Hate speech detection in social media apps using deep learning and machine learning techniques

This work presents a model able to detect if a tweet contains hate speech or not, using as primary data tweets collected from the X platform in the context of the 2023 Chilean Plebiscite Constitutional Reform. Machine learning and deep learning approaches were used to obtain the best model and Natur...

Бүрэн тодорхойлолт

-д хадгалсан:
Номзүйн дэлгэрэнгүй
Үндсэн зохиолч: Paredes Benavides, Jimmy Gerardo (author)
Формат: bachelorThesis
Хэл сонгох:eng
Хэвлэсэн: 2025
Нөхцлүүд:
Онлайн хандалт:http://repositorio.yachaytech.edu.ec/handle/123456789/954
Шошгууд: Шошго нэмэх
Шошго байхгүй, Энэхүү баримтыг шошголох эхний хүн болох!
_version_ 1863534788130897920
author Paredes Benavides, Jimmy Gerardo
author_facet Paredes Benavides, Jimmy Gerardo
author_role author
collection Repositorio Universidad Yachay Tech
dc.contributor.none.fl_str_mv Cuenca Pauta, Erick Eduardo
dc.creator.none.fl_str_mv Paredes Benavides, Jimmy Gerardo
dc.date.none.fl_str_mv 2025-05-12T22:44:03Z
2025-05-12T22:44:03Z
2025-05
dc.format.none.fl_str_mv application/pdf
dc.identifier.none.fl_str_mv http://repositorio.yachaytech.edu.ec/handle/123456789/954
dc.language.none.fl_str_mv eng
dc.publisher.none.fl_str_mv Universidad de Investigación de Tecnología Experimental Yachay
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
dc.source.none.fl_str_mv reponame:Repositorio Universidad Yachay Tech
instname:Universidad Yachay Tech
instacron:Yachay
dc.subject.none.fl_str_mv Procesamiento de lenguaje natural
Aprendizaje automático
Aprendizaje profundo
Natural language processing
Machine learning
Deep learning
dc.title.none.fl_str_mv Hate speech detection in social media apps using deep learning and machine learning techniques
dc.type.none.fl_str_mv info:eu-repo/semantics/publishedVersion
info:eu-repo/semantics/bachelorThesis
description This work presents a model able to detect if a tweet contains hate speech or not, using as primary data tweets collected from the X platform in the context of the 2023 Chilean Plebiscite Constitutional Reform. Machine learning and deep learning approaches were used to obtain the best model and Natural Language Processing techniques to process the text data. Since the dataset used presents an imbalance in its classes, an analysis of the use of data augmentation and data reduction was performed to find out which of those techniques performs better in this dataset. It was concluded that the data augmentation technique was useful in this work because of the low number of samples on the dataset for one of its classes, but the data reduction did not present good results since the number of samples on the dataset is not too much making the data reduction technique not suitable for this dataset. From the four models used K-Nearest Neighbors, Decision Tree Classifier, Logistic Regression, and 1-dimensional Convolutional Neural Network (1D-CNN), the model that outperformed in all the experiments carried out was the 1D-CNN model. Also, the experiment that performs better is the use of data augmentation and not using data reduction. The best score obtained inthe accuracy metric for this combination was 84%.
eu_rights_str_mv openAccess
format bachelorThesis
id Yachay_40ba1610332fdf30ace92e0f1d4e15bf
instacron_str Yachay
institution Yachay
instname_str Universidad Yachay Tech
language eng
network_acronym_str Yachay
network_name_str Repositorio Universidad Yachay Tech
oai_identifier_str oai:repositorio.yachaytech.edu.ec:123456789/954
publishDate 2025
publisher.none.fl_str_mv Universidad de Investigación de Tecnología Experimental Yachay
reponame_str Repositorio Universidad Yachay Tech
repository.mail.fl_str_mv .
repository.name.fl_str_mv Repositorio Universidad Yachay Tech - Universidad Yachay Tech
repository_id_str 10284
spelling Hate speech detection in social media apps using deep learning and machine learning techniquesParedes Benavides, Jimmy GerardoProcesamiento de lenguaje naturalAprendizaje automáticoAprendizaje profundoNatural language processingMachine learningDeep learningThis work presents a model able to detect if a tweet contains hate speech or not, using as primary data tweets collected from the X platform in the context of the 2023 Chilean Plebiscite Constitutional Reform. Machine learning and deep learning approaches were used to obtain the best model and Natural Language Processing techniques to process the text data. Since the dataset used presents an imbalance in its classes, an analysis of the use of data augmentation and data reduction was performed to find out which of those techniques performs better in this dataset. It was concluded that the data augmentation technique was useful in this work because of the low number of samples on the dataset for one of its classes, but the data reduction did not present good results since the number of samples on the dataset is not too much making the data reduction technique not suitable for this dataset. From the four models used K-Nearest Neighbors, Decision Tree Classifier, Logistic Regression, and 1-dimensional Convolutional Neural Network (1D-CNN), the model that outperformed in all the experiments carried out was the 1D-CNN model. Also, the experiment that performs better is the use of data augmentation and not using data reduction. The best score obtained inthe accuracy metric for this combination was 84%.Este trabajo presenta un modelo capaz de detectar si un tweet contiene discursos de odio o no, usando como datos primarios tweets recolectados de la plataforma X en el contexto de la Reforma Constitucional del Plebiscito de Chile de 2023. Se utilizaron enfoques de aprendizaje automático y aprendizaje profundo para procesar los datos de texto. Dado que el conjunto de datos usado presenta un desbalance en sus clases, se realizó un análisis sobre el uso de aumentación de datos y reducción de datos para encontrar cuál de estas técnicas funciona mejor en este conjunto de datos. Se concluyó que la técnica de aumento de datos fué útil en este trabajo debido al bajo número de ejemplos en el dataset para una de sus clases, pero la reducción de datos no presentó buenos resultados ya que el número de muestras en el conjunto de datos no es demasiado, haciendo que la técnica de reducción de datos no sea adecuada para este conjunto de datos. De los cuatro modelos utilizados Decision Tree Classifier, Logistic Regression y 1-dimensional Convolutional Neural Network (1D-CNN), el modelo que mejor desempeño obtuvo en todos los experimentos realizados fue el modelo 1D-CNN. Además, el experimento que mejor desempeño obtuvo fué el que utilizó aumento de datos y no utilizó reducción de datos. El mejor puntaje obtenido en la métrica de exactitud (accuracy) para esta combinación fué del 84%.Ingeniero/a en Tecnologías de la InformaciónUniversidad de Investigación de Tecnología Experimental YachayCuenca Pauta, Erick Eduardo2025-05-12T22:44:03Z2025-05-12T22:44:03Z2025-05info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bachelorThesisapplication/pdfhttp://repositorio.yachaytech.edu.ec/handle/123456789/954enginfo:eu-repo/semantics/openAccessreponame:Repositorio Universidad Yachay Techinstname:Universidad Yachay Techinstacron:Yachay2025-07-09T07:00:23Zoai:repositorio.yachaytech.edu.ec:123456789/954Institucionalhttps://repositorio.yachaytech.edu.ec/Universidad públicahttps://www.yachaytech.edu.ec/https://repositorio.yachaytech.edu.ec/oaiEcuador...opendoar:102842025-07-09T07:00:23falseInstitucionalhttps://repositorio.yachaytech.edu.ec/Universidad públicahttps://www.yachaytech.edu.ec/https://repositorio.yachaytech.edu.ec/oai.Ecuador...opendoar:102842025-07-09T07:00:23Repositorio Universidad Yachay Tech - Universidad Yachay Techfalse
spellingShingle Hate speech detection in social media apps using deep learning and machine learning techniques
Paredes Benavides, Jimmy Gerardo
Procesamiento de lenguaje natural
Aprendizaje automático
Aprendizaje profundo
Natural language processing
Machine learning
Deep learning
status_str publishedVersion
title Hate speech detection in social media apps using deep learning and machine learning techniques
title_full Hate speech detection in social media apps using deep learning and machine learning techniques
title_fullStr Hate speech detection in social media apps using deep learning and machine learning techniques
title_full_unstemmed Hate speech detection in social media apps using deep learning and machine learning techniques
title_short Hate speech detection in social media apps using deep learning and machine learning techniques
title_sort Hate speech detection in social media apps using deep learning and machine learning techniques
topic Procesamiento de lenguaje natural
Aprendizaje automático
Aprendizaje profundo
Natural language processing
Machine learning
Deep learning
url http://repositorio.yachaytech.edu.ec/handle/123456789/954