Hate speech detection in social media apps using deep learning and machine learning techniques

This work presents a model able to detect if a tweet contains hate speech or not, using as primary data tweets collected from the X platform in the context of the 2023 Chilean Plebiscite Constitutional Reform. Machine learning and deep learning approaches were used to obtain the best model and Natur...

Бүрэн тодорхойлолт

-д хадгалсан:

Номзүйн дэлгэрэнгүй
Үндсэн зохиолч:	Paredes Benavides, Jimmy Gerardo (author)
Формат:	bachelorThesis
Хэл сонгох:	eng
Хэвлэсэн:	2025
Нөхцлүүд:	Procesamiento de lenguaje natural Aprendizaje automático Aprendizaje profundo Natural language processing Machine learning Deep learning
Онлайн хандалт:	http://repositorio.yachaytech.edu.ec/handle/123456789/954
Шошгууд:	Шошго нэмэх Шошго байхгүй, Энэхүү баримтыг шошголох эхний хүн болох!

_version_	1863534788130897920
author	Paredes Benavides, Jimmy Gerardo
author_facet	Paredes Benavides, Jimmy Gerardo
author_role	author
collection	Repositorio Universidad Yachay Tech
dc.contributor.none.fl_str_mv	Cuenca Pauta, Erick Eduardo
dc.creator.none.fl_str_mv	Paredes Benavides, Jimmy Gerardo
dc.date.none.fl_str_mv	2025-05-12T22:44:03Z 2025-05-12T22:44:03Z 2025-05
dc.format.none.fl_str_mv	application/pdf
dc.identifier.none.fl_str_mv	http://repositorio.yachaytech.edu.ec/handle/123456789/954
dc.language.none.fl_str_mv	eng
dc.publisher.none.fl_str_mv	Universidad de Investigación de Tecnología Experimental Yachay
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess
dc.source.none.fl_str_mv	reponame:Repositorio Universidad Yachay Tech instname:Universidad Yachay Tech instacron:Yachay
dc.subject.none.fl_str_mv	Procesamiento de lenguaje natural Aprendizaje automático Aprendizaje profundo Natural language processing Machine learning Deep learning
dc.title.none.fl_str_mv	Hate speech detection in social media apps using deep learning and machine learning techniques
dc.type.none.fl_str_mv	info:eu-repo/semantics/publishedVersion info:eu-repo/semantics/bachelorThesis
description	This work presents a model able to detect if a tweet contains hate speech or not, using as primary data tweets collected from the X platform in the context of the 2023 Chilean Plebiscite Constitutional Reform. Machine learning and deep learning approaches were used to obtain the best model and Natural Language Processing techniques to process the text data. Since the dataset used presents an imbalance in its classes, an analysis of the use of data augmentation and data reduction was performed to find out which of those techniques performs better in this dataset. It was concluded that the data augmentation technique was useful in this work because of the low number of samples on the dataset for one of its classes, but the data reduction did not present good results since the number of samples on the dataset is not too much making the data reduction technique not suitable for this dataset. From the four models used K-Nearest Neighbors, Decision Tree Classifier, Logistic Regression, and 1-dimensional Convolutional Neural Network (1D-CNN), the model that outperformed in all the experiments carried out was the 1D-CNN model. Also, the experiment that performs better is the use of data augmentation and not using data reduction. The best score obtained inthe accuracy metric for this combination was 84%.
eu_rights_str_mv	openAccess
format	bachelorThesis
id	Yachay_40ba1610332fdf30ace92e0f1d4e15bf
instacron_str	Yachay
institution	Yachay
instname_str	Universidad Yachay Tech
language	eng
network_acronym_str	Yachay
network_name_str	Repositorio Universidad Yachay Tech
oai_identifier_str	oai:repositorio.yachaytech.edu.ec:123456789/954
publishDate	2025
publisher.none.fl_str_mv	Universidad de Investigación de Tecnología Experimental Yachay
reponame_str	Repositorio Universidad Yachay Tech
repository.mail.fl_str_mv	.
repository.name.fl_str_mv	Repositorio Universidad Yachay Tech - Universidad Yachay Tech
repository_id_str	10284
spelling	Hate speech detection in social media apps using deep learning and machine learning techniquesParedes Benavides, Jimmy GerardoProcesamiento de lenguaje naturalAprendizaje automáticoAprendizaje profundoNatural language processingMachine learningDeep learningThis work presents a model able to detect if a tweet contains hate speech or not, using as primary data tweets collected from the X platform in the context of the 2023 Chilean Plebiscite Constitutional Reform. Machine learning and deep learning approaches were used to obtain the best model and Natural Language Processing techniques to process the text data. Since the dataset used presents an imbalance in its classes, an analysis of the use of data augmentation and data reduction was performed to find out which of those techniques performs better in this dataset. It was concluded that the data augmentation technique was useful in this work because of the low number of samples on the dataset for one of its classes, but the data reduction did not present good results since the number of samples on the dataset is not too much making the data reduction technique not suitable for this dataset. From the four models used K-Nearest Neighbors, Decision Tree Classifier, Logistic Regression, and 1-dimensional Convolutional Neural Network (1D-CNN), the model that outperformed in all the experiments carried out was the 1D-CNN model. Also, the experiment that performs better is the use of data augmentation and not using data reduction. The best score obtained inthe accuracy metric for this combination was 84%.Este trabajo presenta un modelo capaz de detectar si un tweet contiene discursos de odio o no, usando como datos primarios tweets recolectados de la plataforma X en el contexto de la Reforma Constitucional del Plebiscito de Chile de 2023. Se utilizaron enfoques de aprendizaje automático y aprendizaje profundo para procesar los datos de texto. Dado que el conjunto de datos usado presenta un desbalance en sus clases, se realizó un análisis sobre el uso de aumentación de datos y reducción de datos para encontrar cuál de estas técnicas funciona mejor en este conjunto de datos. Se concluyó que la técnica de aumento de datos fué útil en este trabajo debido al bajo número de ejemplos en el dataset para una de sus clases, pero la reducción de datos no presentó buenos resultados ya que el número de muestras en el conjunto de datos no es demasiado, haciendo que la técnica de reducción de datos no sea adecuada para este conjunto de datos. De los cuatro modelos utilizados Decision Tree Classifier, Logistic Regression y 1-dimensional Convolutional Neural Network (1D-CNN), el modelo que mejor desempeño obtuvo en todos los experimentos realizados fue el modelo 1D-CNN. Además, el experimento que mejor desempeño obtuvo fué el que utilizó aumento de datos y no utilizó reducción de datos. El mejor puntaje obtenido en la métrica de exactitud (accuracy) para esta combinación fué del 84%.Ingeniero/a en Tecnologías de la InformaciónUniversidad de Investigación de Tecnología Experimental YachayCuenca Pauta, Erick Eduardo2025-05-12T22:44:03Z2025-05-12T22:44:03Z2025-05info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bachelorThesisapplication/pdfhttp://repositorio.yachaytech.edu.ec/handle/123456789/954enginfo:eu-repo/semantics/openAccessreponame:Repositorio Universidad Yachay Techinstname:Universidad Yachay Techinstacron:Yachay2025-07-09T07:00:23Zoai:repositorio.yachaytech.edu.ec:123456789/954Institucionalhttps://repositorio.yachaytech.edu.ec/Universidad públicahttps://www.yachaytech.edu.ec/https://repositorio.yachaytech.edu.ec/oaiEcuador...opendoar:102842025-07-09T07:00:23falseInstitucionalhttps://repositorio.yachaytech.edu.ec/Universidad públicahttps://www.yachaytech.edu.ec/https://repositorio.yachaytech.edu.ec/oai.Ecuador...opendoar:102842025-07-09T07:00:23Repositorio Universidad Yachay Tech - Universidad Yachay Techfalse
spellingShingle	Hate speech detection in social media apps using deep learning and machine learning techniques Paredes Benavides, Jimmy Gerardo Procesamiento de lenguaje natural Aprendizaje automático Aprendizaje profundo Natural language processing Machine learning Deep learning
status_str	publishedVersion
title	Hate speech detection in social media apps using deep learning and machine learning techniques
title_full	Hate speech detection in social media apps using deep learning and machine learning techniques
title_fullStr	Hate speech detection in social media apps using deep learning and machine learning techniques
title_full_unstemmed	Hate speech detection in social media apps using deep learning and machine learning techniques
title_short	Hate speech detection in social media apps using deep learning and machine learning techniques
title_sort	Hate speech detection in social media apps using deep learning and machine learning techniques
topic	Procesamiento de lenguaje natural Aprendizaje automático Aprendizaje profundo Natural language processing Machine learning Deep learning
url	http://repositorio.yachaytech.edu.ec/handle/123456789/954

Hate speech detection in social media apps using deep learning and machine learning techniques

Ижил төстэй зүйлс