Hate speech detection in social media apps using deep learning and machine learning techniques
This work presents a model able to detect if a tweet contains hate speech or not, using as primary data tweets collected from the X platform in the context of the 2023 Chilean Plebiscite Constitutional Reform. Machine learning and deep learning approaches were used to obtain the best model and Natur...
-д хадгалсан:
| Үндсэн зохиолч: | |
|---|---|
| Формат: | bachelorThesis |
| Хэл сонгох: | eng |
| Хэвлэсэн: |
2025
|
| Нөхцлүүд: | |
| Онлайн хандалт: | http://repositorio.yachaytech.edu.ec/handle/123456789/954 |
| Шошгууд: |
Шошго нэмэх
Шошго байхгүй, Энэхүү баримтыг шошголох эхний хүн болох!
|
| _version_ | 1863534788130897920 |
|---|---|
| author | Paredes Benavides, Jimmy Gerardo |
| author_facet | Paredes Benavides, Jimmy Gerardo |
| author_role | author |
| collection | Repositorio Universidad Yachay Tech |
| dc.contributor.none.fl_str_mv | Cuenca Pauta, Erick Eduardo |
| dc.creator.none.fl_str_mv | Paredes Benavides, Jimmy Gerardo |
| dc.date.none.fl_str_mv | 2025-05-12T22:44:03Z 2025-05-12T22:44:03Z 2025-05 |
| dc.format.none.fl_str_mv | application/pdf |
| dc.identifier.none.fl_str_mv | http://repositorio.yachaytech.edu.ec/handle/123456789/954 |
| dc.language.none.fl_str_mv | eng |
| dc.publisher.none.fl_str_mv | Universidad de Investigación de Tecnología Experimental Yachay |
| dc.rights.none.fl_str_mv | info:eu-repo/semantics/openAccess |
| dc.source.none.fl_str_mv | reponame:Repositorio Universidad Yachay Tech instname:Universidad Yachay Tech instacron:Yachay |
| dc.subject.none.fl_str_mv | Procesamiento de lenguaje natural Aprendizaje automático Aprendizaje profundo Natural language processing Machine learning Deep learning |
| dc.title.none.fl_str_mv | Hate speech detection in social media apps using deep learning and machine learning techniques |
| dc.type.none.fl_str_mv | info:eu-repo/semantics/publishedVersion info:eu-repo/semantics/bachelorThesis |
| description | This work presents a model able to detect if a tweet contains hate speech or not, using as primary data tweets collected from the X platform in the context of the 2023 Chilean Plebiscite Constitutional Reform. Machine learning and deep learning approaches were used to obtain the best model and Natural Language Processing techniques to process the text data. Since the dataset used presents an imbalance in its classes, an analysis of the use of data augmentation and data reduction was performed to find out which of those techniques performs better in this dataset. It was concluded that the data augmentation technique was useful in this work because of the low number of samples on the dataset for one of its classes, but the data reduction did not present good results since the number of samples on the dataset is not too much making the data reduction technique not suitable for this dataset. From the four models used K-Nearest Neighbors, Decision Tree Classifier, Logistic Regression, and 1-dimensional Convolutional Neural Network (1D-CNN), the model that outperformed in all the experiments carried out was the 1D-CNN model. Also, the experiment that performs better is the use of data augmentation and not using data reduction. The best score obtained inthe accuracy metric for this combination was 84%. |
| eu_rights_str_mv | openAccess |
| format | bachelorThesis |
| id | Yachay_40ba1610332fdf30ace92e0f1d4e15bf |
| instacron_str | Yachay |
| institution | Yachay |
| instname_str | Universidad Yachay Tech |
| language | eng |
| network_acronym_str | Yachay |
| network_name_str | Repositorio Universidad Yachay Tech |
| oai_identifier_str | oai:repositorio.yachaytech.edu.ec:123456789/954 |
| publishDate | 2025 |
| publisher.none.fl_str_mv | Universidad de Investigación de Tecnología Experimental Yachay |
| reponame_str | Repositorio Universidad Yachay Tech |
| repository.mail.fl_str_mv | . |
| repository.name.fl_str_mv | Repositorio Universidad Yachay Tech - Universidad Yachay Tech |
| repository_id_str | 10284 |
| spelling | Hate speech detection in social media apps using deep learning and machine learning techniquesParedes Benavides, Jimmy GerardoProcesamiento de lenguaje naturalAprendizaje automáticoAprendizaje profundoNatural language processingMachine learningDeep learningThis work presents a model able to detect if a tweet contains hate speech or not, using as primary data tweets collected from the X platform in the context of the 2023 Chilean Plebiscite Constitutional Reform. Machine learning and deep learning approaches were used to obtain the best model and Natural Language Processing techniques to process the text data. Since the dataset used presents an imbalance in its classes, an analysis of the use of data augmentation and data reduction was performed to find out which of those techniques performs better in this dataset. It was concluded that the data augmentation technique was useful in this work because of the low number of samples on the dataset for one of its classes, but the data reduction did not present good results since the number of samples on the dataset is not too much making the data reduction technique not suitable for this dataset. From the four models used K-Nearest Neighbors, Decision Tree Classifier, Logistic Regression, and 1-dimensional Convolutional Neural Network (1D-CNN), the model that outperformed in all the experiments carried out was the 1D-CNN model. Also, the experiment that performs better is the use of data augmentation and not using data reduction. The best score obtained inthe accuracy metric for this combination was 84%.Este trabajo presenta un modelo capaz de detectar si un tweet contiene discursos de odio o no, usando como datos primarios tweets recolectados de la plataforma X en el contexto de la Reforma Constitucional del Plebiscito de Chile de 2023. Se utilizaron enfoques de aprendizaje automático y aprendizaje profundo para procesar los datos de texto. Dado que el conjunto de datos usado presenta un desbalance en sus clases, se realizó un análisis sobre el uso de aumentación de datos y reducción de datos para encontrar cuál de estas técnicas funciona mejor en este conjunto de datos. Se concluyó que la técnica de aumento de datos fué útil en este trabajo debido al bajo número de ejemplos en el dataset para una de sus clases, pero la reducción de datos no presentó buenos resultados ya que el número de muestras en el conjunto de datos no es demasiado, haciendo que la técnica de reducción de datos no sea adecuada para este conjunto de datos. De los cuatro modelos utilizados Decision Tree Classifier, Logistic Regression y 1-dimensional Convolutional Neural Network (1D-CNN), el modelo que mejor desempeño obtuvo en todos los experimentos realizados fue el modelo 1D-CNN. Además, el experimento que mejor desempeño obtuvo fué el que utilizó aumento de datos y no utilizó reducción de datos. El mejor puntaje obtenido en la métrica de exactitud (accuracy) para esta combinación fué del 84%.Ingeniero/a en Tecnologías de la InformaciónUniversidad de Investigación de Tecnología Experimental YachayCuenca Pauta, Erick Eduardo2025-05-12T22:44:03Z2025-05-12T22:44:03Z2025-05info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bachelorThesisapplication/pdfhttp://repositorio.yachaytech.edu.ec/handle/123456789/954enginfo:eu-repo/semantics/openAccessreponame:Repositorio Universidad Yachay Techinstname:Universidad Yachay Techinstacron:Yachay2025-07-09T07:00:23Zoai:repositorio.yachaytech.edu.ec:123456789/954Institucionalhttps://repositorio.yachaytech.edu.ec/Universidad públicahttps://www.yachaytech.edu.ec/https://repositorio.yachaytech.edu.ec/oaiEcuador...opendoar:102842025-07-09T07:00:23falseInstitucionalhttps://repositorio.yachaytech.edu.ec/Universidad públicahttps://www.yachaytech.edu.ec/https://repositorio.yachaytech.edu.ec/oai.Ecuador...opendoar:102842025-07-09T07:00:23Repositorio Universidad Yachay Tech - Universidad Yachay Techfalse |
| spellingShingle | Hate speech detection in social media apps using deep learning and machine learning techniques Paredes Benavides, Jimmy Gerardo Procesamiento de lenguaje natural Aprendizaje automático Aprendizaje profundo Natural language processing Machine learning Deep learning |
| status_str | publishedVersion |
| title | Hate speech detection in social media apps using deep learning and machine learning techniques |
| title_full | Hate speech detection in social media apps using deep learning and machine learning techniques |
| title_fullStr | Hate speech detection in social media apps using deep learning and machine learning techniques |
| title_full_unstemmed | Hate speech detection in social media apps using deep learning and machine learning techniques |
| title_short | Hate speech detection in social media apps using deep learning and machine learning techniques |
| title_sort | Hate speech detection in social media apps using deep learning and machine learning techniques |
| topic | Procesamiento de lenguaje natural Aprendizaje automático Aprendizaje profundo Natural language processing Machine learning Deep learning |
| url | http://repositorio.yachaytech.edu.ec/handle/123456789/954 |