Hate speech detection in social media apps using deep learning and machine learning techniques

This work presents a model able to detect if a tweet contains hate speech or not, using as primary data tweets collected from the X platform in the context of the 2023 Chilean Plebiscite Constitutional Reform. Machine learning and deep learning approaches were used to obtain the best model and Natur...

Volledige beschrijving

Bewaard in:

Bibliografische gegevens
Hoofdauteur:	Paredes Benavides, Jimmy Gerardo (author)
Formaat:	bachelorThesis
Taal:	eng
Gepubliceerd in:	2025
Onderwerpen:	Procesamiento de lenguaje natural Aprendizaje automático Aprendizaje profundo Natural language processing Machine learning Deep learning
Online toegang:	http://repositorio.yachaytech.edu.ec/handle/123456789/954
Tags:	Voeg label toe Geen labels, Wees de eerste die dit record labelt!

Omschrijving
Samenvatting:	This work presents a model able to detect if a tweet contains hate speech or not, using as primary data tweets collected from the X platform in the context of the 2023 Chilean Plebiscite Constitutional Reform. Machine learning and deep learning approaches were used to obtain the best model and Natural Language Processing techniques to process the text data. Since the dataset used presents an imbalance in its classes, an analysis of the use of data augmentation and data reduction was performed to find out which of those techniques performs better in this dataset. It was concluded that the data augmentation technique was useful in this work because of the low number of samples on the dataset for one of its classes, but the data reduction did not present good results since the number of samples on the dataset is not too much making the data reduction technique not suitable for this dataset. From the four models used K-Nearest Neighbors, Decision Tree Classifier, Logistic Regression, and 1-dimensional Convolutional Neural Network (1D-CNN), the model that outperformed in all the experiments carried out was the 1D-CNN model. Also, the experiment that performs better is the use of data augmentation and not using data reduction. The best score obtained inthe accuracy metric for this combination was 84%.

Hate speech detection in social media apps using deep learning and machine learning techniques

Gelijkaardige items