Advancing Online Hate Speech Detection Using External Features and Large Language Models

Das, Amit

Metadata Field	Value	Language
dc.contributor.advisor	Seals, Cheryl
dc.contributor.author	Das, Amit
dc.date.accessioned	2024-07-24T14:55:46Z
dc.date.available	2024-07-24T14:55:46Z
dc.date.issued	2024-07-24
dc.identifier.uri	https://etd.auburn.edu//handle/10415/9349
dc.description.abstract	Social media is a concept developed to link people and make the globe smaller. But it has recently developed into a center for hateful posts that target different people and communities. As a result, there are more events of hostile actions and harassing remarks present online. Since this issue can cause immense harm on a person, it needs to be addressed with immense priority. There are many Natural Language Processing models that have been implemented for hate speech detection. In our study, we begin by using BERT combined with TFIDF representation to tackle the challenges of identifying irony and stereotype-spreading authors on Twitter. For the classification task, we employed a logistic regression classifier. Our findings indicated that the combination of BERT representation with TFIDF yielded very promising results. To delve deeper into the issue, we addressed sexism, another form of hate speech that predominantly targets women. For sexism detection, we introduced a fine-tuned RoBERTa model. This involved encoding the initial data representation using RoBERTa and implementing three distinct Multilayer Perceptrons (MLPs) for the three sub-tasks. The experimental results showcased the effectiveness of our proposed strategy. Additionally, we explored the potential benefits of incorporating external features in the detection of sexism and hate speech. Specifically, we examined the impact of user gender information on online sexism detection in both binary and multi-class classification contexts. Given that most sexist comments are directed towards individuals of a particular gender, understanding the role of user gender information is crucial. Our experiments demonstrated that integrating user gender information with textual features enhanced classification performance in both binary and multi-class classifications. Further advancing our research, we introduced OffensiveLang, a novel community-based implicit offensive language dataset generated by ChatGPT 3.5, covering 38 different target groups. Despite ethical constraints limiting the generation of offensive texts via ChatGPT, we devised a prompt-based approach to effectively generate implicit offensive language. To ensure data quality, we evaluated our dataset through human assessments. Moreover, we employed a prompt-based Zero-Shot method with ChatGPT and compared detection results between human annotations and ChatGPT annotations. We also utilized existing state-of-the-art models to evaluate their effectiveness in detecting such languages and investigated annotator biases in hate speech data annotation using large language models. Lastly, we investigated gender, race, religion, and disability biases in LLMs used for hate speech detection and proposed mitigation strategies. We demonstrated the presence of these biases in LLMs such as GPT-4o and GPT-3.5 when annotating hate speech data. We then explored the underlying factors that contribute to these biases, providing a thorough analysis of the annotated data and emphasizing the role of subjective interpretations. Finally, we suggested potential solutions to mitigate these biases, highlighting the importance of tailored prompts and fine-tuning LLMs to enhance the fairness and accuracy of annotations.	en_US
dc.rights	EMBARGO_NOT_AUBURN	en_US
dc.subject	Computer Science and Software Engineering	en_US
dc.title	Advancing Online Hate Speech Detection Using External Features and Large Language Models	en_US
dc.type	PhD Dissertation	en_US
dc.embargo.length	MONTHS_WITHHELD:24	en_US
dc.embargo.status	EMBARGOED	en_US
dc.embargo.enddate	2026-07-24	en_US

Files in this item

Name:: Amit_Das_PhD_Dissertation_Final.pdf
Size:: 754.0Kb

Show simple item record