A comparative study of deep learning algorithms for hate speech detection on Twitter

Mutanga, Raymond

Please use this identifier to cite or link to this item: https://hdl.handle.net/10321/3776

DC Field	Value	Language
dc.contributor.advisor	Naicker, N.	-
dc.contributor.advisor	Olugbara, Oludayo O.	-
dc.contributor.author	Mutanga, Raymond	en_US
dc.date.accessioned	2022-01-17T10:40:22Z	-
dc.date.available	2022-01-17T10:40:22Z	-
dc.date.issued	2021-10-29	-
dc.identifier.uri	https://hdl.handle.net/10321/3776	-
dc.description	A dissertation submitted in fulfilment of the requirement for the Master of Information and Communications Technology degree, Faculty of Accounting and Informatics, Department of Information Technology, Durban University of Technology, 2021.	en_US
dc.description.abstract	Hate speech is an undesirable phenomenon with severe psychological and physical consequences. The emergence of mobile computing and Web 2.0 technologies has increasingly facilitated the spread of hate speech. The speed, accessibility and anonymity afforded by these tools present challenges in enforcing measures that minimise the spread of hate speech. The continued dissemination of hate speech online has triggered the development of various machine learning techniques for its automated detection. However, current approaches are inadequate because of further challenges such as the use of domain-specific language and language subtleties. Recent studies on automated hate speech detection have focused on the use of deep learning as a possible solution to these challenges. Although some studies have explored deep learning methods for hate speech detection, there are no studies that critically compare and evaluate their performance. This work investigates the use of deep learning algorithms as possible solutions to hate speech detection on Twitter. Three taxonomic classes of deep learning algorithms, namely, Traditional deep learning algorithms, Traditional algorithms with partial attention mechanism and Transformer models, which are entirely based on the attention mechanism, are evaluated for performance, using two publicly available corpora. One of the datasets contained 24 786 tweets annotated into three different classes, while the other dataset contained 2300 tweets annotated into two different classes. All tweets from the two datasets were first preprocessed to rid of them of characters and words deemed irrelevant to the classification decision, for instance, hashtags, stop words and punctuation marks. The preprocessed text was then transformed into feature vectors which were used as input for deep learning algorithms explored in this study. A series of experiments were performed to measure the performance of the deep learning algorithms in hate speech detection. The algorithms were tested on a wide spectrum of tweets containing different forms of hate speech. The efficacy of the deep learning algorithms was objectively evaluated using six state-of-the-art statistical evaluation metrics: precision, Fmeasure, recall, accuracy, Mathews correlation coefficient and area under the curve. The results from this study indicate that variations in parameters do not impact the efficacy of deep learning algorithms by the same proportions. The findings of this empirical study, therefore, provide deep-learning practitioners with a better understanding of the adaptation of robust deep-learning techniques for automated hate speech detection tasks.	en_US
dc.format.extent	124 p	en_US
dc.language.iso	en	en_US
dc.subject.lcsh	Deep learning (Machine learning)	en_US
dc.subject.lcsh	Online hate speech	en_US
dc.subject.lcsh	Algorithms	en_US
dc.title	A comparative study of deep learning algorithms for hate speech detection on Twitter	en_US
dc.type	Thesis	en_US
dc.description.level	M	en_US
dc.identifier.doi	https://doi.org/10.51415/10321/3776	-
item.grantfulltext	open	-
item.cerifentitytype	Publications	-
item.fulltext	With Fulltext	-
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
item.openairetype	Thesis	-
item.languageiso639-1	en	-
Appears in Collections:	Theses and dissertations (Accounting and Informatics)

Files in This Item:

File	Description	Size	Format
MutangaR_2021.PDF	thesis	1.7 MB	Adobe PDF	View/Open

Show simple item record

Page view(s)

410

checked on Dec 13, 2024

Download(s)

445

checked on Dec 13, 2024

Google Scholar^TM

Check

Files in This Item:

Page view(s)

Download(s)

Google Scholar^TM

Altmetric

Altmetric

Files in This Item:

Page view(s)

Download(s)

Google ScholarTM

Altmetric

Altmetric

Google Scholar^TM