Please use this identifier to cite or link to this item:
|Title:||A comparative study of deep learning algorithms for hate speech detection on Twitter||Authors:||Mutanga, Raymond||Issue Date:||29-Oct-2021||Abstract:||
Hate speech is an undesirable phenomenon with severe psychological and physical
consequences. The emergence of mobile computing and Web 2.0 technologies has increasingly
facilitated the spread of hate speech. The speed, accessibility and anonymity afforded by these
tools present challenges in enforcing measures that minimise the spread of hate speech. The
continued dissemination of hate speech online has triggered the development of various
machine learning techniques for its automated detection. However, current approaches are
inadequate because of further challenges such as the use of domain-specific language and
language subtleties. Recent studies on automated hate speech detection have focused on the
use of deep learning as a possible solution to these challenges. Although some studies have
explored deep learning methods for hate speech detection, there are no studies that critically
compare and evaluate their performance.
This work investigates the use of deep learning algorithms as possible solutions to hate speech
detection on Twitter. Three taxonomic classes of deep learning algorithms, namely, Traditional
deep learning algorithms, Traditional algorithms with partial attention mechanism and
Transformer models, which are entirely based on the attention mechanism, are evaluated for
performance, using two publicly available corpora. One of the datasets contained 24 786 tweets
annotated into three different classes, while the other dataset contained 2300 tweets annotated
into two different classes. All tweets from the two datasets were first preprocessed to rid of
them of characters and words deemed irrelevant to the classification decision, for instance,
hashtags, stop words and punctuation marks. The preprocessed text was then transformed into
feature vectors which were used as input for deep learning algorithms explored in this study.
A series of experiments were performed to measure the performance of the deep learning
algorithms in hate speech detection. The algorithms were tested on a wide spectrum of tweets
containing different forms of hate speech. The efficacy of the deep learning algorithms was
objectively evaluated using six state-of-the-art statistical evaluation metrics: precision, Fmeasure, recall, accuracy, Mathews correlation coefficient and area under the curve. The
results from this study indicate that variations in parameters do not impact the efficacy of deep
learning algorithms by the same proportions. The findings of this empirical study, therefore,
provide deep-learning practitioners with a better understanding of the adaptation of robust
deep-learning techniques for automated hate speech detection tasks.
A dissertation submitted in fulfilment of the requirement for the Master of Information and Communications Technology degree, Faculty of Accounting and Informatics, Department of Information Technology, Durban University of Technology, 2021.
|Appears in Collections:||Theses and dissertations (Accounting and Informatics)|
Show full item record
checked on May 23, 2022
checked on May 23, 2022
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.