Please use this identifier to cite or link to this item:
https://hdl.handle.net/10321/3794
Title: | Automatic speech recognition of the isiZulu language | Authors: | Shezi Nokwanda | Issue Date: | 1-Dec-2021 | Abstract: | A key component of artificial intelligence is human-to-machine communication. Such communication has been realised through virtual assistants such as Apple's Siri, Google's Now, Amazon's Alexa, etc. This technology is made possible through Automatic Speech Recognition (ASR). Only in recent years have the previously marginalised or developing countries started researching ASR for their indigenous languages. This research focuses on ASR in isiZulu, which is one of South Africa's most spoken indigenous language. The research involves two main fields of study i.e., digital signal processing (DSP) and machine learning (ML). DSP was applied in word boundary estimation and feature extraction. Machine learning was used to convert the work boundary estimation and feature extraction. Machine learning was used to convert the word boundary estimation problem to a classification problem as well as for word recognition. Word boundary estimation achieved an accuracy of 68.4%, which is on par with the current research. the Mel-frequency cepstrum coefficient (MFCC) was used for the feature extraction of the speech and deep neural networks were chosen for the ML component. For the detection and classification of a word in a sentence, the trained neural network was tested by considering the effect of including and excluding explicit boundaries on the overall recognition. Word recognition accuracy with manually demarcated boundaries was 78.18%. In sentence recognition accuracy achieved without demarcated boundaries was 17.74% while a 23.28% accuracy was achieved without demarcated using classification. While in-sentence recognition accuracy for the two algorithms was both low, the accurately recognised words were determined by different heuristics. Other factors, such as the complex differences between the indigenous isiZulu languages and other more commonly spoken languages, are also highlighted and further research avenues are proposed. |
Description: | Submitted in fulfillment of the degree of Master of Engineering at the Department of Electronic and Computer Engineering in the Faculty of Engineering and the Built Environment at the Durban University of Technology, 2021. |
URI: | https://hdl.handle.net/10321/3794 | DOI: | https://doi.org/10.51415/10321/3794 |
Appears in Collections: | Theses and dissertations (Engineering and Built Environment) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Nokwanda Shezi.pdf | 12.19 MB | Adobe PDF | View/Open |
Page view(s)
362
checked on Dec 22, 2024
Download(s)
185
checked on Dec 22, 2024
Google ScholarTM
Check
Altmetric
Altmetric
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.