DEEP LEARNING-BASED AUTOMATIC AMHARIC-ENGLISH CODE-SWITCHING DETECTION AND CORRECTION

GIRMAW, ANDUALEM

DEEP LEARNING-BASED AUTOMATIC AMHARIC-ENGLISH CODE-SWITCHING DETECTION AND CORRECTION

GIRMAW, ANDUALEM

URI: http://ir.bdu.edu.et/handle/123456789/14391

Date: 2022-02

Abstract:

Human language technology (HLT) is an essential tool for the study of how computer programs in the text can analyze, create, modify or respond to human beings. This technology is also interdisciplinary, implying a different computing viewpoint, such as natural language processing, computational linguistics, deep neural network, and speech technology. It concentrates on language comprehension, Computational Linguistics (CL), and Natural Language Processing (NLP) is concerned with the transformation of human language into a computer program. Natural language processing is the hot research area to solve the gap between humans and computers using natural language. The Amharic language is one of the natural languages which has its characters, words, phrases, sentences, dialect, and grammar structure. However, in social media, was happen the tendency to mix code-switching in the Amharic language with the English language. And, challenges of Amharic natural language processing tasks. Code-switching is a linguistics term that uses two or more two languages or varieties in conversation on a speech and written style. The tasks to detect whether to have code switch or not from social media posts or comments sentences. If the input posts or comments sentences have a code switch, it may have a code switch type such as intra-word code switch, tag-code switch, and inter-sentential code switch. The objective of this study is to develop deep learning-based automatic Amharic-English code-switching detection and correction. This research work had used a deep learning approach to accomplish our proposed study. The paper had applied four evaluation test case approaches appropriately such as CNN, Bi-LSTM, Bi-GRU, and CNN with Bi-GRU. This research work had used the Anaconda platform in python 3.8.5 versions and Keras TensorFlow as a backend. The experimental results of this research work show that Bi-LSTM performed accuracy 92.9%, F1-score 92.9%, recall 92.8%, precision 93%. The Bi-GRU performed accuracy 93.8%, F1-score 93.9%, recall 94.1%, precision 93.8%. CNN performed accuracy 95.8%, F1-score 95.6%, recall 95.5%, precision 95.7%, and CNN with Bi-GRU performed accuracy 93.8%, F1-score 93.7%, recall 93.9%, precision 93.6%. According to the experimental results the paper had selected CNN for this research work, since, CNN had achieved a better result than other algorithms i.e., from the Bi-LSTM, Bi-GRU, and CNN with Bi-LSTM. Key Word: Deep Learning Approach, Code-Switching, Conventional Neural Network, Bidirectional LSTM, Bidirectional GRU

Show full item record