DESIGNING NEXT PHRASE PREDICTION MODEL FOR AMHARIC LANGUAGE USING DEEP LEARNING TECHNIQUES

WELELA, AMESALU

dc.contributor.author	WELELA, AMESALU
dc.date.accessioned	2025-02-24T07:31:34Z
dc.date.available	2025-02-24T07:31:34Z
dc.date.issued	2024-02
dc.identifier.uri	http://ir.bdu.edu.et/handle/123456789/16473
dc.description.abstract	Text entry is an essential aspect of human-computer interaction and can be performed through a keyboard, which mostly contains English letters. Typing Amharic text on a computer system may pose challenges like decreased typing speed, spelling, and grammar errors. These challenges allow to introduce of text prediction that facilitates fast entry of text into computers and handheld devices. Previous studies about Amharic next-word prediction lacked syntactic agreement due to inaccurate part-of-speech tagging. Additionally, a single word did not capture the context of the sentences. This study aims to design next phrase prediction model using deep learning approaches. The dataset for the prediction model was collected from Amharic student textbooks, Amharic teacher's guidebooks, Amharic Grammar entitled የአማርኛ ሰዋሰው by Baye Yimam, and news from Amhara mass media. The collected Amharic sentences required preprocessing, part of speech tagged with a pre-trained model, and a rule-based chunk tagged for the model development. Two prediction models were designed using LSTM and Encoder-Decoder deep learning techniques to compare and select the optimum one. The prediction models are trained using 2176 simple declarative sentences with split ratios of 80%, 10%, 10%, and 70%, 15%, and 15% for training, validation, and, testing sets. The accuracy of proposed models achieved 68.8%, and 70.4% in Encoder Decoder and LSTM respectively on the former split ratio. The LSTM model performs better than the Encoder-decoder model with a split of 80%, 10%, and 10% for training, testing, and validation sets. The finding of this study has a valuable role, especially for non–native and dyslexia users in typing coherent sentences as well as capturing the context of sentences by considering sequences of words rather than individual words. This study was limited to declarative sentences and syntactic information, which leads future researchers to encompass other types of sentences with semantic meanings. Keywords: Phrase prediction, deep learning, Long Short Term Memory, Encoder- Decoder, sentence chunk	en_US
dc.language.iso	en_US	en_US
dc.subject	Information Technology	en_US
dc.title	DESIGNING NEXT PHRASE PREDICTION MODEL FOR AMHARIC LANGUAGE USING DEEP LEARNING TECHNIQUES	en_US
dc.type	Thesis	en_US