BDU IR

Automatic Multiword Expressions Detection in Amharic language using machine learning approach

Show simple item record

dc.contributor.author Atinkut, Muche
dc.date.accessioned 2022-12-31T06:38:51Z
dc.date.available 2022-12-31T06:38:51Z
dc.date.issued 2022-11
dc.identifier.uri http://ir.bdu.edu.et/handle/123456789/14783
dc.description.abstract The innovative techniques of NLP in MWEs have become a very vital area of research in today’s scenario. Multiword expression (MWE) referred to lexical unit larger than a word that can allow both idiomatic and compositional meaning. The purpose of this study is to investigate how to automate multiword expression detection for the Amharic language. Natural Language processing research has been influenced by the existence of multiword expressions. It has been shown that multiword expressions affect NLP researches such as machine translation, question answering, WSD, information retrieval and next word prediction. Other languages like English, Japanese, Indian multiword expressions are identified through different approaches in different researches, however for the Amharic language; there is no research to detect multiword expressions. This study aimed to develop multiword expressions detection model for the Amharic language using a supervised Machine learning approach. Three thousand three hundred datasets are collected from Amharic text Books, Amharic Bible, Fiction, Amharic idiom Books, Amharic Dictionaries and Novels. We used an experimental research methodology to develop the model. TFIDF and keras embedding techniques are applied for vectorizing the dataset for traditional machine learning and deep learning models respectively . Based on the Experimental result we show that MLP algorithm able to outperform SVM, LSTM and BiLSTM algorithm, it achieved an accuracy of 98.94 percent because of the reason it is suitable for classification prediction problem where inputs are assigned a class or a label and the neural network in MLP capable of learning more complex patterns due to its multiple layers of neurons. In general for this study large data set will need with multiword expression of more than two word combination. Keywords: -NLP, Machine learning, Multiword, Multiword Expression, Multiword Expression Detection. en_US
dc.language.iso en_US en_US
dc.subject INFORMATION TECHNOLOGY en_US
dc.title Automatic Multiword Expressions Detection in Amharic language using machine learning approach en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record