BDU IR

Sarcasm Detection Model for Amharic Lemmatize Text Using Machine Learning

Show simple item record

dc.contributor.author Biniyam, Damtie
dc.date.accessioned 2024-12-06T07:16:46Z
dc.date.available 2024-12-06T07:16:46Z
dc.date.issued 2023-06
dc.identifier.uri http://ir.bdu.edu.et/handle/123456789/16300
dc.description.abstract By filling the real-world gaps using natural language processing (NLP) tasks with machine or computer by enabling the machine to perform human language related task by replacing human beings, it is possible to solve more than fifty percent of the world’s problem. NLP can be natural language tasks such as Natural language inference, semantic similarity measure, semantic analysis, semantic role labeling, and sentiment analysis and sarcasm detection. These tasks are focusing on understanding the meaning role of each word in the sentence. Opinion mining is one typical Natural language understanding task by focusing on identification of the polarity of the given sentence or document. However, it is very challenging due to the existence of sarcastic expressions. In this study, we have proposed a classic or shallow machine learning approaches-based sarcasm detection model for Amharic language. We would collect data from Abe Tokyo Amharic Shimut and Mitsetoch Book, his Face book channel, and other telegram and Face book channels focusing on sarcasm. We would annotate the collected dataset as sarcastic and nonsarcastic grouped sentences. After applying normalization, tokenization and non-Amharic components removal as a preprocessing step, we would use feature extraction i.e. inter and intra class frequency. We would implement threshold value-based feature selection i.e. minimum threshold value for intra class frequency and maximum threshold value for inter class frequency. At the end we would implement Artificial Neural Network-Nearest Neighbor, Support Vector Machine, Random Forest, AND Naïve Bayes as classifiers. In our experimental result ANN outperform traditional machine learning classifier we have achieved, 98.09% training accuracy and 94.05% testing accuracy using ANN. Machine learning algorithms do not process text as input and text encoding in another format. TFIDF was applied for vectoring the dataset or encoding text in numeric for traditional machine learning models. The goal of Sarcasm Detection is to determine whether a sentence is sarcastic or non-sarcastic. Key Words: Amharic Lemmatize Text, Sarcasm Detection Model en_US
dc.language.iso en_US en_US
dc.subject Information Technology en_US
dc.title Sarcasm Detection Model for Amharic Lemmatize Text Using Machine Learning en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record