BDU IR

Hate Speech Detection for Amharic Language on Facebook Using Deep Learning

Show simple item record

dc.contributor.author Melat, Fissha Atnafu
dc.date.accessioned 2022-11-18T08:28:12Z
dc.date.available 2022-11-18T08:28:12Z
dc.date.issued 2022-07
dc.identifier.uri http://ir.bdu.edu.et/handle/123456789/14487
dc.description.abstract Amharic is the second most spoken Semitic language in Sub-Saharan Africa, with 65 million speakers, after Arabic, which has over 300 million speakers. On some Facebook pages, some of the content shows hate speech. Hate speech generally refers to expressions, speech, gestures, or writing that advocate, threaten, or encourage violent acts towards someone based on gender, religion, political view, or disability. In recent years, social activities over the internet, especially on Facebook platforms, have increased dramatically. But unfortunately, social media like Facebook have evolved into platforms for the dissemination of hate speech, which is disrupting the social lives of the majority of people and leading to conflicts. As a solution to this problem, this research develops an Amharic hate speech detection model using deep learning algorithms. In this study, new Amharic hate speech datasets were prepared from Facebook, Twitter, and YouTube. These social media groups and individual channels have been chosen to collect the dataset. This experiment used a total of 113,959 out of 308,160 posts and comments to train and test the collected dataset. Embedding layers using Keras are used as a feature extraction for the deep learning models. Those models are Long Short Term Memory (LSTM), Bidirectional Long Short Term Memory (BILSTM), Gated Recurrent Unit (GRU), and Multilayer Preceptor (MLP). The experiment was conducted on those four models by using 80% of the dataset for training and 20% of the dataset for testing the model after training. As a result, performance evaluation with the use of precision, recall, and F-measure the above-mentioned experiments were put into the evaluation, and they have shown a promising result. Each model has been experimented with and tested individually. In the experiments conducted, BILSTM and GRU achieved the highest accuracy (91%), and also LSTM and MLP achieved 90% accuracy. During the experiment, one of the challenges was scrapping the dataset and labeling the scraped comments. But by overcoming all the challenges, it was possible to detect Amharic hate speech and have a better performance. Key words: - Amharic language, Hate Speech, Classification, Word Embedding, Deep Learning en_US
dc.language.iso en_US en_US
dc.subject Faculty of Electrical and Computer Engineering en_US
dc.title Hate Speech Detection for Amharic Language on Facebook Using Deep Learning en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record