Abstract:
The connectivity and accessibility of social media platforms in the world allow people to
express their ideas and share experiences easily. However, the anonymity and flexibility
afforded by the Internet have made it easy for users to communicate aggressively. Hate
speech affects the society in many aspects, such as affecting the mental health of targeted
audiences, affects social interaction, leads to violence and distraction of properties.
Determining a text that containing hate speech is a difficult task for humans, it is timeconsuming,
tedious, and introduces subjective notions of what constitutes a text to be hate
or offensive speech.
As a solution to address the problem, this research develops a detection model for Amharic
hate speech texts using deep learning approaches. In this research, we prepare a new
Amharic hate speech dataset from Facebook and Twitter social media that are labeled into
four classes, and then the data is augmented to balance the category class. Word2vec
embedding and word embeddings using Keras are used as a feature for the deep learning
models. CNN, LSTM, Bi-LSTM, GRU, and combined CNN-LSTM models trained using
the whole dataset with the Word2vec embedding feature and automatically generated
features using the embedding layer for both augmented and original dataset. We evaluate
the models using (80,20) train-test split with precession, recall, and f1-score performance
metrics were used to compare the models.
Using the two datasets the study developed five different models with each feature through
the original and augmented dataset. The model based on BILSTM with word2vec achieves
slightly better performance than the other models for both augmented and original dataset.
According to the classification performance result, the model with augmented data shows
a little bit less confusion between offensive and both (hate and offensive) than the model
without augmented dataset. However, the models mostly tend to misclassify hate speech’s
as both (hate and offensive) speech. Generally, BILSTM achieves the highest F1-score
(90%), and also the CNN classifier performs an f1-score (89%).