Abstract:
Extremist attitudes on social media posts and comments are linked with other groups that
propagate an individual, group, city, or country. This is a major issue. In the present day, Twitter,
Facebook, YouTube, Instagram, TikTok, LinkedIn, and others are commonly used as social media
platforms. Opinions expressed on such social media are a great problem, as is the spread of
extremist ideas. Some of the information on these social media platforms is considered negative
because it can be used to attack and insult people.
We propose a machine learning approach to build a model for the classification of extremist
attitude posts and comments on social media in Afan Oromo to categorize them as politics,
religious, ethnic, race, and neutral. We have experimented with 14500 collected data from
Facebook of BBC, OBN, FBC, OLF, KFO, and Taye of Afan Oromo pages using the Face Pager
tool. From this dataset, we used train data of 11600, and for test data, we used 2900. We applied
text processing tasks like lower case conversion, link removal, email removal, number removal,
special character removal, short word removal, tokenization, and data annotation to the collected
data. Then, for feature extraction, we used a counter vector and TF-IDF. Finally, we applied seven
machine learning algorithms (RFC, SVM, LG, NB, KNC, DTC, and NN) to 80% of the train data
and 20% of the test data to build a model to classify extremist attitude posts and comments on
social media in Afan Oromo into categories such as politics, religious, race, ethnicity, or neutral.
From these seven models, the Random Forest Classifier (RFC) with counter-vector feature
extraction scored the highest performance. The model's performance was also evaluated using a
performance test dataset, which yielded an f1 score of 95%. So, it is appropriate for the
classification of extremist attitude posts and comments in Afan Oromo from social media to
categories such as politics, religious, race, ethnic, or neutral extremist.
Keywords: Afan Oromo, extremist attitude, extremism, sentiment analysis, deep learning,
classification, Twitter, Facebook, annotation.