Abstract:
News is an event or an event that is actual or current. Events can be said to be news if they have been reported and broadcasted. Text news is a type of news in which it is in written form. These days, a large volume of unstructured text is released on a regular basis, the data is too large and difficult to arrange with human power. As a result, it needs a correct method of data collection and organizing from those techniques, where text classification is included. Text classification is a machine-learning technique for categorizing text documents into a set of predefined groupings. In previous studies, the text news classification was done mainly at the document level, and also, there was no work attempted for Awngi text news. To address the aforementioned problems, we have developed an automatic Awngi text news classification model. The experimentation performed on 1605 news documents collected in Amhara Media Corporation (Awngi section) from chair-bewa (ቼር-ቤዋ) journals. To build the classification models for Awngi news text, multinomial Naive bayes, support vector machine, logistic regression, XGBoost classifiers are used and achieves accuracy of 80 %, 83 %, 81 %, and 70 % respectively. Based on experimental results we obtained, The SVM achieves the better results with TF-IDF feature selection, and it achieves 83 % accuracy. This demonstrates a promising result for constructing a better text news classification model using Support Vector Machine with TF_IDF feature extraction schemes.
Key word: Text classification, Machine learning technique, News classification, Awngi text