Abstract:
The existence of an anaphoric term in a text has an impact on different natural language
processing studies like machine translation, semantics analysis, and sentiment analysis. This
study use deep learning and a machine learning approach to design an anaphoric term
identification and resolution model for the Amharic language. A total amount of 4524 Amharic
input sentences are collect for anaphoric term identification and resolution. The data source for
this study is Amharic textbooks, the Amharic holy bible, the Amharic holy Quran, Amharic
news, and fiction. And then the data is annotated by the expert annotators using annotation
guidelines developed by the researcher. The data preprocessing tasks such as stop word and
punctuation mark removal, normalization, POS tagging, and morphological analysis are applied.
We have experiment with NB and RF from machine learning algorithms and LSTM, and
BiLSTM from deep learning algorithms. The accuracy of these models was 85%, 93%, 95% and
98% respectively. The outcome of the experiment demonstrates that the suggested BiLSTM
model outperforms NB, RF, and LSTM.
Keywords: Anaphora resolution, anaphoric term identification, independent anaphor,
anaconda Jupiter notebook, and antecedent.