Abstract:
Question Answering (QA) System is very useful as most of the deep learning-related problems can be modeled as a question answering problem. However, due to its nature of language dependency, the existing question answering system is not functional to the local languages like Ge’ez that has varieties of grammar and other features that differentiate Ge’ez from others. In addition to the existing systems also requires manual feature engineering, linguistic tools, or other external sources. The researcher tries to solve such a problem by proposing Ge’ez factoid question answering (GFQA) using a deep learning approach which can learn features by training from training documents. Ge’ez factoid question answering (GFQA) enables researchers to get the initial for further research and student of university who need short answer from Ge’ez corpus. The proposed Ge’ez factoid question answering (GFQA) system comprises question analysis, document analysis, and answer extraction modules, embedding layer, and model block. Our data are collected from Ethiopian orthodox tewahido church (EOTC) sources which include New Testament of the Bible, Dersane Mickael, Akismaros. As data sources using different data collection methods like an interview and documents analysis. We use BILSTM and LSTM language models for our model building since our data requires sequential models. We trained each of the models with 10 epochs with a batch size of 32 having 128 hidden sizes each. Our data sets are more than 30,000 sentences collected from the above-mentioned sources. Based on this data, we have prepared a total of 3200 question-answer pairs as datasets, and from these total datasets, we split into 80:10:10 ratio. The training, testing, and validation datasets are partitioned into 2240, 480, and 480 data sets sizes respectively. 78.13 % accuracy were achieved by the LSTM model during testing. 79.17% accuracy was achieved by the BILSTM model during testing. And hence, the best-performed model is BILSTM for GFQA.