CROSS-LINGUAL TEXTUAL ENTAILMENT  RECOGNITION USING DEEP NEURAL NETWORK

WUBIE, BELAY ALEMU

CROSS-LINGUAL TEXTUAL ENTAILMENT RECOGNITION USING DEEP NEURAL NETWORK

WUBIE, BELAY ALEMU

URI: http://ir.bdu.edu.et/handle/123456789/13221

Date: 2021-09

Abstract:

Natural Language processing is dealing with natural language understandings and natural language generation which enable computers to understand and analyze human language. Cross-lingual Textual Entailment is one of the applications of NLU if there exist premise (P) as a source language and hypothesis (H) as a target language pair of sentences that decide whether the hypothesis and premise inferential relationship is forward entailment, backward entailment, bidirectional entailment, contradiction or neutral. Cross-lingual Textual Entailment recognition is challenging for transferring information between Ethiopian Semitic (Amharic) languages and foreign (English) languages. Amharic is a structurally complex language, where an application that is developed for foreign or other Ethiopian languages cannot directly fit. In this study, we have proposed a Cross-lingual Textual Entailment model using deep neural network approaches. For sentences embedding, we have used the hybrid of XLNet and Bi-LSTM. In XLNet, we have implemented Transformer-XL, Multi-head attention mechanism, and relative position embedding. Neural machine translation is utilized for translating English sentences into Amharic sentences with IBM5 alignment. In the translation step, we have combined Bi-LSTM with transformer (multi-head attention and relative position embedding). We have also implemented cross-lingual embedding and compare its performance with NMT. We have combined the Amharic dataset with the SNLI dataset and annotate the dataset based on multi-way classification. The NMT predicts 96.87% of the training dataset. We have obtained 86.89% testing accuracy that optimized by 8.88%, 6.73%, and 5.56% performance of each Bi-LSTM, XLNet, and Bi-LSTM and Transformer model respectively using 10 training epochs. In general, the deep learning-based Cross-lingual Textual Entailment model achieves 89.92%. The issue with this research is that it ignores multiple inferences. This multi-sentence inference is a major issue that requires further investigation. In addition to this, it didn't use the word disambiguation. Therefore, as a recommendation integrating word disambiguation is needed to enhance the performance of Cross-lingual Textual Entailment. Keywords: Cross-lingual Textual Entailment, Deep learning algorithms, hybrid model, pre-trained models, cross-lingual embedding, translation, concatenation, classifiers.

Show full item record