Abstract:
The extraction of a causal relation could be a difficult task while it is very important
domain area for Natural Language Processing (NLP). There are many existing
approaches and techniques developed in order to tackle extraction of causality relation
task. These approaches are categorized either rule-based (non-statistical) or machinelearning-
based (statistical) method. In the case of statistical or rule-based approaches, it
needs widespread manual works to design and construct handcrafted patterns and rules,
however, the precision and recall are low because of the complexity of causal relation
expressions in texts. On the other hand, the non-statistical or machine learning
approaches, are the current approaches either rely on sophisticated feature engineering
which is error-prone, or rely on large amount of labeled data which is impractical for
the extraction of causal relation problem. In order to deal with the above issues, we have
proposed a one of the deep learning approaches called a Convolutional Neural Network
(CNN) for causal relation extraction in this paper. This CNN approach consists of a
word embeddings and position embeddings. The word embedding and position
embedding allows to represent each word with a vector form and calculate the position
of the word on the given sentence from head to tail respectively for the model.
Furthermore, additional semantic features that are useful for identifying causal relations
that are stated in ambiguous form also allows to determine the direction of the causality
are created. We have been used our own data set that we have collected from three
domain areas of Amharic text, i.e., health related, agricultural related and environmental
related domain areas, to assess the capability of CNN to efficiently extract the causal
relation from texts, and the model that we have proposed outperforms current state-ofart
models for causal relation extraction.