Abstract:
Identifying rhetorical roles in legal judgment documents is essential for understanding their structure and content, but the lack of clear headings, poor organization, and complexity of the document make manual analysis time-consuming and inconsistent. Legal professionals struggle to efficiently locate critical information such as facts or arguments, which hinders legal research, case preparation, and results in variability in legal analysis and delays in case outcomes.
This study develops a multi-label rhetorical role identification model for Amharic legal judgment documents using both machine learning and deep learning approaches. A dataset of 7,000 annotated sentences, consisting of 190,975 words and 18,137 unique terms, was collected from the Amhara Region Supreme Court and annotated by three annotators, with a judge serving as the corrector. We used 300 legal judgment documents for a dataset. The Potato annotation tool was used for labeling, and the data was split 80/20 for training and testing, with 10% of the training data reserved for validation. The models tested include deep learning approaches such as LSTM, Bi-LSTM, and Bi-GRU, alongside traditional machine learning models like SVM, Random Forest, Gradient Boosting, and Logistic Regression.
The evaluation showed that the Bi-LSTM model achieved the highest performance, with a precision of 89%, recall of 85%, F1 score of 87%, and a Hamming loss of 0.042. Random Forest performed comparably, with a precision of 92%, recall of 81%, F1 score of 86%, and a Hamming loss of 0.049. In contrast, traditional models such as SVM, Gradient Boosting, and Logistic Regression exhibited lower performance. Bi-LSTM outperformed traditional machine learning models in accurately identifying rhetorical roles in complex legal documents, effectively handling imbalanced data by capturing contextual information and sequence dependency. LIME was used to interpret the Bi-LSTM model, enhancing the transparency of its sentence identification process.
Keywords: - Deep learning, legal judgment, LIME, Machine learning, Multi-label classification, Rhetorical role identification