Continuous Ethiopian Sign Language Recognition using CNN and  Bidirectional LSTM

Shimels, Alem

dc.contributor.author	Shimels, Alem
dc.date.accessioned	2023-06-19T07:21:33Z
dc.date.available	2023-06-19T07:21:33Z
dc.date.issued	2022-08
dc.identifier.uri	http://ir.bdu.edu.et/handle/123456789/15393
dc.description.abstract	Sign language is an independent language that conveys meaning through gestures and body language. People who have to use sign language to communicate with other people are often unable to communicate effectively with people who do not know how to use sign language. Therefore, an application to translate sign language to text would be useful for many people. Several studies on sign language recognition for various sign languages have been conducted all over the world. Since signs and linguistic features are different from one sign language to another, an algorithm that recognizes one sign language may not be applicable to another. For Ethiopian Sign Language (EthSL) recognition, different studies have been conducted, but the majority of them are limited to the recognition of finger spelling and isolated words. For continuous EthSL recognition, there is only one study conducted. However, the study uses specialized equipment such as Kinect, and the accuracy of the study is lower. To fill the gaps, we proposed a continuous EthSL recognition model using a Bidirectional long-short term memory (BiLSTM) and Connectionist Temporal Classification (CTC). This study uses video as input from a digital camera. First, we collected a total of 420 sentence-level sign videos of 10 unique sentences. After that the video frames are extracted, then the extracted frames are passed through a sequence of different preprocessing activities, such as resizing, noise removal, and segmentation. Following that, the frame's spatial features are extracted by using a Convolutional Neural Network (CNN) and stored in a single .csv file. Finally, temporal dependencies are modeled using BiLSTM. To get around temporal segments and achieve end-to-end continuous EthSL recognition, we use CTC on top of the networks. We have conducted three experiments in order to select the proposed model. From the experimental results, we have got 33% and 38% Word Error Rate (WER) using Long Short-Term Memory (LSTM) with CTC and Gated Recurrent Units (GRU) with CTC respectively. By using BiLSTM-CTC we have got 32% WER. The experimental result illustrates that BiLSTM-CTC achieves the highest recognition accuracy using BiLSTM-CTC. Keywords: Convolutional Neural Network, Bidirectional Long Short-Term Memory, Connectionist Temporal Classification, Continuous Ethiopian sign language recognition.	en_US
dc.language.iso	en_US	en_US
dc.subject	Computing	en_US
dc.title	Continuous Ethiopian Sign Language Recognition using CNN and Bidirectional LSTM	en_US
dc.type	Thesis	en_US