BDU IR

Continuous Ethiopian Sign Language Recognition using CNN and Bidirectional LSTM

Show simple item record

dc.contributor.author Shimels, Alem
dc.date.accessioned 2023-06-19T07:21:33Z
dc.date.available 2023-06-19T07:21:33Z
dc.date.issued 2022-08
dc.identifier.uri http://ir.bdu.edu.et/handle/123456789/15393
dc.description.abstract Sign language is an independent language that conveys meaning through gestures and body language. People who have to use sign language to communicate with other people are often unable to communicate effectively with people who do not know how to use sign language. Therefore, an application to translate sign language to text would be useful for many people. Several studies on sign language recognition for various sign languages have been conducted all over the world. Since signs and linguistic features are different from one sign language to another, an algorithm that recognizes one sign language may not be applicable to another. For Ethiopian Sign Language (EthSL) recognition, different studies have been conducted, but the majority of them are limited to the recognition of finger spelling and isolated words. For continuous EthSL recognition, there is only one study conducted. However, the study uses specialized equipment such as Kinect, and the accuracy of the study is lower. To fill the gaps, we proposed a continuous EthSL recognition model using a Bidirectional long-short term memory (BiLSTM) and Connectionist Temporal Classification (CTC). This study uses video as input from a digital camera. First, we collected a total of 420 sentence-level sign videos of 10 unique sentences. After that the video frames are extracted, then the extracted frames are passed through a sequence of different preprocessing activities, such as resizing, noise removal, and segmentation. Following that, the frame's spatial features are extracted by using a Convolutional Neural Network (CNN) and stored in a single .csv file. Finally, temporal dependencies are modeled using BiLSTM. To get around temporal segments and achieve end-to-end continuous EthSL recognition, we use CTC on top of the networks. We have conducted three experiments in order to select the proposed model. From the experimental results, we have got 33% and 38% Word Error Rate (WER) using Long Short-Term Memory (LSTM) with CTC and Gated Recurrent Units (GRU) with CTC respectively. By using BiLSTM-CTC we have got 32% WER. The experimental result illustrates that BiLSTM-CTC achieves the highest recognition accuracy using BiLSTM-CTC. Keywords: Convolutional Neural Network, Bidirectional Long Short-Term Memory, Connectionist Temporal Classification, Continuous Ethiopian sign language recognition. en_US
dc.language.iso en_US en_US
dc.subject Computing en_US
dc.title Continuous Ethiopian Sign Language Recognition using CNN and Bidirectional LSTM en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record