BDU IR

AMHARIC SPEECH TO ETHIOPIAN SIGN LANGUAGE TRANSLATION

Show simple item record

dc.contributor.author SELAMAWIT, BELAY
dc.date.accessioned 2025-02-24T08:03:04Z
dc.date.available 2025-02-24T08:03:04Z
dc.date.issued 2025-06
dc.identifier.uri http://ir.bdu.edu.et/handle/123456789/16480
dc.description.abstract Ethiopian Sign Language (EthSL) is the primary means of communication for the deaf community in Ethiopia. The dearth of accessible tools that convert Amharic speech into EthSL, however, makes it difficult for hearing and deaf people to communicate. The goal of this study is to create a deep learning model for translating Amharic speech to video sign language. To close this gap, this study attempts to create a deep learning-based Amharic speech-to-video sign language translation model. Four deep learning models were compared: Long Short-Term Memory (LSTM) networks, Variational Autoencoders (VAEs), Recurrent Neural Networks (RNNs), and Convolutional Neural Networks (CNNs). Via hyperparameter optimization, LSTM achieved the highest accuracy, increasing from 91% to 97%. In order to ensure model performance, interpretability, and usability, the study used a design science methodology. To increase the model's decisionmaking process's transparency, integrated gradients were used. A Flutter-built mobile application including the trained LSTM model was evaluated for usability using surveys. Results show that as compared to previous deep learning models, the LSTM-based model considerably increases the accuracy of Amharic speech-to-EthSL translation. The prototype improves accessibility for the deaf community by enabling real-time EthSL video creation from Amharic speech. By offering a dependable and understandable solution, encouraging inclusivity, and setting the stage for future developments in Amharic speech-to-sign language translation, this research advances assistive technology. For this thesis, four deep learning algorithms were investigated: Long Short-Term Memory (LSTM) networks, Variational Autoencoders (VAEs), Recurrent Neural Networks (RNNs), and Convolutional Neural Networks (CNNs). After hyperparameter optimization, LSTM outperformed the others, increasing its accuracy from 91% to 97%. Using a design science methodology, this study guarantees interpretability and usability in addition to model performance. Transparency in Amharic speech-to-video sign language translation was improved by using Integrated Gradients to explain the model's decision en_US
dc.language.iso en_US en_US
dc.subject Software Engineering en_US
dc.title AMHARIC SPEECH TO ETHIOPIAN SIGN LANGUAGE TRANSLATION en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record