BDU IR

Amharic Visual Question Answering on Ethiopian Tourism

Show simple item record

dc.contributor.author Alebachew, Molla Dinku
dc.date.accessioned 2025-02-24T08:06:34Z
dc.date.available 2025-02-24T08:06:34Z
dc.date.issued 2024-11
dc.identifier.uri http://ir.bdu.edu.et/handle/123456789/16481
dc.description.abstract Visual Question Answering (VQA) is a Vision-to-Text (V2T) task that integrates visual features of images with natural language questions to generate meaningful responses. Most existing research has focused on English, leaving a significant gap for other languages, including Amharic. Tourism, a major global industry, relies heavily on interactions where visitors seek information about natural, historical, cultural, and religious sites. Ethiopia is a remarkable tourist destination, home to unique sites most visitors are local, creating an urgent need for a VQA model that can deliver accurate, culturally relevant information in Amharic. Unfortunately, no such model currently exists to assist tourists at these heritage sites. This research addresses this gap by developing an Amharic Visual Question Answering model specifically tailored for Ethiopian tourism. A new Amharic VQA dataset was created using 2,200 diverse images from Ethiopian tourist sites paired with 6,600 questions in Amharic. Our dataset is collected from various sources, including the UNESCO website, the Amhara Tourism office, and online platforms such as Facebook, Free pixel, and Instagram. Each image is complemented by three corresponding questions formulated by three individual experts and answered by ten candidates. The questions, answers, and images are linked through annotations and fed into the model. We used ResNet-50 for feature extraction and Bidirectional Gated Recurrent Unit (BiGRU) with attention mechanisms, achieving a testing accuracy of 54.98%, demonstrating the model's effectiveness in answering questions about Ethiopian heritage. We will expand this research using external knowledge to get answer and description beyond image and custom object detection. Key word: Amharic Language; Ethiopian tourism; Deep learni en_US
dc.language.iso en_US en_US
dc.subject Computer Science en_US
dc.title Amharic Visual Question Answering on Ethiopian Tourism en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record