BDU IR

INFORMATION RETRIEVAL FOR SILT’E TEXT USING LATENT SEMANTIC INDEXING

Show simple item record

dc.contributor.author Yoseph, Mare Guwta
dc.date.accessioned 2022-03-18T07:11:38Z
dc.date.available 2022-03-18T07:11:38Z
dc.date.issued 2021-10
dc.identifier.uri http://ir.bdu.edu.et/handle/123456789/13223
dc.description.abstract Information retrieval is a mechanism that enables finding relevant information material of unstructured nature that satisfies information needs of user from large collection. Since there are usually many ways to express the same concepts, the terms in the user’s query may not appear in a relevant document. Alternatively, many words can also have more than one meaning which may confuse the retrieval system. This research intended to apply latent semantic indexing to handle synonymous and polysemous words in the Silt’e text document and users’ query. Silt’e text retrieval developed in this study has indexing and searching subsystems. While indexing organizes index terms, searching enables matching query terms with index terms in order to retrieve relevant documents. For the experimenting purpose, we have used 700 Silt’e text documents and 56 queries were used to test the prototype of the system.Silt’e text document corpus is prepared by the researcher encompassing different reports from Silt’e culture and tourism bureau and books. Also, various techniques of text preprocessing including tokenization, normalization, stop word removal and stemming were used to identify content-bearing words. Experimental result shows that the prototype registered on the average 68% recall, 79% precision and 72% F-measure. The major challenges that affect the performance of the IR prototype include lack of standard dataset for Silt’e language and the ineffectiveness of Silt’e stemmer to conflate Silt’e inflectional words into their stem. Therefore, in order to improve the performance of the prototype, there is a need to develop Silt’e dataset as well as Silt’e stemmer. Keywords: Information Retrieval, Latent Semantic Indexing, Singular Value Decomposition en_US
dc.language.iso en_US en_US
dc.subject INFORMATION TECHNOLOGY en_US
dc.title INFORMATION RETRIEVAL FOR SILT’E TEXT USING LATENT SEMANTIC INDEXING en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record