INFORMATION RETRIEVAL FOR SILT’E TEXT USING LATENT SEMANTIC INDEXING

Yoseph, Mare Guwta

dc.contributor.author	Yoseph, Mare Guwta
dc.date.accessioned	2022-03-18T07:11:38Z
dc.date.available	2022-03-18T07:11:38Z
dc.date.issued	2021-10
dc.identifier.uri	http://ir.bdu.edu.et/handle/123456789/13223
dc.description.abstract	Information retrieval is a mechanism that enables finding relevant information material of unstructured nature that satisfies information needs of user from large collection. Since there are usually many ways to express the same concepts, the terms in the user’s query may not appear in a relevant document. Alternatively, many words can also have more than one meaning which may confuse the retrieval system. This research intended to apply latent semantic indexing to handle synonymous and polysemous words in the Silt’e text document and users’ query. Silt’e text retrieval developed in this study has indexing and searching subsystems. While indexing organizes index terms, searching enables matching query terms with index terms in order to retrieve relevant documents. For the experimenting purpose, we have used 700 Silt’e text documents and 56 queries were used to test the prototype of the system.Silt’e text document corpus is prepared by the researcher encompassing different reports from Silt’e culture and tourism bureau and books. Also, various techniques of text preprocessing including tokenization, normalization, stop word removal and stemming were used to identify content-bearing words. Experimental result shows that the prototype registered on the average 68% recall, 79% precision and 72% F-measure. The major challenges that affect the performance of the IR prototype include lack of standard dataset for Silt’e language and the ineffectiveness of Silt’e stemmer to conflate Silt’e inflectional words into their stem. Therefore, in order to improve the performance of the prototype, there is a need to develop Silt’e dataset as well as Silt’e stemmer. Keywords: Information Retrieval, Latent Semantic Indexing, Singular Value Decomposition	en_US
dc.language.iso	en_US	en_US
dc.subject	INFORMATION TECHNOLOGY	en_US
dc.title	INFORMATION RETRIEVAL FOR SILT’E TEXT USING LATENT SEMANTIC INDEXING	en_US
dc.type	Thesis	en_US