BDU IR

SPECTROGRAM IMAGE ASSISTED SPEAKER INDEPENDENT GE’EZ LANGUAGE PRONUNCIATION CLASSIFICATION

Show simple item record

dc.contributor.author Bekalu, Mogne
dc.date.accessioned 2023-07-04T07:32:17Z
dc.date.available 2023-07-04T07:32:17Z
dc.date.issued 2023-03-15
dc.identifier.uri http://ir.bdu.edu.et/handle/123456789/15454
dc.description.abstract Language is a communication medium by which humans can be able to communicate in their everyday life. Ge’ez is a classical language of Ethiopia in which ancient histories and manuscripts have been written with. Ge’ez has four types of word pronunciation which are Tenesh/ተነሽ Tetay/ ተጣይ Wodaki/ወዳቂ, Seyaf/ ሰያፍ they have their way of utterance that can be able to distinguish from each other, the first thing anyone has to know to be experienced in reading Ge’ez scripts is to know word pronunciation category. This study is proposed to minimize the challenge of categorizing the Ge’ez words to their desired pronunciation style through spectrogram-assisted Ge’ez language pronunciation classification. A total of 2308 words of audio utterances have been used directly recording from Ge’ez students and Ge’ez experts, we have used 3 men’s and 3 women’s using Infinix samart 4 with a sampling rate of 16KHZ (kilohertz). since the environment is uncontrolled, we used MMSE noise removal techniques to mitigate the noise. FFT and STFT have been used for spectrogram and Mel spectrogram generation respectively. Before classification, we have done preprocessing both at the audio stage and generated spectrogram image. MFCC (Mel frequency cepstral coefficients) from the enhanced wav file was used as a feature extraction techniques and texture features from the spectrogram image using GLCM were used. SVM with combined MFCC (MFCC delta, delta-delta MFCC) we get an accuracy of 83.116%, with the GLCM texture feature we get an accuracy of 70.04%, Combined MFCC and GLCM texture feature we get an accuracy result of 88.96%, With SoftMax classifier using combined Texture and MFCC feature we get an accuracy of 85.93%. Using combined texture feature and MFCC features KNN classifier has attained 84.55%. SVM with MFCC and textural features achieved a better result of 91.12%. Key Words: GLCM, Spectrogram, MFCC, SoftMax, pronunciation classification en_US
dc.language.iso en_US en_US
dc.subject Electrical and Computer Engineering en_US
dc.title SPECTROGRAM IMAGE ASSISTED SPEAKER INDEPENDENT GE’EZ LANGUAGE PRONUNCIATION CLASSIFICATION en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record