BDU IR

AMHARIC SPEECH EMOTION RECOGNITION USING MACHINE LEARNING APPROACH

Show simple item record

dc.contributor.author Frehiwot, Mekuriaw
dc.date.accessioned 2021-08-13T06:15:49Z
dc.date.available 2021-08-13T06:15:49Z
dc.date.issued 2021-03
dc.identifier.uri http://ir.bdu.edu.et/handle/123456789/12381
dc.description.abstract The two commonly used ways of communication that are used to express human emotion are facial expression and affective speech. SER (speech emotion recognition) is a technology that extract emotional features from speech signals and analyses the emotional change of the signals. In this work, we focus on SER because it has the capability of expressing the internal feelings of the person. Still-now SER is an important research area and it is an efficient and most commonly used research approach. In SER, using either auditory features (F0, MFCC, LPCC, energy, and zero-crossing rate, etc.) or spectrogram based features have its own limitation. The auditory-based features can highlight human knowledge, whereas spectrogram-based features enable general representation. So to increase SER performance, we used both spectrogram and auditory-based features to discriminate one emotion from the other. For the SER purpose, we collect Amharic corpus from the hospital, and residence places. The main challenges in SER are identifying good features with their classification and feature extraction approaches, the existence of similar contents expressed in different emotions, getting real datasets from the real environment. This research aims to recognize emotion from Amharic language, because it is different from the languages by the change of a single emotional signal by slackening and tightening. So, in our experiment the special characteristics of the language create confusion in Neutral and Angry emotion. In this research, we used hybrid noise filtering techniques (spectral subtraction and MMSE) for removing noise from speech, then we used CNN- BiLSTM to extract features from the spectrogram image and we used handcrafted feature extraction methods to extract auditory-based features. Then after extracting deep features we used PCA to reduce the dimension of the extracted feature vectors. Finally, we send the combined features to the SVM classifier with RBF kernel to label emotional classes like Sadness, Anger, Happiness, and Neutral emotions from the 1042 total SER dataset. We achieved 96.75% for the hybrid model recognition accuracy. en_US
dc.language.iso en_US en_US
dc.subject computer science en_US
dc.title AMHARIC SPEECH EMOTION RECOGNITION USING MACHINE LEARNING APPROACH en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record