Abstract:
Automatic lip motion recognition is an essential visual method for silent communication, deaf and hard to hear people. However, the recognition process is still a challenge due to various factors such as variations pronunciation, speeds of speech, movement of face, gesture, lip makeup, face color, quality of video camera and methods of feature extraction. This study presents a solution for automatic lip motion recognition by identifying lip movements and characterizing their association with speech words, for Amharic language spoken by speakers using the information available in lip movements.
The approach of this study is a computer vision technique based on shape information of lip features have been explored to identify different features of lip contour. The process involves detection face part, extraction of feature using shape information and recognition using Artificial Neural Network (ANN) and Support Vector Machine (SVM). The sample data is collected from 10 different Amharic language speakers with selected 14 Amharic words, which treated voice less patients from health care provider. The input video converted into consecutive image frames. The mouth region was detected using a Viola-Jones object detection algorithm, and the original mouth image converted into YIQ color space, and apply the saturation components in order to detect lip image from face area, final Sobel’s edge detection and morphological image operations is selected for identifying to extract the exact contour of lip. Eight morphology features of lip contour were extracted from morphological image. The final feature vector for each word is generated by using an average of each feature. We have compared classification approaches of ANN and SVM classifiers with the average shape information features on each classification parameter. The experiment shows that the classification performance of SVM classifier is better than ANN. The classification accuracies 65.71% and 66.43%, are obtained using ANN and SVM respectively.
The experiment result shows, the proposed approach investigated lip motion shape extraction model by construction three main methods, the model used Viola-Jones object detection algorithm to detect a mouth region, and YIQ color space to provide acceptable discrimination between the lip region and face area, and applying lip counter shape information as input to classifier. Therefore, the approach is better for real-time lip motion recognition, apply in limited memory storage machines and also this approach is good for its efficient processing speed.