Abstract:
Human action recognition is a computer vision technique used to understand the activity of the action performed in the scene. Today, computer vision technology has become popular and is applied in various areas like surveillance systems, robot vision, satellite imaging, and others. However, using computer vision in human activity recognition is still challenging due to the dynamic movement of human motion, noise, occlusion, complex background, variable dressing style, varying illumination, and others. To overcome this and other problems, various research has been conducted, but it is still a challenging issue. In this study, we applied human action recognition techniques to interpret the meaning of Tekele Zemamie's actions in the Ethiopian Orthodox Church. Tekele Zemamie is the most frequent action held by the church choirs in the church stage ceremonies and others. To carry out the research experimental work we have prepared 900 video clip datasets. The dataset was recorded in a 2.5-meter distance using a 48MP Camera. We recorded the data at Abune Gebere Menfes Kidus church Bahir Dar Ethiopia in six action class categories. From the dataset, 80% was used for training and the remaining 20% was used to test the performance of the model. We have used skip frame selection techniques in a single video and 30 frames are selected in each video. We applied spatial, temporal, and skeleton pose feature extraction techniques. To overcome the stated problems and to develop Tekele Zemamie recognition model experimental works have been examined using BlazePose_Bi-LSTM, BlazePose_LSTM, BlazePose_SoftMax, CNN_LSTM, and CNN_Bi-LSTM deep-learning models, and we achieved 84%, 76%, 67%, 95%, and 97% recognition accuracy respectively. From the proposed models, CNN_Bi-LSTM works better than the other models.
Keywords: CNN, Bi-LSTM, HPE, SKE, HAR