BDU IR

Integrating Music Genre Classification for Precision in Constructing an Automatic Music Transcription Model for Ethiopian Begena

Show simple item record

dc.contributor.author Bisrat, Getnet
dc.date.accessioned 2024-12-05T07:22:45Z
dc.date.available 2024-12-05T07:22:45Z
dc.date.issued 2023-12
dc.identifier.uri http://ir.bdu.edu.et/handle/123456789/16273
dc.description.abstract This thesis delves into the intricate realm of Automatic Music Transcription (AMT), with a specific focus on the Begena, a ten-stringed Ethiopian musical instrument. The primary aim is to address challenges associated with manual music transcriptions, including time-consuming efforts, inconsistencies, and financial burdens. Additionally, the study aims to overcome limitations in existing AMT models and systems, such as the absence of scale identification, variations in notation, and low precision. A groundbreaking aspect of this research is the integration of Music Genre Classification (MGC) into AMT for the Begena, marking a unique advancement in the field. The investigation combines manual feature extraction methods and spectrogram-based experiments within the MGC model. The manual feature extraction approach employs multiple traditional machine learning algorithms and a Convolutional Neural Network (CNN) architecture. Meanwhile, the spectrogram-based experiment utilizes features like Constant-Q Transform (CQT), Mel scale, and Mel-frequency cepstral coefficients (MFCC), trained with various deep learning algorithms. Remarkably, the MFCC-trained models stand out, achieving a perfect accuracy score of 1.00 in both the validation and testing phases for all models. Shifting the focus to AMT models, three deep learning algorithms—CNN, CRNN, and Generative Adversarial Network (GAN)—are implemented and meticulously trained on distinct dataset groups. The GAN model, crafted to guide the CRNN model during training, emerges as the most effective, achieving outstanding frame-level F1 scores of 0.867 for validation and 0.860 for testing. Furthermore, an innovative attempt to refine AMT predictions through MGC model auto-correction exhibits minimal improvement. A thorough analysis of the experiments reveals the model’s robust performance in specific playing styles, such as Qoutera songs, but comparatively less in Derib songs. Our models showcase advantages in note identification, capturing note overlaps, and scale identification, while also revealing limitations in terms of note omissions and expanding errors. In conclusion, this research makes a significant contribution to the progression of AMT, particularly tailored for traditional instruments like the Begena. However, the study signifies a starting point rather than an endpoint, leaving room for further improvement. Enhancing model performance through the exploration of more advanced architectures and refining the dataset, especially in terms of time interval annotations, holds promise for achieving increased accuracy and reliability. Additionally, the potential adoption of Connectionist Temporal Classification (CTC) offers an intriguing avenue, suggesting a more streamlined approach that could eliminate the need for explicit time interval annotations, thereby simplifying the learning process and potentially elevating overall transcription performance. Keywords: Automatic Music Transcription, Begena, Music Genre Classification, Convolutional Neural Network, Convolutional Recurrent Neural Network, Generative Adversarial Network, Constant-Q Transform, Mel scale, Mel-frequency cepstral coefficients. en_US
dc.language.iso en_US en_US
dc.subject Computer Science en_US
dc.title Integrating Music Genre Classification for Precision in Constructing an Automatic Music Transcription Model for Ethiopian Begena en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record