Integrating Music Genre Classification for Precision in Constructing an Automatic Music Transcription Model for Ethiopian Begena

Bisrat, Getnet

dc.contributor.author	Bisrat, Getnet
dc.date.accessioned	2024-12-05T07:22:45Z
dc.date.available	2024-12-05T07:22:45Z
dc.date.issued	2023-12
dc.identifier.uri	http://ir.bdu.edu.et/handle/123456789/16273
dc.description.abstract	This thesis delves into the intricate realm of Automatic Music Transcription (AMT), with a specific focus on the Begena, a ten-stringed Ethiopian musical instrument. The primary aim is to address challenges associated with manual music transcriptions, including time-consuming efforts, inconsistencies, and financial burdens. Additionally, the study aims to overcome limitations in existing AMT models and systems, such as the absence of scale identification, variations in notation, and low precision. A groundbreaking aspect of this research is the integration of Music Genre Classification (MGC) into AMT for the Begena, marking a unique advancement in the field. The investigation combines manual feature extraction methods and spectrogram-based experiments within the MGC model. The manual feature extraction approach employs multiple traditional machine learning algorithms and a Convolutional Neural Network (CNN) architecture. Meanwhile, the spectrogram-based experiment utilizes features like Constant-Q Transform (CQT), Mel scale, and Mel-frequency cepstral coefficients (MFCC), trained with various deep learning algorithms. Remarkably, the MFCC-trained models stand out, achieving a perfect accuracy score of 1.00 in both the validation and testing phases for all models. Shifting the focus to AMT models, three deep learning algorithms—CNN, CRNN, and Generative Adversarial Network (GAN)—are implemented and meticulously trained on distinct dataset groups. The GAN model, crafted to guide the CRNN model during training, emerges as the most effective, achieving outstanding frame-level F1 scores of 0.867 for validation and 0.860 for testing. Furthermore, an innovative attempt to refine AMT predictions through MGC model auto-correction exhibits minimal improvement. A thorough analysis of the experiments reveals the model’s robust performance in specific playing styles, such as Qoutera songs, but comparatively less in Derib songs. Our models showcase advantages in note identification, capturing note overlaps, and scale identification, while also revealing limitations in terms of note omissions and expanding errors. In conclusion, this research makes a significant contribution to the progression of AMT, particularly tailored for traditional instruments like the Begena. However, the study signifies a starting point rather than an endpoint, leaving room for further improvement. Enhancing model performance through the exploration of more advanced architectures and refining the dataset, especially in terms of time interval annotations, holds promise for achieving increased accuracy and reliability. Additionally, the potential adoption of Connectionist Temporal Classification (CTC) offers an intriguing avenue, suggesting a more streamlined approach that could eliminate the need for explicit time interval annotations, thereby simplifying the learning process and potentially elevating overall transcription performance. Keywords: Automatic Music Transcription, Begena, Music Genre Classification, Convolutional Neural Network, Convolutional Recurrent Neural Network, Generative Adversarial Network, Constant-Q Transform, Mel scale, Mel-frequency cepstral coefficients.	en_US
dc.language.iso	en_US	en_US
dc.subject	Computer Science	en_US
dc.title	Integrating Music Genre Classification for Precision in Constructing an Automatic Music Transcription Model for Ethiopian Begena	en_US
dc.type	Thesis	en_US