Application of Machine Learning Algorithms for Pneumonia Detection and Classification

Hamza, Shukri Abdurahman

BDU IR Home
→
Bahir Dar Institute of Technology (BiT)
→
Faculty of Electrical and Computer Engineering
→
Communication System Engineering
→
thesis
→
View Item

Application of Machine Learning Algorithms for Pneumonia Detection and Classification

Hamza, Shukri Abdurahman

URI: http://ir.bdu.edu.et/handle/123456789/15795

Date: 2023-06

Abstract:

Pneumonia is the leading cause of death in children, killing 5 million children under the age of five in 2020, with Nigeria, India, Pakistan, the Democratic Republic of the Congo, and Ethiopia accounting for half of all deaths, according to the WHO 2020 report. In this thesis work, we proposed a simple CNN model to detect the presence of pneumonia in a patient from a chest x-ray image using public and local x-ray image datasets. The x-ray image data collected from two regional hospitals in Ethiopia (Merawi and Felegehiwot Referral Hospital) is processed, analyzed, and combined in an appropriate ratio. To evaluate the whole images from each dataset, different exploratory data analysis techniques are employed to compute image quality, similarity, and variances. Exploratory analysis revealed that x-ray images obtained from Merawi hospital have low quality compared to other image sources due to exposure imbalance during imaging, whereas the image obtained from an online source is of high quality. Following that, 3 classical machine learning algorithms and 6 pretrained models where selected and trained on prepared data: SVM, KNN, and Logistic Regression, VGG16, VGG19, DenseNet121, MobileNetV2, InceptionResNetV2, and Xception. Besides that, we proposed a CNN model with few convolution layers using the Keras Sequential API, and the performances were examined using selected metrics and compared to pretrained models. We discovered that the proposed CNN model outperformed both the deep transfer learning and classical models with a test accuracy, weighted recall, and precision f1 score of 93% and an AUC of 96.02%.The model missed 7 pneumonia-infected images obtained from Felegehiwot hospital and 8 normal images obtained from Merawi hospital. During model debugging, we observed that the model couldn’t obtain enough information from pneumonia images obtained from Felegehiwot during training due to the small sample size of only 177 images, whereas the normal missed images obtained from Merawi hospital are noisy due to low exposures. The model performs well on online test data, with test accuracy, weighted precision, recall, an F1 score of 99%, and a 100% AUC score. For external validation, the model is evaluated on chest x-ray images from Debre Markos Comprehensive and specialty hospitals. The model properly predicts all images with pneumonia and wrongly labels only one normal image as having pneumonia. Furthermore, we also addressed the association factors of clinical and machine learning algorithms for pneumonia diagnosis and the key challenges and issues in the clinical application of machine learning techniques in terms of various factors. As a result, the thesis work will be extremely beneficial, especially in developing countries where the medical healthcare economy is crucial. Keywords: Chest X-ray image, Evaluation metrics, Machine learning algorithm, Medical image dataset, Pneumonia.

Show full item record