Abstract:
Neonatal jaundice is a leading reason behind neonatal admission and deaths in most parts of the low and middle-income countries. It is a yellowing of a baby’s skin and eyes. Neonatal jaundice is very common and can occur when babies have a high level of bilirubin. If left untreated, elevated bilirubin can lead to brain damage. The need for early prediction of jaundice has become progressively necessary for distinguishing those babies in danger of hyperbilirubinemia. In Ethiopia, no research has been conducted in the early prediction of neonatal jaundice severity using machine learning to support the health sector. So, we propose a machine learning model for early prediction of neonatal jaundice severity.
In this paper, we developed a prediction model using identified classification factors in order to classify the risk level of jaundice as low, moderate and high level. We collect datasets from Bahir Dar Felege Hiwot Referral Hospital, Ethiopia. The dataset contains 550 neonates with 19 classification factors identified from neonates that are used to formulate the predictive model for classification of the risk levels of jaundice. 80% of the dataset is used for training and the rest for testing. Also, 10-fold cross validation is used. After data organized data pre-processing such as handling missing data by calculating the mean value of column, encode categorical data by applying one-hot encoding and Feature scaling independent data.
In this work, we developed three machine learning models. The first model was multi-layer perceptron model with different optimizer that is trained on pre-processed data and select best. The second model was support vector machine, which is trained on pre-processed data with different kernel function and select the best. Then random forest model is trained on pre-processed data with different tree numbers and select the best. The Python programming language was used to implement the model. After all, we have tested the model on test dataset and the result is presented with confusion matrix, accuracy, precision and f1-score. Random forest has got 95% while support vector machine has achieved 97%, but multi-layer perceptron model has achieved 98% accuracy.