Abstract:
Solar irradiance is a critical factor influencing the output of solar power plants, and its accurate
prediction can help reduce the uncertainty caused by its intermittent nature. This study evaluates
the performance of Multiple Linear Regression (MLR) and Decision Tree (DT) machine learning
models in predicting Direct Normal Irradiance (DNI) over Lalibela, Ethiopia, using
meteorological satellite data from the National Solar Radiation Database (NSRDB) from
January 1, 2017, to December 31, 2019. The models were trained on 80% of the dataset and
tested on the remaining 20%, with input variables including temperature, relative humidity,
solar zenith angle, wind speed, wind direction, diffuse horizontal irradiance, and global
horizontal irradiance. Statistical metrics such as Root Mean Square Error (RMSE), Mean
Absolute Error (MAE), and Pearson's correlation coefficient (R) were used to assess the models'
prediction accuracy. The DT model demonstrated superior performance, achieving an R-value of
0.9997 in the testing phase, with RMSE and MAE values of 7.8378 w/m
2
and 3.1578 w/m
,
respectively. In contrast, the MLR model exhibited a lower R-value of 0.94, with higher RMSE
and MAE values of 102.33 w/m
2
and 72.28 w/m
2
, respectively. The results indicate that the DT
model out performs the MLR model in predicting DNI, suggesting that machine learning
techniques like Decision Trees are more effective in handling complex relationships between
meteorological variables and solar irradiance.
2