Abstract:
For regions where TEC data from ground observations is unavailable, modeled TEC is
used to characterize ionosphere conditions. Accurately predicting TEC data is crucial
to mitigating its effects on radio communication and related applications. This study
investigated the performance of four machine learning models for predicting hourly GPS VTEC data from a single station in Addis Ababa, Ethiopia, employing the gradient
boosting machine (GBM), extreme gradient boosting (XGBoost), light gradient boosting
machine (LightGBM), and a stacked model of these three algorithms with a linear
regression model as both a base learner and meta ensemble model. Model input variables
include the effects of solar activity, geomagnetic activity, season, time of day, solar wind,
and the interplanetary magnetic field. The models were trained using the available GPS VTEC data from 2011 to April 30, 2017, and their performance was tested using the
data from May 1, 2017 to the end of 2018. The RandomizedSearchCV algorithm was
applied to determine the optimal hyperparameters of the models. Based on statistical
analysis, the VTEC values predicted by the four models have approximately similar
linear correlations with the GPS VTEC, with an R value of 0.95. The stacked model has
slightly minimized errors with RMSE, MAE, and standard error values of 3.013, 2.364,
and 2.946 TECU, respectively, and has slightly improved the predictive performance
of the three gradient-boosting models. The three gradient-boosting-based models have
approximately similar performance in VTEC prediction. A comparison is made between
GPS-VTEC and predicted values depending on diurnal and seasonal characteristics, and
the results show that in most cases, the VTEC predictions of the developed models
are well correlated with the GPS-VTEC. Our results indicate that the use of gradient boosting-based methods and their stacked integration show potential for predicting
VTEC with good accuracy and efficiency in the low-latitude ionospheric region.