Sirpa-Poma J. W., Calle J., Uscamayta-Ferrano E., Molina-Carpio J., Satgé Frédéric, Toledo O. C., Duran R., Mollinedo P. P., Hussain R., Pillco-Zolá R. (2025). Development of hourly resolution air temperature across Titicaca Lake on auxiliary ERA5 variables and machine learning-based gap-filling. Sensors, 25 (23), p. 7165 [20 p.].
Titre du document
Development of hourly resolution air temperature across Titicaca Lake on auxiliary ERA5 variables and machine learning-based gap-filling
Année de publication
2025
Auteurs
Sirpa-Poma J. W., Calle J., Uscamayta-Ferrano E., Molina-Carpio J., Satgé Frédéric, Toledo O. C., Duran R., Mollinedo P. P., Hussain R., Pillco-Zolá R.
Source
Sensors, 2025,
25 (23), p. 7165 [20 p.]
This article presents an innovative procedure that combines advanced quality control (QC) methods with machine learning (ML) techniques to produce reliable, continuous, high-resolution meteorological data. The approach was applied to hourly air temperature records from six automatic weather stations located around Lake Titicaca in the Altiplano region of South America. The raw dataset contained time gaps, inconsistencies, and outliers. To address these, the QC stage employed Interquartile Range, Biweight, and Local Outlier Factor (LOF) statistics, resulting in a clean dataset. Two gap-filling methods were implemented: a spatial approach using time series from nearby stations and a temporal approach based on each station's time series and selected variables from the ERA5-Land reanalysis. Several ML models were also employed in this process: Random Forest (RF), Support Vector Machine (SVM), Stacking (STACK), and AdaBoost (ADA). Model performance was evaluated on a validation subset (30% of station data). The RF model achieved the best results, with R2 values up to 0.9 and Root Mean Square Error (RMSE) below 1.5 degrees C. The spatial approach performed best when stations were strongly correlated, while the temporal approach was more suitable for locations with low inter-station correlation and high local variability. Overall, the procedure substantially improved data reliability and completeness, and it can be extended to other meteorological variables.
Plan de classement
Sciences fondamentales / Techniques d'analyse et de recherche [020]
;
Sciences du milieu [021]
Description Géographique
PEROU ; BOLIVIE ; TITICACA LAC
Localisation
Fonds IRD [F B010095905]
Identifiant IRD
fdi:010095905