Publications des scientifiques de l'IRD

Sirpa-Poma J. W., Calle J., Uscamayta-Ferrano E., Molina-Carpio J., Satgé Frédéric, Toledo O. C., Duran R., Mollinedo P. P., Hussain R., Pillco-Zolá R. (2025). Development of hourly resolution air temperature across Titicaca Lake on auxiliary ERA5 variables and machine learning-based gap-filling. Sensors, 25 (23), p. 7165 [20 p.].

Titre du document
Development of hourly resolution air temperature across Titicaca Lake on auxiliary ERA5 variables and machine learning-based gap-filling
Année de publication
2025
Type de document
Article référencé dans le Web of Science WOS:001635326000001
Auteurs
Sirpa-Poma J. W., Calle J., Uscamayta-Ferrano E., Molina-Carpio J., Satgé Frédéric, Toledo O. C., Duran R., Mollinedo P. P., Hussain R., Pillco-Zolá R.
Source
Sensors, 2025, 25 (23), p. 7165 [20 p.]
This article presents an innovative procedure that combines advanced quality control (QC) methods with machine learning (ML) techniques to produce reliable, continuous, high-resolution meteorological data. The approach was applied to hourly air temperature records from six automatic weather stations located around Lake Titicaca in the Altiplano region of South America. The raw dataset contained time gaps, inconsistencies, and outliers. To address these, the QC stage employed Interquartile Range, Biweight, and Local Outlier Factor (LOF) statistics, resulting in a clean dataset. Two gap-filling methods were implemented: a spatial approach using time series from nearby stations and a temporal approach based on each station's time series and selected variables from the ERA5-Land reanalysis. Several ML models were also employed in this process: Random Forest (RF), Support Vector Machine (SVM), Stacking (STACK), and AdaBoost (ADA). Model performance was evaluated on a validation subset (30% of station data). The RF model achieved the best results, with R2 values up to 0.9 and Root Mean Square Error (RMSE) below 1.5 degrees C. The spatial approach performed best when stations were strongly correlated, while the temporal approach was more suitable for locations with low inter-station correlation and high local variability. Overall, the procedure substantially improved data reliability and completeness, and it can be extended to other meteorological variables.
Plan de classement
Sciences fondamentales / Techniques d'analyse et de recherche [020] ; Sciences du milieu [021]
Description Géographique
PEROU ; BOLIVIE ; TITICACA LAC
Localisation
Fonds IRD [F B010095905]
Identifiant IRD
fdi:010095905
Contact
  • Coordonnées :
    Mission Science Ouverte (MSO)
    IRD - Délégation régionale Île-de-France & Ouest
    Campus Condorcet - Hôtel à projets
    8 cours des Humanités - 93322 Aubervilliers Cedex
    Horizon Pleins textes
    Aide
  •