Accurate prediction of daily mean temperature in high-altitude environments remains challenging due to pronounced thermal variability, strong seasonal oscillations, and data limitations typical of remote meteorological stations. This study proposes a vectorial feature engineering strategy for temporal variables within a Gated Recurrent Unit (GRU) framework, applied to the Puno station (3,827 m a.s.l.) in the Peruvian Altiplano. The dataset comprises 7,279 daily observations (2003–2024), preprocessed through a four-stage pipeline: sentinel value detection, dual-consensus outlier removal (interquartile range and monthly Z-score), cubic spline imputation, and Min–Max normalization. The core contribution is the cyclic sine–cosine encoding of calendar variables (month, day), projecting discrete temporal indices onto the unit circle in ℝ². This representation removes artificial discontinuities at calendar boundaries (e.g., December–January), which persist under conventional linear scaling. The proposed architecture—GRU (64), SpatialDropout1D (0.2), GRU (32), Dense (16, ReLU), Dense (1)—is trained using 30-day sliding windows for one-step-ahead forecasting. A controlled ablation study compares the proposed approach against an identical baseline using normalized but non-cyclic temporal features. On the test set (n = 1,088; 2021–2024), cyclic encoding yields consistent improvements: MAE decreases from 0.0663 to 0.0655 (+1.17%), RMSE from 0.0836 to 0.0829 (+0.82%), MAPE from 13.20% to 12.63% (+4.32%), and R² increases from 0.6705 to 0.6758 (+0.80%). The denormalized error (0.81 °C) remains within standard measurement uncertainty, confirming the effectiveness of cyclic temporal encoding as an inductive bias without increasing model complexity.
This work is licensed under a Creative Commons Attribution 4.0 International License.