This work presents a rigorous mathematical framework for monitoring meteorological drought by integrating advanced machine learning, feature selection, and clustering strategies. The central aim is to predict drought events based on a combination of meteorological indicators, using the Standardized Precipitation Index (SPI) as the target variable. A new Composite Drought Index (CDI) is introduced to encapsulate multivariate drought information, constructed from the Precipitation Concentration Index, Temperature Condition Index, Wind Speed Condition Index, and Soil Moisture Condition Index. The CDI demonstrates strong empirical consistency with the established drought index, SPI, based on four decades of data from thirty-two meteorological stations. To capture spatial heterogeneity, the study employs fuzzy clustering to group stations into meteorologically homogeneous classes. Within each cluster, the Boruta algorithm is used to isolate the most relevant features by assessing their relative importance, ensuring that only statistically informative variables contribute to model construction. Drought prediction is then performed using a suite of machine learning models, including Random Forest, Support Vector Regression, Extreme Gradient Boosting, and Deep Feedforward Neural Networks. A hybrid model combining deep neural networks with random forests achieves the best overall performance by extracting latent features through deep architectures and refining predictions via ensemble methods. This hybrid yields the lowest prediction errors, with Mean Absolute Error ranging from 0.1570 to 0.2664, Mean Squared Error between 0.0409 and 0.1093, and Root Mean Squared Error between 0.2022 and 0.3306. It also attains the highest Nash-Sutcliffe Efficiency, from 0.8973 to 0.9547, and Kling-Gupta Efficiency, from 0.7253 to 0.8807. The study's main contributions include the formal definition of CDI as a multivariate index, the incorporation of fuzzy clustering to enhance spatial generalization, and the deployment of a deep-ensemble model to capture complex nonlinear and temporal dependencies in meteorological data. Empirical results demonstrate that CDI significantly outperforms univariate indices, and that the hybrid model provides better predictive performance than conventional deep learning approaches such as CNN and LSTM. The framework is adaptable for real-time drought monitoring and early warning systems, offering practical value for climate resilience in drought-prone regions.

Streamlined meteorological drought monitoring through fuzzy clustering and deep learning

Di Persio, Luca;
2025-01-01

Abstract

This work presents a rigorous mathematical framework for monitoring meteorological drought by integrating advanced machine learning, feature selection, and clustering strategies. The central aim is to predict drought events based on a combination of meteorological indicators, using the Standardized Precipitation Index (SPI) as the target variable. A new Composite Drought Index (CDI) is introduced to encapsulate multivariate drought information, constructed from the Precipitation Concentration Index, Temperature Condition Index, Wind Speed Condition Index, and Soil Moisture Condition Index. The CDI demonstrates strong empirical consistency with the established drought index, SPI, based on four decades of data from thirty-two meteorological stations. To capture spatial heterogeneity, the study employs fuzzy clustering to group stations into meteorologically homogeneous classes. Within each cluster, the Boruta algorithm is used to isolate the most relevant features by assessing their relative importance, ensuring that only statistically informative variables contribute to model construction. Drought prediction is then performed using a suite of machine learning models, including Random Forest, Support Vector Regression, Extreme Gradient Boosting, and Deep Feedforward Neural Networks. A hybrid model combining deep neural networks with random forests achieves the best overall performance by extracting latent features through deep architectures and refining predictions via ensemble methods. This hybrid yields the lowest prediction errors, with Mean Absolute Error ranging from 0.1570 to 0.2664, Mean Squared Error between 0.0409 and 0.1093, and Root Mean Squared Error between 0.2022 and 0.3306. It also attains the highest Nash-Sutcliffe Efficiency, from 0.8973 to 0.9547, and Kling-Gupta Efficiency, from 0.7253 to 0.8807. The study's main contributions include the formal definition of CDI as a multivariate index, the incorporation of fuzzy clustering to enhance spatial generalization, and the deployment of a deep-ensemble model to capture complex nonlinear and temporal dependencies in meteorological data. Empirical results demonstrate that CDI significantly outperforms univariate indices, and that the hybrid model provides better predictive performance than conventional deep learning approaches such as CNN and LSTM. The framework is adaptable for real-time drought monitoring and early warning systems, offering practical value for climate resilience in drought-prone regions.
2025
meteorological drought monitoring , fuzzy clustering , deep learning , stochastics
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1171690
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 7
social impact