Overview: In the latest study published on March 20 in the journal Nature, Grey Nearing and his colleagues from Google Research’s flood forecasting team demonstrated how to breakthrough by applying AI technology, significantly enhancing the timeliness and accuracy of global flood forecasts. Compared to the most advanced global flood forecasting system (GloFAS), this model uses publicly available global data to extend the warning period from immediate (0 days) to 5 days. At the same time, this innovative research not only outperforms traditional models in predicting five-year flood events but also bridges the gap for developing countries lacking long-term hydrological data, elevating flood prediction results in Africa and Asia to levels comparable to Europe. This AI model has been integrated into an operational early warning system, providing publicly available real-time flood information to over 80 countries worldwide. This achievement represents a significant advancement in the field of global flood management and disaster prevention technology, emphasizing the importance of open sharing of global hydrological data and paving new ways to improve the accuracy of global flood forecasts.
01
Research Background and Challenges
-
The Universality and Impact of Floods: As one of the most common natural disasters globally, approximately 19% of the world’s population faces the threat of flooding events. Economic losses due to floods reach up to $50 billion annually. Since 2000, the frequency of floods has more than doubled due to climate change and human activities intensifying the hydrological cycle.
- The Importance and Challenges of Flood Forecasting Systems: Although flood forecasting systems can significantly reduce casualties and property losses, only a few basins in low- and middle-income countries, which account for 90% of the global population vulnerable to flooding, have access to river hydrological data. According to the World Bank, if flood warning systems in developing countries were improved to the level of developed countries, an average of 23,000 lives could be saved each year.
- Runoff Estimation in Basins Without Hydrological Data: The International Association of Hydrological Sciences (IAHS) designated the “Prediction in Ungauged Basins (PUB)” problem as a decade-long challenge from 2003 to 2012. At the end of this decade, IAHS reported minimal substantive progress on this issue. The PUB problem poses a technical bottleneck for effectively forecasting flood events in developing countries lacking runoff data.
02
Research Methodology
- AI Model Construction: This study employed a Long Short-Term Memory (LSTM) based encoder-decoder architecture to estimate daily river runoff for the next seven days. The architecture consists of an encoder and a decoder (as shown in the figure below), where the encoder processes past year weather data, and the decoder estimates river runoff based on future 7-day weather forecast data. The model predicts the asymmetric Laplace distribution parameters of river flow at each time step, thereby improving prediction accuracy and reliability.
- Model Composition: The model uses an LSTM with 256 embedded hidden units, trained for 10 hours on an NVIDIA-V100 GPU, with a training dataset size of 50,000 batches, evaluated using temporal and spatial cross-validation. The model predicts the unidirectional asymmetric Laplace distribution parameters of area-normalized surface runoff at each time step.
- Data Support: The model training dataset includes input and target information for 152,259 years across 5,680 basins. This dataset relies on publicly available global river flow data (GRDC), daily single-layer forecast data from the ECMWF Integrated Forecasting System (IFS), ERA5-Land reanalysis data, precipitation data from NASA’s IMERG, and geographical and hydrological attributes from the HydroATLAS database, totaling 60GB in size.
- Cross-Validation: First, temporal splitting of the data was performed using a fold cross-validation method, ensuring that no test data from any flow gauge was used within one year (the sequence length of the LSTM encoder) of training data. Second, spatial splitting of the data was performed using random (without replacement) k-fold cross-validation (k=10). This cross-validation process was repeated to ensure that all data from all gauges (1984-2021) was predicted in an out-of-sample manner temporally and spatially. The measured data was split, and the measured data was spatially split (as shown) in a non-random manner.
In the cross-validation experiments, the splitting coefficients were as follows: (1) intercontinental cross-validation split (k=6); (2) cross-validation split across climate zones (k=13); (3) cross-validation split across hydrologically separated basin groups (k=8).
03
Research Results: AI Enhances Prediction Reliability
To measure the reliability of extreme event predictions, based on the above methods, the study tested precision, recall, and F1 score (the harmonic mean of precision and recall) under different recurrence interval conditions. The results are as follows:Figure 1 | Differences in Immediate F1 Scores between the AI Model and GloFAS for Two-Year Recurrence Flood Events (1984-2021). The AI model outperformed GloFAS in 70% of the gauges (N=3,673). Figure 1 shows the distribution of differences in F1 scores for immediate warnings (lead time: 0 days) for two-year recurrence flood events globally from 1984 to 2021 (N=3,360). The lead time is expressed as the number of days from the prediction time, so a lead time of 0 days means the river flow forecast is for the same day. The AI model showed superior or equivalent results to the GloFAS system in 64% (65%), 70% (73%), 60% (73%), and 49% (76%) of gauges for events with recurrence intervals of 1 year (N=3,638, P=6×10−87, Cohen’s d=0.22), 2 years (N=3,673, P<3×10−181, d=0.41), 5 years (N=3,360, P=8×10−130, d=0.42), and 10 years (N=2,920, P<1×10−66, d=0.33) respectively.(1) Comparison of Forecast Accuracy for Different Recurrence Intervals
Figure 2 | Boxplots of Accuracy (a) and Recall (b) for Short-Term Forecasts (0 Days Lead Time) under Different Recurrence Intervals. On average, the AI model is more reliable across all recurrence intervals. The accuracy of the AI model for five-year flood events is statistically indistinguishable from GloFAS’s accuracy for one-year recurrence events, while the recall of the AI model surpasses that of GloFAS for one-year flood events. The main text provides detailed explanations of the statistical tests. In all scenarios, GloFAS and the AI model were compared within the same set of gauges. GloFAS simulated data is sourced from climate data repositories. (2) Comparison of Results for Different Forecast Lead Times Figure 3 shows the distribution of F1 scores within a lead time range of 7 days across recurrence intervals from 1 to 10 years. Compared to GloFAS’s near-term forecasts (0 days lead time), AI predictions outperform or show no significant differences in reliability (F1 scores) for events with recurrence intervals of 1 year (AI significantly better; N=2,415, P=6×10−6, d=0.08), 2 years (no statistical difference; N=2,162, P=0.98, d=2×10−4), and 5 years (no statistical difference; N=1,298, P=0.69, d=0.025) within a maximum lead time of 5 days.
Figure 3 | Distribution of F1 Scores for All Evaluated Gauges across Different Recurrence Interval Lead Times (a-d). The AI model’s F1 scores for events with recurrence intervals of 1 year (a), 2 years (b), 5 years (c), and 10 years (d) within a maximum lead time of 5 days are either better than or show no statistical differences compared to GloFAS for the same events with 0 days lead time. The main text reports the statistical tests. The boxes represent the interquartile range of the distribution, while the whiskers represent the full range excluding outliers.(3) Comparison of Forecast Results Across Continents
Figure 4 | Boxplots of F1 Score Distribution for Different Continents and Recurrence Intervals (a-d). The AI model scored higher across all continents for events with recurrence intervals of 1 year (a), 2 years (b), 5 years (c), and 10 years (d), with three exceptions: one-year recurrence events in Africa and five-year and ten-year recurrence events in Asia, where no statistical differences were observed. There are significant geographic differences in reliability between the two models, which can be addressed by increasing global access to open hydrological data. The main text reports the statistical tests. The boxes represent the interquartile range of the distribution, while the whiskers represent the full range excluding outliers. GloFAS simulated data comes from climate data repositories.
04
Conclusion and Discussion (Credibility and Breakthrough)
Although hydrological simulation is a relatively mature research field, the areas most susceptible to flood risks often lack reliable forecasting and warning systems. By leveraging artificial intelligence and open datasets, this study significantly enhances the accuracy, recall, and early warning period for short-term (0-7 days) extreme flood event predictions.Figure 5 | Comparison of Model Results with Averages (a: GIoFAS Model; b: AI Model) at Any Given Location; c: F1 Score Correlation with HydroATLAS Basin Attributes.
Figure 6 | Spatial Distribution of F1 Scores for Two-Year Recurrence Flood Events at Global Scale Using AI Model. This map shows the forecast results of F1 scores for two-year recurrence flood events across 1,030,000 level 12 HydroBASINS basins using the AI prediction model.
- Credibility: By comparing with existing global flood warning systems (such as GloFAS), the AI model demonstrates at least comparable accuracy in predictions, and even superior performance. This comparative result confirms the high credibility of the AI model in predicting floods in basins without observed hydrological data.
- Timeliness: The AI model can provide up to five days of early warning, compared to traditional models that typically only offer immediate forecasts (0 days lead time), thus providing a valuable time window for disaster prevention and mitigation.
- Global Applicability: This study overcomes the challenges of incomplete or missing gauge data, enabling accurate flood predictions in basins without hydrological observation data worldwide, which is particularly significant for developing countries.
In summary, this research not only highlights the potential application of AI in global flood forecasting but also provides an effective approach for accurate flood predictions in basins without hydrological data, breaking the limitations of traditional flood forecasting models and offering new directions for improving global flood management and forecasting systems.
Source: International Conference on Flood Management
Editor: Xu Puqiong
First Review: Du Zhongying
Second Review: Zhang Jixiao
Share
Collect
Like
View