GenCast, a up-to-date artificial intelligence model from Google DeepMind, is precise enough to compete with classic weather forecasts. Recently published research shows that it outperformed the leading forecasting model when tested on 2019 data.
Artificial intelligence won’t replace classic forecasting any time soon, but it could add to the arsenal of tools used to predict weather and warn the public about severe storms. GenCast is one of several AI weather forecasters models being developed which could lead to more precise predictions.
GenCast is one of several AI-powered weather forecast models that can lead to more precise forecasts
“Weather basically affects every aspect of our lives… it’s also one of the biggest challenges in science, predicting the weather,” says Ilan Price, senior scientist at DeepMind. “Google DeepMind is on a mission to advance artificial intelligence for the benefit of humanity. I think this is an important way and an important contribution in this regard.”
Price and his colleagues tested GenCast against the ENS system, one of the world’s best forecast models run by the European Center for Medium-Range Weather Forecasts (ECMWF). The study showed that GenCast outperformed ENS 97.2% of the time published this week in the magazine Nature.
GenCast is a machine learning weather forecasting model trained on weather data from 1979–2018. The model learns to recognize patterns in four decades of historical data and uses them to predict what might happen in the future. This is very different from how classic models like ENS work, which still rely on supercomputers to solve complicated equations to simulate atmospheric physics. They produce both GenCast and ENS team forecastswhich offer a number of possible scenarios.
For example, when it came to predicting the path of a tropical cyclone, GenCast was able to provide an additional 12 hours of warning on average. GenCast was generally better at predicting cyclone tracks, extreme weather and wind energy production up to 15 days in advance.
The only caveat is that GenCast tested itself against an older version of ENS that now runs at a higher resolution. The peer-reviewed study compared GenCast forecasts with ENS forecasts for 2019, examining how close each model came to actual conditions this year. According to ECMWF machine learning coordinator Matt Chantry, the ENS system has improved significantly since 2019. This makes it challenging to say how well GenCast can perform against ENS today.
Of course, resolution is not the only vital factor when it comes to making precise predictions. ENS was already running at a slightly higher resolution than GenCast in 2019, and GenCast still managed to beat it. DeepMind claims to have conducted similar research on data from 2020 to 2022 and obtained similar results, although these have not been verified. However, he did not have data allowing for comparisons for 2023, when ENS began to operate at a much higher resolution.
By dividing the world into a grid, GenCast operates at a resolution of 0.25 degrees, which means that each square of that grid is a quarter of a degree of latitude and a quarter of a degree of longitude. For comparison, in 2019 ENS used a resolution of 0.2 degrees and currently it is 0.1 degrees.
Nevertheless, the development of GenCast “represents a significant milestone in the evolution of weather forecasting,” Chantry said in an emailed statement. In addition to ENS, ECMWF says it also maintains its own version of the file machine learning system. Chantry says it “draws inspiration from GenCast.”
Speed is GenCast’s advantage. It can generate one 15-day forecast in just eight minutes using one Google Cloud TPU v5. Physics-based models like ENS can take several hours to do the same. GenCast bypasses all the equations that ENS must solve, so it takes less time and computational power to produce a forecast.
“From a computational standpoint, traditional forecasting is orders of magnitude more expensive to run compared to a model like Gencast,” Price says.
This efficiency could alleviate some concerns about the environmental impact of energy-intensive AI data centers, which have already contributed to Google’s greenhouse gas emissions in recent years. However, it’s challenging to assess how GenCast stacks up against physics-based models when it comes to sustainability without knowing how much energy is spent training the machine learning model.
There are still improvements that GenCast could make, including potential upscaling to higher resolutions. Moreover, GenCast generates forecasts at 12-hour intervals compared to classic models, which typically do so at shorter intervals. This may impact how these forecasts are used in the real world (for example, to assess wind energy availability).
“We kind of wonder, is this good? And why?”
“You want to know what’s going to happen with the wind throughout the day, not just at 6 a.m. and 6 p.m.,” says Stephen Mullens, an assistant professor of meteorology at the University of Florida who was not involved in the GenCast study.
While there is growing interest in using artificial intelligence to improve forecasts, it has yet to be tested. “People are looking at it. “I don’t think you can buy and sell the entire meteorological community on this,” Mullens says. “We are trained scientists who think in terms of physics… and since artificial intelligence is not really that, there is still an element where we wonder, is this a good thing? And why?”
Forecasters can check GenCast for themselves; DeepMind has released code for its open source model. Price says he sees GenCast and more improved AI models being used in the real world alongside classic models. “Once these models get into the hands of practitioners, they will further build trust,” Price says. “We really want this to have a broad social impact.”