Recent study from MIT suggests that the largest and most computationally intensive AI models may soon offer diminishing returns compared to smaller models. By comparing scaling laws with continuous improvements in model performance, researchers found that squeezing performance leaps from gigantic models could become more arduous, while performance gains could make models running on more modest hardware increasingly more effective over the next decade.
“It’s very likely that things will start to narrow in the next five to 10 years,” says Neil Thompson, a computer scientist and MIT professor involved in the study.
Performance spikes like those seen in January with the ultra-low-cost DeepSeek model have already served as a reality check for the artificial intelligence industry, which is accustomed to burning massive amounts of computing power.
As it stands, a pioneering model from a company like OpenAI is currently significantly better than a model trained with a fraction of the computing power of an academic lab. While the MIT team’s predictions may not come true if, for example, modern training methods such as reinforcement learning produce surprising modern results, they suggest that gigantic AI companies will have less of an advantage in the future.
Hans Gundlach, the MIT scientist who led the analysis, became interested in the issue because of the cumbersome nature of operating state-of-the-art models. Together with Thompson and Jayson Lynch, another MIT researcher, he plotted the future performance of pioneer models compared to those built using more modest computational means. Gundlach says the predicted trend is particularly pronounced for currently fashionable inference models, which rely more heavily on additional computations during inference.
Thompson says the results show how essential it is to improve the algorithm and scale up the computation. “If you’re spending a lot of money training these models, you should definitely allocate some of that money to developing more efficient algorithms, because that can make a huge difference,” he adds.
The study is particularly engaging given today’s AI infrastructure boom (or should we say “bubble”?) that shows no signs of slowing down.
OpenAI and other US technology companies have signed deals worth one hundred billion dollars to build artificial intelligence infrastructure in the United States. “The world needs much more computing power” – OpenAI CEO Greg Brockman announced this week announcing a partnership between OpenAI and Broadcom for custom AI chips.
An increasing number of experts question the validity of these transactions. More or less 60 percent the cost of building a data center is consumed by graphics processors, which quickly depreciate in value. Partnerships between major players are also emerging round and opaque.
