And is a "pork of energy", but Deepseek can change it

Deepseek surprised everyone last month with the claim that his artificial intelligence model uses about one tenth amount of computing force as a Llam 3.1 meta model, increasing the whole worldview on how much energy and resources he needs to develop artificial intelligence.

Accepting the nominal value, this claim may have huge implications on the impact of AI on the environment. Technology giants are in a hurry to build huge AI data centers with plans to operate the same number of electricity as small cities. Generating such a vast amount of electricity causes pollution, raising concerns about how physical infrastructure subject to up-to-date generative AI tools can exacerbate climate change and deteriorate air quality.

Reduction, how much energy is needed for training and launching of generative AI models can alleviate a lot of this stress. But it is too early to assess whether the Deepseek will be changing the game when it comes to AI environment. Much will depend on how other main players react to the breakthrough of the Chinese startup, especially taking into account the plans to build up-to-date data centers.

“There is a choice in this matter.”

“It only shows that AI does not have to be an energy pork,” he says Madala Singh. “There is a choice in this matter.”

In December, the confusion in Deepseek began with the release of its V3 model, which cost only $ 5.6 million for the final training and $ 2.78 million for training on older H800 NVIDIA systems, according to A Technical report from the company. For comparison, meta lama 3.1 405b – despite the operate of newer, more proficient H100 systems – he accepted 30.8 million GPU hours train. (We do not know the exact costs, but the estimates for Lama 3.1 405B were nearby $ 60 million and from 100 million to $ 1 billion for comparable models.)

Then Deepseek released his R1 model last week, which turned to the capitalization of Marc Andreessen “called”A deep gift for the world. “Quick AI and company assistant shot up Apple and Google’s App Stores. And on Monday he sent the prices of competitors’ share in Nosedive to establish Deepseek was able to create an alternative to Lama, Gemini and Chatgpt for a fraction of the budget. Nvidia, whose tokens enable all these technologies, it was observed that the price of the action fell in the news, that V3 Deepseek only needed 2000 tokens to train compared to 16,000 tokens or more competitors needed.

Deepseek claims that he was able to limit how much electricity it consumes using more efficient training methods. Technically used carefree strategy. Singh says that it boils down to the more selective, with what part of the model they are trained; You don’t have to train the whole model at the same time. Singh claims that if you think that the AI model as a great customer service company with many experts is more selective in the choice of experts.

The model also saves energy when it comes to applying, i.e. when the model is designed to do, through the so -called Key value buffering and compression. If you write a story that requires research, you can think about this method as the possibility of index cards references with high level summaries while writing, and not the need to read the entire report that has been summarized, explains Singh.

Singh is particularly optimistic that Deepseek models are mostly open source, without training data. Thanks to this approach, scientists can learn from each other faster and opens the door for smaller players to enter the industry. This is also a precedent for greater transparency and responsibility that investors and consumers can be more critical of what resources develop the model.

A double sword is to be considered

“If we have shown that these advanced possibilities of artificial intelligence do not require such a huge wear of resources, it will open a bit more breathing for more sustainable infrastructure planning,” says Singh. “It can also encourage these agreed AI laboratories, such as Open AI, AITropic, Google Gemini, to develop more proficient algorithms and techniques, and go beyond the type of brutal strength, simply adding more data and computing power to these models.”

There is certainly skepticism about Deepseek. “We did some kicking in Deepseek, but it is arduous to find specific facts about energy consumption,” said Carlos Torres Diaz, the head of energy research in Rystad Energy.

If the company claims about its energy consumption, this may reduce the total energy consumption of the data center, writes Torres Diaz. And while large technology companies have signed a lot of transactions to obtain renewable energy, the rapid demand for electricity from data centers still risk throwing limited sun and wind resources from the energy networks. Reducing the consumption of AI electricity “, in turn, would provide more renewable energy for other sectors, helping to displace fossil fuels faster,” according to Torres Diaz. “In general, less energy demand from any sector is beneficial for a global energy transition, because in the long term it would be needed to produce mini -powered energy with mines.”

There is a double -edged sword to be considered with more energy -saving AI models. Satya Nadella, general director of Microsoft wrote on x O Jevons Paradox, in which the more efficient the technology becomes, the more likely to apply it. Environmental damage is growing as a result of an increase in performance.

“The question is: gee, if we could abandon AI energy consumption about the 100, it means that 1000 data providers have arrived and says:” Wow, that’s great. We intend to build, build, build 1000 times more than we planned “?” Says Philip Kerin, professor of research on electrical and computer engineering at the University of Illinois Urbana-Champaign. “Over the next 10 years will be a really engaging thing.” Torres Diaz also said that this problem makes it too early to revise energy consumption forecasts “fall significantly”.

No matter how much electricity uses the data center, it is important to look at where this electricity comes from to understand how much pollution causes. China still gets Over 60 percent of electricity from coaland different 3 percent It comes from gas. The United States also revives 60 percent of electricity from fossil fuelsBut most come from gas – which creates less pollution of carbon dioxide After burning than coal.

Even worse, energy companies are Delay of pension of the US fuel power plant in the US. Some even plan Build new gas plants. Burning more fossil fuels inevitably leads to more pollution that causes climate change, as well Local air pollution which increase health risk to nearby communities. Data centers also Break a lot of water Keeping equipment before overheating, which can lead to greater stress in regions susceptible to drought.

These are all problems that AI developers can minimize by limiting general energy consumption. Traditional data centers were able to do this in the past. Despite the loads almost three times in 2015-2019, the demand for the power supply managed to remain relatively flat during this period, According to Goldman Sachs Research. Then the data centers became much more hungry for power around 2020 with progress in artificial intelligence. They detained over 4 percent of electricity in the US in 2023, which may almost triple to about 12 percent until 2028, according to December report from Lawrence Berkeley National Laboratory. There is no much uncertainty about this kind of projection now, but at this point, calling all shots based on Deepseek is still a shot in the dark.

Categories

And is a “pork of energy”, but Deepseek can change it

Prepare for a year of cluttered weather in the US

The fight to hold artificial intelligence companies accountable for the deaths of children

7 Ways to Reduce Hallucinations in LLM Manufacturing

Measuring progress towards AGI: A cognitive framework

Why Walmart and OpenAI are disrupting their sales agent offerings

More News

What’s going on with Alexa+?

The winter storm tested power grids that are strained to accommodate AI data centers

Google DeepMind employees ask leaders to ensure their “physical safety” from ICE

Google Photos now lets you describe how to turn images into videos

Prepare for a year of cluttered weather in the US

The fight to hold artificial intelligence companies accountable for the deaths of children

7 Ways to Reduce Hallucinations in LLM Manufacturing