Saturday, March 14, 2026

Enormous -scale cloud waste cutting: Akamai saves 70% using AI agents organized by Kubernetes

Share


Join the event trusted by corporate leaders for almost two decades. VB Transforma connects people building AI Real Enterprise. Learn more


Especially in this generative artificial intelligence, the costs in the cloud are the highest. But this is not only because enterprises exploit more calculations – they do not exploit it efficiently. In fact, enterprises are expected to waste $ 44.5 billion for unnecessary expenses for the cloud.

This is a strengthened problem Akamai Technologies: The company has a vast and intricate cloud infrastructure on many clouds, not to mention numerous strict safety requirements.

Down Solve this, the provider of cybersecurity and content supplier turned to the Kubernetes automation platform Cast that you havewhich AI agents support optimize costs, security and speed in cloud environments.

Ultimately, the platform helped Akamai reduce from 40% to 70% of the cloud costs, depending on the load on the work.

“We needed a constant way to optimize our infrastructure and reduce the costs in the cloud without dedication of performance,” said VentureBeat Dekes Shavit, senior director of cloud engineering at Akamai. “We are those who process safety events. The delay is not an option. If we are not able to respond to real -time security, we failed.”

Specialized agents who monitor, analyze and act

Kubernetes manages the infrastructure that launches applications, facilitating the implementation, scaling and management, especially in native architecture and microservices.

Cast AI integrated with the Kubernetes ecosystem to support customers scalp their clusters and loads, choose the best infrastructure and manage computational cycles of life, explained the founder and general director of Laurent Gil. Its basic platform is the automation of application performance (APA), which operates through a team of specialized agents who constantly monitor, analyze and take action to improve application performance, security, efficiency and costs. Companies only offer necessary calculations from AWS, Microsoft, Google or others.

APA is powered by several machine learning models (ML) with reinforcement learning (RL) based on historical data and learned patterns, strengthened by a pile of observation and heuristics. This is connected to the tools of infrastructure-as-kodem (IAC) on several clouds, which makes it a completely automated platform.

Gil explained that APA was built on the principle that observation is only a starting point; As he called it, observation is “a foundation, not a goal.” Cast AI also supports an incremental party, so customers do not have to tear and exchange; They can integrate with existing tools and work flow. In addition, nothing never leaves customer infrastructure; All analyzes and activities occur in their dedicated Kubernetes clusters, ensuring greater safety and control.

Gil also emphasized the importance of human concentration. “Automation complements people making decisions,” he said, and APA maintains the flow of human work.

Akamai unique challenges

Shavit explained that the vast and intricate and intricate infrastructure in the cloud Akamai supplies content delivery (CDN) and cyber security services provided for “the most demanding clients and industries in the world”, while observing strict contracts at the level of services (SLA) and performance requirements.

He noticed that for some services they consume, they are probably the largest clients for their supplier, adding that they did “a lot of basic engineering and reengineeria” along with the hyperskara to support their needs.

In addition, Akamai serves clients of various sizes and industries, including vast financial institutions and credit card companies. The company’s services are directly related to the customer security attitude.

Ultimately, Akamai had to balance all this complexity with costs. Shavit noticed that real clients’ attacks can enhance the capacity of 100x or 1000x to specific elements of their infrastructure. But “in advance, scaling our ability in the cloud 1000 times is simply not financially feasible,” he said.

His team considered optimization on the code side, but the inseparable complexity of their business model required focusing on the basic infrastructure itself.

Automatically optimizing the entire infrastructure Kubernetes

What Akamai really needed was the Kubernetes automation platform, which could optimize the costs of starting the entire basic infrastructure in real time in several clouds, explained Shavit and rock applications based on constantly changing demand. But all this had to be done without devoting the performance of the application.

Before the implementation of the cast, Shavit noticed that the Devops Akamai team manually tuned all Kubernetes loads only a few times a month. Given the scale and complexity of its infrastructure, it was complex and pricey. Analyzing only sporadically loads, they clearly missed any real -time optimization potential.

“Now hundreds of cast agents are doing the same tuning, but they do what a second every day,” said Shavit.

The basic features of APA are used by Akamai is the automation of Kubernetes automation with packaging of containers (minimizing the number of containers used), the automatic selection of the most profitable computing instances, work rights, automation of the instance of the place in the entire life cycle and cost analysis.

“We have an insight into the costs of two minutes to integrate, which we have never seen before,” said Shavit. “After the implementation of active agents, the optimization began automatically and the savings began to appear.”

Point instances – in which enterprises can access unused cloud abilities at reduced prices – of course they had business sense, but they proved to be complicated due to the intricate Akamai loads, especially Apache Spark. This meant that they had to either burden the load or apply more work on them, which proved to be financially necessary.

Thanks to Cast AI, they could exploit the points on Spark with “Zero Investment” from the engineering team or surgery. The value of point instances was “super bright”; They only had to find the right tool to be able to exploit them. Shavit noticed that it was one of the reasons why they went with the cast.

While saving 2x or 3x on the cloud account is great, Shavit indicated that automation without manual intervention is “priceless”. This resulted in “huge” time savings.

Before the implementation of Cast AI, his team “constantly moved on knobs and switches” to ensure that their production environments and customers were on a par with the service in which they needed to invest.

“The biggest benefit is the fact that we no longer have to manage our infrastructure,” said Shavit. “The Cast Agent Team does it for us now. This has released our team to focus on what is most important: spending functions for our clients faster.”

Editor’s attention: on this month VB TransformGoogle Cloud Cto Will Grannis and Highmark Health SVP and the analytics director Richard Clarke will discuss a modern pile of AI in healthcare and real challenges related to the implementation of multi -model AI systems in a intricate, regulated environment. Register today.

Latest Posts

More News