Thursday, April 23, 2026

The AI ​​system learns how to ensure the glossy movement of warehouse robots

Share

In a giant autonomous warehouse, hundreds of robots zip through the aisles, collecting and distributing products to fulfill a constant stream of customer orders. In this busy environment, even miniature traffic jams or minor collisions can turn into massive slowdowns.

To avoid this avalanche of inefficiencies, researchers at MIT and technology company Symbotic have developed a up-to-date method that automatically keeps a fleet of robots moving smoothly. Their method learns which robots should act first at any given moment based on how congestion forms, and adapts to prioritize robots that may get stuck. This way, the system can redirect robots in advance to avoid bottlenecks.

The hybrid system uses deep reinforcement learning, a powerful artificial intelligence method for solving elaborate problems, to figure out which robots should be prioritized. A swift and reliable planning algorithm then instructs the robots, enabling them to respond quickly to constantly changing conditions.

In simulations inspired by real e-commerce warehouse layouts, this up-to-date approach achieved an approximately 25 percent augment in performance compared to other methods. Importantly, the system can quickly adapt to up-to-date environments with different numbers of robots or different warehouse layouts.

“There are many decision-making problems in manufacturing and logistics, where companies rely on algorithms developed by human experts. But we have shown that with the power of deep reinforcement learning we can achieve superhuman efficiency. This is a very promising approach, because in these giant warehouses, even a 2-3 percent increase in throughput can have a huge impact,” says Han Zheng, a graduate student at MIT’s Laboratory for Information and Decision Systems (LIDS) and lead author of a paper on this up-to-date approach.

In the article, Zheng is joined by Yining Ma, postdoc at LIDS; Brandon Araki and Jingkai Chen of Symbotic; and senior author Cathy Wu, 1954 Associate Professor of Career Development in Civil and Environmental Engineering (CEE) and the Institute for Data, Systems and Society (IDSS) at MIT and a member of LIDS. Tests appears today In .

Robot redirection

Coordinating hundreds of robots in an e-commerce warehouse at the same time is not an simple task.

The problem is even more complicated because the warehouse is a vigorous environment and robots constantly receive up-to-date tasks after achieving their goals. They need to be quickly redirected as they exit and enter the warehouse floor.

Companies often exploit algorithms written by experts to determine where and when robots should move to maximize the number of shipments they can handle.

However, if congestion or collisions occur, the company may have no choice but to shut down the entire warehouse for hours to manually resolve the issue.

“In this setting, we don’t have an accurate forecast of the future. We only know what the future may hold in terms of incoming packages or the distribution of future orders. The planning system must adapt to these changes as warehouse operations continue,” says Zheng.

MIT researchers achieved this adaptability through machine learning. They started by designing a neural network model to observe the warehouse environment and decide how to prioritize robots. They train this model using deep reinforcement learning, a trial-and-error method in which the model learns to control robots in simulations that mimic real warehouses. The model is rewarded for making decisions that augment overall throughput while avoiding conflicts.

Over time, the neural network learns to effectively coordinate the work of many robots.

“By interacting with simulations inspired by real warehouse layouts, our system receives feedback that we use to make more intelligent decisions. The trained neural network can then adapt to warehouses with different layouts,” explains Zheng.

It is designed to capture long-term constraints and obstacles in each robot’s path, while also taking into account the vigorous interactions between robots as they move through the warehouse.

By predicting current and future robot interactions, the model plans to avoid congestion before it occurs.

Once the neural network decides which robots should receive priority, the system uses a proven scheduling algorithm to tell each robot how to get from one point to the next. This productive algorithm helps robots respond quickly in a changing warehouse environment.

This combination of methods is key.

“This hybrid approach builds on my group’s work on how to achieve the best of both worlds between machine learning and classical optimization methods. Pure machine learning methods still struggle to solve complex optimization problems, and yet it is extremely time- and labor-intensive for experts to design effective methods. However, jointly using methods designed by experts in the right way can greatly simplify the task of machine learning,” says Wu.

Overcoming complexity

After training the neural network, the researchers tested the system in simulated warehouses that were different from those it saw during training. Because industrial simulations were too ineffective for this elaborate problem, researchers designed their own environments to mimic what happens in real warehouses.

Their hybrid learning approach achieved an average of 25 percent higher throughput compared to customary algorithms and random search in terms of the number of packages delivered per robot. Their approach can also generate feasible robot path plans that can overcome congestion caused by customary methods.

“Especially as the density of robots in a warehouse increases, the complexity increases exponentially and traditional methods quickly begin to fail. In such environments, our method is much more efficient,” says Zheng.

While their system is still a long way from real-world implementation, these demonstrations highlight the feasibility and benefits of using a machine learning approach to warehouse automation.

In the future, researchers want to include task allocation in problem formulation because determining which robot will perform each task affects congestion. They also plan to scale their system to larger warehouses equipped with thousands of robots.

This research was funded by Symbotic.

Latest Posts

More News