Monday, December 23, 2024

Precision Home Robots Learn in Real, Simulated and Real Modes

Share

At the top of the wish list for many people looking for automation is a particularly time-consuming task: household chores.

The goal of many roboticists is to create the right combination of hardware and software so that the machine can learn “general” rules (the rules and strategies that guide the robot’s behavior) that work everywhere, under all conditions. Realistically, if you have a home robot, you probably don’t care much about it working for your neighbors. With that in mind, researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) set out to find a solution that would make it basic to train a robot with resilient rules for very specific environments.

“Our goal is to give the robots exceptional performance in the face of disturbances, distractions, changing lighting conditions, and object position changes, all in one environment,” says Marcel Torne Villasevil, an MIT CSAIL research assistant in the Improbable AI lab and lead author of the recent study. paper about the work. “We propose a method for creating digital twins on the fly, using the latest advances in computer vision. With just a phone, anyone can capture a digital replica of the real world, and robots can train in a simulated environment much faster than in the real world, thanks to GPU parallelization. Our approach eliminates the need for extensive reward engineering by using a few real-world demonstrations to kick-start the training process.”

Taking the robot home

RialTo, of course, is a bit more complicated than just waving your phone and (boom!) your home bot is at your service. It starts with using the device to scan the target environment using tools like NeRFStudio, ARCode, or Polycam. Once the scene is reconstructed, users can upload it to the RialTo interface to make detailed adjustments, add the necessary joints to the robots, and more.

The refined scene is exported and fed into the simulator. Here, the goal is to develop a policy based on real-world actions and observations, such as grabbing a cup on a countertop. These real-world demonstrations are replicated in the simulation, providing valuable data for reinforcement learning. “This helps create a strong policy that works well both in simulation and in the real world. An improved algorithm using reinforcement learning helps guide this process to ensure that the policy is effective when applied outside of the simulator,” Torne says.

Tests showed that RialTo created mighty rules for a variety of tasks, both in controlled lab conditions and in more unpredictable real-world environments, improving by 67 percent over imitation learning with the same number of demonstrations. The tasks included opening a toaster, placing a book on a shelf, putting a plate on a stand, placing a cup on a shelf, opening a drawer, and opening a cabinet. For each task, the researchers tested the system’s performance at three increasing levels of difficulty: randomly placing objects, adding visual distractions, and applying physical perturbations during the task. When combined with real-world data, the system outperformed classic imitation learning methods, especially in situations with many visual distractions or physical distractions.

“These experiments show that if we want high resilience in one specific environment, it’s best to use digital twins instead of trying to achieve resilience by collecting data at scale across multiple environments,” says Pulkit Agrawal, director of the Improbable AI Lab, assistant professor of electrical engineering and computer science (EECS) at MIT, principal investigator of CSAIL at MIT, and senior author of the paper.

As for limitations, RialTo currently needs three days to be fully trained. To speed up this process, the team mentions improving the basic algorithms and using foundational models. Training in simulation also has its limitations, and it is currently arduous to perform an effortless transfer of simulation to reality and simulate deformable objects or fluids.

Next level

What’s next for RialTo’s journey? Building on previous efforts, the researchers are working to preserve robustness to various perturbations while improving the model’s ability to adapt to modern environments. “Our next endeavor is an approach that uses pre-trained models, speeds up the learning process, minimizes human input, and achieves broader generalization capabilities,” Torne says.

“We are incredibly enthusiastic about our concept of ‘on-the-fly’ robot programming, where robots can autonomously scan their environment and learn how to solve specific tasks in simulation. While our current method has limitations—such as the need for a few initial human demonstrations and significant computational time to train these rules (up to three days)—we see it as a significant step toward achieving ‘on-the-fly’ robot learning and implementation,” says Torne. “This approach brings us closer to a future in which robots do not need pre-existing policies that cover every scenario. Instead, they can quickly learn new tasks without extensive real-world interaction. In my opinion, this advance could accelerate the practical application of robotics much earlier than relying solely on a universal, all-encompassing policy.”

“To implement robots in the real world, researchers have traditionally relied on methods such as imitation learning from expert data, which can be expensive, or reinforcement learning, which can be dangerous,” says Zoey Chen, a doctoral student in computer science at the University of Washington who was not involved in the paper. “RialTo directly addresses both the safety constraints of real-world RL, [robot learning]and efficient data constraints for data-driven learning methods, with a novel real-to-sim-to-real pipeline. This novel pipeline not only provides safe and robust simulation training before real-world deployment, but also significantly improves data collection efficiency. RialTo has the potential to significantly scale up robot learning and allows robots to adapt to complex real-world scenarios much more effectively.”

“The simulation has demonstrated impressive capabilities on real robots, providing inexpensive, potentially infinite data for policy learning,” adds Marius Memmel, a doctoral student in computer science at the University of Washington, who was not involved in the work. “However, these methods are limited to a few specific scenarios, and constructing simulations to match them is expensive and labor-intensive. RialTo provides an easy-to-use tool for reconstructing real-world environments in minutes instead of hours. Furthermore, it makes extensive use of collected demonstrations during policy learning, minimizing operator burden and closing the gap between simulation and reality. RialTo is robust to object positions and perturbations, demonstrating incredible real-world performance without the need for extensive simulator construction and data collection.”

Torne wrote the paper with senior authors Abhishek Gupta, an assistant professor at the University of Washington, and Agrawal. Four other CSAIL members are also listed: EECS graduate student Anthony Simeonov SM ’22, research assistant Zechu Li, undergraduate student April Chan, and Tao Chen PhD ’24. Members of the Improbable AI Lab and WEIRD Lab also provided valuable feedback and support in developing this project.

The work was supported in part by a Sony Research Award, the U.S. government and Hyundai Motor Co., with support from WEIRD (Washington Embodied Intelligence and Robotics Development) Lab. The researchers presented their work at the Robotics Science and Systems (RSS) conference earlier this month.

Latest Posts

More News