The saying “practice makes perfect” is usually reserved for humans, but it’s also a great maxim for robots just starting out in an unfamiliar environment.
Imagine a robot arriving at a warehouse. It is equipped with a skill it has been trained in, such as placing items, and now it must select items from a shelf it is not familiar with. At first, the machine struggles with this as it has to familiarize itself with its recent environment. To improve, the robot must understand which skills within the overall task need improvement and then specialize (or parameterize) that activity.
A human on site could program the robot to optimize its performance, but researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and The AI Institute have developed a more effective alternative. Unveiled at the Robotics: Science and Systems conference last month, the “Estimate, Extrapolate, and Situate” (EES) algorithm lets these machines practice on their own, potentially helping them excel at useful tasks in factories, homes, and hospitals.
Assessment of the situation
To lend a hand robots become better at tasks like sweeping floors, EES works with a vision system that locates and tracks the machine’s surroundings. The algorithm then estimates how reliably the robot performs a task (sweeping, for example) and whether it would be worth practicing more. EES predicts how well the robot could perform the entire task if it improved a given skill, and then it practices. The vision system then checks whether the skill was performed correctly after each attempt.
EES could come in handy in places like a hospital, a factory, a home, or a coffee shop. For example, if you want a robot to immaculate your living room, it would need lend a hand practicing skills like sweeping. But according to Nishanth Kumar SM ’24 and his colleagues, EES could lend a hand the robot improve without human intervention, using just a few practice runs.
“When we started this project, we wondered whether such specialization would be feasible in a reasonable number of samples on a real robot,” says Kumar, co-author paper student in electrical engineering and computer science and a member of CSAIL, describing the work. “We now have an algorithm that allows robots to significantly improve at a given skill in a reasonable amount of time using tens or hundreds of data points, an improvement over the thousands or millions of samples required by a standard reinforcement learning algorithm.”
See point search
EES’s ability to learn efficiently was evident when it was deployed on Boston Dynamics’ Spot quadruped during research trials at The AI Institute. The robot, which has an arm strapped to its back, performed the manipulation tasks after several hours of practice. In one demonstration, the robot learned how to safely place a ball and a ring on a tilted table in about three hours. In another, the algorithm guided the machine to improve its ability to sweep toys into a bin in about two hours. Both results seem to be improvements over previous frameworks that would likely take more than 10 hours per task.
“We wanted the robot to learn from its own experiences so that it could better choose strategies that would work for its implementation,” says co-author Tom Silver SM ’20, PhD ’24, an electrical engineering and computer science (EECS) graduate and CSAIL affiliate who is now an assistant professor at Princeton University. “By focusing on what the robot knows, we tried to answer a key question: What skills does the robot have that would be most useful to practice right now?”
EES could eventually lend a hand improve autonomous robotic practice in recent deployment environments, but for now, it comes with a few limitations. To start, they used tables that were low to the ground, making it easier for the robot to see objects. Kumar and Silver also 3D-printed a handle that helped Spot grab the brush. The robot failed to detect some objects and identified objects in the wrong places, so the researchers counted those errors as failures.
Giving robots homework
The researchers note that the speed of practice from physical experiments could be sped up even more with a simulator. Instead of physically working on each skill autonomously, the robot could eventually combine real and virtual practice. They hope to make their system faster with less latency by designing the EES to overcome the imaging delays the researchers experienced. In the future, they might explore an algorithm that reasons based on the sequence of practice trials rather than planning which skills to improve.
“Enabling robots to learn on their own is both incredibly useful and incredibly challenging,” says Danfei Xu, an assistant professor in the School of Interactive Computing at Georgia Tech and a research scientist at NVIDIA AI, who was not involved in the work. “In the future, home robots will be sold to a variety of households and expected to perform a wide range of tasks. We can’t program everything they need to know in advance, so it’s important that they can learn on the fly. But letting robots explore and learn without guidance can be incredibly slow and can lead to unintended consequences. Silver and his colleagues’ research introduces an algorithm that lets robots practice their skills autonomously in a structured way. This is a big step toward creating home robots that can continually evolve and improve on their own.”
Silver and Kumar’s co-authors include AI Institute researchers Stephen Proulx and Jennifer Barry, as well as four CSAIL members: Northeastern University doctoral student and visiting researcher Linfeng Zhao, MIT EECS doctoral student Willie McClinton, and MIT EECS professors Leslie Pack Kaelbling and Tomás Lozano-Pérez. Their work was supported in part by the AI Institute, the U.S. National Science Foundation, the U.S. Air Force Office of Scientific Research, the U.S. Office of Naval Research, the U.S. Army Research Office, and MIT Quest for Intelligence, using high-performance computing resources from MIT SuperCloud and the Lincoln Laboratory Supercomputing Center.