Imagine you want to carry a vast, weighty box up a flight of stairs. You can spread your fingers and pick up the box with both hands, then hold it on your forearms and balance it against your chest, using your whole body to manipulate the box.
Humans are generally good at whole-body manipulation, but robots struggle with such tasks. To a robot, any place where the box could touch any point on the fingers, arms, or torso of the carrier is a contact event that it has to think about. With billions of potential contact events, planning for this task quickly becomes unfeasible.
Now, MIT researchers have found a way to simplify this process, known as contact-rich manipulation planning. They exploit an AI technique called smoothing, which summarizes many contact events into fewer decisions to enable even a plain algorithm to quickly identify an effective manipulation plan for the robot.
While it’s still in its early days, this method could potentially allow factories to exploit smaller, mobile robots that can manipulate objects with their entire arms or bodies, rather than vast robotic arms that can only grasp with their fingertips. This could aid reduce energy consumption and lower costs. Additionally, the technique could be useful for robots sent on exploration missions to Mars or other bodies in the solar system, as they could quickly adapt to their surroundings using only an on-board computer.
“Rather than thinking about it as a black-box system, if we can leverage the structure of these types of robotic systems through models, there’s an opportunity to speed up the whole process of making these decisions and developing rich-interaction plans,” says HJ Terry Suh, a graduate student in electrical engineering and computer science (EECS) and co-author paper about this technique.
Suh was joined on the paper by co-authors Tao Pang PhD ’23, a roboticist at the Boston Dynamics AI Institute; Lujie Yang, an EECS graduate student; and senior author Russ Tedrake, a professor of EECS, aerospace, and mechanical engineering at Toyota and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). The research appears this week in
Learning about learning
Reinforcement learning is a machine learning technique in which an agent, like a robot, learns to perform a task through trial and error and receives a reward for getting closer to the goal. Scientists say that this type of learning is based on a black box approach because the system must learn everything about the world through trial and error.
It has been used effectively to plan manipulations requiring a vast number of contacts, in which the robot tries to learn the best way to move an object in a specific way.
However, because there are billions of potential touchpoints that the robot must analyze to determine how to exploit its fingers, hands, arms, and body to interact with an object, this trial-and-error approach requires a lot of computation.
“To actually learn a policy, you might need to simulate a reinforcement learning process that takes millions of years,” Suh adds.
On the other hand, if scientists design a specific physics-based model using their knowledge of the system and the task the robot needs to perform, this model will take into account the structure of the world around it, making it more capable.
But physics-based approaches aren’t as effective as reinforcement learning when it comes to planning manipulations that require a vast number of contacts—and Suh and Pang wondered why.
They conducted a detailed analysis and found that a technique known as smoothing enabled reinforcement learning to work effectively.
Many of the decisions a robot might make when determining how to manipulate an object are irrelevant in the grand scheme of things. For example, any infinitesimal change in a single finger, whether it makes contact with an object or not, doesn’t really matter. Smoothing averages out many of these irrelevant, intermediate decisions, leaving a few critical ones.
Reinforcement learning performs smoothing implicitly, by trying many touchpoints and then taking a weighted average of the results. Building on this knowledge, the MIT researchers designed a plain model that performs a similar type of smoothing, allowing it to focus on basic robot-object interactions and predict long-term behavior. They showed that this approach can be as effective as reinforcement learning in generating intricate plans.
“If you know a little more about your problem, you can design more efficient algorithms,” Pang says.
Winning combination
Although smoothing greatly simplifies decision-making, sorting through the remaining decisions can still be a arduous problem. So the researchers combined their model with an algorithm that can quickly and efficiently sort through all the possible decisions a robot could make.
Thanks to this connection, the computation time on a standard laptop has been reduced to about a minute.
They first tested their approach in simulations where robotic hands were given tasks like moving a pen to a desired configuration, opening a door, or picking up a plate. In each case, their model-based approach achieved the same performance as reinforcement learning, but in a fraction of the time. They saw similar results when they tested their model in hardware on real robotic arms.
“The same ideas that make whole-body manipulation possible also work for planning in dexterous, human hands. Previously, most researchers had argued that reinforcement learning was the only approach that scaled to dexterous hands, but Terry and Tao showed that by taking the key idea of (randomized) smoothing from reinforcement learning, they could make more traditional planning methods work extremely well as well,” Tedrake says.
However, their model is based on a simpler approximation of the real world, so it can’t handle highly active movements, such as falling objects. While effective for slower manipulation tasks, their approach can’t generate a plan that would allow the robot to, for example, throw a can into a trash can. In the future, the researchers plan to improve their technique so that it can handle these highly active movements.
“If you study your models carefully and really understand the problem you’re trying to solve, there are definitely benefits to be had. There are benefits to doing things outside of the black box,” Suh says.
This work is funded in part by Amazon, MIT Lincoln Laboratory, the National Science Foundation and Ocado Group.