Friday, May 9, 2025

Helping robots capture the unpredictable

Share

When robots encounter unfamiliar objects, they have difficulty explaining a basic truth: appearance is not everything. They may try to grab the block but find out it’s a literally a piece of cake. The misleading appearance of this object can cause the robot to miscalculate physical properties such as the object’s mass and center of mass, utilize an incorrect grip, and apply more force than necessary.

To see through this illusion, researchers at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) designed Capturing a neural process, a predictive physical model that enables real-time inference of these hidden features for more wise robotic grasping. Based on confined interaction data, their deep learning system can aid robots in areas such as warehouses and households at a fraction of the computational cost of previous algorithmic and statistical models.

The neural process of grasping is trained to infer unseen physical properties from the history of grasp trials and uses the inferred properties to guess which grasps will work in the future. Previous models often identified robot grips based only on visual data.

Typically, methods for inferring physical properties rely on customary statistical methods, which require many known concepts and a lot of computational time to work properly. The neural grasping process enables these machines to perform good grips more efficiently, using much less interaction data, and completes calculations in less than a tenth of a second, as opposed to the seconds (or minutes) required by customary methods.

The researchers note that the neural process of grasping develops in unstructured environments, such as homes and warehouses, because both contain an abundance of unpredictable objects. For example, a robot powered by the MIT model could quickly learn to handle tightly packed boxes with varying amounts of food without looking inside the box, and then place them where needed. In the logistics center, items with different physical properties and geometries were placed in appropriate boxes and shipped to customers.

Trained on 1,000 unique geometries and 5,000 objects, the Grasping Neural Process allowed the simulation to stably capture novel 3D objects generated in the ShapeNet repository. The CSAIL-led group then tested their model in the physical world using two weighted blocks, where their results exceeded a baseline that only took into account object geometry. The robot arm, previously confined to 10 experimental grips, successfully lifted the boxes in 18 and 19 of 20 trials each, while the machine, when unprepared, only performed 8 and 15 stable grips.

While robots performing inference tasks are less theatrical than an actor, they are also tasked with performing a three-part task: training, adaptation, and testing. In the training phase, robots practice on a fixed set of objects and learn how to infer physical properties based on a history of successful (or failed) grasps. The up-to-date CSAIL model cushions inferences from the physics of objects, which means it teaches the neural network the ability to predict the results of an high-priced statistical algorithm. All it takes is one pass through a neural network with confined interaction data to simulate and predict which grips work best for different objects.

Then, in the adaptation phase, the robot is introduced to an unknown object. At this stage, the neural process of grasping helps the robot experiment and update its position accordingly to understand which grips will work best. This tinkering phase prepares the machine for the final stage: testing, during which the robot formally performs a task on the object, learning again about its properties.

“As an engineer, it is unwise to assume that a robot knows all the necessary information it needs to successfully latch on,” says lead author Michael Noseworthy, an MIT doctoral student in electrical engineering and computer science (EECS) and a CSAIL collaborator. “Without humans labeling an object’s properties, robots have traditionally had to use a costly inference process.” According to another lead author, EECS PhD student and CSAIL affiliate Seiji Shaw, the neural grasping process they developed could be an improved alternative: “Our model helps robots do this much more efficiently by allowing the robot to imagine which grips will provide the best result. ”

“To move robots from controlled spaces like a lab or warehouse into the real world, they must be better able to cope with the unknown and be less likely to fail with the slightest change to their program. This work is a key step toward realizing the full transformative potential of robotics,” said Chad Kessens, an independent robotics researcher at the U.S. Army DEVCOM Research Laboratory, which sponsored the work.

While their model can help the robot effectively infer hidden static properties, the researchers would like to improve the system to adapt grips in real time for multiple tasks and objects with dynamic features. They imagine that their work will eventually help accomplish several tasks in the long-term plan, such as picking and chopping carrots. Moreover, their model could adapt to changes in mass distribution in less static objects, such as when filling an empty bottle.

Nicholas Roy, a professor of aeronautics and astronautics at MIT and a CSAIL fellow, joined the researchers on the paper and is the senior author. Group recently presented this work in IEEE International Conference on Robotics and Automation.

Latest Posts

More News