Imagine that the robot helps pristine the dishes. You ask him to spread the bowl from the sink, but her gripper slightly misses the sign.
By using fresh frames developed by MIT and NVIDIA researchers, you can improve this behavior of the robot with straightforward interactions. The method would allow you to indicate a bowl or track the trajectory on the screen or simply give the robot arm a poke in the right direction.
Unlike other methods of correcting the behavior of the robot, this technique does not require users to collect fresh data and retrained the machine learning model that powers the robot’s brain. It enables the work to employ intuitive, human feedback in real time to choose a possible sequence of action, which is approaching as close as possible to satisfaction with the user’s intention.
When scientists tested their framework, his success rate was 21 percent higher than an alternative method that did not employ human intervention.
In the long run, these frames may allow the user to easily run a robot trained in a factory to perform a wide range of household tasks, despite the fact that the robot has never seen their home or facilities.
“We can’t expect lay people to collect data and customize the neural network model. The consumer expects the robot to work immediately after removing from the box, and if this is not the case, they would like an intuitive mechanism to adapt it. This is a challenge that we solved in this work, “says Felix Yanwei Wang, a graduate of electrical and computer science (ECS) and chief author A paper with this method.
His co -authors are a doctorate of Lirui Wang ’24 and Yilun du ’24; Elder author Julie Shah, Professor Mit Aeronautics and Astronautics and director of Interactive Robotics Group at the Computer Science Laboratory and Artificial Intelligence (CSAIL); And also Balakumar Sundaraalingam, Xuning Yang, Yu-Wei Chao, Claudia Perez-D’ARPINO until 1919 and Dieter Fox from Nvidia. Research will be presented at an international conference on robots and automation.
Relieving non -social
Recently, scientists began to employ pre -trained AI generative models to learn “principles” or a set of rules that the robot warns to complete the operation. Generative models can solve many elaborate tasks.
During the training, the model sees only feasible robot movements, so it learns to generate proper trajectories for the robot.
Although these trajectories are significant, this does not mean that they are always consistent with the user’s intention in the real world. The robot could be trained to grab the boxes from the shelf, without turning them, but may not reach the box on someone’s shelf, if the shelf is oriented differently than those he saw during training.
To overcome these failures, engineers usually collect data showing a fresh task and will again train a generative model, high-priced and time -consuming process that requires specialist knowledge of machine learning.
Instead, myth researchers wanted to enable users to manage the behavior of the robot while implementing when they make a mistake.
But if a person interacts with the robot to improve his behavior, it can accidentally cause the generative model to choose an abnormal action. He can reach the box that the user wants, but in this process knocks off books from the shelf.
“We want to allow the user to interact with the robot without introducing such errors, which is why we get behavior that is much more adapted to the user’s intentions during implementation, but it is also important and feasible,” says Wang.
Their frames achieve this, providing the user with three intuitive ways to correct the behavior of the robot, each of which offers some advantages.
First of all, the user can indicate an object that wants the robot to manipulate in the interface that shows the view of the camera. Secondly, they can track the trajectory in this interface, enabling them to determine how they want the robot to reach the object. Thirdly, they can physically move the robot’s arm in the direction in which they want him to follow him.
“When you map the 2D image of the environment to 3D spaces, some information is lost. The physical robot poisoning is the most direct way to determine the user’s intentions without losing any information, “says Wang.
Sampling for success
To make sure that these interactions do not mean that the robot chooses an abnormal action, such as collision with other objects, scientists employ a specific sampling procedure. This technique allows the model to choose the action from a set of correct actions that are the most significant thing in accordance with the user’s goal.
“Instead of imposing the will of the user, we give a robot an idea of what the user intends, but allow the sampling procedure to oscillate their own set of learned behavior,” explains Wang.
This sampling method enabled scientists to exceed other methods that they compared it during simulation and experiments with a real robot arm in a toy kitchen.
Although their method may not always perform the task right away, it offers users the advantage that they can immediately improve the robot, if they see that he is doing something bad, instead of waiting for them to finish and then give fresh instructions.
In addition, after several times after the robot poke user, until he sets the correct bowl, he can register this repair and include them in his behavior through future training. The next day, the robot can pick up the correct bowl without having to poke.
“But the key to this continuous improvement is to have a way of interaction of the user with the robot, which we showed here,” says Wang.
In the future, scientists want to escalate the speed of sampling procedure while maintaining or improving its efficiency. They also want to experiment with generating robot policy in groundbreaking environments.