Researchers from MIT and Stanford University have developed a up-to-date machine learning approach that can be used to more effectively and efficiently control a robot, such as a drone or autonomous vehicle, in energetic environments where conditions can change rapidly.
The technique could facilitate autonomous vehicles learn to compensate for slippery surfaces to avoid skidding, enable free-flying robots to tow objects through space, or enable a drone to follow a downhill skier despite powerful gusts of wind.
The researchers’ approach incorporates some structure from control theory into the model learning process in a way that leads to an effective method for controlling complicated dynamics, such as those caused by wind effects on the trajectory of a flying vehicle. One way to think about this structure is as a clue that can facilitate determine how to control the system.
“Our work focuses on understanding the internal structure of the system dynamics, which can be used to design more effective, stabilizing controllers,” he says. Navid AzizanEsther and Harold E. Edgerton Assistant Professor in the MIT Department of Mechanical Engineering and the Institute for Data, Systems, and Society (IDSS) and a member of the Laboratory for Information and Decision Systems (LIDS). “By jointly learning the dynamics of the system and these unique control-oriented structures from data, we are able to naturally create controllers that perform much more efficiently in the real world.”
By using this structure in a trained model, the researchers’ technique immediately extracts an effective controller from the model, unlike other machine learning methods that require deriving the controller or training it separately through additional steps. Because of this structure, their approach is also able to learn an effective controller using less data than other approaches. This could facilitate their learning-based control system achieve better performance faster in rapidly changing environments.
“In this work, we try to find a balance between identifying structure in the system and simply learning a model from data,” says the lead author. Spencer M. Richardsgraduate student at Stanford University. “Our approach is inspired by how roboticists use physics to derive simpler models of robots. Physical analysis of these models often yields a useful structure for control purposes—one that you might not notice if you just naively tried to fit a model to the data. Instead, we try to identify a similarly useful structure from the data that indicates how to implement the control logic.”
Additional authors paper are Jean-Jacques Slotine, professor of mechanical engineering and brain and cognitive sciences at MIT, and Marco Pavone, associate professor of aeronautics and astronautics at Stanford. The research will be presented at the International Conference on Machine Learning (ICML).
Learning how to apply the controller
Determining the best way to control a robot to complete a given task can be a arduous problem, even if scientists know how to model all the elements of the system.
The controller is the logic that allows the drone to follow a desired trajectory, for example. This controller tells the drone how to adjust the rotor forces to compensate for the effects of wind that may push it off a stable path to reach its destination.
This drone is a energetic system—a physical system that evolves over time. In this case, its position and speed change as it flies through its environment. If such a system is plain enough, engineers can manually derive a controller.
Modeling a system manually inherently captures some structure based on the physics of the system. For example, if a robot were modeled manually using differential equations, they would capture the relationship between velocity, acceleration, and force. Acceleration is the rate of change of velocity over time, which is determined by the mass and forces applied to the robot.
But often the system is too complicated to model accurately by hand. Aerodynamic effects, such as how swirling wind pushes a flying vehicle, are notoriously arduous to derive by hand, Richards explains. Instead, scientists would take measurements of the drone’s position, velocity, and rotor speed over time and apply machine learning to fit a model of this energetic system to the data. But these approaches typically don’t learn a control-based framework. That framework is useful in determining how best to set rotor speeds to guide the drone’s movement over time.
Once a model of a dynamical system is built, many existing approaches also apply the data to learn a separate controller for the system.
“Other approaches that try to learn the dynamics and the controller from the data as separate entities are a bit philosophically divorced from the way we typically do it for simpler systems. Our approach is more like deriving models by hand from physics and combining that with control,” Richards says.
Structure identification
A team from MIT and Stanford has developed a technique that uses machine learning to learn a model of dynamics, but in such a way that the model has a specific structure useful for controlling the system.
With this framework, they can extract the controller directly from the dynamics model, rather than using the data to learn a completely separate controller model.
“We found that in addition to learning the dynamics, it is also important to learn the control-oriented framework that supports efficient controller design. Our approach to learning state-dependent dynamics factorization coefficients outperformed baselines in data efficiency and tracking capabilities, which proved effective in efficiently and effectively controlling the system trajectory,” says Azizan.
When they tested this approach, their controller closely followed the desired trajectories, outperforming all baseline methods. The controller extracted from their trained model nearly matched the performance of a ground-truth controller that is built using precise system dynamics.
“By making simpler assumptions, we ended up with something that actually performed better than other complicated baseline approaches,” Richards adds.
The researchers also found that their method was data-efficient, meaning it achieved high performance even with a diminutive amount of data. For example, it could effectively model a highly energetic rotor-powered vehicle using just 100 data points. Methods that used multiple trained components saw much faster performance degradation with smaller data sets.
Such efficiency could make their technique particularly useful in situations where a drone or robot must quickly learn to adapt to rapidly changing conditions.
Moreover, their approach is universal and can be applied to many types of energetic systems, from robotic arms to free-flying spacecraft operating in low-gravity environments.
In the future, researchers are interested in developing models that are more physically interpretable and that can identify very detailed information about a dynamical system, Richards says. That could lead to better controllers.
“Despite its ubiquity and importance, nonlinear feedback control remains an art form, which makes it particularly suitable for data-driven and learning-based methods. This paper makes a significant contribution to the field by proposing a method that jointly learns the dynamics of a system, a controller, and a control-oriented structure,” says Nikolai Matni, assistant professor in the Department of Electrical and Systems Engineering at the University of Pennsylvania, who was not involved in this work. “What I found particularly exciting and compelling was the integration of these components into a joint learning algorithm, so that the control-oriented structure acts as an inductive bias in the learning process. The result is a data-efficient learning process that generates dynamic models that have an internal structure that enables efficient, stable, and robust control. While the technical contribution of the paper is excellent in itself, it is the conceptual contribution that I find most exciting and significant.”
This research is supported in part by the NASA University Leadership Initiative and the Natural Sciences and Engineering Research Council of Canada.