As the generative capabilities of AI models become more powerful, you’ve probably seen how they can transform basic text suggestions into hyper-realistic images and even luxurious video clips.
More recently, generative AI has shown potential in helping chemists and biologists study stagnant molecules such as proteins and DNA. Models like AlphaFold can predict molecular structures, accelerating drug discovery, and MIT-supported researchRadio transmission” can lend a hand design novel proteins, for example. The challenge, however, is that the molecules are constantly moving and vibrating, which is crucial for creating models when constructing novel proteins and drugs. Simulating these movements on a computer using physics – a technique known as molecular dynamics – can be very high-priced and require billions of time steps on supercomputers.
As a step toward simulating these behaviors more effectively, researchers from the Computer Science and Artificial Intelligence Laboratory (CSAIL) and MIT’s Department of Mathematics have developed a generative model that learns from previous data. The team’s system, called MDGen, can take a frame of a 3D molecule and simulate what comes next like a video, stitch together separate photos, and even fill in missing frames. By pressing the “play button” on molecules, the tool could potentially lend a hand chemists design novel molecules and study precisely how well their drug prototypes for cancer and other diseases will interact with the molecular structure it is intended to affect.
Co-author Bowen Jing SM ’22 says MDGen represents an early proof of concept, but suggests the beginning of an thrilling novel line of research. “In the beginning, generative AI models created fairly simple videos, such as a person winking or a dog wagging its tail,” says Jing, a PhD student at CSAIL. “Fast forward a few years and now we have amazing models like Sora and Veo that can be useful in many interesting ways. We hope to instill a similar vision of the molecular world in which dynamic trajectories are movies. For example, you can give a model the first and tenth frames and it will animate what’s in between, or it can remove noise from a molecular video and guess what was hidden.
The researchers say MDGen represents a paradigm shift from previous comparable work on generative artificial intelligence in a way that allows for much broader applications. Previous approaches were “autoregressive,” meaning they relied on the previous still frame to build the next one, starting from the first frame to create the video sequence. In contrast, MDGen generates frames in parallel with diffusion. This means that MDGen can be used, for example, to stitch frames at endpoints or “upsample” low frame rate trajectories, in addition to pressing play on the start frame.
The results of this work were presented in a paper presented last December at the Conference on Neural Information Processing Systems (NeurIPS). Last summer, the solution was recognized for its potential commercial impact at the ML4LMS International Machine Learning Workshop Conference.
Some small steps forward in molecular dynamics
In experiments, Jing and his colleagues found that MDGen simulations were similar to direct physical simulations while creating trajectories 10 to 100 times faster.
The team first tested their model’s ability to capture a three-dimensional frame of the molecule and generate another 100 nanoseconds. Their system combined successive 10-nanosecond blocks to get these generations to that time. The team found that MDGen was able to compete with the accuracy of the baseline model, completing the video generation process in about a minute, which is just a fraction of the three hours the baseline model needed to simulate the same dynamics.
Given the first and last frames of a one-nanosecond sequence, the MDGen team also modeled the steps in between. The researchers’ system demonstrated a degree of realism for more than 100,000 different predictions: it simulated more likely molecular trajectories than its baseline values in clips shorter than 100 nanoseconds. In these tests, MDGen also demonstrated the ability to generalize to peptides it had not seen before.
MDGen’s capabilities also include simulating frames within frames, “upsampling” steps between each nanosecond to more accurately capture faster molecular phenomena. It can even “repaint” the structures of molecules, restoring deleted information about them. These features could ultimately be used by researchers to design proteins based on the movement specifications of different parts of the molecule.
Playing with protein dynamics
Jing and co-author Hannes Stärk say MDGen is an early sign of progress toward more efficient molecular dynamics generation. Still, they lack the data that would enable these models to be immediately applied to the design of drugs or molecules that induce the movements that chemists will want to see in the target structure.
The researchers’ goal is to scale MDGen from modeling molecules to predicting protein changes over time. “We currently use toy systems,” says Stärk, also a PhD student at CSAIL. “To enhance MDGen’s predictive protein modeling capabilities, we will need to leverage current architecture and available data. We don’t yet have a repository for this type of YouTube-scale simulation, so we hope to develop a separate machine learning method that can speed up the data collection process for our model.
For now, MDGen represents an encouraging path forward in modeling molecular changes invisible to the naked eye. Chemists could also use these simulations to further understand the behavior of drug prototypes in treating diseases such as cancer and tuberculosis.
“Machine learning methods that learn from physical simulation represent an emerging new frontier in artificial intelligence in science,” says Bonnie Berger, MIT Simons Professor of Mathematics, CSAIL principal investigator and senior author of the paper. “MDGen is a comprehensive, multi-functional modeling platform that bridges these two domains, and we are very excited to share our early models in this direction.”
“Sampling realistic transition paths between molecular states is a significant challenge,” says senior author Tommi Jaakkola, professor of electrical engineering and computer science at MIT Thomas Siebel and the Institute for Data, Systems and Society, and principal investigator of CSAIL. “This early work shows how we could begin to address such challenges by moving from generative modeling to full simulation runs.”
Bioinformatics scientists praise this system for its ability to simulate molecular transformations. “MDGen models molecular dynamics simulations as a joint distribution of structural deposition, capturing molecular movements between discrete time steps,” says Simon Olsson, an associate professor at Chalmers University of Technology, who was not involved in the research. “By using a masked learning target, MDGen enables innovative applications such as sampling transition paths, drawing analogies, and painting trajectories connecting metastable phases.”
The researchers’ work on MDGen was supported in part by the National Institute of General Medical Sciences, the U.S. Department of Energy, the National Science Foundation, the Machine Learning Consortium for Pharmaceutical Discovery and Synthesis, the Abdul Latif Jameel Machine Learning Clinic in Health, the Defense Threat Reduction Agency, and the Advanced Agencies Defense Research Projects.