Sunday, December 22, 2024

MIT researchers present Boltz-1, a fully open model for predicting biomolecular structures

Share

Scientists at MIT do released a powerful open-source AI model called Boltz-1 that could significantly accelerate biomedical research and drug development.

Developed by a team of researchers at the MIT Jameel Clinic for Machine Learning in Health, Boltz-1 is the first fully open-source model to achieve cutting-edge performance on par with AlphaFold3, a model from Google DeepMind that predicts the 3D structures of proteins and other biological molecules.

Boltz-1’s primary developers were MIT graduate students Jeremy Wohlwend and Gabriele Corso, along with MIT Jameel Clinic Research Affiliate collaborator Saro Passaro and MIT electrical engineering and computer science professors Regina Barzilay and Tommi Jaakkola. Wohlwend and Corso presented the model on December 5 at MIT’s Stata Center, where they said their ultimate goal is to foster global collaboration, accelerate discovery and provide a strong platform for improving biomolecular modeling.

“We hope this will be a starting point for the community,” Corso said. “There’s a reason we call it Boltz-1 and not Boltz. This is not the end of the line. We want as much community input as we can get.”

Proteins play an imperative role in almost all biological processes. The shape of a protein is closely related to its function, so understanding protein structure is crucial when designing novel drugs or constructing novel proteins with specific functions. However, due to the extremely sophisticated process by which a protein’s long chain of amino acids is assembled into a 3D structure, accurately predicting this structure has been a major challenge for decades.

DeepMind’s AlphaFold2, which won Demis Hassabis and John Jumper the 2024 Nobel Prize in Chemistry, uses machine learning to rapidly predict the three-dimensional structures of proteins with such accuracy that they are indistinguishable from those obtained experimentally by scientists. This open-source model is used by academic and commercial research teams around the world and has contributed to many advances in drug development.

AlphaFold3 improves on its predecessors by introducing a generative artificial intelligence model, known as a diffusion model, that better handles the uncertainty associated with predicting extremely sophisticated protein structures. However, unlike AlphaFold2, AlphaFold3 is not fully open source nor available for commercial operate, which prompted criticism from the scientific community and started a global race build a commercial version of the model.

In their work on Boltz-1, MIT researchers used the same initial approach as AlphaFold3, but after studying the underlying diffusion model, they explored potential improvements. Those that improved the model’s accuracy the most were included, such as novel algorithms that improved prediction performance.

In addition to the model itself, they have open-sourced their entire training and tuning process so that other scientists can operate Boltz-1.

“I am extremely proud of Jeremy, Gabriele, Saro and the rest of the Jameel Clinic team who contributed to this release. This project required many days and nights of work, and it took unwavering determination to get to this point. There are many exciting ideas for further improvements and we look forward to sharing them in the coming months,” says Barzilay.

It took the MIT team four months of work and a lot of experimentation to develop Boltz-1. One of the biggest challenges was overcoming the ambiguity and heterogeneity contained in the Protein Data Bank – the collection of all biomolecular structures that thousands of biologists have solved over the last 70 years.

“I spent many long nights struggling with this data. Many of them are pure domain knowledge that you simply need to acquire. There are no shortcuts,” says Wohlwend.

Ultimately, their experiments showed that Boltz-1 achieved the same level of accuracy as AlphaFold3 for a diverse set of sophisticated biomolecular structure predictions.

“What Jeremy, Gabriele and Saro have achieved is simply extraordinary. Their hard work and persistence on this project has made biomolecular structure prediction more accessible to the broader community and will revolutionize progress in molecular science,” says Jaakkola.

Scientists plan to further improve Boltz-1’s performance and reduce the time needed for prediction. They also invite researchers to try Boltz-1 on their device GitHub repository and connect with other Boltz-1 users on their Loose channel.

“We believe that we still have many, many years of work ahead of us to improve these models. We really want to collaborate with others and see what the community does with this tool,” adds Wohlwend.

Mathai Mammen, CEO and president of Parabilis Medicines, calls Boltz-1 a “game-changer” model. “By openly sourcing these advances, MIT Jameel and her colleagues are democratizing access to cutting-edge structural biology tools,” he says. “This groundbreaking effort will accelerate the creation of life-changing medicines. Thank you to the Boltz-1 team for making this profound step forward!”

“Boltz-1 will be extremely useful to my lab and the community at large,” adds Jonathan Weissman, a professor of biology at MIT and a member of the Whitehead Institute for Biomedical Engineering, who was not involved in the research. “We will see a whole wave of discoveries made possible by the democratization of this powerful tool.” Weissman adds that he anticipates that the open nature of Boltz-1 will lead to a wide range of original novel applications.

This work was also supported by a National Science Foundation American Expedition Grant; Jameel Clinic; the U.S. Defense Threat Reduction Agency’s Exploration of Medical Countermeasures to Modern and Emerging Threats (DOMANE) program; and the MATCHMAKERS project supported by the Cancer Grand Challenges partnership funded by Cancer Research UK and the US National Cancer Institute.

Latest Posts

More News