Thursday, May 8, 2025

A smarter way to improve drug discovery

Share

The utilize of artificial intelligence to improve drug discovery is exploding. Scientists are implementing machine learning models to lend a hand them identify, among billions of options, molecules that may have the properties they are looking for when developing recent drugs.

But there are so many variables to consider – from the price of materials to the risk of something going wrong – that even when scientists utilize AI, weighing the costs of synthesizing the best candidates is no uncomplicated task.

The myriad challenges of identifying the best and most cost-effective molecules to test are one of the reasons recent drug development takes so long, and a key factor in the high prices of prescription drugs.

To lend a hand scientists make cost-conscious choices, MIT researchers have developed an algorithmic framework to automatically identify optimal molecular candidates, which minimizes synthesis costs while maximizing the likelihood that candidates have the desired properties. The algorithm also identifies the materials and experimental steps needed to synthesize these molecules.

Their quantitative framework, known as the Synthesis Scheduling and Reward-Based Route Optimization Process (SPARROW), takes into account the costs of synthesizing batches of molecules at a time, since multiple candidates can often be derived from some of the same chemicals.

Moreover, this unified approach allows key information on molecular design, property prediction and synthesis planning to be extracted from online repositories and widely used artificial intelligence tools.

In addition to helping pharmaceutical companies discover recent drugs more efficiently, SPARROW can be used for applications such as inventing recent chemicals for agriculture or discovering specialized materials for organic electronics.

“Choosing relationships is now largely an art – sometimes a very successful one. But because we have all these other models and prediction tools that give us information about how molecules might work and how they might be synthesized, we can and should utilize that information to inform the decisions we make,” says Connor Coley, 1957 Adjunct Professor for career development in MIT’s departments of chemical engineering, electrical engineering, and computer science, and senior author of the SPARROW article.

Coley is joined in the article by lead author Jenna Fromer SM ’24. Test appears today IN .

Complex cost considerations

In some ways, whether a scientist should synthesize and test a particular molecule comes down to a question of the cost of synthesis versus the value of the experiment. However, determining cost or value is itself a difficult problem.

For example, the experiment may require expensive materials or may have a high risk of failure. In terms of value, one might consider how useful it would be to know the properties of this molecule or whether these predictions are subject to a high level of uncertainty.

At the same time, pharmaceutical companies are increasingly using batch synthesis to improve efficiency. Instead of testing molecules individually, they use combinations of chemical building blocks to test multiple candidates at once. However, this means that all chemical reactions must require the same experimental conditions. This makes estimating costs and value even more challenging.

SPARROW addresses this challenge by analyzing common intermediates involved in molecule synthesis and incorporating this information into a cost versus value function.

“When you think about this optimization game of designing a batch of molecules, the cost of adding a new structure depends on the molecules you’ve already selected,” Coley says.

The framework also takes into account factors such as the costs of starting materials, the number of reactions occurring in each synthesis route, and the likelihood that these reactions will be successful on the first try.

To use SPARROW, a scientist provides a set of molecular compounds they intend to test and a definition of the properties they hope to discover.

From this, SPARROW collects information about molecules and their synthetic pathways, and then compares the value of each with the cost of synthesizing a batch of candidates. It automatically selects the best subset of candidates that meet the user’s criteria and finds the most cost-effective routes to synthesizing these compounds.

“It does all the optimization in one step, so it can really take into account all competing goals at once,” Fromer says.

Versatile frames

SPARROW is unique because it can include molecular structures hand-designed by humans, those that exist in virtual catalogs, or never-before-seen molecules invented using generative artificial intelligence models.

“We have different sources of ideas. Part of the appeal of SPARROW is that you can take all of these ideas and apply them on a level playing field,” adds Coley.

The researchers evaluated SPARROW using it in three case studies. Case studies, based on real problems faced by chemists, aimed to test SPARROW’s ability to find cost-effective synthesis plans when working with a wide range of input molecules.

They found that SPARROW effectively captured the marginal costs of batch synthesis and identified common experimental steps and chemical intermediates. Additionally, it can be scaled to handle hundreds of potential molecular candidates.

“There are many models in the chemistry machine learning community that are good at things like retrosynthesis or predicting molecular properties, but how do we actually use them? Our framework aims to extract the value of this prior work. “Hopefully, by creating SPARROW, we will lend a hand other researchers think about convoluted downward selection using their own cost and utility functions,” Fromer says.

In the future, researchers want to incorporate additional complexity into SPARROW. For example, they would like to enable the algorithm to take into account the fact that the value of testing one compound is not always constant. They also want to include more elements of parallel chemistry in the cost versus value function.

“Fromer and Coley’s work better adapts algorithmic decision-making to the practical realities of chemical synthesis. When existing computational design algorithms are used, determining the best synthesis for a set of designs is left to the medicinal chemist, resulting in less optimal choices and additional work for the medicinal chemist,” says Patrick Riley, senior vice president of artificial intelligence at Relay Therapeutics, who was not involved in these tests. “This paper presents a principles-based path that includes consideration of conjoint synthesis, which I hope will result in higher quality and more accepted algorithmic designs.”

“Identifying which compounds need to be synthesized in a way that carefully balances time, cost, and the potential to make progress toward goals while providing new, actionable information is one of the most difficult tasks facing drug discovery teams. Fromer and Coley’s SPARROW approach does this in an efficient and automated manner, providing a useful tool for human drug chemistry teams and taking important steps toward a fully autonomous approach to drug discovery,” adds John Chodera, computational chemist at Memorial Sloan Kettering Cancer Center, which was not involved in this work.

This research was supported in part by DARPA’s Accelerated Molecular Discovery Program, the Office of Naval Research, and the National Science Foundation.

Latest Posts

More News