Creating a unique and promising research hypothesis is a basic skill of every scientist. It can also be time-consuming: modern PhD students may spend the first year of their program trying to decide what exactly to study in their experiments. What if artificial intelligence could support?
MIT researchers have created a way to autonomously generate and evaluate promising research hypotheses in various fields through the collaboration of humans and artificial intelligence. In the modern paper, they describe how they used this framework to create evidence-based hypotheses that address unmet research needs in the field of bioinspired materials.
Published on Wednesday in The study’s co-authors are Alireza Ghafarollahi, a postdoc in the Laboratory of Atomic and Molecular Mechanics (LAMM), and Markus Buehler, the Jerry McAfee Professor of Engineering in MIT’s departments of Civil and Environmental Engineering and Mechanical Engineering and director of LAMM.
The framework, which researchers call SciAgents, consists of multiple AI agents, each with specific capabilities and access to data, using “graph inference” methods in which AI models employ a knowledge graph that organizes and defines relationships between different concepts scientific. The multi-agent approach mimics the way biological systems organize themselves as groups of elementary elements. Buehler notes that the principle of “divide and conquer” is the dominant paradigm in biology at many levels, from materials to insect swarms to civilizations – all examples in which total intelligence is much greater than the sum of individual abilities.
“By using multiple AI agents, we are trying to simulate the process by which communities of scientists make discoveries,” Buehler says. “At MIT, we do this by having a group of people from different backgrounds collaborate and run into each other at coffee shops or on MIT’s Infinite Corridor. But it’s very random and slow. Our goal is to simulate the discovery process by seeing whether AI systems can be creative and make discoveries.”
Automation of good ideas
As recent developments have shown, enormous language models (LLM) have demonstrated an impressive ability to answer questions, summarize information, and perform plain tasks. However, when it comes to generating modern ideas from scratch, they are quite confined. MIT researchers wanted to design a system that would enable AI models to perform a more sophisticated, multi-step process that goes beyond recalling information learned during training to extrapolating and creating modern knowledge.
The basis of their approach is an ontological knowledge graph that organizes and creates connections between various scientific concepts. To produce the graphs, scientists feed a set of research articles into a generative artificial intelligence model. In previous work, Buehler used a field of mathematics known as category theory to support an artificial intelligence model develop abstractions of scientific concepts in the form of graphs, rooted in defining relationships between components in a way that could be analyzed by other models in a process called graph inference. This focuses AI models on developing a more principled way of understanding concepts; it also allows them to generalize better across domains.
“Creating science-oriented AI models is really important to us because scientific theories are typically based on generalizable principles, not just knowledge recall,” Buehler says. “By focusing AI models on ‘thinking’ in this way, we can move beyond conventional methods and explore more creative applications of AI.”
In the latest paper, the researchers used about 1,000 research studies on biological materials, but Buehler says knowledge graphs can be generated using many more or fewer research articles in any given field.
Individual agents within the framework interact with each other to collectively solve a sophisticated problem that none of them could solve alone. The first task they set themselves is to generate a research hypothesis. LLM interactions begin once a subgraph is defined based on the knowledge graph, which can happen randomly or by manually entering a pair of keywords discussed in the articles.
Under this language model, which researchers have dubbed “ontology,” the task is to define scientific terms in articles and explore the connections between them, ultimately creating a knowledge graph. Then, a model called “Scientist 1” prepares a research proposal based on factors such as the ability to discover unexpected properties and novelties. The proposal includes a discussion of potential outcomes, research impact, and conjecture about the underlying mechanisms of action. The “Scientist 2” model expands on this concept by suggesting specific experimental and simulation approaches and making other improvements. Finally, the “Critic” model highlights its strengths and weaknesses and suggests further improvements.
“It’s about building a team of experts who don’t all think the same way,” Buehler says. “They need to think differently and have different options. The Critic Agent is deliberately programmed to criticize others, so not everyone agrees and says it’s a great idea. You have an agent who says, “Here’s a weakness, can you explain it better?” This makes the results significantly different from individual models.”
Other agents in the system can search existing literature, giving the system the ability to not only assess the feasibility but also generate and evaluate the novelty of each idea.
Strengthening the system
To validate their approach, Buehler and Ghafarollahi created a knowledge graph based on the words “silk” and “energy-intensive.” Using this framework, “Scientist 1” proposed the integration of silk with dandelion-based pigments to create biomaterials with improved optical and mechanical properties. The model predicted that the material would be much stronger than traditional silk materials and would require less energy to process.
Scientist 2 then suggested, for example, using special molecular dynamics simulation tools to study the interactions of the proposed materials, adding that a good application for this material would be a bio-inspired adhesive. The Critic model then highlighted several strengths of the proposed material and areas for improvement, such as its scalability, long-term stability and the environmental impact of solvent use. To address these concerns, Krytyka suggested conducting pilot studies to validate the process and rigorously analyze the durability of the material.
The researchers also conducted other experiments with randomly selected keywords, which led to various original hypotheses on more efficient biomimetic microfluidic chips, enhancing the mechanical properties of collagen-based scaffolds, and interactions between graphene and amyloid fibrils to create bioelectronic devices.
“The system was able to develop these new, rigorous ideas based on a path from the knowledge graph,” Ghafarollahi says. “In terms of newness and applicability, the materials seemed robust and innovative. In future work, we will generate thousands or tens of thousands of new research ideas, and then we can categorize them and try to better understand how these materials are created and how they can be further improved.”
In the future, scientists hope to incorporate modern tools for information retrieval and simulation into their structures. They can also easily replace the basic models in their frameworks with more advanced models, allowing the system to adapt to the latest AI innovations.
“Because of the way these agents interact, improving one model, even if it’s small, has a huge impact on the overall behavior and performance of the system,” Buehler says.
Since releasing a preliminary version detailing their open-source approach, the researchers have been contacted by hundreds of people interested in using the frameworks in various scientific fields, and even in areas such as finance and cybersecurity.
“A lot of things can be done without having to go to the lab,” Buehler says. “Basically, you want to go to the lab at the very end of the process. The laboratory is high-priced and takes a lot of time, so you need a system that will be able to analyze the best ideas in great detail, formulate the best hypotheses and accurately predict emerging behaviors. Our vision is to make it basic to employ, so that you can employ the app to submit other ideas or drag datasets to really challenge the model and make modern discoveries.