Monday, December 23, 2024

Despite its impressive results, generative AI lacks a coherent understanding of the world

Share

Gigantic language models can do impressive things, such as write poems or generate viable computer programs, even though these models are trained to predict what words will come next in a piece of text.

Such surprising possibilities can make it seem as if models are implicitly learning some general truths about the world.

But according to a recent study, that’s not necessarily the case. Researchers have found that a popular type of generative AI model can provide turn-by-turn driving directions in Fresh York City with near-perfect accuracy – without having to create an precise internal map of the city.

Despite the model’s amazing ability to navigate efficiently, when researchers closed some streets and added detours, its performance dropped dramatically.

As they dug deeper, researchers discovered that the model’s indirectly generated maps of Fresh York City contained many non-existent streets weaving between the grid and connecting distant intersections.

This could have sedate implications for generative AI models deployed in the real world, as a model that appears to work well in one context may break down if the task or environment changes slightly.

“One hope is that because LLM students can do all these amazing things in language, maybe we could use the same tools in other fields of study as well. However, the question of whether LLMs learn coherent models of the world is very important if we want to use these techniques to make new discoveries, says senior author Ashesh Rambachan, assistant professor of economics and principal investigator at MIT’s Information and Decision Systems Laboratory (COVER).

Rambachan is connected on a article about work by lead author Keyon Vafa, a postdoctoral fellow at Harvard University; Justin Y. Chen, a graduate student in electrical engineering and computer science (EECS) from MIT; Jon Kleinberg, Tisch Professor of Computer Science and Computer Science at Cornell University; and Sendhil Mullainathan, MIT Professor of EECS and Economics and member of LIDS. The research results will be presented at the Conference on Neural Information Processing Systems.

New indicators

The researchers focused on a type of generative AI model, called a transformer, that forms the backbone of LLMs such as GPT-4. Transformers are trained on vast amounts of linguistic data to predict the next item in a sequence, such as the next word in a sentence.

However, if scientists want to determine whether LLM has created an accurate model of the world, measuring the accuracy of its predictions is not sufficient, the researchers say.

For example, they found that the transformer could predict the correct moves in the game Connect 4 almost every time without understanding any of the rules.

Therefore, the team developed two new metrics that can test the global transformer model. The researchers focused their assessments on a class of problems called deterministic finite automations (DFA).

DFA is a problem with a sequence of states, such as intersections that must be traversed to reach a destination, and a concrete way of describing the rules that must be followed along the way.

They chose two problems that they formulated as DFA: navigating the streets of New York and playing the board game Othello.

“We needed test beds where we know what the world model is. Now we can think rigorously about what it means to restore this model of the world,” explains Vafa.

The first metric they developed, called sequence discrimination, says that a model has formed a coherent model of the world if it sees two different states, like two different Othello tables, and recognizes how they differ. Sequences, or ordered lists of data points, are what transformers operate to generate results.

The second metric, called sequence compression, says that a transformer with a consistent world model should know that two identical states, like two identical Othello tiles, have the same sequence of possible next steps.

They used these metrics to test two common classes of transformers, one trained on data generated from randomly generated sequences and the other on data generated from the following strategies.

Incoherent models of the world

Surprisingly, the researchers found that transformers making random choices created more precise models of the world, perhaps because they saw a wider range of potential next steps during training.

“In Othello, if you see two random computers playing rather than championship players, you will theoretically see the full set of possible moves, even bad ones that the champions wouldn’t make,” Vafa explains.

Even though the transformers generated precise directions and correct Othello movements in almost every case, these two metrics showed that only one generated a consistent world model for Othello’s movements, and neither did a good job of generating consistent world models in the wayfinding example.

The researchers demonstrated the consequences of this by adding detours to a map of Fresh York City, which caused all navigation models to crash.

“I was surprised how quickly performance deteriorated as soon as we added the detour. If we close just 1 percent of possible streets, the accuracy immediately drops from almost 100 percent to just 67 percent,” Vafa says.

When they recovered the city maps generated by the models, they looked like an imaginary Fresh York City with hundreds of intersecting streets superimposed on a grid. The maps often contained random overpasses over other streets or multiple streets with impossible orientation.

These results show that transformers can perform specific tasks surprisingly well without understanding the principles. If scientists want to build LLMs that can capture precise models of the world, they need to take a different approach, the researchers say.

“We often see these models doing impressive things and think they must have understood something about the world. “I hope we can convince people that this is a question that needs to be thought through very carefully and that we don’t have to rely on our own intuition to answer it,” Rambachan says.

In the future, researchers want to address a more diverse set of problems, such as those for which some principles are only partially known. They also want to apply their assessment metrics to real-world science problems.

This work is supported in part by the Harvard Data Science Initiative, a National Science Foundation Graduate Research Fellowship, a Vannevar Bush Faculty Fellowship, a Simons Collaboration grant, and a MacArthur Foundation grant.

Latest Posts

More News