Wednesday, December 25, 2024

Using ideas from game theory to improve the reliability of language models

Share

Imagine that you and your friend are playing a game where your goal is to convey secret messages to each other using only cryptic sentences. Your friend’s task is to guess the secret message of your sentences. Sometimes you give the clues directly, and other times your friend has to guess the message by asking yes or no questions about the clues you provide. The challenge is that you both want to make sure you understand each other and agree on the secret message.

Researchers at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) have created a similar “game” to facilitate improve the way artificial intelligence understands and generates text. It’s called a “consensus game” and it involves two parts of an AI system – one part tries to generate sentences (e.g., giving directions), and the other part tries to understand and evaluate those sentences (e.g., guess a secret message).

The researchers found that by treating this interaction as a game in which both parts of the AI ​​work together under certain rules to agree on the right message, they could significantly improve the AI’s ability to answer questions correctly and consistently. They tested this recent game-like approach on a variety of tasks, such as reading comprehension, math problem solving, and conversation, and found that it helped the AI ​​perform better in all areas.

Traditionally, immense language models respond in one of two ways: generating answers directly from the model (generative queries) or using the model to evaluate a set of predefined answers (discriminative queries), which can lead to different and sometimes inconsistent results. With a generative approach, “Who is the President of the United States?” might give a plain answer like “Joe Biden.” However, a discriminatory query could incorrectly question this fact when evaluating the same answer, e.g., “Barack Obama.”

So how can we reconcile mutually incompatible scoring procedures to produce consistent and effective predictions?

“Imagine a new way to help language models understand and generate text, for example in a game. “We have developed a training-free method based on game theory that treats the entire process as a complex game of cues and signals, in which the generator tries to send the right message to the distinguisher, using natural language instead of chess pieces, using words and sentences,” says Athul Jacob, an MIT Ph.D. in the field of electrical engineering and computer science and an associate of CSAIL. “The way we move through this game is to find an “approximate balance,” which leads to a recent decoding algorithm called “equilibrium ranking.” This is a pretty thrilling demonstration of how combining game theory strategies can address the earnest challenges of making language models more hearty and consistent.”

Tested on multiple tasks such as reading comprehension, common sense reasoning, math problem solving, and dialogue, the team’s algorithm consistently improved the performance of these models. The exploit of the ER algorithm in the LLaMA-7B model even eclipsed the results obtained with much larger models. “Given that they’re already competitive, that people have been working on this for a while, but the level of improvement we saw when we were able to outperform a model 10 times the size was a pleasant surprise,” Jacob says.

The game continues

“Diplomacy,” the strategic board game set in pre-World War I Europe in which players negotiate alliances, betray friends, and conquer territories without the exploit of dice—relying solely on skill, strategy, and interpersonal manipulation—recently released its second installment . In November 2022, computer scientists including Jacob developed “Cicero” – an artificial intelligence agent that achieves human-level capabilities in a seven-player, mixed-theme game that requires the same skills mentioned above, but with natural language. Math partially inspired Consensus.

While the history of AI agents goes back long before OpenAI software appeared in chat in November 2022, it is well documented that they can still pose as a well-meaning, if pathological, friend.

The consensus game system achieves balance as agreement, ensuring accuracy and fidelity to the model’s original insights. To achieve this, the method iteratively adjusts the interactions between the generative and discriminative components until a consensus is reached on an answer that accurately reflects reality and is consistent with their initial beliefs. This approach effectively bridges the gap between these two querying methods.

In practice, implementing a consensus game approach to querying language models, especially for question answering tasks, involves significant computational challenges. For example, when using datasets such as MMLU, which contain thousands of multiple-choice questions and answers, the model must apply a mechanism to each query. It then needs to reach a consensus between the generative and discriminative components for each question and possible answers.

The system actually had a right of passage problem in elementary school: math problems with words. It cannot generate wrong answers, which is a key element in understanding the process of finding the right one.

“Over the last few years, we have seen some truly impressive progress in both strategic decision-making and language generation using AI systems, but we are just starting to figure out how to combine the two. Balance ranking is a first step in this direction, but I think we can do a lot to extend this to more complex problems,” says Jacob.

Directions for future work include improving the base model by integrating the results of the current method. This is particularly promising because it can provide more substantive and consistent answers across a variety of tasks, including factual and unconstrained generation. The potential for such a method to significantly improve the performance of the base model is high, potentially resulting in more reliable and fact-based results from ChatGPT and similar language models that people exploit every day.

“Even though modern language models such as ChatGPT and Gemini have made it possible to solve various tasks via chat interfaces, the statistical decoding process that generates a response from such models has remained unchanged for decades,” says Google scientist Ahmad Beirami, who was not involved in the work. “The MIT researchers’ proposal is an innovative game-theoretic framework for decoding language models by solving the equilibrium of a consensus game. The significant performance gains reported in the research paper are promising, opening the door to a potential paradigm shift in model decoding language that could usher in an avalanche of new applications.”

Jacob wrote the paper with MIT-IBM Watson Lab researcher Yikang Shen and assistant professors in MIT’s Department of Electrical Engineering and Computer Science Gabriele Farina and Jacob Andreas, who is also a member of CSAIL. Earlier this month, they presented their work at the International Conference on Learning Representations (ICLR), where it was featured as a “Paper in Spotlight.” The study also received the “best paper” award at the NeurIPS R0-FoMo workshop in December 2023.

Latest Posts

More News