A better method for identifying overconfident immense language models

Huge language models (LLMs) can produce reliable but misleading answers, so researchers have developed uncertainty quantification methods to test the reliability of predictions. One popular method involves sending the same question repeatedly to see if the model produces the same answer.

But this method measures self-confidence and even the most impressive LLM can definitely be wrong. Overconfidence can mislead users about the accuracy of predictions, which can have devastating consequences in crucial areas such as health care and finance.

To address this shortcoming, MIT researchers have introduced a modern method for measuring a different type of uncertainty that more reliably identifies confident but incorrect LLM responses.

Their method involves comparing the responses of the target model with those of a group of similar LLMs. They found that measuring the discrepancy between models more accurately captures this type of uncertainty than customary approaches.

They combined their approach with the LLM measure of internal consistency to create a total uncertainty metric and assessed it on 10 realistic tasks such as question answering and mathematical reasoning. This measure of total uncertainty consistently outperformed other measures and was better at identifying unreliable predictions.

“Many different approaches to quantifying uncertainty use the principle of consistency, but if the uncertainty estimate is based only on the output of a single model, it is not necessarily reliable. We went back to the beginning to understand the limitations of current approaches and used them as a starting point to design a complementary method that can empirically improve the results,” says Kimia Hamidieh, a graduate student in electrical engineering and computer science (EECS) at MIT and lead author of the book article about this technique.

She was joined in the article by Veronika Thost, a research associate at the MIT-IBM Watson AI Lab; Walter Gerych, former postdoc at MIT, now assistant professor at Worcester Polytechnic Institute; Mikhail Yurochkin, Research Associate at MIT-IBM Watson AI Lab; and senior author Marzyeh Ghassemi, associate professor at EECS and member of the Institute of Medical Engineering Sciences and the Information Systems and Decisions Laboratory.

Understanding overconfidence

Many popular methods for quantifying uncertainty involve asking the model a confidence factor or testing the consistency of its response to the same prompt. These methods estimate the aleatoric uncertainty, i.e. the model’s internal confidence in its own predictions.

However, LLMs can be sure when they are completely wrong. Research has shown that epistemic uncertainty, or uncertainty about whether the correct model is being used, may be a better way to assess true uncertainty when the model is overconfident.

MIT researchers estimate epistemic uncertainty by measuring disagreement in a similar LLM group.

“If I ask ChatGPT the same question multiple times and it keeps giving me the same answer, it doesn’t mean the answer is necessarily correct. If I switch to Claude or Gemini and ask them the same question and get a different answer, it will give me a sense of epistemic uncertainty,” Hamidieh explains.

Epistemic uncertainty attempts to capture how far the target model deviates from the ideal model for the task. However, because it is impossible to build a perfect model, researchers utilize surrogates or approximations, which are often based on incorrect assumptions.

To improve uncertainty quantification, MIT researchers needed a more exact way to estimate epistemic uncertainty.

Team approach

The method they developed involves measuring the divergence between a target model and a diminutive set of models of similar size and architecture. They found that comparing the semantic similarity or degree of match between the meanings of responses can provide a better estimate of epistemic uncertainty.

To obtain the most exact estimate, researchers needed a set of LLMs that included a variety of responses, were not very similar to the target model, and were weighted by reliability.

“We found that the easiest way to meet all these properties was to use models trained by different companies. We tried many different, more complex approaches, but this very simple approach turned out to be the best,” says Hamidieh.

After developing this method for estimating epistemic uncertainty, they combined it with a standard approach that measures aleatoric uncertainty. This measure of total uncertainty (TU) most accurately reflected whether the model’s confidence level was trustworthy.

“Uncertainty depends on the uncertainty of the given prompt and how close our model is to the optimal model. Therefore, summing up these two uncertainty metrics will give us the best estimate,” Hamidieh says.

TU could be more effective in identifying situations in which LLM is hallucinating, because epistemic uncertainty can certainly mean erroneous results that aleatoric uncertainty can miss. This could also enable researchers to reinforce correct LLM responses during training, which could improve performance.

They tested TU using multiple LLMs on 10 common tasks such as answering questions, summarizing, translating and mathematical reasoning. The method they developed was more effective at identifying unreliable forecasts than either measure alone.

Measuring total uncertainty often required fewer queries than calculating aleatoric uncertainty, which could reduce computational costs and save energy.

Their experiments also showed that epistemic uncertainty is most effective on tasks with a unique correct answer, such as answering factual questions, but may perform worse on tasks that are more open-ended.

In the future, researchers may be able to adapt their technique to improve its performance for open-ended queries. They can also build on this work by exploring other forms of aleatoric uncertainty.

This work is funded in part by the MIT-IBM Watson AI Lab.

Categories

A better method for identifying overconfident immense language models

Signal creator helps Meta AI encrypt

Prepare for a year of cluttered weather in the US

The fight to hold artificial intelligence companies accountable for the deaths of children

7 Ways to Reduce Hallucinations in LLM Manufacturing

Measuring progress towards AGI: A cognitive framework

More News

Can artificial intelligence support predict which heart failure patient will worsen over the course of a year?

3 questions: About the future of artificial intelligence and mathematical and physical sciences

How Joseph Paradiso’s sensory innovations combine art, medicine and ecology

3 questions: Building predictive models to characterize cancer progression

Signal creator helps Meta AI encrypt

Prepare for a year of cluttered weather in the US

The fight to hold artificial intelligence companies accountable for the deaths of children