Enabling AI to explain its predictions in plain language

Machine learning models can make errors and be arduous to employ, so researchers have developed explainer methods to lend a hand users understand when and how they should trust the model’s predictions.

These explanations are often convoluted, but may contain information about hundreds of features of the model. They are sometimes presented as multi-faceted visualizations that may be arduous for users without machine learning knowledge to fully understand.

To lend a hand people understand AI explanations, MIT researchers used enormous language models (LLMs) to transform story-based explanations into straightforward language.

They developed a two-part system that translates machine learning explanations into a paragraph of human-readable text, then automatically rates the quality of the narrative so the end user knows whether to trust it.

By giving the system a few sample explanations, researchers can tailor its narrative descriptions to user preferences or specific application requirements.

The researchers hope to employ this technique in the longer term, allowing users to ask follow-up questions about the model about how it arrived at its predictions in real-world conditions.

“Our goal with this study was to take the first step toward enabling users to have full-blown conversations with machine learning models about why they made certain predictions, so they can make better decisions about whether to use the model,” says graduate student Alexandra Zytek electrical engineering and computer science (EECS) and the main author of the book entitled article about this technique.

She is joined in this article by Sara Pido, a postdoc at MIT; Sarah Alnegheimish, EECS graduate; Laure Berti-Équille, Research Director at the French National Research Institute for Sustainable Development; and senior author Kalyan Veeramachaneni, principal research fellow at the Information Systems and Decisions Laboratory. The study results will be presented at the IEEE Substantial Data conference.

Explanatory explanations

The researchers focused on a popular type of machine learning explanation called SHAP. In the SHAP callout, a value is assigned to each feature that the model uses to make predictions. For example, if the model predicts house prices, one of the features might be the location of the house. The location will be assigned a positive or negative value that represents how much the feature modified the overall predictions of the model.

Often, SHAP explanations are presented in the form of bar charts that show which features are most and least significant. However, for a model with over 100 features, this bar chart quickly becomes unwieldy.

“As researchers, we have to make many choices about what we present visually. If we decide to only show the top 10, people might wonder what happened to another feature that isn’t in the story. Using natural language relieves us of having to make such choices,” says Veeramachaneni.

However, instead of using a large language model to generate natural language explanations, researchers use LLM to transform an existing SHAP explanation into a readable narrative.

Outsourcing only the natural language part of the process to LLM reduces the potential for inaccuracies to be introduced into the explanations, explains Zytek.

Their system, called EXPLINGO, is divided into two parts that cooperate with each other.

The first component, called NARRATOR, uses LLM to create narrative descriptions of SHAP explanations that satisfy user preferences. By initially giving the NARRATOR three to five written examples of narrative explanations, the LLM will emulate this style when generating text.

“Rather than forcing the user to specify what kind of explanation they are looking for, it is easier to simply ask them to write what they want to see,” Zytek says.

This allows NARRATOR to be easily adapted to new use cases by showing it a different set of hand-written examples.

After the NARRATOR creates explanations in plain language, the second component, GRADER, uses LLM to evaluate the narrative based on four metrics: conciseness, accuracy, completeness, and fluency. GRADER automatically displays LLM the text from the NARRATOR and the SHAP explanation it describes.

“We have found that even if the LLM makes a mistake while performing an assignment, it often won’t make an error when reviewing or approving that assignment,” he says.

Users can also customize GRADER to give different weights to each metric.

“In the case of high stakes, you can imagine the accuracy and completeness of weighing much higher than, for example, liquidity,” he adds.

Analyzing narratives

For Zytek and her colleagues, one of the biggest challenges was adapting the LLM to generate natural-sounding narratives. The more cues they added to the control style, the more likely the LLM would introduce errors into the explanations.

“A lot of rapid tuning involved finding and fixing each bug one at a time,” he says.

To test their system, the researchers took nine machine learning datasets with explanations and asked different users to write a narrative for each dataset. This allowed them to evaluate the NARRATOR’s ability to emulate unique styles. The GRADER tool was used to rate each narrative explanation across all four metrics.

Ultimately, the researchers found that their system could generate high-quality narrative explanations and effectively mimic a variety of writing styles.

Their results show that providing a few handwritten sample explanations significantly improves the narrative style. However, these examples must be written carefully – including comparative words such as “larger” may cause GRADERA to mark the exact explanations as incorrect.

Based on these results, the researchers want to explore techniques that could help their system deal better with comparison words. They also want to extend EXPLINGO to rationalize explanations.

In the long term, they hope to use this work as a step toward an interactive system where the user can ask follow-up questions about the model in connection with the explanations.

“It would help with decision-making in many ways. If people disagree with the model’s predictions, we want them to be able to quickly find out whether their intuition is correct or whether the model’s intuition is correct and where that difference comes from,” says Zytek.

Categories

Enabling AI to explain its predictions in plain language

Why pigeons at rest are in the center of complexity theory

Why balcony solar panels did not start in the USA

Apple cooperates with the anthropic on the AI coding tool for Xcode

Compact packages from Shein and this are now subject to American tariffs. Here’s what to know

Altman and Elon Musk are racing to build “everything

More News

An pioneering AI model inspired by neural dynamics from the brain

Making AI models will be more trustworthy in high rate settings

The novel method detects contamination of microorganisms in cell farms

The robotic system is zooming on objects most significant for helping people

Why pigeons at rest are in the center of complexity theory

Why balcony solar panels did not start in the USA

Apple cooperates with the anthropic on the AI coding tool for Xcode