Guardrails for the LLM: Measuring AI "hallucinations" and talkativeness

# Entry

Vast language models (LLM) like to employ flowery, sometimes overly verbose language in their answers. Ask a straightforward question and you’ll likely be inundated with paragraphs of overly detailed, enthusiastic, and complicated prose. This typical behavior has its roots in their training, as they are optimized to be as helpful and conversational as possible.

Unfortunately, verbosity This is a stern aspect that needs to be taken into account and it can be argued that it often correlates with an increased likelihood of a stern problem: hallucinations. The more words generated in response, the greater the chances of moving away from established knowledge and venturing into the “art of production.”

In summary, powerful guardrails are needed to prevent this two-sided problem, starting with controlling verbosity. This article shows you how to employ the file Text statistics A Python library for measuring readability and detecting overly complicated responses before they reach the end user, forcing the model to refine its response.

# Setting a complexity budget with Textstat

The Textstat Python library can be used to calculate scores such as the automatic readability index (ARI); estimates the level of assessment (level of study) necessary to understand a piece of text, such as a model answer. If the complexity index exceeds the budget or threshold – for example, 10.0, which corresponds to a 10th-grade reading level – a re-hinting loop may automatically be triggered, requiring a more concise and simpler response. This strategy not only eliminates flowery language, but can also assist reduce the risk of hallucinations because the model ultimately sticks more closely to the basic facts.

# LangChain pipeline implementation

Let’s see how to implement the strategy described above and integrate it with LangChain a pipeline that can be easily run in Google Colab notebook. You will need Face Hugging API token which can be obtained for free at https://huggingface.co/settings/tokens. Create a recent “secret” named HF_TOKEN in the Colab menu on the left by clicking the “Secrets” icon (looks like a key). Paste the generated API token into the “Value” field and you’re done!

First, install the necessary libraries:

!pip install textstat langchain_huggingface langchain_community

The code below is specific to Google Colab and may need to be adapted accordingly if you are working in a different environment. Focuses on recovering stored API token:

from google.colab import userdata

# Obtain Hugging Face API token saved in your Colab session's Secrets
HF_TOKEN = userdata.get('HF_TOKEN')

# Verify token recovery
if not HF_TOKEN:
    print("WARNING: The token 'HF_TOKEN' wasn't found. This may cause errors.")
else:
    print("Hugging Face Token loaded successfully.")

In the following code snippet we perform several actions. First, it configures components to generate local text using a pre-trained Hugging Face model distilgpt2. The model is then integrated into the LangChain pipeline.

import textstat
from langchain_core.prompts import PromptTemplate
# Importing necessary classes for local Hugging Face pipelines
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from langchain_community.llms import HuggingFacePipeline

# Initializing a free-tier, local-friendly, compatible LLM for text generation
model_id = "distilgpt2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Creating a text-generation pipeline
pipe = pipeline(
    "text-generation", 
    model=model, 
    tokenizer=tokenizer, 
    max_new_tokens=100,
    device=0 # Apply GPU if available, otherwise it will default to CPU
)

# Wrapping the pipeline in HuggingFacePipeline
llm = HuggingFacePipeline(pipeline=pipe)

Our core granularity measurement and management mechanism is then implemented. The following function generates a summary of the text passed to it (assumed to be the LLM response) and tries to ensure that the summary does not exceed a threshold level of complexity. Please note that when using the appropriate tooltip template, generation models such as distilgpt2 can be used to obtain text summaries, although the quality of such summaries may not be on par with heavier summary-focused models. We chose this model because of its reliability in local execution in a confined environment.

def safe_summarize(text_input, complexity_budget=10.0):
    print("n--- Starting Summary Process ---")
    print(f"Input text length: {len(text_input)} characters")
    print(f"Target complexity budget (ARI score): {complexity_budget}")

    # Step 1: Initial Summary Generation
    print("Generating initial comprehensive summary...")
    base_prompt = PromptTemplate.from_template(
        "Provide a comprehensive summary of the following: {text}"
    )
    chain = base_prompt | llm
    summary = chain.invoke({"text": text_input})
    print("Initial Summary generated:")
    print("-------------------------")
    print(summary)
    print("-------------------------")

    # Step 2: Measure Readability
    ari_score = textstat.automated_readability_index(summary)
    print(f"Initial ARI Score: {ari_score:.2f}")

    # Step 3: Enforce Complexity Budget
    if ari_score > complexity_budget:
        print("Budget exceeded! Initial summary is too complex.")
        print("Triggering simplification guardrail...")
        simplification_prompt = PromptTemplate.from_template(
            "The following text is too verbose. Rewrite it concisely "
            "using simple vocabulary, stripping away flowery language:nn{text}"
        )
        simplify_chain = simplification_prompt | llm
        simplified_summary = simplify_chain.invoke({"text": summary})

        new_ari = textstat.automated_readability_index(simplified_summary)
        print("Simplified Summary generated:")
        print("-------------------------")
        print(simplified_summary)
        print("-------------------------")
        print(f"Revised ARI Score: {new_ari:.2f}")
        summary = simplified_summary
    else:
        print("Initial summary is within complexity budget. No simplification needed.")

    print("--- Summary Process Finished ---")
    return summary

Also note in the code above that ARI scores are calculated to estimate text complexity.

The last part of the sample code tests the previously defined function by passing in sample text and a complexity budget of 10.0, and then printing the final results.

# 1. Providing some highly verbose, complicated sample text
sample_text = """
The inextricably intertwined permutations of cognitive computational arrays within the 
realm of Vast Language Models often precipitate a cascade of unnecessarily labyrinthine 
lexical structures. This propensity for circumlocution, whilst seemingly indicative of 
profound erudition, frequently obfuscates the foundational semantic payload, thereby 
rendering the generated discourse significantly less accessible to the quintessential layperson.
"""

# 2. Calling the function
print("Running summarizer pipeline...n")
final_output = safe_summarize(sample_text, complexity_budget=10.0)

# 3. Printing the final result
print("n--- Final Guardrailed Summary ---")
print(final_output)

The resulting printed messages can be quite long, but when you call the pre-trained model to summarize, you may notice a subtle drop in the ARI score. However, don’t expect miraculous results: the chosen model, although lightweight, is not very good at summarizing text, so the reduction in ARI score is rather miniature. You can try using other models, e.g google/flan-t5-small to see how they handle text summarization, but beware – these models will be heavier and more arduous to handle.

# Summary

This article shows how to implement an infrastructure to measure and control overly detailed LLM responses by calling an auxiliary model to summarize them before committing to their level of complexity. In many scenarios, hallucinations are a byproduct of being very talkative. Although the implementation shown here focuses on assessing verbosity, there are specific checks that can also be used to measure hallucinations – such as semantic consistency checks, natural language inference (NLI) cross-encoders, and LLM-as-a-judge solutions.

Ivan Palomares Carrascosa is a thought leader, writer, speaker and advisor in the fields of Artificial Intelligence, Machine Learning, Deep Learning and LLM. Trains and advises others on the employ of artificial intelligence in the real world.

Categories

Guardrails for the LLM: Measuring AI “hallucinations” and talkativeness

# Entry

# Setting a complexity budget with Textstat

# LangChain pipeline implementation

# Summary

Trump Media Withdraws Plans for Its Own Prediction Market

A test for “bad cholesterol” does not tell everything

Ilya Sutskever Stands By His Role in Taking Down Sam Altman’s OpenAI: ‘I Didn’t Want It to Be Destroyed’

10 GitHub Repositories to Master FastAPI

Chevron wants tax break for school district for data center power plant

More News

A test for “bad cholesterol” does not tell everything

10 GitHub Repositories to Master FastAPI

Chevron wants tax break for school district for data center power plant

5 fun projects using Claude’s code

Trump Media Withdraws Plans for Its Own Prediction Market

A test for “bad cholesterol” does not tell everything

Ilya Sutskever Stands By His Role in Taking Down Sam Altman’s OpenAI: ‘I Didn’t Want It to Be Destroyed’