Thursday, March 12, 2026

Debugging and tracking LLM like a professional

Share

Debugging and tracking LLM like a professional
Photo by the author Canva

# Entry

Customary debugging with print() Or registration works, but it is sluggish and awkward with LLM. Phoenix provides a view of the schedule of each step, quick inspection and answers, detecting errors with re -insight, visibility in delay and costs, and a full visual understanding of your application. Phoenix by Aize AI is a powerful observation and tracking tool for open source designed especially for LLM application. It helps visually monitor, debrted and track everything that happens in LLM pipelines. In this article we will go through what Phoenix does and why it matters, how to integrate Phoenix from Langchain step by step and how to visualize traces in the phoenix user interface.

# What is Phoenix?

Phoenix is an observation tool and debugging open source made for the application of huge languages models. It intercepts detailed telemetry data from LLM work flows, including prompts, answers, delays, errors and tools, and presents this information in an intuitive, interactive navigation desktop. Phoenix allows programmers to deeply understand how their LLM pipelines behave inside the system, identify problems with rapid outputs, analyze the bottlenecks, monitor with tokens and related costs, and track all logic of errors/re -logic during the implementation phase. It supports consistent integration with popular frames, such as Langchain and Llamaandex, and also offers opentemetry support for more personalized configurations.

# Step by step configuration

// 1. Installation of the required libraries

Make sure you have Python 3.8+ and install dependencies:

pip install arize-phoenix langchain langchain-together openinference-instrumentation-langchain langchain-community

// 2. Starting Phoenix

Add this line to start the PHOENIX desktop:

import phoenix as px
px.launch_app()

This starts the local navigation desktop http: // localhost: 6006.

// 3. Building the Langchain pipeline with calling Phoenix

Understand Phoenix by using. We build a straightforward Langchain chatbot. Now we want:

  • Debugging if the prompt works
  • Monitor how long the model is reacting
  • Follow a quick structure, a model and output
  • See all this visually instead of registering everything by hand

// Step 1: Start the Phoenix desktop in the background

import threading
import phoenix as px

# Launch Phoenix app locally (access at http://localhost:6006)
def run_phoenix():
    px.launch_app()

threading.Thread(target=run_phoenix, daemon=True).start()

// Step 2: Register Phoenix with OpenTemetry and Langchain instrument

from phoenix.otel import register
from openinference.instrumentation.langchain import LangChainInstrumentor

# Register OpenTelemetry tracer
tracer_provider = register()

# Instrument LangChain with Phoenix
LangChainInstrumentor().instrument(tracer_provider=tracer_provider)

// Step 3: Initiate LLM (Total API)

from langchain_together import Together

llm = Together(
    model="meta-llama/Llama-3-8b-chat-hf",
    temperature=0.7,
    max_tokens=256,
    together_api_key="your-api-key",  # Replace with your actual API key
)

Do not forget to list “your mate” with your API key. You can do it by using it to combine.

// Step 4: Define a hint template

from langchain.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("human", "{question}"),
])

// Step 5: Connect the prompt and model in the chain

// Step 6: Ask a lot of questions and print answers

questions = [
    "What is the capital of France?",
    "Who discovered gravity?",
    "Give me a motivational quote about perseverance.",
    "Explain photosynthesis in one sentence.",
    "What is the speed of light?",
]

print("Phoenix running at http://localhost:6006n")

for q in questions:
    print(f" Question: {q}")
    response = chain.invoke({"question": q})
    print(" Answer:", response, "n")

// Step 7: Keep your application alive for monitoring

try:
    while True:
        pass
except KeyboardInterrupt:
    print(" Exiting.")

# Understanding the traces and indicators of Phoenix

Before seeing the exit, we should first understand the Phoenix indicators. You will first have to understand what traces and spanities are:
Track: Each trace represents one full course of the LLM pipeline. For example, every question such as “What is the capital of France?” generates a recent trace.
Stretching: Each trace is mixed with many spans, each of which represents the stage in the chain:

  • ChatPROMPTMPLATE.Format: Monitor formatting
  • Totalm.invoke: LLM Call
  • Any non -standard components added

Shown records for a trail

Metric Meaning and meaning
Delay (MS)

It measures the total time for full LLM chain, including rapid formatting, LLM reaction and final processing. It helps to identify bottlenecks and debug sluggish answers.

Entrance tokens

Number of tokens sent to the model. Critical for monitoring the input size and controlling the costs of the API interface, because most of the apply is based on tokens.

Output tokens

Number of tokens generated by the model. Useful for understanding talkativeness, quality of reaction and impact of costs.

Swift template

Displays a full prompt with variables inserted. It helps confirm whether the hints are structured and filled correctly.

Input / output text

It shows both the user’s input and the model’s response. Useful for checking the quality of interaction and detection of hallucinations or incorrect answers.

Duration

It breaks the time made by each step (like quick creation or invocation of the model). It helps to identify bottlenecks in the chain.

Chain name

Determines which part of the pipeline the span belongs (e.g. prompt.formatIN TogetherLLM.invoke). It helps to isolate where problems occur.

Tags / metadane

Additional information, such as model name, temperature, etc. useful for gear filtering, comparing results and analyzing the impact of parameters.

Now visit http: // localhost: 6006 To display the Phoenix navigation desktop. You will see something like this:

Phoenix navigation desktopPhoenix navigation desktop

Open the first trace to display its details.

Phoenix first tracePhoenix first trace

# Wrapping

To sum up, Arize Phoenix makes you extremely basic to debug, track and monitor the LLM application. You don’t have to guess what went wrong or copies of the diaries. Everything is there: hints, answers, times and many others. It helps to detect problems, understand performance and simply build better AI experiences with much less stress.

Canwal Mehreen He is a machine learning engineer and a technical writer with a deep passion for data learning and AI intersection with medicine. He is the co -author of the ebook “maximizing performance from chatgpt”. As a Google 2022 generation scholar for APAC, it tells diversity and academic perfection. It is also recognized as a variety of terradate at Tech Scholar, Mitacs Globalink Research Scholar and Harvard Wecode Scholar. Kanwalwal is a scorching supporter of changes, after establishing FemCodes to strengthen women in the STEM fields.

Latest Posts

More News