# Entry
JSON is great for APIs, storage, and application logic. However, in vast language model (LLM) pipelines, it often carries a lot of token overhead that doesn’t add much value to the model: braces, quotes, commas, and repeated field names on every line. TOONbrief for Token-Oriented Object Notation, is a newer format designed specifically to keep the same JSON data model using fewer tokens and to provide models with clearer structural cues. The official TOON docs describe it as a compact, lossless JSON representation for LLM input, especially forceful for uniform arrays of objects.
In this article, you’ll learn what TOON is, when you should employ it, and how to start using it step by step in your own LLM workflow. We will also be truthful about the trade-offs because TOON is useful in some cases, but not all.
# Why JSON wastes tokens in LLM pipelines
JSON becomes high-priced in hints because it repeats the structure over and over again. LLMs don’t care that JSON is the standard. All they see is the chips.
If you send 100 support tickets, product lines, or user records to the model, the same field names will appear in each object. TOON reduces this amount of repetition by declaring fields once and then streaming the row values in a compact tabular form. Here’s a basic example.
JSON:
{
"users": [
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" },
{ "id": 3, "name": "Charlie", "role": "user" }
]
}
TOON:
users[3]{id,name,role}:
1,Alice,admin
2,Bob,user
3,Charlie,user
Same data, less clutter.
The structure is still clear, but the repeating keys are gone. This is where TOON derives most of its value.
# What exactly is TOON and when is it worth using it?
TOON is a JSON data model serialization format. This means it can represent objects, arrays, strings, numbers, logical values, and nulls – but in a more compact way for model inputs. The TOON project presents it as lossless to JSON, which means you can convert JSON to TOON and vice versa without losing information. The crucial thing to understand is this:
You don’t need to replace JSON in your application.
A better approach is to leave the JSON format in the backend, APIs and storage, and then convert it to TOON format only if you intend to send structured data to LLM.
TOON is most useful when the prompt contains repeated records with the same structure and the same fields. Good examples are downloaded support tickets, catalog lines, analytical records, tool output, CRM entries, or memory snapshots for agent systems. However, if your structure is deeply nested, very irregular, completely flat, or very diminutive, the benefits may shrink or disappear.
# First steps with TOON
// Step 1: Installing the TOON Command Line Interface
The easiest way to try TOON is to employ the official command line interface (CLI) from the TOON project. The TOON site connects directly to the CLI, and the main repository presents the format as part of a broader ecosystem of SDKs and tools.
Install package:
npm install -g @toon-format/cli
// Step 2: Convert JSON file to TOON
First, let’s create a folder:
mkdir toon-test
cd toon-test
Now run the following command to create the JSON file:
Paste this:
[
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" },
{ "id": 3, "name": "Charlie", "role": "user" }
]
Now convert this:
npx @toon-format/cli users.json -o users.toon
You should get a compact result similar to this:
[3]{id,name,role}:
1,Alice,admin
2,Bob,user
3,Charlie,user
This is the basic TOON pattern: declare the shape once, then print the values line by line. This is consistent with the official design goal of tabular arrays for uniform objects.
// Step 3: Using TOON as model input
The best place to employ TOON is on the entry side of the pipeline. Instead of pasting a vast JSON blob into the prompt, pass the TOON version and keep the instructions basic.
For example:
The following data is in TOON format.
users[3]{id,name,role}:
1,Alice,admin
2,Bob,user
3,Charlie,user
Summarize the user roles and point out anything unusual.
This works well because TOON is designed to lend a hand the model read repeating structure with less work. This is how the official project sets its benchmarks: as a test of understanding of various structured input formats.
// Step 4: Preserve JSON format for results
This is one of the most crucial practical decisions. TOON is very useful for data input, but JSON is still usually a better choice for output when another system needs to parse the model’s response. This is because JSON has much better tooling support, and contemporary APIs can force structured JSON output using schemas.
In practice, the safest pattern is:
- JSON in your application.
- TOON for vast, structured tooltip context.
- JSON again for machine parsable model responses.
This ensures efficiency on the input side and reliability on the output side.
// Step 5: Benchmarking in your own pipeline
Don’t change formats based on hype alone.
Run a little benchmark in your own workflow:
- Count input tokens for JSON.
- Count input tokens for TOON.
- Compare latency.
- Compare the quality of the answers.
- Compare the total cost.
The official TOON project claims token savings as one of its main benefits, and third-party accounts echo these claims, but community discussion also shows that the results are highly dependent on the shape of the data. So the best question isn’t “Is TOON better than JSON?”
A better question is: “Is TOON better for this particular LLM stage?”
# Final thoughts
TOON is not something that needs to be used everywhere.
This is a targeted optimization for one specific problem: wasting tokens on repeated JSON structure in LLM prompts. If your pipeline feeds a lot of repeating structured records into the model, it’s worth testing TOON. If your payloads are diminutive, irregular, or highly nested, JSON may still be a better choice.
The smartest way to take this approach is basic: keep JSON where JSON already works well, employ TOON where you wrap vast, structured inputs into hints, and compare the results to your own tasks before tackling them.
Kanwal Mehreen is a machine learning engineer and technical writer with a deep passion for data science and the intersection of artificial intelligence and medicine. She is co-author of the e-book “Maximizing Productivity with ChatGPT”. As a 2022 Google Generation Scholar for APAC, she promotes diversity and academic excellence. She is also recognized as a Teradata Diversity in Tech Scholar, a Mitacs Globalink Research Scholar, and a Harvard WeCode Scholar. Kanwal is a staunch advocate for change and founded FEMCodes to empower women in STEM fields.
