Samsung AI Researcher's Novel Open TRM Reasoning Model Outperforms Models 10,000 Times Larger - For Specific Problems

The trend of artificial intelligence researchers developing fresh, small Open source generative models that outperform much larger, proprietary models continued this week with another stunning advance.

Alexia Jolicoeur-Martineausenior AI researcher at Samsung Institute of Advanced Technology (SAIT) in Montreal, Canada, has introduced Tiny Recursion Model (TRM) — a neural network so compact that it contains just 7 million parameters (internal model settings) and yet competes or outperforms state-of-the-art language models 10,000 times larger in the number of parameters, including OpenAI o3-mini and Google Gemini 2.5 Pro, on some of the most challenging benchmarks in artificial intelligence research.

The goal is to show that highly capable fresh AI models can be created inexpensively, without the massive investments in graphics processing units (GPUs) and power needed to train the larger, multi-trillion parameter flagship models that power many LLM chatbots today. The results are described in a research article published on the open-access website arxiv.org titled “Less is more: Recursive inference in small networks“

“The idea that to solve difficult tasks you have to rely on huge, basic models trained for millions of dollars by some huge corporation is a trap,” Jolicoeur-Martineau wrote in the article social networkX. “Too much emphasis is currently placed on the use of LLMs, rather than on developing and developing new degrees.”

Jolicoeur-Martineau also added: “Through recursive reasoning, it turns out that ‘less is more.’ A small model, trained from scratch, iterated and updated its answers over time, can accomplish a lot without breaking the bank.

The TRM code is now available Girub under the enterprise-friendly, commercially viable MIT License – which means anyone from researchers to companies can download it, modify it and deploy it for their own purposes, even for commercial applications.

One big caveat

However, readers should be aware that TRM has been designed specifically for solving structured, visual and grid-based problems such as Sudoku, mazes and puzzles on ARC (Abstract and Reasoning Corpus)-AGI benchmarkthe latter offers tasks that should be easy for a human but difficult for AI models, such as sorting colors on a grid based on a prior but not identical solution.

From hierarchy to simplicity

The TRM architecture is a radical simplification.

It is based on a technique called Hierarchical reasoning model (HRM) introduced earlier this year, which showed that small networks could solve logic puzzles such as Sudoku and mazes.

HRM was based on two cooperative networks – one operating at high frequency, the other at low frequency – supported by biologically inspired arguments and mathematical justifications involving fixed point theorems. Jolicoeur-Martineau found this unnecessarily complicated.

TRM removes these elements. Instead of two networks, it uses a single two-layer model which recursively refines its own predictions.

The model starts with an embedded question and an initial answer represented by variables X, yAND With. Through a series of reasoning steps, it updates its internal hidden representation With and specifies the answer y until it reaches a stable baseline value. Each iteration corrects potential errors from the previous step, resulting in a self-improving inference process without additional hierarchy or mathematical overhead.

How recursion replaces scale

The basic idea behind TRM is this recursion can override depth and size.

By iteratively reasoning from its own results, the network effectively simulates a much deeper architecture without the associated memory and computational overhead. This recursive cycle, with as many as sixteen stages of supervision, enables the model to make increasingly better predictions – in a similar spirit to how large language models use multi-step “chain-thinking” reasoning, but in this case achieved through compact feedforward design.

Simplicity pays off in both performance and generalization. The model uses fewer layers, no point approximations or a hierarchy of two networks. Light stopping mechanism decides when to stop refinement, preventing wastage of calculations while maintaining accuracy.

Performance that punches above its weight

Despite its small size, TRM delivers comparative results that match or exceed models millions of times larger. In tests, the model achieved:

Accuracy 87.4%. ON Sudoku-Extreme (increase from 55% for HRM)
Accuracy 85%. ON Maze – difficult riddles
Accuracy 45%. ON ŁUK-AGI-1
Accuracy 8%. ON ŁUK-AGI-2

These results exceed or are very close to the performance of several high-end large language models, including Deep Seek R1, Gemini 2.5 ProAND o3-minieven though TRM uses less than 0.01% of their parameters.

Such results suggest that recursive reasoning, rather than scale, may be the key to solving abstract and combinatorial reasoning problems – domains where even the highest-end generative models often stumble.

Design philosophy: less is more

TRM’s success results from thoughtful minimalism. Jolicoeur-Martineau found that reducing complexity leads to better generalization.

As the researcher increased the number of layers or the size of the model, performance decreased due to overfitting of small datasets.

However, a two-layer structure combined with a recursive depth and deep supervisionachieved optimal results.

The model also performed better when self-attention was replaced with: simpler multilayer perceptron for tasks with small, fixed contexts, such as Sudoku.

For larger grids such as ARC puzzles, self-attention remained valuable. These findings emphasize that model architecture should respond to the structure and scale of the data, rather than focusing on maximum capacity by default.

Train small, think big

TRM is now officially available as open source under the MIT license ON Girub.

The repository includes full training and evaluation scripts, tools for creating datasets for Sudoku, Maze and ARC-AGI, and reference setups for reproducing published results.

It also documents computational requirements, from a single NVIDIA L40S GPU for Sudoku training to multi-GPU H100 setups for ARC-AGI experiments.

The open version confirms that TRM was designed specifically for structured grid-based reasoning tasks instead of modeling a general-purpose language.

Each benchmark—Sudoku-Extreme, Maze-Hard, and ARC-AGI—uses small, well-defined I/O grids, adapting to a recursive model supervision process.

The training includes significant data augmentation (such as color permutations and geometric transformations), which highlights that the performance of TRM lies in the size of the parameters rather than the total computational demand.

The simplicity and clarity of the model make it more accessible to researchers outside large corporate laboratories. Its code base builds directly on the prior structure of the Hierarchical Reasoning Model, but removes HRM’s biological analogies, multiple network hierarchies, and fixed-point dependencies.

In this way, TRM offers a repeatable reference point for examining recursive inference in small models – a counterpoint to the dominant “all you need is scale” philosophy.

Community reaction

The release of TRM and its open-source codebase sparked immediate debate among artificial intelligence researchers and practitioners about X. While many praised this achievement, others questioned how broadly generalizable its methods could be.

Supporters hailed TRM as proof that small models can outperform giants, calling it “10,000 times smaller, but smarter” and a potential step towards architectures that think, not just scale.

Critics responded that the domain of TRM was narrow – focused narrow, grid-based puzzles — and that the computational savings are mainly due to size rather than total runtime.

Researcher Yunmin Cha noted that TRM training relies on intense reinforcement and recursive transitions, “more computation, same model.”

Cancer geneticist and data analyst Hello, Loveday emphasized that TRM exists solvernot a chat model or a text generator: it is distinguished by structured reasoning, but not an open language.

Machine learning researcher Sebastian Raschka positioned TRM as a valid simplification of HRM rather than a fresh form of general intelligence.

He described the process as “a two-step loop that updates the state of internal reasoning and then refines the answer.”

Several researchers, including Augustine Nabeleagreed that the strength of the model lies in its clear reasoning structure, but noted that future work would need to demonstrate transferability to less constrained problem types.

The consensus emerging on the Internet is that TRM may be narrow, but its message is broad: careful recursion, not constant expansion, can fuel the next wave of reasoning research.

Looking to the future

Although TRM is currently applicable to supervised reasoning tasks, its recursive structure opens several future directions. Jolicoeur-Martineau suggested exploration Generative or multi-response variantswhere the model can generate many possible solutions rather than one deterministic one.

Another open question concerns scaling laws for recursion – determining how far “less is more” can go as model complexity or data size increases.

Ultimately, the study offers both a practical tool and a conceptual reminder: progress in artificial intelligence does not have to depend on increasingly larger models. Sometimes, teaching a compact network to think carefully—and recursively—can be more effective than getting a gigantic network to think once.

Categories

Samsung AI Researcher’s Novel Open TRM Reasoning Model Outperforms Models 10,000 Times Larger – For Specific Problems

One big caveat

From hierarchy to simplicity

How recursion replaces scale

Performance that punches above its weight

Design philosophy: less is more

Train small, think big

Community reaction

Looking to the future

With the wave of a magnet, microscopic “magno-bots” perform convoluted maneuvers

Enabling privacy-preserving AI training on everyday devices

Britain’s answer to Darpa wants to reprogram the human brain

OpenAI really wants Codex to stop talking about goblins

Elon Musk Testifies He Launched OpenAI to Prevent ‘Terminator Outcome’

More News

OpenAI really wants Codex to stop talking about goblins

Elon Musk Testifies He Launched OpenAI to Prevent ‘Terminator Outcome’

‘It’s undignified’: Hundreds of workers training Meta’s artificial intelligence could be fired

The United Arab Emirates is leaving OPEC after almost 60 years

With the wave of a magnet, microscopic “magno-bots” perform convoluted maneuvers

Enabling privacy-preserving AI training on everyday devices

Britain’s answer to Darpa wants to reprogram the human brain