Sunday, April 20, 2025

Discovering Novel Algorithms with AlphaTensor

Share

Tests

Published
Author’s

Alhussein Fawzi, Matej Balog, Bernardino Romera-Paredes, Demis Hassabis, Pushmeet Kohli

AlphaZero’s First Mathematics Extension Opens Novel Research Opportunities

Algorithms have helped mathematicians perform basic operations for thousands of years. The antique Egyptians created an algorithm for multiplying two numbers without using the multiplication table, and the Greek mathematician Euclid described an algorithm for calculating the greatest common divisor that is still in utilize today.

During the golden age of Islam, a Persian mathematician Muhammad ibn Musa al-Khwarizmi designed novel algorithms for solving linear and quadratic equations. In fact, al-Khwarizmi’s name, translated into Latin as Algorithmsled to the creation of the term algorithm. However, despite today’s knowledge of algorithms – used throughout society from algebra in the classroom to cutting-edge scientific research – the process of discovering novel algorithms is extremely challenging and exemplifies the incredible reasoning abilities of the human mind.

In our paperpublished today in the journal Nature, we present AlphaTensorthe first artificial intelligence (AI) system to discover novel, productive, and verifiably correct algorithms for basic tasks like matrix multiplication. It sheds delicate on a 50-year-old question in mathematics about finding the fastest way to multiply two matrices.

This paper is a stepping stone in DeepMind’s mission to advance science and solve the most fundamental problems in AI. Our AlphaTensor system is based on AlphaZero, an agent that has demonstrated superhuman performance in board games such as chess, Go, and Shogi, and this paper charts AlphaZero’s journey from playing games to solving unsolved mathematical problems for the first time.

Matrix multiplication

Matrix multiplication is one of the simplest operations in algebra, commonly taught in high school math classes. But outside the classroom, this humble mathematical operation has had a profound impact on today’s digital world and is ubiquitous in state-of-the-art computers.

Example of the multiplication process of two 3×3 matrices.

This operation is used to process images on smartphones, recognize voice commands, generate graphics for computer games, run simulations to predict the weather, compress data and videos for sharing on the Internet, and much more. Companies around the world spend huge amounts of time and money developing computer hardware to efficiently multiply matrices. So even compact improvements in matrix multiplication performance can have a wide impact.

For centuries, mathematicians believed that the standard matrix multiplication algorithm was the best that could be achieved in terms of efficiency. However, in 1969, German mathematician Volker Strassen shocked the math community showing that better algorithms exist.

Standard algorithm compared to Strassen’s algorithm which uses one less scalar multiplication (7 instead of 8) to multiply a 2×2 matrix. Multiplications are much more critical than addition for overall efficiency.

By examining very compact matrices (2×2 size), he discovered an ingenious way to combine matrix entries to produce a faster algorithm. Despite decades of research after Strassen’s breakthrough, larger versions of this problem remain unsolved – so much so that it is unknown how efficiently two matrices as compact as 3×3 can be multiplied.

In our paper, we explored how state-of-the-art AI techniques can accelerate the automatic discovery of novel matrix multiplication algorithms. Building on advances in human intuition, AlphaTensor discovered algorithms that outperform state-of-the-art algorithms for many matrix sizes. Our AI-designed algorithms outperform human-designed algorithms, a major leap forward in the field of algorithm discovery.

The process and progress of automating algorithmic discovery

We first transformed the problem of finding productive matrix multiplication algorithms into a single-player game. In this game, the board is a three-dimensional tensor (an array of numbers) that records how far from correct the current algorithm is. Through a set of allowed moves corresponding to the algorithm’s instructions, the player attempts to modify the tensor and zero its entries. When the player succeeds, this results in a provably correct matrix multiplication algorithm for any pair of matrices, and its efficiency is recorded by the number of steps taken to zero the tensor.

This game is extremely challenging – the number of possible algorithms to consider is much greater than the number of atoms in the universe, even for compact cases of matrix multiplication. Compared to the Go game that remains a challenge for artificial intelligence for decadesthe number of possible moves at each stage of our game is 30 orders of magnitude larger (over 1033 in the case of one of the settings we considered).

Basically, to play this game well, you have to find the smallest needle in a giant haystack of possibilities. To address the challenges of this field, which differs significantly from time-honored games, we have developed many key components, including a novel neural network architecture that accounts for problem-specific inductive biases, a procedure for generating useful synthetic data, and a recipe for exploiting problem symmetry.

We then trained the AlphaTensor agent using reinforcement learning to play the game, starting with no knowledge of existing matrix multiplication algorithms. Through learning, AlphaTensor gradually improves over time, reinventing historical swift matrix multiplication algorithms such as Strassen’s algorithm, ultimately surpassing the realm of human intuition and discovering algorithms faster than previously known.

A single-player game played by AlphaTensor, the goal of which is to find the correct matrix multiplication algorithm. The game state is a cubic array of numbers (shown in gray for 0, blue for 1, and green for -1) representing the remaining work to be done.

For example, if a time-honored algorithm taught in school multiplies a 4×5 matrix by 5×5 by 100 multiplications, and that number was reduced to 80 thanks to human ingenuity, AlphaTensor found algorithms that perform the same operation using just 76 multiplications.

An algorithm discovered by AlphaTensor that uses 76 multiplications, which is an improvement over state-of-the-art algorithms.

In addition to this example, AlphaTensor’s algorithm improves on Strassen’s two-level finite-field algorithm for the first time since its discovery 50 years ago. These algorithms for multiplying compact matrices can be used as primitives for multiplying much larger matrices of any size.

Moreover, AlphaTensor also discovers a diverse set of algorithms with state-of-the-art complexity – up to thousands of matrix multiplication algorithms for each size, showing that the space of matrix multiplication algorithms is richer than previously thought.

Algorithms in this wealthy space have various mathematical and practical properties. Taking advantage of this diversity, we tuned AlphaTensor to find algorithms that are swift on a given hardware, such as Nvidia V100 GPU and Google TPU v2. These algorithms multiply huge matrices 10-20% faster than commonly used algorithms on the same hardware, demonstrating AlphaTensor’s flexibility to optimize any goal.

AlphaTensor with a target corresponding to the execution time of the algorithm. Once a correct matrix multiplication algorithm is discovered, it is tested on the target hardware, which is then fed back to AlphaTensor to learn more productive algorithms on the target hardware.

Exploring implications for future research and applications

Mathematically, our results can guide further research in complexity theory aimed at identifying the fastest algorithms for solving computational problems. By exploring the space of possible algorithms in a more productive way than previous approaches, AlphaTensor helps deepen our understanding of the wealth of matrix multiplication algorithms. Understanding this space can unlock novel results that assist determine the asymptotic complexity of matrix multiplication, one of the most fundamental open problems in computer science.

Because matrix multiplication is a fundamental element of many computational tasks, including computer graphics, digital communications, neural network training, and scientific computing, the algorithms discovered by AlphaTensor can significantly augment computational efficiency in these fields. AlphaTensor’s flexibility to accommodate any type of target could also spur novel applications for designing algorithms that optimize metrics such as energy consumption and numerical stability, helping to prevent compact rounding errors from snowballing as the algorithm runs.

Although we have focused here on the specific problem of matrix multiplication, we hope that our paper will inspire others to utilize AI to guide algorithmic discovery for other fundamental computational tasks. Our research also shows that AlphaZero is a powerful algorithm that can be extended well beyond the realm of time-honored games to assist solve open mathematical problems. Building on our research, we hope to stimulate broader work—applying AI to assist society solve some of the most critical challenges in mathematics and other sciences.

More information can be found in AlphaTensor’s GitHub repository.

Latest Posts

More News