Friday, March 6, 2026

FermiNet: Quantum Physics and Chemistry from First Principles

Share

Unfortunately, an error of 0.5% is still not enough to be useful to the working chemist. Molecular bond energy is only a petite fraction of the system’s total energy, and correctly predicting whether a molecule is stable can often depend on as little as 0.001% of the system’s total energy, or about 0.2% of the remaining “correlation” energy.

For example, while the total energy of electrons is butadiene molecule is almost 100,000 kilocalories per mole, the energy difference between the different possible shapes of the molecule is only 1 kilocalorie per mole. This means that if you want to correctly predict the natural shape of butadiene, you need the same level of precision as measuring the width of a football field to the nearest millimeter.

With the advent of digital computing after World War II, scientists developed a wide range of computational methods beyond the description of the mean electron field. Although these methods come in a mixture of abbreviations, they all generally fall somewhere on the axis that combines accuracy with efficiency. At one extreme are essentially correct methods that scale less than exponentially with the number of electrons, making them impractical for all but the smallest molecules. At the other extreme are methods that scale linearly but are not very correct. These computational methods have had a profound impact on the practice of chemistry Nobel Prize in Chemistry in 1998 has been awarded to the creators of many of these algorithms.

Fermion neural networks

Despite the wide range of existing computational quantum mechanics tools, we felt that a modern method was needed to solve the problem of proficient representation. There is a reason why the largest quantum chemical calculations only involve tens of thousands of electrons even for the most approximate methods, while classical chemical computation techniques such as molecular dynamics can handle millions of atoms.

The state of a classical system can be easily described – just trace the position and momentum of each particle. Representing the state of a quantum system is much more tough. Each possible electron position configuration must be assigned a probability. This is encoded in a wave function that assigns a positive or negative number to each electron configuration, and the squared wave function gives the probability of finding a system in that configuration.

The space of all possible configurations is huge – if you tried to represent it as a grid of 100 points in each dimension, then the number of possible electronic configurations of a silicon atom would be greater than the number of atoms in the universe. This is where we thought deep neural networks could lend a hand.

Over the past few years, there has been tremendous progress in representing convoluted, multivariate probability distributions using neural networks. Now we know how to train these networks efficiently and scalably. We guessed that, given that these networks have already proven their ability to fit high-dimensional functions for artificial intelligence problems, perhaps they could also be used to represent quantum wave functions.

Researchers like Giuseppe Carleo, Matthias Troyer and others showed how up-to-date deep learning can be used to solve idealized quantum problems. We wanted to operate deep neural networks to solve more realistic problems in condensed matter chemistry and physics, and that meant including electrons in our calculations.

There is only one wrinkle when dealing with electrons. Electrons must obey Pauli’s exclusion principlewhich means they cannot be in the same place at the same time. This is because electrons are a type of particle known as fermionswhich include the building blocks of most matter: protons, neutrons, quarks, neutrinos, etc. Their wave function must be antisymmetric. If you swap the position of the two electrons, the wave function will be multiplied by -1. This means that if two electrons are on top of each other, the wave function (and the probability of this configuration) will be zero.

This meant that we had to develop a modern type of neural network that was antisymmetric in its inputs, which we called FermiNet. In most quantum chemical methods, antisymmetry is introduced using a function called the determinant. The matrix determinant has the property that if we swap two rows, the result will be multiplied by -1, just like the wave function for fermions.

So you can take several one-electron functions, evaluate them for every electron in your system, and pack all the results into one matrix. The determinant of this matrix is ​​then an appropriately antisymmetric wave function. The main limitation of this approach is that the resulting function – known as a Slater determinant – not too general.

The wave functions of real systems are usually much more complicated. A common way to correct this is to take a immense linear combination of Slater determinants – sometimes millions or more – and add a few uncomplicated corrections based on electron pairs. Even then, this may not be enough to accurately calculate the energy.

Latest Posts

More News