We should regard the present state of the universe as the effect of its former state and as the cause of the state which is to come. An intelligence knowing all the forces acting in nature at a given moment, as well as the momentary positions of all things in the universe, would be able to embrace in one pattern the movements of the largest bodies as well as the lightest atoms in the world, provided its intellect were sufficiently powerful.
The converse implication is that to someone without sufficient intellect, processes such as coin tossing would appear random. The language of computation allows us to formalize this connection.
Earlier this year, Avi Wigderson received the Turing Award“The Nobel Prize in Computer Science,” in part for formally combining randomness with mathematical functions that are hard to compute. He and his colleagues have created a process that takes a suitably complicated function and outputs “pseudorandom” bits that cannot be effectively distinguished from truly random bits. Randomness, it seems, is simply computation that we cannot predict.
Do we have a way to manage this randomness and complexity? The recent advances we’ve seen in AI through machine learning give us some insight into what that would mean. Information can be divided into a structured part and a random part. Take English, for example. There’s an underlying complicated structure that describes language, and the sentences that society has produced over time are really a random sampling of that structure. Recent advances in machine learning have allowed us to take those random samples and recover a significant amount of the underlying structure. Often, that structure seems concealed, but we can still operate it to simulate random samples, generating up-to-date English sentences on demand.
Consider the problem of translation. Imagine a woman, Sophie, who grew up speaking English and French and now works as a translator. She can easily take an English text, fully understand it, and produce an equivalent in French. Computational speaking, the machine in this case is Sophie’s brain, because it has to follow some process that converts English into French. Sophie probably doesn’t understand the whole process, or even think about it as a process, but it happens anyway.
Now suppose we want to translate a text on a computer. Simply using a French-English dictionary to translate word by word doesn’t work, because different languages have different structures and words have different meanings in different contexts. Using linguistic tools is not enough; the computational process of understanding a language goes beyond what we can describe.
Sophie understands languages because she grew up bilingually, exposed to both languages and all their complexities. Machine learning takes a similar approach, training language models on vast amounts of data. These models consist of a complicated neural network, a collection of artificial neurons cleverly connected to each other, and those connections have associated weights that change the signals that go through the system. Once trained, the neural network will predict the probability of translating the next word in a sequence from English to French.
