Monday, December 23, 2024

This is your brain. This is your brain in code

Share

Functional magnetic resonance imaging (fMRI), which measures changes in blood flow in the brain, has been used over the past few decades for a variety of applications, including “functional anatomy” – a way to determine which areas of the brain are turned on when a person performs a specific task . fMRI is used to study people’s brains as they perform a wide variety of activities – solving math problems, learning foreign languages, playing chess, improvising on the piano, completing crosswords, and even watching TV shows like “Curb Your Enthusiasm.” “

One activity that receives little attention is computer programming – both writing code and the equally cumbersome task of trying to understand a piece of code that has already been written. “Given the importance that computer programs play in our daily lives,” says Shashank Srikant, a graduate student at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), “it’s certainly worth pursuing. These days, so many people interact with code—reading, writing, designing, debugging—but no one really knows what’s going on in their heads when it’s happening.” Fortunately, he made some “progress” in this direction in a paper – written with MIT colleagues Benjamin Lipkin (the paper’s other lead author, along with Srikant), Anna Ivanova, Evelina Fedorenko, and Una-May O’Reilly – that was presented earlier this month at a conference on neural information processing systems held in Modern Orleans.

Modern paper built on 2020 study, written by many of the same authors who used fMRI to monitor programmers’ brains as they “understood” diminutive fragments, or snippets of code. (Understanding in this case means looking at a piece of code and correctly determining the result of the calculations performed by the piece.) The 2020 work found that code comprehension did not consistently activate the language system, the areas of the brain responsible for language processing, explains Fedorenko, professor of brain and cognitive sciences (BCS) and co-author of the earlier study. “Instead, the multiple demand network – a brain system associated with general reasoning and supporting domains such as mathematical and logical thinking – was highly active.” The current work, which also uses developers’ MRI scans, requires “a deeper dive,” he says, trying to get more detailed information.

While the previous study involved 20-30 people to determine which brain systems were used on average to understand code, the fresh study focuses on individual programmers’ brain activity while processing specific elements of a computer program. For example, suppose there is a one-line piece of code that covers word manipulation and a separate piece of code that covers a math operation. “Can I go from the activity that we see in brains, the actual brain signals, and try to reverse engineer it and figure out what exactly the programmer was looking at?” – asks Srikan. “This would reveal what program information is uniquely encoded in our brains.” He notes that neuroscientists consider a physical property to be considered “hard-coded” if they can infer that property from a person’s brain signals.

For example, consider a loop—an instruction in a program that repeats a specific operation until a desired result is achieved—or a branch, another type of programming instruction that can cause a computer to switch from one operation to another. Based on the observed patterns of brain activity, the group could tell whether someone was evaluating a piece of code that included a loop or a branch. Researchers could also determine whether the code referred to words or mathematical symbols and whether someone was reading the actual code or just a written description of the code.

This addressed the first question a researcher might ask about whether something is actually encoded. If the answer is yes, the next question might be: where is this encoded? In the cases cited above – loops or branches, words or mathematics, code or its description – brain activation levels were found to be comparable both in the language system and in the multiple demand network.

A noticeable difference was observed in the code properties related to the so-called energetic analysis.

Programs can have “static” properties – such as the number of digits in a sequence – that do not change over time. “But programs can also have a dynamic aspect, such as the number of times a loop is run,” says Srikant. “I can’t always read a piece of code and know in advance what the program’s runtime will be.” MIT researchers found that in energetic analysis, information is encoded much better in the multi-request network than in the language processing center. This discovery was one clue in their quest to see how code understanding is distributed in the brain – which parts are involved and which play a larger role in specific aspects of the task.

The team conducted a second series of experiments, which involved machine learning models called neural networks, specifically trained in computer programs. In recent years, these models have successfully helped developers assemble pieces of code. The group wanted to find out whether the brain signals observed in their study when participants analyzed pieces of code resembled the activation patterns observed when neural networks analyzed the same piece of code. The answer they arrived at was a resounding yes.

“If you put a piece of code into a neural network, it will generate a list of numbers that will somehow tell you what the program is about,” Srikant says. Brain scans of people studying computer programs similarly produce a list of numbers. For example, when branching dominates a program, “you see a clear pattern of brain activity,” he adds, “and you see a similar pattern when a machine learning model tries to understand the same piece.”

Mariya Toneva of the Max Planck Institute for Software Systems finds such discoveries “particularly thrilling. They escalate the ability to operate computational models of code to better understand what is happening in our brains when we read programs,” he says.

MIT researchers are clearly intrigued by the connections they have discovered, which shed featherlight on how distinct pieces of computer programs are encoded in the brain. But they don’t yet know what these recently gathered insights can tell us about how people carry out more sophisticated plans in the real world. Performing these kinds of tasks – such as going to the cinema, which requires checking showtimes, arranging transportation, buying tickets, etc. – cannot be done with a single unit of code and just one algorithm. Successful execution of such a plan would rather require “composition” – combining different fragments and algorithms into a reasonable sequence that leads to something fresh, much like putting together individual bars of music to create a song or even a symphony. Creating code composition models, says O’Reilly, chief scientist at CSAIL, “is beyond our reach at this point.”

Lipkin, a BCS doctoral student, sees this as the next logical step — finding a way to “combine simple operations to build complex programs and use these strategies to successfully solve general reasoning tasks.” He also believes that part of the progress toward this goal so far is due to its interdisciplinary composition. “We were able to benefit from individual expertise in program analysis and neural signal processing, as well as combined work in machine learning and natural language processing,” says Lipkin. “This type of collaboration is becoming more common as neuroscientists and computer scientists join forces in the pursuit of understanding and building general intelligence.”

This project was funded by grants from the MIT-IBM Watson AI Laboratory, MIT Quest for Intelligence, the National Science Foundation, the National Institutes of Health, the McGovern Institute for Brain Research, the MIT Department of Brain and Cognitive Sciences, and the Simons Center for the Social Brain.

Latest Posts

More News