A single photograph gives insight into the creator’s world – his interests and feelings about a subject or space. But what about the creators of the technologies that support create such images?
MIT Department of Electrical Engineering and Computer Science Associate Professor Jonathan Ragan-Kelley is one such person, having designed everything from visual effects tools in movies to the Halide programming language, widely used in the industry for photo editing and processing. As a researcher at the MIT-IBM Watson AI Lab and the Computer Science and Artificial Intelligence Laboratory, Ragan-Kelley specializes in high-performance, domain-specific programming languages and machine learning that enable 2D and 3D graphics, visual effects, and computational photography.
“The biggest goal of a lot of our research has been to develop new programming languages that make it easier to write programs that run really efficiently on the increasingly complex hardware that’s in computers today,” Ragan-Kelley says. “If we want to continue to increase the computing power that we can actually use in real-world applications – from graphics and visual computing to artificial intelligence – we need to change the way we program.”
Finding the golden mean
Over the past two decades, chip designers and software engineers have witnessed a slowdown Moore’s law and a clear shift from general-purpose computing on CPUs to more diverse and specialized computing and processing units such as GPUs and accelerators. There’s a trade-off with this transition: the ability to run general-purpose code a bit slower on processors for faster, more capable hardware that requires the code to be heavily tailored to it and mapped to it with customized programs and compilers. Newer hardware with improved software can better support applications such as broadband cellular radio interfaces, decoding highly compressed videos for streaming, and graphics and video processing in power-limited cellular cameras, to name just a few applications.
“Our work is all about unlocking the power of the best hardware we can build to deliver the greatest computational performance and efficiency for these types of applications in ways that traditional programming languages cannot.”
To achieve this, Ragan-Kelley divides her work into two directions. First, it sacrifices generality to capture the structure of specific and vital computational problems and exploits it to achieve better computational efficiency. This can be seen in the Halide image processing language, which he co-wrote and which helped transform the image editing industry in programs like Photoshop. Moreover, because it was specifically designed to quickly handle dense, regular arrays of numbers (tensors), it also performs well in neural network computations. The second goal focuses on automation, specifically how compilers map programs to hardware. One such project at the MIT-IBM Watson AI Lab uses Exo, a language developed by the Ragan-Kelley group.
Over the years, researchers have worked doggedly to automate coding with compilers, which can be a black box; however, there is still a great need for direct control and tuning of performance by engineers. Ragan-Kelley and his group develop methods that combine each technique, balancing the trade-offs to achieve effective and resource-efficient programming. At the core of many high-performance programs, such as video game engines or camera processing in mobile phones, are state-of-the-art systems that are largely hand-optimized by experts in low-level, detailed languages such as C, C++, and assembly language. This is where engineers make specific choices about how the program will run on the hardware.
Ragan-Kelley notes that developers can opt for “very tedious, very unproductive and very unsafe low-level code” that can introduce bugs, or “safer, more productive higher-level programming interfaces” that don’t allow for minor tweaks in the compiler about how the program works and usually result in lower performance. That’s why his team is trying to find a joyful medium. “We’re trying to figure out how to get control over the key issues that human performance engineers want to have under control,” says Ragan-Kelley, “so we’re trying to build a new class of languages that we call User-schedulable languages that provide safer and higher support levels to control what the compiler does or to control program optimization.
Unlocking Gear: High-Level and Underrated Ways
Ragan-Kelley and his research group are tackling this problem in two directions: using machine learning and modern artificial intelligence techniques to automatically generate optimized schedules that interface to the compiler to achieve better compiler performance. Another uses “exocompilation” which he is working on in the lab. He describes this method as a way to “turn the compiler inside out” with a compiler framework with controls that allow for human targeting and customization. Additionally, his team can add its own schedulers on top, which can help target specialized hardware like IBM Research’s machine learning accelerators. Applications for this work cover a wide range of applications: computer vision, object recognition, speech synthesis, image synthesis, speech recognition, text generation (large language models), etc.
His large-scale laboratory project goes one step further, approaching the work from a systems perspective. In work led by his advisor and intern William Brandon, in collaboration with lab scientist Rameswar Panda, the Ragan-Kelley team is rethinking large language models (LLMs), finding ways to slightly change the computation and programming architecture of the model so that transformer-based models on AI technology can run more efficiently on AI hardware without losing accuracy. Their work, Ragan-Kelley says, diverges significantly from standard thinking, with potentially big benefits in terms of cutting costs, improving capabilities and/or shrinking LLM to require less memory and run on smaller computers.
This more avant-garde thinking when it comes to computational performance and hardware is where Ragan-Kelley excels and where she sees value, especially in the long term. “I think there are certain areas [of research] that should be implemented, but are well-established or obvious, or result from such conventional knowledge that many people are already implementing them or will implement them,” he says. “We try to find ideas that have a lot of practical impact in the world, but at the same time are things that wouldn’t necessarily happen or, in my opinion, are undervalued relative to their potential by the rest of the community. “
An example of this is the course he currently teaches, 6.106 (Software Performance Engineering). About 15 years ago, there was a shift from a single processor to multiple processors on a device, which resulted in parallelism being taught in many academic programs. However, as Ragan-Kelley explains, MIT realized how important it was for students to understand not only parallelism, but also memory optimization and using specialized hardware to achieve the best possible performance.
“By changing the way we program, we can unlock the computational potential of new machines and enable people to continue to rapidly develop new applications and new ideas that can take advantage of this increasingly complex and demanding hardware.”