Models of artificial intelligence of the neural network used in applications such as medical image processing and speech recognition perform operations on extremely elaborate data structures that require a huge amount of calculations for processing. This is one of the reasons why deep learning models consume so much energy.
To improve the efficiency of AI models, MIT scientists have created an automated system that allows programmers to develop deep learning algorithms at the same time operate two types of data redundancy. This reduces the amount of calculations, bandwidth and memory needed for machine learning surgery.
Existing algorithm optimization techniques can be burdensome and usually allow programmers to only operate rarity or symmetry – two different types of redundancy that exist in the structures of deep learning data.
By enabling the developer to build an algorithm from scratch, which uses both exemptions at the same time, the approach of researchers MIT has increased the calculation speed by almost 30 times in some experiments.
Because the system uses a user -friendly programming language, it can optimize machine learning algorithms for a wide range of application. The system can also assist scientists who are not experts in deep learning, but want to improve the efficiency of AI algorithms they operate to process data. In addition, the system may apply to scientific calculations.
“For a long time, intercepting these data exemptions required many implementation efforts. Instead, a scientist can tell our system what he would like to calculate in a more abstract way, not to mention the system how to calculate it, “says Willow Ahrens, Postdoc Mit and Co -author A System paperwhich will be presented at the international symposium on the generation of code and optimization.
It is joined by the main author of Radha Patel ’23, SM ’24 and senior author Saman Amarasinghe, a professor at the Faculty of Electrical Engineering and Computer Science (EEC) and the main researcher at the IT and Artificial Intelligence Laboratory (CSAIL).
Cutting out calculations
In machine learning, data is often represented and manipulated as multidimensional boards called tensors. Tensor is like a matrix, which is a rectangular system of values arranged on two axes, rows and columns. But unlike a two -dimensional matrix, a tensor can have many dimensions or axes, which makes it hard to manipulate tensors.
Deep learning models are performed by operations on tenants using repeated matrix multiplication and adding-this process is how neural networks learn elaborate data patterns. The volume of calculations, which should be made on these multidimensional data structures, requires a huge amount of calculations and energy.
But due to the method of arranging data in tensors, engineers can often boost the speed of the neural network by cutting out excess calculations.
For example, if the Tensor represents the data from the user’s review from the e-commerce, because not every user has reviewed each product, most of the values in this tensor are probably zero. This type of data redundancy is called uncommon. The model can save time and calculations, storing and acting only on non -zero values.
In addition, sometimes the tensor is symmetrical, which means that the upper half and the lower half of the data structure are even. In this case, the model must only work in half, reducing the number of calculations. This type of data redundancy is called symmetry.
“But when you try to capture both of these optimizations, the situation becomes quite complex,” says Ahrens.
To simplify this process, she and her colleagues have built a modern compiler, which is a computer program, which explains the elaborate code into a simpler language that can be processed by the machine. Their compiler, called Systec, can optimize calculations, automatically using both uncommon and symmetry in tensors.
They started the Systec building process, identifying three key optimizations that they can do with symmetry.
First of all, if the algorithm’s output tensor is symmetrical, it only has to calculate half. Secondly, if the input tensor is symmetrical, the algorithm must read only half. Finally, if the indirect results of the TENSOR operations are symmetrical, the algorithm can skip excess calculations.
Simultaneous optimization
To operate Systec, the programmer introduces his program, and the system automatically optimizes his code for all three types of symmetry. Then the second phase of Systec performs additional transformations to store only non -zero data values, optimizing the program for uncommon.
Ultimately, Systec generates ready -to -use code.
“In this way we receive the benefits of both optimization. An interesting thing in symmetry is that your tensor has more dimensions, you can get even more savings on calculations, “says Ahrens.
Scientists have shown an acceleration of almost 30 with a code generated automatically by Systc.
Because the system is automated, it can be particularly useful in situations where a scientist wants to process data using an algorithm that they write from scratch.
In the future, scientists want to integrate Systc with existing uncommon tensor compiler systems to create a trouble -free interface for users. In addition, they would like to operate it to optimize the code to obtain more elaborate programs.
These works are partly financed by Intel, the National Science Foundation, the Defense Research Project Agency and the Energy Department.