Saturday, April 19, 2025

Making a code generated by AI in any language

Share

Programmers can now operate immense language models (LLM) to generate computer code faster. However, this makes life easier for programmers if this code acts in accordance with the principles of programming language and does not cause a computer failure.

There are some methods to ensure that LLM complies with the principles of any language in which they generate the text, but many of these methods either distort the intended meaning of the model, or are too time consuming to be feasible for elaborate tasks.

The up-to-date approach developed by scientists from the MIT and elsewhere automatically leads LLM to generate a text, which is in line with the rules of the appropriate language, such as a specific programming language, and is also flawless. Their method allows LLM to assign efforts to results, which will most likely be correct and correct, while rejecting unusual results at the beginning of the process. This probabilistic approach increases computing efficiency.

Due to these profits from performance, the architecture of scientists enabled a compact LLM to surpass much larger models in generating correct, appropriately structured results for several real operate cases, including molecular biology and robotics.

In the long run, this up-to-date architecture can lend a hand in Unsperts control the content generated AI. For example, this can allow businessmen to write elaborate queries in SQL, the language of database manipulation, using only natural language hints.

“This work has an impact outside of research. This can improve programming assistants, analysis of powered AI data and scientific tools to discover by ensuring that the results generated by AI remain both useful and correct,” says João Loula, MIT graduate and joint author of the article on these frames.

Loula joins the article by the author of Benjamin Lebrun, a research assistant at Milla-tribec Artificial Intelligence Institute, and Li du, a graduate from John Hopkins University; Authors of co-seenior vikash mansinghka ’05, Meng ’09, PhD ’09, main research scientist and leader of the probabilistic computing project in the MIT Brain and Cognitive Sciences department; Alexander K. Lew SM ’20, assistant professor at the University of Yale; Tim Vieira, Postdok in Eth Zurich; and Timothy J. O’donnell, associate professor at the University of McGill and chairman Canada Cifar Ai in Mili, who led an international team; And also a few others. The research will be presented at an international conference on the representation of learning.

Enforcement of structure and meaning

One of the common approaches to controlling the structural text generated by LLMS is to check the entire output such as a computer code block, to make sure it is correct and will work flawlessly. If not, the user must start again, collecting calculation resources.

On the other hand, the programmer may stop checking the way on the road. Although this may ensure that the code complies with the programming language and is structurally correct, the gradual correcting of the code may cause it to drift it from the meaning that the user intends, hurting its accuracy in the long run.

“It is much easier to enforce the structure than the meaning. We can quickly check if something is in the right programming language, but to check its meaning, you need to do the code. Our work also applies to dealing with these different types of information,” says Loula.

Scientists’ approach includes engineering knowledge in LLM to direct it to the most promising results. These outputs more often observe structural restrictions defined by the user and have the meaning that the user intends.

“We do not try to train LLM to do this. Instead, we engineer the knowledge that an expert would have and connect it with LLM knowledge, which offers a completely different approach to scaling than in deep learning,” adds Mansinghka.

They achieve this using a technique called the sequential Monte Carlo, which allows you to generate a parallel with LLM to compete with each other. The model dynamically allocates resources to various threads with parallel calculations depending on how promising their initial appears.

Each exit receives a weight that represents how likely it is that it is structurally and semantically correct. At each stage of calculations, the model focuses on those with higher weights and throws the rest.

In a sense, it is as if LLM has an expert browsing her arm to ensure the right choice at each stage, while focusing on a general purpose. The user defines his desired structure and meaning, as well as how to check the results, and then the architecture of scientists leads LLM to perform the rest.

“We have developed hard mathematics so that in the case of all kinds of restrictions you want to turn on, you will receive the right weights. Ultimately, you will receive the right answer,” says Loula.

Increasing compact models

To test their approach, they used the Framework for LLM, whose task is to generate four types of outputs: Python code, SQL queries, molecular structures and plans to imitate the robot.

Compared to existing approaches, the scientists were made more accurately, requiring less calculations.

For example, in generating code in Python, the architecture of scientists enabled a compact Open Source model to surpass a specialized, commercial closed model, which is more than twice as immense.

“We are very excited that we can let these small models exceed their weight,” says Loula.

Going further, scientists want to operate their technique to control larger fragments of the generated text, instead of working on one compact piece at once. They also want to combine their method with learning to control the results generated by the model, learns to be more correct.

In the long run, this project may have wider applications for non -technical users. For example, it can be combined with systems for automatic data modeling and inquiries about generative database models.

This approach may also enable machine -assisted data analysis systems in which the user can talk to the software, which thoroughly models the meaning of the data and questions asked by the user, adds Mansinghka.

“One of the basic questions about linguistics is how the meaning of words, phrases and sentences can be justified by the world models, taking into account the uncertainty and ambiguity in the sense and reference. The ground.

These studies are partly financed as part of the program of the chairmen of AI Canada Cifar and by Siegel Family Foundation through a gift for the myth of Siegel Family Family Quest for Intelligence.

Latest Posts

More News