Responsibility and security
Using advanced artificial intelligence to fix critical gaps in the software
Today we provide the initial results of our research on CodeMender, a novel agent based on artificial intelligence, which automatically improves code safety.
Finding and repairing software gaps is extremely arduous and time -consuming, even using conventional, automated methods such as Fuzzing. Our efforts based on artificial intelligence such as Great dream AND Oss-Fuzz They demonstrated the ability of artificial intelligence to find novel zero-day gaps in well-tested software. As we achieve more and more breakthroughs in detecting locks in security with artificial intelligence, it will be harder for people to keep up.
CodeMender helps solve this problem by adopting a comprehensive approach to code safety, which is both reactive, immediately patching novel gaps, and proactive, rewriting and securing the existing code, eliminating entire grades of gaps. Over the past six months, when we created CodeMender, we’ve already sent 72 security corrections to Open Source projects, including some with a size of up to 4.5 million code lines.
Thanks to the automatic creation and apply of high -quality security corrections, the Codemender -based agent based on artificial intelligence helps programmers and carers focus on what they do best – creating good software.
Codemender in action
Codemender works based on the latest thinking opportunities Twins deeply think Models to create an autonomous agent capable of debugging and repairing convoluted gaps in security.
To achieve this, the Codemender agent is equipped with solid tools that allow him to analyze the code before introducing changes and automatically verify these changes to make sure they are correct and do not cause regression.
Animation showing the process of repairing gaps by CodeMender.
Although the models of enormous languages improve quickly, code safety errors can be high-priced. The process of automatic validation CodeMender guarantees that changes in the code are correct in many respects, because only high -quality corrections appear for manual insight, which, for example, remove the original cause of the problem, are functionally correct, do not cause regression and comply with style tips.
As part of our research, we have also developed novel techniques and tools that allow CodeMender to analyze the code more effectively and verify changes. This includes:
- Advanced program analysis: We have developed tools based on advanced program analysis, which include immobile analysis, vigorous analysis, differential tests, fizzing and SMT solvers. Using these tools for systematic analysis of code patterns as well as flow control and data flow, CodeMender can better identify the original causes of gaps in security and weaknesses of architecture.
- Multi -stage systems: We have developed special purpose agents that allow CodeMnder to solve specific aspects of the basic problem. For example, CodeMender uses a critic tool based on enormous language models, which emphasizes the differences between the original and modified code to see if the proposed changes do not introduce regression, and if necessary, make auto -levy.
Repairing the gaps
To effectively patch the gap and prevent it from re -appearance, Code Mender uses the debger, source code browser and other tools to indicate the main causes and develop corrections. To the video carousel below we have added two examples of patching gaps in Codemender.
Example No. 1: Identification of the original cause of the gap in security
Here is a fragment of the agent’s reasoning on the main cause of the patch generated by CodeMender, after analyzing the Debuger’s results and code search tools.
Although the last patch in this example changed only a few lines of the code, the original cause of the gap was not immediately clear. In this case, the report on the failure showed the overflow of the stroke buffer, but the actual problem lay elsewhere – incorrect management of the stack of XML (Extensible Markup Language) elements during analysis.
Example No. 2: The agent can create non -trivial patches
In this example, the CodeMender agent managed to develop a non -skilled patch that solves the convoluted problem of the life of the object.
The agent not only managed to find the cause of the gap, but also to modify a completely non -standard Code C code generation system under the project.
Proactive rewriting of the existing code to raise safety
We also designed CodeMender to rewrite the existing code to apply safer Data Structures and API interfaces.
For example, we have implemented Codemender for apply -Fbounds-security Annotations for a part of a commonly used image compression library called libwebp. When -Fbounds-security Annotations are used, the compiler adds control of the borders to the code to prevent the attacker from using the overflow or failure to complete the buffer to perform any code.
A few years ago in Libwebp a gap in security related to the overflow of the pile buffer (CVE-2023-4863) was used by a person who is a threat under Exploit on iOS requiring zero click. WITH -Fbounds-security Annotations, this gap, like most other buffer overflow in the project in which we used annotations, would always become impossible to apply.
In the video carousel below we show examples of the agent’s decision -making process, including the stages of validation.
Example No. 1: Agent’s reasoning steps
In this example, Agent Codemender is asked to deal with the following issues -Fbounds-security Error turned on bit_bębokości indicator:
Example No. 2: The agent automatically corrects errors and test failures
Another key feature of Codemender is his ability to automatically improve novel errors and any failure of tests that result from their own annotations. Here is an example of recovering the agent after a compilation error.
Example No. 3: The agent checks the changes
In this example, Agent Codemender modifies the function, and then uses the LLM assessment tool configured for functional equivalence to check if the functionality remains intact. When the tool detects a failure, the agent himself corrects an error based on the opinion of the LLM judge.
Software safety for everyone
Although our early results from Codemender are promising, we take a cautious approach, focusing on reliability. Currently, all patches generated by Codemender are checked by researchers before sending them to the main source of software.
Using Codemender, we have already started sending corrections to various critical open source libraries, many of which have already been accepted and further sent. We gradually accelerate this process to ensure quality and systematically take into account the opinions of the Open Source community.
We will also gradually reach interested carers of key Open Source projects with corrections generated by Codemender. We hope that by based on the opinions obtained in this process, we will provide CodeMender as a tool that all software developers will be able to apply to ensure the safety of their code databases.
We will provide a number of techniques and results that we intend to publish in the form of technical articles and reports in the coming months. Thanks to CodeMender, we have just started to discover the amazing potential of artificial intelligence in increasing the safety of software for everyone.
Thanks
Creators (mentioned in alphabetical order):
Alex Rebert, Armman Hasanzadeh, Carlo Lemos, Charles Sutton, Dongge Liu, Gogul Balakrishnan, HIEP CHY, James Zern, Koushik Sen, Liang, Liang, Max Shavrick, Oliver Chang and Petros Manic.
