Researchers at Amazon Web Services and the University of Wisconsin-Madison recently conducted a study to improve large language models of code (Code-LLMs). These models assist in code completion but can sometimes miss bugs in the code context.
Programming bugs are common and can be tough to locate and fix. The study focuses on enhancing the performance of Code-LLMs in detecting potential bugs during code generation, to help programmers.
The research aims to improve our understanding of Code-LLMs when it comes to generating functional implementations from a code context with potential bugs. Two datasets, namely, buggy-HumanEval and buggy-FixEval, were used to evaluate the study. The study revealed significant performance degradation of the models, with test-case pass rates dropping below 5%.
Several post-mitigation methods were explored to address these issues, including removal-then-completion and rewriting-then-completion. But gaps persist even with these methods. This work highlights the need for further research to enhance code completion with potential bugs.
The work presented in the study can be summarized in the following points:
1. Introduction of a new task called bCC, to generate functional implementations from a code context with potential bugs.
2. Evaluation on two datasets named buggy-HumanEval and buggy-FixEval.
3. Significant degradation of Code-LLMs’ performance, with test-case pass rates dropping below 5%.
4. Exploration of post-mitigation methods, including completion-then-rewriting and rewriting-then-completion.
5. Proposal of ways to improve code completion in the presence of potential bugs.
To check out the study, you can find the paper here.
If you’re interested in this kind of research, we recommend joining our 34k+ ML SubReddit, our 41k+ Facebook Community, our Discord Channel, and subscribing to our Email Newsletter for the latest AI research news, cool AI projects, and more. If you like our work, you’ll love our newsletter!