Training Autoregressive Language Models with Fill-in-the-Middle for Improved Performance

Autoregressive Language Models and Text Infilling

In recent years, there has been a growing interest in data augmentation techniques for autoregressive language models. One such technique involves moving a span of text from the middle of a document to its end. This simple transformation has shown promising results in text infilling tasks.

The good news is that this data augmentation technique does not harm the original generative capability of the models. Extensive evidence indicates that models trained with a large fraction of data transformed in this way perform well in terms of perplexity and sampling evaluations across different scales.

Considering the usefulness, simplicity, and efficiency of training models with this technique, it is recommended that future autoregressive language models be trained with text infilling by default.

To ensure the best results, we conducted a series of ablations on key hyperparameters. These included the frequency of data transformation, the structure of the transformation, and the method of selecting the infill span. Based on our findings, we have established strong default settings and best practices for training models with text infilling.

We are excited to announce that we have released our best infilling model, trained with these recommended practices, in our API. Additionally, we have made our infilling benchmarks available to assist researchers in their future work on this topic.

Benefits of Autoregressive Language Models and Text Infilling

By incorporating text infilling into the training process, autoregressive language models can achieve better performance and improve their overall functionality. The ability to fill in missing text can be particularly useful in applications such as automated summarization, content generation, and dialogue systems.

The straightforward approach of moving a span of text from the middle to the end of a document allows the models to learn and generate more coherent and contextually relevant text.

Training autoregressive language models with text infilling can be done efficiently and effectively. The addition of this technique enhances the language generation process and provides more accurate and useful results.


In conclusion, the technique of text infilling with autoregressive language models has proven to be a valuable approach in improving their generative capability. By implementing strong default settings and best practices, we ensure the optimal training of these models.

We are excited about the future prospects of this technique and its potential for further advancements in natural language processing. Through our released infilling model and benchmarks, we aim to contribute to the ongoing research in this field and empower others to build on our work.

Source link

Stay in the Loop

Get the daily email from AI Headliner that makes reading the news actually enjoyable. Join our mailing list to stay in the loop to stay informed, for free.

Latest stories

You might also like...