Introducing RA-DIT: Enhancing Language Models with Retrieval Capabilities
Researchers from Meta have developed Retrieval-Augmented Dual Instruction Tuning (RA-DIT) as a solution to address the limitations of large language models (LLMs) in capturing less common knowledge and the high computational costs of extensive pre-training. RA-DIT is a lightweight fine-tuning methodology designed to equip any LLM with efficient retrieval capabilities.
Key Features of RA-DIT
RA-DIT is a two-stage fine-tuning method that enhances LLMs with retrieval capabilities. The first stage focuses on optimizing the LM’s use of retrieved information, while the second stage refines the retriever to provide more contextually relevant results preferred by the LLM. By combining these two stages, RA-DIT outperforms existing retrieval-augmented models in knowledge-intensive zero and few-shot learning benchmarks.
How RA-DIT Works
RA-DIT involves two key fine-tuning stages. In the first stage, the pre-trained LLM’s utilization of retrieved information is enhanced. The second stage focuses on refining the retriever to provide more contextually relevant results preferred by the LLM. This approach utilizes the LLAMA language model, pretrained on an extensive dataset, and a dual-encoder-based retriever architecture initialized with the DRAGON model. Additionally, parallel in-context retrieval augmentation is used to compute LLM predictions more efficiently.
Performance and Results
RA-DIT achieves notable performance enhancements, with RA-DIT 65B setting new benchmarks in knowledge-intensive zero-and few-shot learning tasks. It surpasses existing in-context Retrieval-Augmented Language Models (RALMs) by a significant margin. RA-DIT also outperforms base LLAMA models in commonsense reasoning evaluation datasets. Ablation analysis and parallel in-context retrieval augmentation further highlight RA-DIT’s effectiveness in enhancing retrieval-augmented language models.
In conclusion, RA-DIT offers a lightweight fine-tuning method to enhance the performance of pre-trained language models with retrieval capabilities. It achieves state-of-the-art results in zero and few-shot evaluations on knowledge-intensive benchmarks, showcasing its superiority in incorporating external knowledge into LLMs for improved performance.
For more information, you can read the research paper. We would also like to invite you to join our ML SubReddit with 31k+ members, our Facebook Community with 40k+ members, our Discord Channel, and subscribe to our Email Newsletter, where we share the latest AI research news and cool AI projects.
If you enjoy our work, you will love our newsletter. Don’t forget to subscribe!