Introducing ResLoRA: An Enhanced Framework for LLMs
Large language models (LLMs) with hundreds of billions of parameters have significantly improved performance on various tasks. Fine-tuning these models on specific datasets enhances performance compared to prompting during inference but comes with high costs due to parameter volume. This is where low-rank adaptation (LoRA) comes in as a popular parameter-efficient fine-tuning method for LLMs.
Understanding Low-rank Adaptation (LoRA)
LoRA freezes most parameters in the original model and only updates a few in added modules. It employs low-rank adaptation, merging matrices parallel to frozen linear layers during inference. However, LoRA’s long backward path poses challenges, especially when integrated with ResNet and Transformers, impacting gradient flow during training.
ResLoRA: An Advanced Framework
Researchers from the School of Computer Science and Engineering, Beihang University, Beijing, China, and Microsoft have introduced ResLoRA, an improved framework of LoRA. ResLoRA adds residual paths to LoRA blocks during training and employs merging approaches to convert ResLoRA to LoRA blocks during inference. This innovative approach outperforms original LoRA and other baseline methods across natural language generation (NLG) and understanding (NLU) tasks.
ResLoRA introduces three blocks inspired by ResNet—input-shortcut, block-shortcut, and middle-shortcut—adding residual paths to LoRA blocks. These structures improve gradient flow during training and crucial for efficient parameter tuning. To seamlessly integrate ResLoRA with linear layers, a merging approach has been designed. This merging relies on previous block weights to ensure accurate model merging.
Enhanced Performance and Results
In extensive experiments across NLG, NLU, and text-to-image tasks, ResLoRA consistently outperforms other variants like AdaLoRA, LoHA, and LoKr. It showcases improvements in accuracy ranging from 10.98% to 36.85% and demonstrates faster training and superior image generation quality compared to LoRA.
Overall, ResLoRA proves to be an effective framework for fine-tuning LLMs, offering superior outcomes with fewer training steps and no additional trainable parameters. It sets a new standard in the field, combining the benefits of LoRA with innovative approaches for enhanced performance across various AI tasks.