The Revolution of Diffusion Models in Generative Modeling
Generative modeling has been revolutionized with the introduction of diffusion models. These models have had a significant impact on various data types. However, when it comes to generating aesthetically pleasing images from text descriptions, fine-tuning is often necessary. Text-to-image diffusion models use techniques such as classifier-free guidance and curated datasets like LAION Aesthetics to enhance alignment and image quality.
The image above showcases Direct Reward Fine-Tuning (DRaFT) using human preference reward models. The authors of the research propose a simple and efficient method for gradient-based reward fine-tuning. This method involves differentiating through the diffusion sampling process. The DRaFT approach backpropagates through the entire sampling chain, which is typically represented as an unrolled computation graph with a length of 50 steps. To manage memory and computational costs effectively, gradient checkpointing techniques are utilized, and LoRA weights are optimized instead of modifying the entire set of model parameters.
Enhancements to DRaFT for Improved Efficiency and Performance
In addition to the basic DRaFT method, the authors introduce two enhancements to further improve efficiency and performance. The first enhancement is DRaFT-K, a variant that limits backpropagation to only the last K steps of sampling when computing the gradient for fine-tuning. This approach outperforms full backpropagation with the same number of training steps, as full backpropagation can lead to issues with exploding gradients. The second enhancement is DRaFT-LV, a variation of DRaFT-1 that computes lower-variance gradient estimates by averaging over multiple noise samples. This enhancement enhances the efficiency of the approach.
The authors applied DRaFT to Stable Diffusion 1.4 and conducted evaluations using various reward functions and prompt sets. They compared their methods to RL-based fine-tuning baselines and achieved significant efficiency advantages. For example, when maximizing scores from the LAION Aesthetics Classifier, their approach achieved over a 200-fold speed improvement compared to RL algorithms.
DRaFT-LV, one of the proposed variations, showed exceptional efficiency, learning approximately twice as fast as ReFL, a prior gradient-based fine-tuning method. The authors also demonstrated the versatility of DRaFT by combining or interpolating DRaFT models with pre-trained models, which can be achieved by adjusting LoRA weights through mixing or scaling.
Directly fine-tuning diffusion models on differentiable rewards presents a promising avenue for improving generative modeling techniques for images, text, and more. The efficiency, versatility, and effectiveness of the DRaFT approach make it a valuable tool for researchers and practitioners in the field of machine learning and generative modeling.
Check out the full research paper here. Credit goes to the researchers involved in this project.
If you’re interested in AI research updates and want to join our community, don’t forget to subscribe to our Email Newsletter, follow our Facebook Community, join our ML SubReddit, or be a part of our Discord Channel.
Feel free to reach out to us on WhatsApp as well. Join our AI Channel on WhatsApp for more updates.