Recent studies in Artificial Intelligence have highlighted the potential in innovations like Large Language Models (LLMs), such as GPT, PaLM, and LLaMa. While it was thought that fine-tuning was essential, an interesting hypothesis was recently put forward which challenged this widely used technique of improving basic LLMs. The Superficial Alignment Hypothesis, proposed by a study called LIMA, suggested that alignment tuning may actually train these models to choose certain data formats for user engagement. This highlighted the importance of alignment tuning to assimilate the linguistic style of AI assistants.
In response to these findings, a team of researchers explored to what extent base LLMs can be aligned without the commonly adopted practice of alignment tuning, and suggested a new alignment technique. This technique, URIAL, accomplishes effective alignment solely through in-context learning with base LLMs, without the need for fine-tuning. The team’s evaluation of the technique demonstrated that URIAL can perform on par with or better than LLMs aligned with traditional tuning-based strategies.
These findings have brought to light the shallow nature of alignment tuning and have also shown that it largely depends on the preexisting knowledge of the basic LLMs. For more information on this research, check out their Paper and Project.