Symbol Tuning: Improving In-Context Learning in Language Models
Introduction
Human intelligence allows us to learn new tasks with just a few examples. This ability to reason and learn has been successfully scaled up in language models, resulting in various applications in machine learning. However, these language models are still sensitive to how prompts are given, indicating a lack of robust reasoning. Prompt engineering and phrasing tasks as instructions are often required, and unexpected behaviors can occur, such as performance remaining unaffected even with incorrect labels. To address these limitations, we propose symbol tuning as a simple fine-tuning procedure that emphasizes input-label mappings, improving in-context learning in language models.
Motivation
Instruction tuning is a common method that improves model performance on in-context examples. However, models are not forced to learn from these examples since instructions and natural language labels redundantly define the task. Symbol tuning, on the other hand, removes instructions and replaces natural language labels with unrelated symbols. This setup requires models to reason with in-context examples to understand the task. Symbol-tuned models should perform better on tasks that require reasoning between in-context examples and their labels.
Symbol-Tuning Procedure
We selected 22 publicly-available natural language processing datasets for our symbol-tuning procedure. These datasets are classification-type tasks with discrete labels. The labels were remapped to random symbols from a set of approximately 30,000 arbitrary labels. We applied symbol tuning to Flan-PaLM models of different sizes.
Experimental Setup
To evaluate the model’s ability to perform unseen tasks, we chose 11 NLP datasets that were not used in fine-tuning. Symbol-tuned models were tested on these datasets, focusing on in-context learning and algorithmic reasoning tasks.
In-Context Learning
In in-context learning, models must reason with the given examples to successfully perform the task. Symbol tuning improved performance across all settings, especially in tasks without relevant natural language labels. Smaller symbol-tuned models even outperformed larger models, suggesting significant compute savings.
Algorithmic Reasoning
Symbol tuning also enhanced the model’s performance in algorithmic reasoning tasks, resulting in significant improvements compared to baselines. Symbol-tuned models achieved higher performance and reduced inference compute.
Flipped Labels
Symbol-tuned models exhibited the ability to override prior knowledge. They performed much better than instruction-tuned models in following flipped labels presented in-context. This suggests that symbol tuning strengthens the model’s adaptability and flexibility in learning.
Conclusion
Symbol tuning is a new method that improves in-context learning in language models by emphasizing input-label mappings. By removing instructions and replacing natural language labels with symbols, models are forced to reason with in-context examples. Symbol-tuned models demonstrated improved performance on unseen tasks, algorithmic reasoning, and the ability to override prior knowledge. Symbol tuning holds promise in enhancing the robustness and reasoning capabilities of language models.