Researchers at the Beijing Academy of Artificial Intelligence, Gaoling School of Artificial Intelligence, and Renmin University of China have developed a new method called Activation Beacon, which extends the context length for large language models (LLMs) like Llama-1 and Llama-2. Activation Beacon condenses raw activations to allow LLMs to grasp a broad context within a short window, thus effectively extending context quality, supporting diverse lengths, and ensuring compatibility with existing LLMs.
Using special tokens called beacons, Activation Beacon achieves a condensing ratio (α) of L/k (k ≪ L), optimizing information intake. The method employs three attention schemes, with stepwise expansion proving the most effective. It’s a plug-and-play LLM module that introduces long contextual information while preserving LLM’s short-context capabilities. Experimental results confirm Activation Beacon is an effective, efficient, and low-cost method for extending LLM context length.
The Activation Beacon excels in long-context language modeling, outperforming other LLMs and fine-tuning-free methods, and showcasing its effectiveness in diverse real-world applications without compromising LLM’s original capabilities. It enhances both efficiency in inference and training, and it achieves comparable or superior performance in long context language modeling.