Researchers at the Beijing Academy of Artificial Intelligence, Gaoling School of Artificial Intelligence, and Renmin University of China have developed a new method called Activation Beacon, which extends the context length for large language models (LLMs) like Llama-1 and Llama-2. Activation Beacon condenses raw activations to allow LLMs to grasp a broad context within a short window, thus effectively extending context quality, supporting diverse lengths, and ensuring compatibility with existing LLMs.
Using special tokens called beacons, Activation Beacon achieves a condensing ratio (α) of L/k (k ≪ L), optimizing information intake. The method employs three attention schemes, with stepwise expansion proving the most effective. It’s a plug-and-play LLM module that introduces long contextual information while preserving LLM’s short-context capabilities. Experimental results confirm Activation Beacon is an effective, efficient, and low-cost method for extending LLM context length.
The Activation Beacon excels in long-context language modeling, outperforming other LLMs and fine-tuning-free methods, and showcasing its effectiveness in diverse real-world applications without compromising LLM’s original capabilities. It enhances both efficiency in inference and training, and it achieves comparable or superior performance in long context language modeling.
If you want to learn more, you can check out the paper from the researchers. And don’t forget to follow them on Twitter and join their Facebook Community.