What Are Large Language Models (LLMs) in AI?
Large Language Models (LLMs) are known for their foundational capabilities like commonsense reasoning and coherent language generation, and they have been fine-tuned for specific tasks. For instance, they excel in code generation, logical reasoning, and other domain-specific tasks.
Combining AI Models to Introduce Novel Capabilities
Researchers have wondered if they can combine an anchor model with a domain-specific augmenting model to introduce new capabilities, such as merging a model’s code understanding with another’s language generation. Traditionally, this approach involves pre-training or fine-tuning the anchor model using the data used for training the augmenting model. However, this may not be practical due to computational costs.
What is Composition to Augment Language Models (CALM)?
To address issues related to training and data limitations, researchers at Google Research and Google DeepMind have introduced an innovative framework called Composition to Augment Language Models (CALM). CALM introduces a small set of trainable parameters within the intermediate layer representations of both augmenting and anchor models. The aim is to discover an optimal fusion of these models, enhancing their performance in handling new complex tasks more effectively than either model alone.
Check out the paper for this new research. And if you like this article, remember to follow us on Twitter and join our Reddit, Facebook, and Discord communities.