Enhancing Language Models: The Power of Model Composition Unleashed

What Are Large Language Models (LLMs) in AI?

Large Language Models (LLMs) are known for their foundational capabilities like commonsense reasoning and coherent language generation, and they have been fine-tuned for specific tasks. For instance, they excel in code generation, logical reasoning, and other domain-specific tasks.

Combining AI Models to Introduce Novel Capabilities

Researchers have wondered if they can combine an anchor model with a domain-specific augmenting model to introduce new capabilities, such as merging a model’s code understanding with another’s language generation. Traditionally, this approach involves pre-training or fine-tuning the anchor model using the data used for training the augmenting model. However, this may not be practical due to computational costs.

What is Composition to Augment Language Models (CALM)?

To address issues related to training and data limitations, researchers at Google Research and Google DeepMind have introduced an innovative framework called Composition to Augment Language Models (CALM). CALM introduces a small set of trainable parameters within the intermediate layer representations of both augmenting and anchor models. The aim is to discover an optimal fusion of these models, enhancing their performance in handling new complex tasks more effectively than either model alone.

Check out the paper for this new research. And if you like this article, remember to follow us on Twitter and join our Reddit, Facebook, and Discord communities.

Source link

Stay in the Loop

Get the daily email from AI Headliner that makes reading the news actually enjoyable. Join our mailing list to stay in the loop to stay informed, for free.

Latest stories

You might also like...