Advancing AI’s Ability for Compositional Generalization
In the late 1980s, two philosophers and cognitive scientists, Jerry Fodor and Zenon Pylyshyn, argued that artificial neural networks, the driving force behind artificial intelligence (AI) and machine learning, couldn’t make connections known as “compositional generalizations.” However, scientists have been working on improving this capability in neural networks and related technologies, leading to an ongoing debate that continues to this day.
Advancements in Compositional Generalizations
Researchers from New York University and Pompeu Fabra University in Spain have made progress in enhancing the ability of AI tools, like ChatGPT, to make compositional generalizations. They developed a technique known as Meta-learning for Compositionality (MLC), which outperforms current approaches and is on par with or better than human performance. MLC focuses on training neural networks, like those in ChatGPT, to improve their compositional generalization skills through practice.
While previous developers hoped that compositional generalization would emerge from standard training methods or used special-purpose architectures, MLC demonstrates that explicitly practicing these skills can unlock new potential in AI systems.
Achieving Human-Like Systematic Generalization
Brenden Lake, an assistant professor in NYU’s Center for Data Science and Department of Psychology, and co-author of the research paper explains, “For 35 years, researchers have debated whether neural networks can achieve human-like systematic generalization. We have shown, for the first time, that a generic neural network can mimic or exceed human systematic generalization in a head-to-head comparison.”
In their exploration of improving compositional learning in neural networks, the researchers developed MLC, a unique learning procedure. MLC continuously updates a neural network’s skills over a series of episodes. In each episode, the network receives a new word and is challenged to use it compositionally. For example, the network could take the word “jump” and create new combinations like “jump twice” or “jump around right twice.” With each new episode featuring a different word, the network’s compositional skills improve through practice.
Evidence of Success
To gauge the effectiveness of MLC, the researchers conducted experiments with human participants, using tasks identical to those performed by MLC. The participants had to learn the meaning of both real and nonsensical terms as defined by the researchers. MLC performed as well as, and sometimes even better than, human participants. MLC and humans outperformed ChatGPT and GPT-4, which struggled with this learning task despite their impressive general abilities.
Baroni, one of the researchers from Pompeu Fabra University, notes that “MLC can further improve the compositional skills of large language models,” indicating that there is still room for growth in enhancing the compositional generalization abilities of AI systems like ChatGPT.