Research towards AI models that can generalize, scale, and accelerate science
The International Conference on Learning Representations (ICLR) will kick off next week in Kigali, Rwanda. This landmark event marks the first major artificial intelligence (AI) conference to be held in Africa and the first in-person gathering since the beginning of the pandemic.
Researchers from around the globe will convene to share their cutting-edge work in deep learning, spanning AI, statistics, data science, and applications such as machine vision, gaming, and robotics. DeepMind is proud to support the conference as a Diamond sponsor and champion for diversity, equity, and inclusion.
Open questions on the path to AGI
Although AI has shown remarkable performance in text and image tasks, further research is necessary for systems to generalize across different domains and scales. Achieving this is a vital step towards the development of artificial general intelligence (AGI), which has the potential to bring transformative changes to our daily lives.
We propose a new approach where AI models learn by simultaneously solving two related problems. By training models to approach a problem from multiple perspectives, they acquire the ability to reason and solve analogous problems, which enhances their generalization capability. Additionally, we explored whether neural networks can generalize by comparing them to the Chomsky hierarchy of languages. Through rigorous testing on 16 different tasks with 2200 models, we discovered that certain models struggle to generalize but can be improved by augmenting them with external memory.
Another challenge we address is how to make progress on long-term tasks that offer sparse rewards. We introduced a new approach and a publicly available dataset to train models to explore in a manner similar to humans, even over extended periods of time.
As AI capabilities advance, it is crucial to ensure that current methods function as intended and effectively in real-world scenarios. For instance, although language models can generate impressive responses, many lack the ability to explain their reasoning. To address this, we present a method that leverages the logical structure underlying language models to solve multi-step reasoning problems. This allows for explanations that can be comprehended and verified by human users.
On the other hand, adversarial attacks are a way of testing the limits of AI models by attempting to elicit incorrect or harmful outputs. Training models on adversarial examples enhances their robustness against attacks but can impact their performance with regular inputs. To address this tradeoff, we demonstrate that by adding adapters, we can create models that offer us control over this tradeoff in real-time.
Reinforcement learning (RL) has proven successful in addressing various real-world challenges. However, RL algorithms typically struggle to generalize to new tasks beyond their training scope. We propose algorithm distillation as a method to enable a single model to efficiently generalize to new tasks by training a transformer to imitate the learning histories of RL algorithms across diverse tasks. Additionally, RL models rely on trial and error learning, which can be computationally intensive and time-consuming. We outline a novel approach that achieves human-level performance across 57 Atari games using significantly less data, resulting in substantial reductions in computing and energy costs.
AI for science
AI offers powerful tools for researchers to analyze complex data and gain insights into the world around us. Several papers presented at the conference highlight how AI is accelerating scientific progress and how science, in turn, is advancing AI.
Accurately predicting a molecule’s properties based on its 3D structure is critical for drug discovery. Our researchers have developed a denoising method that achieves state-of-the-art performance in molecular property prediction. This method enables large-scale pre-training and generalization across different biological datasets. Furthermore, we introduce a transformer that can improve the accuracy of quantum chemistry calculations using only atomic positions.
Lastly, our team has developed FIGnet, a physics-inspired simulator for modeling collisions between complex shapes, such as teapots and doughnuts. This simulator has potential applications in robotics, graphics, and mechanical design.
For a complete list of DeepMind papers and the event schedule at ICLR 2023, please visit here.