Understanding the Physical World: Assessing AI’s Ability to Grasp Physics
The ability to comprehend the physical world is essential for artificial intelligence (AI) systems to function effectively. However, measuring AI’s understanding of physics poses a significant challenge. To address this, we turned to developmental psychologists who have studied what infants know about the physical world. Their insights have been instrumental in creating a concrete framework of physical concepts.
In our recent publication in Nature Human Behavior, we introduced the Physical Concepts dataset, which is an open-source synthetic video dataset. This dataset is based on the violation-of-expectation (VoE) paradigm used to assess five key physical concepts: solidity, object persistence, continuity, “unchangeableness,” and directional inertia. With this benchmark, we aimed to evaluate AI’s knowledge of the physical world.
Drawing inspiration from the work of developmental psychologists, we developed PLATO (Physics Learning through Auto-encoding and Tracking Objects). PLATO represents and reason about the world as a collection of objects. It predicts the future positions of objects based on their past locations and interactions with other objects.
After training PLATO on videos depicting simple physical interactions, we found that it successfully passed the tests in our Physical Concepts dataset. Interestingly, we also trained “flat” models that were similar in size or even larger than PLATO but lacked object-based representations. These models did not perform as well in the tests, indicating the importance of objects in learning intuitive physics.
We also explored the amount of training required to develop this understanding. While infants as young as two and a half months display evidence of physical knowledge, we wanted to ascertain PLATO’s capabilities. By varying the amount of training data, we discovered that PLATO could learn our physical concepts with just 28 hours of visual experience. Although our dataset is limited and synthetic, this finding suggests that intuitive physics can be acquired with relatively little experience, as long as there is an inductive bias for representing the world as objects.
Additionally, we tested PLATO’s generalization ability. In the Physical Concepts dataset, all the objects in the test set were also present in the training set. To challenge PLATO further, we utilized a subset of another synthetic dataset created by MIT researchers. This dataset featured new objects and visual appearances that PLATO had never encountered before. Surprisingly, PLATO performed well in these tests without requiring any re-training, highlighting its ability to generalize to new stimuli.
We believe that the Physical Concepts dataset will be invaluable for researchers seeking a comprehensive understanding of their models’ grasp of the physical world. In the future, this dataset can be expanded to include additional physical concepts and more diverse visual stimuli, such as new object shapes or real-world videos.