Curiosity-driven exploration is when an AI agent actively seeks new information to better understand its surroundings. The agent uses a model of the world to predict future events based on past events. It then uses any differences between the predicted and actual events as its reward for exploring and gathering new information. This information is used to improve the accuracy of the world model.
Based on the successful “bootstrap your own latent” (BYOL) approach used in computer vision, graph representation learning, and representation learning in RL, we introduce BYOL-Explore. BYOL-Explore is a simple but effective AI agent that uses curiosity-driven exploration to solve challenging tasks. It learns a representation of the world by predicting future representations and uses the prediction errors as intrinsic rewards to train its exploration policy. This means that BYOL-Explore simultaneously learns a world representation, world dynamics, and an exploration policy by optimizing prediction errors.
Despite its simplicity, BYOL-Explore outperforms other curiosity-driven exploration methods like Random Network Distillation (RND) and Intrinsic Curiosity Module (ICM) in challenging 3D exploration tasks. It achieves this with a single network trained across all tasks, while prior methods could only make progress when given expert demonstrations.
Furthermore, BYOL-Explore achieves super-human performance in the ten hardest Atari games, outperforming agents like Agent57 and Go-Explore. It accomplishes this with a simpler design.
In the future, BYOL-Explore can be generalized to highly stochastic environments by learning a probabilistic world model. This would enable the agent to anticipate and navigate stochastic events and plan its exploration accordingly.