Artificial intelligence (AI) and machine learning (ML) technologies aim to improve people’s lives in various industries. One key application of AI is creating decision-making agents for different tasks. However, training these agents can be challenging due to environmental diversity and the difficulty of creating reward functions. To address these issues, a team from Google Research developed a Universal Policy (UniPi) called “Learning Universal Policies via Text-Guided Video Generation.” UniPi uses text as a universal interface for task descriptions and video as a universal interface for communicating actions. It consists of four components: trajectory consistency through tiling, hierarchical planning, flexible behavior modulation, and task-specific action adaptation. By leveraging text-based video generation, UniPi enables combinatorial generalization, multi-task learning, and real-world transfer. The researchers evaluated UniPi on various language-based tasks and found that it generalizes well compared to other baselines. This research highlights the potential of generative models and abundant data for creating versatile decision-making systems.