Introducing a Framework for Creating AI Agents That Understand Human Instructions and Perform Actions
Human behavior is complex and understanding human instructions can be difficult. Even simple requests require deep understanding of intent and language. AI researchers have focused on learning these interactions from data rather than writing code. To explore this approach, we created a research framework in a video game environment.
Today, we are announcing the publication of a paper and collection of videos showcasing our progress in building video game AIs that can understand human concepts and interact with people. Instead of optimizing game scores, we ask people to invent tasks and judge progress themselves. This approach allows us to improve agent behavior through interactions with humans.
Our framework begins with human-human interactions in a video game world. Agents learn from these interactions through imitation learning and human feedback. We built a simple video game world called “the playhouse” where humans and agents can collaborate on various activities. This environment provided a safe setting for interactions and allowed us to collect large amounts of data.
The agents we trained are capable of a wide range of tasks, some of which were unexpected. This is because language allows for endless possibilities and humans came up with tasks during interactions. We used three steps to create our AI agents: imitation learning, reinforcement learning, and optimizing agent behavior using a reward model.
Imitation learning allowed agents to imitate basic human interactions. However, to learn efficient and goal-directed behavior, reinforcement learning was necessary. Agents tried different actions and those that improved performance were reinforced. We trained a reward model using human judgments of progress toward goals or errors.
Once the reward model was trained, we optimized agent behavior using RL. Agents answered questions and followed instructions in the simulator, and their behavior was scored using the reward model. We also explored two approaches for task instructions and questions: recycling human dataset tasks and training agents to mimic human behavior.
To evaluate our agents, we used various mechanisms including hand-scripted tests and offline human scoring. People interacted with our agents in real-time and judged their performance. Our RL-trained agents performed… [continue the article with more details about agent performance and evaluations]
In conclusion, our framework allows us to create AI agents that can understand human instructions and perform actions. We have made significant progress in training video game AIs that can interact with humans. The possibilities for AI in open-ended settings are endless, and we will continue to improve our agents using this framework. Stay tuned for more updates on our research and advancements in the field of AI.