Empowering users to teach robots new tricks and tasks is essential for integrating robots into real-world applications. With recent advancements in large language models (LLMs), we have made progress in this area. LLMs allow us to use natural language to teach robots, but they struggle when it comes to translating language into low-level robot commands.
To solve this problem, we have developed a language-to-reward system that bridges the gap between language and robot actions. We use reward functions as an interface to translate user instructions into code that specifies the desired actions. We then use optimization techniques to find the best robot actions that maximize the generated reward function.
Our language-to-reward system consists of two main components: the Reward Translator and the Motion Controller. The Reward Translator maps user instructions to reward functions represented as python code. The Motion Controller optimizes the reward function to find the optimal robot actions.
To make the system more reliable, we break down the Reward Translator into two sub-modules: the Motion Descriptor and the Reward Coder. The Motion Descriptor interprets user input and expands it into a detailed description of the desired robot motion. This makes the reward coding task more stable and provides a more interpretable interface for users. The Reward Coder then translates the motion description into the reward function using python code.
Once we have the reward function, the Motion Controller synthesizes a controller that maps robot observation to low-level robot actions. We use different strategies to solve this problem, including reinforcement learning and model predictive control.
We have tested our system on various robotic control tasks in simulation and on a physical robot manipulator. Our system allows users to teach robots new skills and tasks with concise natural language instructions. It is especially useful for tasks that go beyond pre-designed primitives.
In conclusion, our language-to-reward system enables users to teach robots novel actions through natural language input. It bridges the gap between language and low-level robot actions using reward functions as an interface. With this system, users can easily teach robots new tasks and skills, opening up possibilities for real-world applications.