A robotic ‘chef’ has been trained by researchers at the University of Cambridge to learn from cooking videos and recreate the dishes.
The team programmed the robotic chef with a ‘cookbook’ of eight simple salad recipes. By watching a video of a human preparing one of the recipes, the robot could identify which recipe was being made and replicate it.
The videos also helped the robot expand its cookbook. At the end of the experiment, the robot even came up with its own ninth recipe. The findings, published in the journal IEEE Access, demonstrate the potential of using video content as a valuable source of data for automated food production. This could ultimately lead to the easier and cheaper deployment of robot chefs.
Although robotic chefs have been featured in science fiction for years, cooking remains a challenging task for robots in reality. While several commercial companies have developed prototype robot chefs, none of them are currently available on the market. They still have a long way to go to match the skills of human chefs.
Unlike humans who can learn new recipes through observation, such as watching cooking videos on YouTube, teaching a robot to make a variety of dishes is expensive and time-consuming.
“We wanted to teach our robot chef to learn in the same incremental way that humans do — by identifying the ingredients and how they come together in a dish,” explained Grzegorz Sochacki, the paper’s first author and a PhD candidate at Cambridge’s Department of Engineering.
The Technique Behind Teaching the Robot Chef
To train their robot chef, Sochacki and his colleagues created eight simple salad recipes and filmed themselves making them. They then used a publicly available neural network to train the robot. The neural network had already been programmed to recognize different objects, including the fruits and vegetables used in the salad recipes (broccoli, carrot, apple, banana, and orange).
The robot analyzed each frame of the video using computer vision techniques and identified the objects, features (like a knife), and the human demonstrator’s arms, hands, and face. Both the recipes and videos were converted to vectors, and the robot conducted mathematical operations on these vectors to determine the similarity between a demonstration and a vector.
By accurately identifying the ingredients and actions of the human chef, the robot could determine which recipe was being prepared. For example, if the human demonstrator held a knife in one hand and a carrot in the other, the robot inferred that the carrot would be chopped.
Impressive Results and Future Potential
Out of the 16 videos it watched, the robot successfully recognized the correct recipe 93% of the time, even though it only detected 83% of the human chef’s actions. The robot was also able to differentiate slight variations within a recipe, such as making a double portion or normal human error, without considering them as new recipes. Moreover, the robot identified a new, ninth salad recipe, added it to its cookbook, and successfully made it.
Sochacki stated, “It’s remarkable how well the robot could pick up on the details. While these recipes are not complex — they mainly consist of chopped fruits and vegetables — the robot was surprisingly effective at recognizing that two chopped apples and two chopped carrots are the same recipe as three chopped apples and three chopped carrots.”
Limitations and Future Possibilities
The videos used for training the robot chef differed from popular food videos on social media, which often have fast cuts and visual effects, frequently shifting between the person preparing the food and the dish being created. For instance, if the human demonstrator held a carrot with their hand wrapped around it, the robot would struggle to identify it. To solve this issue, the demonstrator had to hold up the carrot so the robot could see the entire vegetable.
Sochacki added, “Our robot isn’t interested in the type of food videos that go viral on social media — those are simply too difficult to follow. However, as robot chefs improve and become faster at identifying ingredients in food videos, they may be able to use platforms like YouTube to learn a wide range of recipes.”
The research was partly supported by Beko plc and the Engineering and Physical Sciences Research Council (EPSRC), which is a part of UK Research and Innovation (UKRI).