AI-powered humanoid robot can serve you food, stack the dishes — and have a conversation with you

 Artificial intelligence robot (left) hands an apple to a man (right).
Artificial intelligence robot (left) hands an apple to a man (right).

A self-correcting humanoid robot that learned to make a cup of coffee just by watching footage of a human doing it can now answer questions thanks to an integration with OpenAI's technology.

In the new promotional video, a technician asks Figure 01 to perform a range of simple tasks in a minimalist test environment resembling a kitchen. He first asks the robot for something to eat and is handed an apple. Next, he asked Figure 01 to explain why it handed him an apple while it was picking up some trash. The robot answers all the questions in a robotic but friendly voice.

Related: Watch scientists control a robot with their hands while wearing the Apple Vision Pro

The company said in its video that the conversation is powered by an integration with technology made by OpenAI — the name behind ChatGPT. It's unlikely that Figure 01 is using ChatGPT itself, however, because that AI tool does not normally use pause words like "um," which this robot does.

Should everything in the video work as claimed, it means an advancement in two key areas for robotics. As experts previously told Live Science, the first advancement is the mechanical engineering behind dexterous, self-correcting movements like people can perform. It means very precise motors, actuators and grippers inspired by joints or muscles, as well as the motor control to manipulate them to carry out a task and hold objects delicately.

Even picking up a cup — something which people barely think about consciously — uses intensive on-board processing to orient muscles in precise sequence.

RELATED STORIES

This video of a robot making coffee could signal a huge step in the future of AI robotics. Why?

Human-like robot tricks people into thinking it has a mind of its own

Robot hand exceptionally 'human-like' thanks to new 3D printing technique

The second advancement is real-time natural language processing (NLP) thanks to the addition of OpenAI's engine — which needs to be as immediate and responsive as ChatGPT when you type a query into it. It also needs software to translate this data into audio, or speech. NLP is a field of computer science that aims to give machines the capacity to understand and convey speech.

Although the footage appears impressive, so far Livescience.com is sceptical. Listen at 0.52s and again at 1.49s, when Figure 01 starts a sentence with a quick 'uh', and repeats the word 'I', just like a human taking a split second to get her thoughts in order to speak. Why (and how) would an AI speech engine include such random, humanlike tics of diction? Overall, the inflection is also suspiciously imperfect, too much like the natural, unconscious cadence humans use in speech.

We suspect it might actually be pre-recorded to showcase what Figure Robotics is working on rather than a live field test, but if – as the video caption claims – everything really is the result of a neural network and really shows Figure 01 responding in real time, we've just taken another giant leap towards the future.