Robotics

Figure's humanoids start doing tasks they weren't trained for

Figure's humanoids start doing tasks they weren't trained for
Having never seen these kitchen items before, Helix-powered Figure 02 robots were able to put away random groceries by simply asking them "Can you put these away?"
Having never seen these kitchen items before, Helix-powered Figure 02 robots were able to put away random groceries by simply asking them "Can you put these away?"
View 3 Images
"Here ya go, bud. I got you." Running a single Helix instance across two robots allows levels of bot-to-bot collaboration never before seen
1/3
"Here ya go, bud. I got you." Running a single Helix instance across two robots allows levels of bot-to-bot collaboration never before seen
"Here, have some cookies." If Short Circuit had been made this century, I imagine Johnny Five would look more like these guys
2/3
"Here, have some cookies." If Short Circuit had been made this century, I imagine Johnny Five would look more like these guys
Having never seen these kitchen items before, Helix-powered Figure 02 robots were able to put away random groceries by simply asking them "Can you put these away?"
3/3
Having never seen these kitchen items before, Helix-powered Figure 02 robots were able to put away random groceries by simply asking them "Can you put these away?"
View gallery - 3 images

Only weeks after Figure.ai announced ending its collaboration deal with OpenAI, the Silicon Valley startup has announced Helix – a commercial-ready, AI "hive-mind" humanoid robot that can do almost anything you tell it to.

Figure has made headlines in the past with its Figure 01 humanoid robot. The company is now on version 02 of its premiere robot, however, it's received more than just a few design changes: it's been given an entirely new AI brain called Helix VLA.

It's not just any ordinary AI either. Helix is the very first of its kind to be put into a humanoid robot. It's a generalist Vision-Language-Action model. The keyword being "generalist." It can see the world around it, understand natural language, interact with the real world, and it can learn anything.

"Here, have some cookies." If Short Circuit had been made this century, I imagine Johnny Five would look more like these guys
"Here, have some cookies." If Short Circuit had been made this century, I imagine Johnny Five would look more like these guys

Unlike most AI models that can require 1,000s of hours of training or hours of PhD-level manual programming for a single new behavior, Helix can combine its ability to understand semantic knowledge with its vision language model (VLM) and translate that into actions in meat-space.

"Pick up the cassette tape from the pile of stuff over there." What if Helix had never actually seen a cassette tape with its own eyes before (granted, they are pretty rare these days)? Combining the general knowledge that large language models (LLM) like ChatGPT have with Figure's own VLM, Figure can identify and pick out the cassette. It's unknown whether it'll appreciate Michael Jackson's Thriller as much as we did though.

It gets better. Helix has the ability to span two robots simultaneously and work collaboratively with both. And I don't mean simply two bots digging through a pile of stuff to find Michael Bolton's greatest hits. Each robot has two GPUs built inside to handle high-level latent planning at 7-9 Hz (System 2) and low-level control at 200 Hz (System 1), meaning System 2 can really think about something while System 1 can act on whatever S2 has already created a plan of action for.

Running at 200 Hz means the S1 bot can take quick physical action as its actions have already been planned for it by S2.

7-9 Hz means 7-9 times per second, which is no slouch, but it leaves the S2 bot enough time to really deep-think. After all, the human expressions have always been "two heads are better than one," and "let's get another set of eyes on it," etc. Except, Helix is a single AI brain controlling two bots simultaneously.

Figure and OpenAI had been collaborating for about a year before Figure.ai founder, Brett Adcock, decided to pull the plug with a post on X "Figure made a major breakthrough on fully end-to-end robot AI, built entirely in-house. We're excited to show you in the next 30 days something no one has ever seen on a humanoid." It's been 16 days since his February 4th Valentine's Day post ... over-achiever?

Over-achieve indeed. This doesn't feel like just another small step in robotics. This feels more like a giant leap. AI now has a body and can do stuff in the real world.

Figure 01 robots have already demoed the ability to work simple and repeatable tasks in the BMW Manufacturing Co factory in Spartanburg, SC. The Figure 02 robots represent an entirely new generation of capability. And they're commercial-ready, right out of the box, batteries included.

Unlike previous iterations, Helix uses a single set of neural network weights to learn; think "hive-mind." Once a single bot has learned a task, now they all know how to do it. Which plays great for home-integration of the bots.

When you think about it, compared to a well-organized and controlled factory setting, a household is actually quite chaotic and complicated. Dirty laundry on the floor next to the hamper (you know who you are), your kids' foot-murdering Legos strewn about, cleaning supplies under the kitchen sink, fine China in the curio cabinet (what's that!?). The list goes on.

Figure reckons that the 02 can pick up nearly any small household object even if it's never seen it before. The Figure 02 robot has 35 degrees of freedom, including human-like wrists, hands, and fingers. Pairing its generalist knowledge with its vision model allows it to understand abstract concepts. "Pick up the desert item," in a demo, led Figure to pick up a toy cactus that it had never seen before from a pile of random objects on a table in front of it.

This is all absolutely jaw-dropping stuff. The Figure 02 humanoid robot is the closest thing to I,Robot we've seen to date. This marks the beginning of something far more than just "smart machines." Helix bridges a gap between something we control – or at least try to control – on our screens to real-world autonomy and real-world physical actions ... and real-world consequences. It's equally terrifying as it is mesmerizing.

"Here ya go, bud. I got you." Running a single Helix instance across two robots allows levels of bot-to-bot collaboration never before seen
"Here ya go, bud. I got you." Running a single Helix instance across two robots allows levels of bot-to-bot collaboration never before seen

Privacy? Not anymore.

I can't help but wonder ... who controls all this data? Engineers at Figure? Is it hackable (does a bear sleep in the woods?) and is some teenager living in mom's basement going to start sending ransomware out to every owner? Are trade secrets going to be revealed from factory worker-bots? Corporate sabotage is a real thing, and I really want to know what the supposed 23 flavors found in Dr Pepper are ...

The whole hive-mind concept means that on a server somewhere, there is a complete, highly detailed walk-through of your house. Figure knows your entire family, what's in your sock drawer and your nightstand. It knows the exact dollar amount of cash stuffed in your mattress. And if you're not careful, it might even know what you look like in your birthday suit.

Source: Figure.ai

View gallery - 3 images
8 comments
8 comments
Faint Human Outline
One flavor I have detected thus far: wintergreen
Skipjack
This is really good. Now, they just have to do it 10 times as fast in order for it to be actually useful.
pete-y
gob-smacking - but not sure I would want to let one loose in my house. It will know too much and potentially share too much.
michael_dowling
Can a real life version of Bicentennial Man be coming to homes soon?
Daishi
@Skipjack I am not sure they have to be fast to be useful. Before online shopping (or Netflix DVD's by mail) a lot of people would have said 2 day shipping is too slow to compete with in person shopping. Even slow automation that doesn't need a break or much manual help is useful. It doesn't matter if it takes 10 minutes to put away groceries if you are doing something else in a different room.
essecj
Are they programmed to look at each other? That’s the creepiest part to me.
JS
@Daishi _ I can't help but to agree with this point. Every time humans begin to adopt a new tech, it's inevitably painfully slow at first (dial-up internet, your example of Netflix ... even cars). It's only going to get better/faster. And maybe creepier. :D
bwana4swahili
Another step up the ladder to making humans redundant... But I can see a HUGE number of places these critters could be put to work.