Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why Today's Humanoids Won't Learn Dexterity (rodneybrooks.com)
74 points by chmaynard 1 day ago | hide | past | favorite | 68 comments




> No sense of touch. Human hands are packed absolutely full of sensors. [...] We store energy in our tendons and reuse it on the next step

Side-rant: As cool as some cyberpunk/sci-fi ideas are, I can't imagine a widespread elective mechanical limb replacement within the lifetime of anyone here. We dramatically under-estimate how amazing our normal limbs are. I mean, they're literally swarms of nanobots beyond human comprehension. To recycle an old comment against mechanical limbs:

________

[...] just remember that you're sacrificing raw force/speed for a system with a great deal of other trade-offs which would be difficult for modern science to replicate.

1. Supports a very large number of individual movements and articulations

2. Meets certain weight-restrictions (overall system must be near-buoyant in water)

3. Supports a wide variety of automatic self-repair techniques, many of which can occur without ceasing operation

4. Is entirely produced and usually maintained by unskilled (unconscious?) labor from common raw materials

5. Contains a comprehensive suite of sensors

6. Not too brittle, flexes to store and release mechanical energy from certain impacts

7. Selectively reinforces itself when strain is detected

8. Has areas for the storage of long-term energy reserves, which double as an impact cushion

9. Houses small fabricators to replenish some of its own operating fluids

10. Subsystems for thermal management (evaporative cooling, automatic micro-activation)

_______________

I predict the closest thing we might see instead will be just growing replacement biological limbs, followed by waldoes where you remotely control an arm without losing your own.


Per 5, it says here "Human hands are packed absolutely full of sensors. Getting anywhere near that kind of sensing out of robot hands and usable by a human puppeteer is not currently possible."

Then another quote, "No one has managed to get articulated fingers (i.e., fingers with joints in them) that are robust enough, have enough force, nor enough lifetime, for real industrial applications."

So (3) and (7) are relevant to lifetime, but another point, related to sensors, is that humans will stop hurting themselves if finger strain occurs, such as by changing their grip or crying off the task entirely. Hands are robust because they can operate at the edge of safe parameters by sensing strain and strategizing around risk. Humans know to come in out of the rain, so to speak.


I have come to realize that we barely understand complexity. I've read a lot on information theory, thermodynamics, many takes on entropy. Not to mention literature on software development, because a lot of this field is managing complexity.

We severely underestimate how complex natural systems are. Autonomous agents seem like something we should be able to build. The idea is as old as digital computers. Turing famously wrote about that.

But an autonomous complex system is complex to an astronomical degree. Self driving vehicles, let alone autonomous androids, are several orders of magnitude more complex that we can even model.



Yes! Thank you!

I have read Wiener and Ashby to reach this conclusion. I've used this argument before. A piece of software capable of creating any possible software would be infinitely complex. Also the reason I don't buy the "20 w general intelligence exists". The wattage for generally intelligent humans would be the entire energy input to the biosphere up to the evolution of humans.

Planetary biospheres show general intelligence, not individual chunks of head meat.


That knowledge held in evolution equates to "training" for an AGI, I guess. Mimicking 4 billion years of evolution shouldn't take that long ... but it does sound kind of expensive now you mention it.

>> I have come to realize that we barely understand complexity.

> Mimicking 4 billion years of evolution shouldn't take that long

Thanks for illustrating my point.


Now I'm imagining a brain in a jar, but with every world-mimicking evolved aspect of the brain removed. Like, it has no implicit knowledge of sound waves or shapes or - well, maybe those low-level things are processed in the ears and retinas, but it has no next-stage anticipation of audio or visual data, either, and no body plan that relates to the body's nerves, and no relationship to digestion or hormones or gravity or jump scares or anything else that would prepare it for being monkey-shaped and living in the world. But, it has the key thing for intelligence, the secret sauce, whatever that is. So it can sit there and be intelligent.

Then you can connect it up to some input and output, and ... it exhibits intelligence somehow. Initially by screaming like a baby. Then it adapts to the knowledge implicit in its input and output systems ... and that's down to the designer. If it has suction cup end effectors and a CCD image sensor array doobrie ... I guess it's going to be clumsy and bewildered. But would it be noticeably intelligent? Could it even scream like a baby, actually? I suppose our brains are pre-evolved to learn to talk. Maybe this unfortunate person would only be able to emit a static hiss. I can't decide if I think it would ever get anywhere and develop appreciable smarts or not.


I'm just gonna quote myself again

> Planetary biospheres show general intelligence, not individual chunks of head meat.


Also to prevent breaking other things or hurting others. That’s also why robots will have tons of safety issues for a while

Yeah, it's cool and all, but I more that once was frustrated that it can't rotate freely, it has only one elbow joint, it can't extend.

You want telescopic rotary jazz hands?

Sure. I even had a simulated experience of having extendable arms in my dream. So, the control machinery is probably there for some reason.

One of the things that is true of humans is that we have an extremely mutable body plan and sensorium.

https://plasticity-lab.com/body-augmentation

https://www.carlosterminel.com/wearable-compass

https://www.madsci.org/posts/archives/mar97/858984531.Ns.r.h...

https://www.sciencedirect.com/science/article/pii/S096098220...

Bolting on extra senses, tools, limbs is no big deal.

Humans are also some of the most physically adaptable animals on the planet, in terms of being able to remodel our bodies to serve new tasks. "specific adaptation to imposed demand" is one of the things that really sets us (and a few other animals) apart in a remarkable way. Few animals can practice and train their bodies like we can.

In addition, I understand research shows that people with amputations very quickly adapt both practically and psychologically, as a general principle (some unfortunate folks are stuck with phantom pain and other adaptive issues).

The old discussion about "adding 20 minutes to your commute is worse than losing a leg below the knee" takes into account the fact that most people underestimate how large a negative effect commuting has, but also overestimate how large a negative effect losing a portion of a limb has.


In any case, it seems like a "simple" problem to solve. An accelerometer chip costs a few cents and the data rates can be handled by a very light wiring harness, ex I2C.

So embedding such a sensor in every rigid component, wiring a single data line to all of them (using the chassis as electrical ground) and feeding the data back to the model seems a trivial way to work around this problem without any kind of real pressure sensitivity. The model knows the inputs it gives to the actuators/servos, so it will quickly learn to predict the free mechanical behavior of the body, and use any deviation to derive data equivalent to pressure and force feedback.

Another possible source of data is the driving current of the motors/actuators which is proportional to the mechanical resistance the limb encounters. All sorts of garbage sources of data that were almost useless noise in the classical approach become valuable with a model large enough.


> they will likely have to collect the both the right data, and learn the right thing.

The "bitter lesson" says to stop trying to find simple rules for how to do things - stop trying to understand - and instead to use massive data and massive search to deal with all the incredibly fussy and intractable details magically.

But the article here is saying that the lesson is false at its root, because in fact lots of understanding is applied at the point of choosing and sanitising the data. So just throwing noise the model won't do.

This doesn't seem to match experience, where information can be gleaned from noise and "garbage sources of data ... become valuable with a model large enough", but maybe there's something illusory about that experience, IDK.


Isn’t one of the main functions of the brain / nervous system to “filter” noisy sensory input data to provide a coherent “signal” to perception? Perhaps a smaller or more specialized model could do that if we ended up packing the “skin” of the humanoid with various sensors.

From the article: a human hand has about 17,000 low-threshold mechanoreceptors in the glabrous skin (where hair doesn’t grow) of the hand, with about 1,000 of them right at the tip of each finger, but with much lower density over the rest of each finger and over the palm. These receptors come in four varieties (slow vs fast adapting, and a very localized area of sensitivity vs a much larger area) and fire when they sense pressure applied or released.

Where can you buy the artificial equivalent?


Naturalistic fallacies will only carry you so far. For example, my 12 year old car has none of the incredibly adapted limbs and muscles of a cheetah, but can still easily exceed the animal land speed.

The article makes a compelling case that a certain kind of sensory input and learning is necessary to crack robotic movement in general, it remains to be seen if such a fine array of sensors as the human hand is useful outside very specific use-cases. A robot that can stock shelves reliably would still be immensely useful and very generalizable, even if it can't thread the needle due to limited fine sensory abilities.


You are moving the goalpost.

Title of the article you're commenting: Why Today’s Humanoids Won’t Learn Dexterity

Thesis the article is contradicting: The idea is that humanoid robots will share the same body plan as humans, and will work like humans in our built for human environment. This belief requires that instead of building different special purpose robots we will have humanoid robots that do everything humans can do.

You are now arguing that a specialized robot lacking dexterity would still be immensely useful. Nobody is disputing that. It's just not what the article is about.


If you asked someone 300 years ago what an automated dishwashing machine would've looked like, it would be a lot more like a person than the wet cupboard we have now. I'm assuming many tasks will be like that -- it's more of a lack of imagination for why we say we need a humanoid robot to solve that task. I'm assuming it'll be the minority of tasks where it make it makes sense for that

This needs to be some sort of maxim: “The most useful robots are the ones that don’t look like (or try to be) humanoids.”

For some reason (judging by Fritz Lang, Gundam, etc.) humanity has some deep desire or curiosity for robots to look like humans. I wonder if cats want robot cats?


> humanity has some deep desire or curiosity for robots to look like humans.

I don't think you can draw that conclusion. Most people find humanoid robots creepy. I think we have a desire for "Universal Robotics". As awesome as my dishwasher is, it's disappointing that it's nearly useless for any other task. Yeah, it washes my dishes, but it doesn't was my clothes, or put away the dishes. Our desire for a humanoid robot, I think, largely grows out of our desire for having a single machine capable of doing anything.


The vast majority of “universal robots” are portrayed as humanoids in science fiction. Perhaps part of the reason is that true “universality” includes socializing, companionship, human emotions, and of course love.

Or, alternatively, general-purpose robots tend to be human-shaped because the world as it already is has already been fully designed for humans. Single doors are tall because that's the size of a human. Tools are designed to be held in something like a hand, sometimes two of them. Stairs are designed to be walked on, and basically any other traversal method just falls apart.

Of course, there is also the thing where authors and artists tend to draw anything with human intelligence as humans, from robots to aliens. Maybe it's the social reason you mention, or they just unconsciously have assumed humans to be the greatest design to ever exist. But even despite this, in a human world, I expect the first true general-purpose robots to be "standing" upright, with one or several arm-like limbs.


Brooks describes how speech preprocessed by chopping it up into short time segments and converting the segments to the frequency domain. He then bemoans the fact that there's no similar preprocessing for touch data. OK.

But then he goes on to vision, where the form that goes into vision processing today is an array of pixels. That's not much preprocessing. That's pretty much what existed at the image sensor. Older approaches to vision processing had feature extractors, with various human-defined feature sets. That was a dead end. Today's neural nets find their own features to extract.

Touch sensing suffers from sensor problems. A few high-detail skin-like sensors have been built. Ruggedness and wear are a big problem.

Consider, though, a rigid tool such as an end wrench. Humans can feel out the position of a bolt with an end wrench, get the wrench around the bolt, and apply pressure to tighten or loosen a nut. Yet the total information available is position plus six degrees of freedom of force. If the business end of your tool is rigid, the amount of info you can get from it is quite limited. That doesn't mean you can't get a lot done. (I fooled around with this idea pre-LLM era, but didn't get very far.) That's at least a way to get warmed up on the problem.

Here's a video of a surgeon practicing by folding paper cranes with small surgical tools.[1] These are rigid tools, so the amount of touch information available is limited. That's a good problem to work on.

[1] https://www.youtube.com/watch?v=5q-HHoqzQi0


As you tighten a bolt the angle you need to apply force changes. So it’s not just a fixed position plus force in 6 directions its force in 6 directions at each position. You can learn quite a bit about something from such interactions such as an objects weight, center of mass, etc.

Further robots generally have more than a single rigid manipulator.


Yes, it's time-varying data, but there are not that many channels. And the sensors are off the shelf items, although overpriced.

Very interesting point that while we've figured out how to digitize images, text and sounds we haven't digitized touch. At best we can describe in words what a touch sensation was like. Smell is in a similar situation. We haven't digitized it at all.

I'm not sure describing it in words is very helpful, and there's probably a good amount of such data available already.

I would think the way to do it is build the touch sensors first (and it seems they're getting pretty close) then just tele-operate some robots and collect a ton of data. Either that, or put gloves on humans that can record. Pay people to live their normal lives but with the gloves on.


> Artificial Intelligence researchers have been trying to get [X] to [Y] for over 65 years

For 10,000 different problems. A great many of which have been solved in recent years.

Robotics is improving at a very fast clip, relative to most tech. I am unaware of any barrier, or any reason to infer there is one, for dextrous robots.

I think the primary difference between AI software models and services, and robotic AI, is economics.

The cost per task for AI software is .... very small. And the cost per task for a robot with AI is ... many orders of magnitude over that.

The marginal costs of serving one more customer are completely incomparable.

It's just a push of a button to replace the "fleet" of chatbots a million customers are using. Something unthinkable in the hardware world.

The seemingly lower level of effort and progress is because hardware that could operate in our real world with the same dexterity that ChatGPT/Claude can converse online, will be extremely expensive at first.

Robotics companies are not just focused on dexterity. They are focused on improvements to dexterity that stay within a very tight economic envelope. Inexpensive dexterity is going to take a while.


> I am unaware of any barrier, or any reason to infer there is one, for dextrous robots.

Pretraining data?


So much easier in many ways.

You can train in stages.

First stage, either digitally generate (synthetic) basic movements, or record basic human recorded movements of a model. The former is probably better and an generate endless variation.

But the model is only trying to control joint angles, position etc. no worries about controlling power. The simulate system has no complications like friction.

The you train with friction, joint viscosity, power deviance from demand based on up, down times, fade, etc.

Then train in a complex simulated environment.

Then train for control.

Etc.

The point being, robotic control is easy to be broken down into small steps of capability.

That massively improves training speed and efficiency, even potentially smaller models.

It is also a fear simpler task by many orders of magnitude to learning the corpus of the written internet.

Comparable to that, would be training an AI to operate with any land, sea or air device. Which, nobody today is trying, (AFAIK)


Synthetic data works better for robots since you can generate endless scenarios based on real physical laws.

Do you know of success stories here? Success of transferring models learned in physics simulation to the real world.

When we (ZenRobotics) tried this 15 years ago a big problem was the creation of sufficiently high-fidelity simulated worlds. Gathering statistics and modelling the geometry, brittleness, flexibility, surface texture, friction, variable density etc of a sufficiently large variety of objects was harder than gathering data from the real world.


Google has done training in simulation: https://x.company/projects/everyday-robots/#:~:text=other%20...

I believe this is the most popular tool now: https://github.com/google-deepmind/mujoco


Thanks for the links.

AFAICT these have not resulted in any shipping products.


One very important task to solve is the ability to select a box from a shelf and set it neatly on a pallet, as well as the reverse. People have been working very hard on this problem for a long time, there are impressive demos out there, yet still nobody is ready to set their best box manipulating robots loose in a real warehouse environment.

How hard can it be to consistently pick up boxes and set them down again in a different location? Pretty hard, apparently.


> Before too long (and we already start to see this) humanoid robots will get wheels for feet, at first two, and later maybe more, with nothing that any longer really resembles human legs in gross form. But they will still be called humanoid robots.

Totally agree. Wheels are cheaper, more durable and more effective than legs.

Human would have wheels if there was an evolution pathway to wheels.


The world is full of curbs, stairs, lips, rugs, vehicles, etc. If you're a human-scale robot then your wheels need really wide base to not tip over all the time, so you are extremely awkward in any kind of moderately constrained space. I wouldn't exchange my legs for wheels. Wheelchair users have to fight all the time for oversights to be corrected. I can see maybe see a wheel-based humanoid robot, but only as a compromise.

On the other hand there is not much reason to constrain ourselves to the unstable and tricky bipedal platform or insist on having a really top-heavy human-like torso. You could have 3-4 little legs on a dog scale body with several extra long upwards reaching arms for eg.


What would be the evolutionary pressure to grow wheels? They are useless without roads

Roads, i think you answered your own question, would be an evolutionary pressure, hypothetically speaking.

I sometimes imagine wheeled creatures evolving in a location with a big flat hard surface like the Utah salt flats or the Nazca desert, but I guess there's not much reward for being able to roll around since those places are empty as well as flat. Tumbleweed found some success that way though, maybe?

The golden wheel spider lives in the sand dunes of the Namib Desert. When confronted by a spider-hunting wasp, it can perform a "cartwheeling" maneuver to escape. By tucking in its legs and turning onto its side, it can roll down a sand dune.

Is there any biological examples of freely rotating power systems? We have nice rotating joints with muscles to provide power, but I can't think of any joint that would allow the sort of free rotation while also producing torque, a wheeled animal would require.

Something internal to some shellfish, I believe, a kind of mixing rod that rotates. Hold on, I'll check if it's powered. (Also rotifers but they're tiny.)

Hmm, no, it sounds like it's externally powered:

> The style consists of a transparent glycoprotein rod which is continuously formed in a cilia-lined sac and extends into the stomach. The cilia rotate the rod, so that it becomes wrapped in strands of mucus.

https://en.wikipedia.org/wiki/Rotating_locomotion_in_living_...

Or maybe the cilia ( = wiggly hairs) could be seen as a kind of motor. Depends how you count it and exactly what the set-up is, I can't tell from this.


I think I would count internal power created by the rotating component itself. I hadn't though of that possibility, since human made machinery usually has the power producing component located in the main body and transferring that power to a freely rotating component is quite hard. Biological systems wouldn't necessarily look like that, and could feasibly be powered by the wheels themselves deforming as if the wheels were a separate, but connected, biological system.

That's quite interesting.


Wheels are great until the robot encounters uneven surfaces, such as a stairway, or a curb. So some kind of stepping functionality would still be necessary.

Power to wheel sensors are the paws. Where is the CNS/brain going?

Not sure if I buy the argument that touch sensitivity is a prerequisite for dexterity.

I can put on a thick glove (losing touch and pressure sensitivity all together) and grab a fragile glass without breaking it.


Because you have learnt it already and you can make predictions. And you don’t lose pressure sensitivity, you still feel the pressure of your hand to the glove, a better example would be using an exoskeleton or robotic arm or inactivate certain nerves. Still you risk more of breaking it imo and you have to be more careful in the beginning until you learn again.

Edit: and you probably are not gonna be as fast doing it


You don't lose pressure or touch sensitivity from wearing even thick welding gloves. You can still feel how hard you are gripping the rod quite easily.

Depends heavily on the use case. Indeed many tasks humans carry out are done without touch feedback - but many also require it.

An example of feed-forward manipulation is lifting a medium-sized object. Classic example is lifting a coffee cup. If you misjudge a full cup for empty you may spill the contents before your brain manages to replan the action based on sensory input. It takes around 300ms for that feedback loop to happen. We do many thing faster than that would allow.

The linked article has a great example of a task where a human needs feedback control: picking up and lighting a match.

Sibling comments also make a good point on that touch may well be necessary to learn the task. Babies do a lot of trial-and-error manipulation and even adults will do new tasks slower first.


What about full control of a claw machine to pick up said glass?

The scientists at Oak Ridge National Labs develop a lot of dexterity working with robotic manipulators in the radioactive hot cells there

https://youtu.be/B-Lj7xAXJpc


If I can see the said glass, absolutely. I can even do it remotely.

This is a good point, but I’m not convinced it negates the author’s argument.

Consider whether you could pick up that same fragile glass with your eyes closed? I’d wager you could, as you’d still receive (diminished) textile feedback despite the thick gloves.


Only because of your training otherwise.

A cool approach to digitizing touch that I read about a few days ago: https://www.wired.com/story/this-clever-robotic-finger-feels...

(I'm not disagreeing with the author, just sharing an article that is interesting/relevant.)


> When an instability is detected while walking and the robot stabilizes after pumping energy into the system all is good, as that excess energy is taken out of the system by counter movements of the legs pushing against the ground over the next few hundred milliseconds. But if the robot happens to fall, the legs have a lot of free kinetic energy, rapidly accelerating them, often in free space. If there is anything in the way it gets a really solid whack of metal against it. And if that anything happens to be a living creature it will often be injured, perhaps severely.

Fails Clark's law?

I misread the title and I thought it was about humans.

And I could see it. With prevalence of screens kids already don't learn a lot of dexterity that previous generations have learned. Their grip strength is weak and capacity for fine 3d motions is probably underdeveloped as well.

Last week I've seen an intelligent and normally developing 7 year old kid asking mum to operate a small screwdriver to get to the battery compartment of a toy because that apparently was beyond his competence.

Now with recent developments in robotics, fully neural controllers and training in simulated environments there could be that modern babies will have very little tasks requiring dexterity left when they grow up.


Quick googling brings smaller than a penny strain/pressure sensors under $10. That is a sense of touch enough for most of the tasks.

The article points out that the human hand has over 10000 sensors with specific spatial layout and various specialised purposes (pressure / vibration / stretching / temperature) that require different mechanical connections between the sensor and the skin.

the Nature limited us to just 2 hands for all tasks and purposes. The humanoids have no such limitations.

>10000 sensors with specific spatial layout and various specialised purposes (pressure / vibration / stretching / temperature) that require different mechanical connections between the sensor and the skin.

mechanical connection wouldn't be an issue if we lithograph the sensors right onto the "skin" similarly to chips.


Sorry, I meant to emphasize _different_ mechanical connections. That a sensor that detects pressure has a different mechanical linkage than the one detecting vibration. So you need multiple different manufacturing techniques to replicate that at correspondingly higher cost.

The “more than 10000” also has a large impact in size (sensors need to be very small) and cost (you are not paying for one sensor but 10000).

Of course some applications can do with much less. IIUC the article is all about a _universal_ humanoid robot, able to do _all_ tasks.


How can I invest in humanoid robot companies now? I do believe this will be tried to be made into the next hype cycle.

Pshaw, that's nothing, you need to invest in the companies promising to make a Greater Fool robotic investor, now that's where the market'll take off. :P



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: