
Learning Dexterity - gdb
https://blog.openai.com/learning-dexterity/
======
pmohun
Fascinating:

"We observed that for precision grasps, such as the Tip Pinch grasp, Dactyl
uses the thumb and little finger. Humans tend to use the thumb and either the
index or middle finger instead. However, the robot hand’s little finger is
more flexible due to an extra degree of freedom, which may explain why Dactyl
prefers it. This means that Dactyl can rediscover grasps found in humans, but
adapt them to better fit the limitations and abilities of its own body."

The learning of "emergent" behavior, specifically when it creates improvements
to natural human motion is one of the main reasons why this type of work is so
important. Similar to the way that we imitate design from nature (e.g. wings,
suction cups), we can now accelerate development by observing how the bots
perform the task in a variety of environments

~~~
sgillen
It's a really cool phenomena, it also means we have to make our simulations
better. These sorts of RL algorithms are so good at finding "exploits" in the
physics engine they are in that help them "cheat" sometimes, compared to what
the researcher wanted.

~~~
TaylorAlexander
I agree, but FWIW in the linked work they did randomize some of the engine
parameters during training to avoid fitting too much to a specific set of
assumptions. Certainly though more accurate simulations, as long as they were
still fast to compute, would be very useful!

~~~
sgillen
yeah, I think a big reason why they needed to do randomization in the first
place is because of inaccuracies in the simulation compared to the real world.
And the extra randomization required almost two orders of magnitude more
training time!

------
dokem
Does anyone know a good graduate program/route for this kind of work? My
undergrad was CS with some experience in (dumb) robotics and mechanical design
but no ML. I am interested in applying ML/CV to physical systems like this
however am I bit weary of going back to a CS program. I have seen some
Mechanical programs with an emphasis on control that let you 'build your own
degree'. If I could take a mix of ML/CV, control systems, kinematics I would
be happy. Just looking for some input from people in this field.

~~~
gdb
(I work at OpenAI.)

Worth noting: it's a well-supported route to join OpenAI without any special
graduate training. Many of our teams (including our robotics team!) hire
experienced software engineers, teaching them whatever ML they need to know,
or our Fellows program lets people do a more formal curriculum
([https://blog.openai.com/openai-fellows/](https://blog.openai.com/openai-
fellows/)). We also have a number of software engineers who focus on what
looks like traditional software engineering: see for example
[https://www.youtube.com/watch?v=UdIPveR__jw](https://www.youtube.com/watch?v=UdIPveR__jw).

See our open positions here: [http://openai.com/jobs](http://openai.com/jobs)!

------
chubot
Honest question: In the video, it looks like it works, but performs worse than
about 90% of humans at the task of rotating a cube.

On the other hand, Alpha Go or even a rudimentary chess program does better
than 99.99% of all humans.

So is it fair to say that deep learning is fundamentally missing something
that humans do? Or that chess and Go are "easy" problems in some sense?

(It seems like with "unlimited" training hours it could eventually be better
than a human? Or is that a hardware issue?)

~~~
dgreensp
In 2015, it was commonly thought that it would still be decades before a
computer could beat a top human player at Go. Now, you are calling it “easy,”
because it’s been done.

The first chess program was written by Alan Turing on paper between 1948 and
1950. He didn’t have a computer to run it, but he could still play a game with
it by stepping through the algorithm by hand. In 1997, Deep Blue beat
Kasparov, using traditional algorithms and not deep learning.

Clearly there are differences between these problems and dexterity. Chess, for
example, can be described relatively simply using logic, and there is no
dynamic or physical element; a rudimentary player can be written using pencil
and paper; a winning player just needs enough compute power, apparently.

More importantly, there is a technology curve. You are asking about the
ultimate limits of a technique moments after its first success puts it at the
low end of the spectrum of human ability. Give it a decade or two.

I am just shocked the video was real-time and not sped up like so many of
these videos are (eg watch a robot arm fold a shirt in thirty seconds when you
play it at 5x speed).

~~~
YeGoblynQueenne
>> In 2015, it was commonly thought that it would still be decades before a
computer could beat a top human player at Go

This needs a citation and it needs it badly.

It was widely reported in the popular press, to the dismay of many scientists
working in game-playing AI, who had very different opinions about how close or
far beating a professional human at Go was at the time of AlphaGo. The
majority of them in fact did not make predictions- they just pointed out that
Go was the last of the traditional board games to remain uncoquered by AI. Not
that it would take X years to get there. Most AI researchers are loath to make
such predictions, knowing well that they tend to be very inaccurate (on either
direction).

~~~
dgreensp
All I know is what the articles and commenters were saying then, as an
interesting contrast to this comment now. Every article on AlphaGo described a
general state of shock at achieving something that (even if at a purely
psychological level) seemed at least 10 years away.

[https://www.technologyreview.com/s/546066/googles-ai-
masters...](https://www.technologyreview.com/s/546066/googles-ai-masters-the-
game-of-go-a-decade-earlier-than-expected/)

> Just a couple of years ago, in fact, most Go players and game programmers
> believed the game was so complex that it would take several decades before
> computers might reach the standard of a human expert player.

~~~
YeGoblynQueenne
>> All I know is what the articles and commenters were saying then, as an
interesting contrast to this comment now.

I understand, but in such cases (when an opinion of experts is summarised in
the popular press, rather than by experts themselves) it may be a good idea to
dig a bit further before repeating what may be a misunderstanding on the part
of reporters.

For example, my experience is very different than what you report. In an AI
course during my data science Master's and in the context of a discussion on
game-playing AI, the tutor pointed to Go as the only traditional board game
that was not yet conquered by adversarial AI, without offering any predictions
or comments about its hardness, other than to say that the difficulty of AI
systems with Go is sometimes explained by saying that "intuition" is needed to
play well. And I generally don't remember being surprised when I first heard
of the AlphaGo result (I have some bakcground in adversarial AI, though I'm
not an expert), and in fact thinking that it was bound to happen eventually,
one way or another.

A similar discussion can be found in _AI: A Modern Approach_ (3d ed) in the
"Bibliographical and Historical Notes" section of chapter 5. Adversarial AI,
where recent (at the time) successes are noted, but again no prediction about
the timeframe of beating a human master is attempted and no explanation of the
hardness of the game is given, other than its great branching factor. In fact,
the relevant paragraph notes that "Up to 1997 there were no competent Go
programs. Now the best programs play _most_ [sic] of their moves at the master
level; the only problem is that over the course of a game they usually make at
least one serious blunder that allows a strong opponent to win" \- a summary
that, given the year is 2010, and to my opinion, strongly contradicts the
assumption that most experts considered Go to be out of reach of an AI player.
It looks like in 2010 experts understood then-current programs to be quite
strong players already.

In general, I would be very surprised to find many actual experts (e.g.
authors of Go playing systems) predicting that beating Go would take "at least
10 years", let alone "several decades" (!). Like I say, most AI researchers
these days are very conservative with their predictions, precisely because
they (and others) have been burned in the past. Stressing "most".

------
kingbirdy
> Learning to rotate an object in simulation without randomizations requires
> about 3 years of simulated experience

It's interesting to me that this is about the same amount of time it takes
humans to develop similar levels of motor control. I don't know enough about
AI or neuroscience to say whether it's likely to be a coincidence or not,
though.

~~~
Ajedi32
Interesting observation. I suspect it's probably coincidence though. Other
tasks which humans are able to learn (such as [playing Dota][1]) have taken
OpenAI much longer to master. OpenAI Five spends 180 years of training per
day, per hero in order to learn Dota, and it still isn't at the level of
professional players (though that may change soon).

Though I suppose you could argue that Dota benefits more from high-level
reasoning, whereas basic motor control is a more intuitive skill. (And
therefore better suited for this type of AI.)

[1]: [https://blog.openai.com/openai-five/](https://blog.openai.com/openai-
five/)

~~~
mortenjorck
Also, the time referenced in the article is presumably three years of non-stop
training – given that an infant has a calendar packed with other things like
sleeping, crying, and other non-motor activities, total human time logged on
learning fine motor control is probably half that, if not less.

------
runesoerensen
Very cool. There's also a Times article about Dactyl:
[https://www.nytimes.com/interactive/2018/07/30/technology/ro...](https://www.nytimes.com/interactive/2018/07/30/technology/robot-
hands.html)

------
0x8BADF00D
> Rapid used 6144 CPU cores and 8 GPUs to train our policy, collecting about
> one hundred years of experience in 50 hours.

That seemed an order of magnitude higher than I expected. Is training usually
this computationally expensive?

~~~
LeanderK
haha I just read it and thought that's a magnitude less than I expected.
Pretty often it is. A lot of papers from high profile institutions have a lot
of computing power availiable.

First, it seems to be a lot and be really expensive, but think of it in man-
hours. It quickly diminishes.

~~~
visarga
Not to mention that evolution has had millions of years of optimizing this
stuff.

------
Animats
Nice.

Take a look at position 44, where it seems to get stuck, with no move to make
forward progress, and two fingers straight out. Did it lack image recognition
to tell it what block rotation was needed?

It doesn't seem to work by discovering strategies for rotating the block one
face at a time, then combining those. It's solving the problem as a whole.
That has both good and bad implications.

------
YeGoblynQueenne
>> We’ve trained a human-like robot hand to manipulate physical objects with
unprecedented dexterity.

To be precise, the "physical objects" appear to invariably be cubes of the
same dimensions. Not arbitrary "physical objects". Which is probably the best
that can be done by training only in a simulated environment.

------
aeleos
I am continually impressed by OpenAI, whenever we think that something is too
difficult for our currently understanding of AI. With their Dota AI and this
they have shown that more can be done with a lot less than previously thought.

~~~
andreyk
Not to be too negative, it's cool work, but I'd argue unlike the OpenAI result
it is not so surprising this was doable with the techniques they used ; see eg
this paper from Google
[http://www.roboticsproceedings.org/rss14/p10.pdf](http://www.roboticsproceedings.org/rss14/p10.pdf)
and this one from Stanford/DeepMind
[http://www.roboticsproceedings.org/rss14/p09.pdf](http://www.roboticsproceedings.org/rss14/p09.pdf)
. Yes there is the additional aspects of an object in hand, but fundamentally
the techniques are the same.

Of course these works are cited in related works of paper as they should be;
perhaps the OpenAI blog should also provide more context on where this stands
wrt prior work, as many non-researchers may read this is may be quite
misleading...

~~~
j2kun
OpenAI has not exactly had the best reputation with their press releases.

------
hellofunk
Holy cow, the robots are definitely coming. We really are at the ground floor
of a technology that is going to change humanity, I am certain of that.
Changes greater than any changes we've seen before.

~~~
deviationblue
Well, if you're thinking sentient, AI beings, then no-- they are still a long
way off. Unless we can give any meaning behind why a robot should do
something, for example, have and use this kind of dexterity, it's all
mechanical tricks. Cool tricks, nonetheless.

~~~
criddell
I agree with the person you replied to but I'm not thinking about intelligent
machines. I'm just thinking about the automation of everything. Look how close
we are to self driving cars without needing a sentient robot behind the wheel.
Mechanical tricks are going to eliminate the jobs of a lot of people.

~~~
hk__2
> Look how close we are to self driving cars without needing a sentient robot
> behind the wheel. Mechanical tricks are going to eliminate the jobs of a lot
> of people.

Let’s say we’re close to self-driving cars, i.e. it’ll happen in 10 years or
so. How much will it cost? How much the maintainance will cost? How many years
will be needed until everybody owns a self-driving car? Unless more than a
handful people have that kind of car you won’t kill a lot of jobs.

------
tomxor
I guess someone has to be the negative one: I can't help feeling it's route to
the correct face looks entirely accidental (and I don't mean that in a good
way)... I'm sure it's "learned" some methods, but they don't look that
efficient, reliable, purposeful or controlled. In a more noisy and dynamic
environment I'd expect them to fail. Granted is possible these could be more
due to training conditions than an inherent limitation of the underlying
model.

~~~
superfx
It looks that way because they're moving rapidly from one face configuration
to another. But there's no way that's happening by random. I would guess that
even just holding the cube constant in a dynamic grip is quite difficult.

------
andreyk
Link to paper ( why no Arxiv :/ ):
[https://d4mucfpksywv.cloudfront.net/research-
covers/learning...](https://d4mucfpksywv.cloudfront.net/research-
covers/learning-dexterity/learning-dexterity-paper.pdf)

TLDR (quick-ish skim, feel free to correct) they train a deep neural network
to control a robot hand to choose desired joints state changes (binned into 11
discrete values; eg rotate this joint by 10 degrees) for a 20-joint hand given
low-level (non-visual; so, current and desired 3D orientation of the object
and exact numeric state of the joints) input of the state of a particular
object and the hand. They also train a network to extract the 3D pose of a
given object given RGB input. All this training is done in simulation with a
ton of computation, and they use a technique called domain randomization
(changing colors and textures and so friction coefficient and so on) to make
these learned models pretty much work in the real world despite being trained
only in simulation.

It's pretty cool work, but if I may pull my reviewer hat on not that
interesting in terms of new ideas - still, it's cool OpenAI is continuing to
demonstrate what can be achieved today with established RL techniques and nice
distributed compute.

~~~
Eridrus
It's pretty amusing/amazing how well domain randomization works, but it seems
like they train the pose detector in the real world and not in simulation:

"To transfer to the real world, we predict the object pose from 3 real camera
feeds with the CNN, measure the robot fingertip locations using a 3D motion
capture system, and give both of these to the control policy to produce an
action for the robot."

------
bambax
> _a human-like robot hand_

But why?? Why should robots' hands resemble human hands? They could have any
number of fingers, or tentacles, or magnets, why should they be like human
hands??

It seems "AI" really means "as close as possible to human behavior", even if
we're not really that clever in said behavior.

Also, human intelligence being at least debatable, it's not obvious that the
obsessive imitation of humans is the best way to attain "AI".

~~~
bytematic
We have built this world for human hands so that has become the best shape
overall. It does depend on the situation but I'm guessing for many, an
improved human shape is best.

~~~
mattigames
Yep, backwards compatibility, not just hands but bodies overall, if we ever
make a robot that can drive existing cars it will pretty much resemble a human
body, and our cities were created for human bodies/interactions (walking up
stairs, etc)

------
dclowd9901
The hand itself is an incredible piece of machinery.

------
shady-lady
Has the pricing come down on these robotic hands? Anybody have ballpark cost
for the Shadow Dexterous Hand - 100k, 300k?

~~~
elsewhen
Pricing on robotic hands:
[http://www.androidworld.com/prod76.htm](http://www.androidworld.com/prod76.htm)

------
preparedzebra
This is a great example of why AI innovation is not moving at the pace we are
told to believe. This is using the same basic algorithms we've known about for
decades, just more compute and differently formulated problems. We need a
paradigm shift!

------
habosa
Any comments on why it seems to basically not use the middle finger at all?

~~~
Menerve
I find it strangely correlated with the way the camera is set up. If it uses
the middle finger then the camera might not see correctly the cube's face. You
can see it using it at the last resort. But I don't see why this would matter
in the simulation phase.

------
Symmetry
That's very impressive. Robotic grasping is getting pretty good[1] but in-hand
manipulation is a whole 'nother kettle of fish and this is really exciting.

[1] He said, tooting his employer's horn.

------
Giho
They should set up an accelerometer and gyroscope in each fingertip instead of
pressure sensors. Could then maybe control without a camera.

~~~
sgillen
They already have a good estimate of the finger tip pose from the angular
measurements of all the internal degrees of freedom (presumably from angular
encoders?). And I'm not sure any IMU small enough to fit into the fingertip
will be accurate enough to provide really useful additional pose information.

------
kylek
I'd like to see it roll a coin on its knuckles. Or maybe some card tricks.

------
sidcool
I immediately thought of the robotic arm from Terminator 2. It's pretty cool.

------
ryanmercer
Terrifyingly amazing.

------
tevlon
This is a perfect example of how AI is taking over the world by storm. I don't
know how people don't realize that there will be no jobs left for billions of
people. Yes, billions. Not Millions.

I don't quite get the "New Jobs will be created" fallacy.

Let me explain: What is job? An Abstract way of looking at it: A job is
something that requires a set skills to accomplish a task. What most
politicians don't get: Researchers like OpenAi teach machines SKILLS not jobs.

A little thought experiment: Let's say humans are capable of 100 skills.
Skills can be anything from: driving, seeing, hearing, reading, walking,
carrying, drawing etc.

Usually, a low paying job requires little to no traning. For example: Someone
in a warehouse that picks the stuff you have ordered. The skill that are
required are: walking, picking and using a device. A High paying jobs usually
requires more skills and/or experience.

We train machines to see better, hear better, sort faster etc. Any new job
will require some sort of skills out the set of skills that can be trained.
But the moment you create this job, it will be automated, because a machine
can do it better and faster.

We need to adress this now, otherwise i don't see a bright future for the
generations to come.

~~~
visarga
> I don't quite get the "New Jobs will be created" fallacy.

Nobody 20 years ago would have imagined the job of mobile app developer. The
mobile phone replaced many devices and probably many jobs in manufacturing,
but also created new domains that we couldn't have imagined and empowered
people in the developing world (and elsewhere).

> We need to address this now, otherwise i don't see a bright future for the
> generations to come.

You look at things the wrong way. Humans have always had a job which can't be
taken away by corporations - the job of caring for oneself and one's needs. If
we don't have corporate jobs, then we can become self reliant at individual,
community and country level and find ways to support ourselves. We can build
houses, teach children, provide medical care and many other things with
jobless people for jobless people. We can even use automation for our own
benefit, like we do with open source software.

~~~
wvenable
What you fail to address is that the success of software is that it does so
much more with so much fewer people. A mobile app developer might be a new job
but it actually replaces (indirectly) the jobs of dozens to hundreds of
people. That's why it's a success in the first place.

My job as a developer is, in a real sense, to eliminate as much work from
people as possible. If I wasn't doing that, there'd be no point to my job. We
don't just make software for the hell of it.

~~~
visarga
No, what you're failing to understand is that the cell phone has created new
opportunities even for the poorest people of the world. It opens up commerce,
payments, short time borrowing, education, hiring, finding a spouse and many
other things that lead to a successful life. On the whole it was a boon for
humanity - in other words, it was worse for all of us, rich or poor, before it
existed.

~~~
wvenable
The world may very well be a better place and frankly saving people from work
is actually a good thing. But I disagree with the idea that the
computer/mobile/AI revolution is adding more jobs than it takes away. There
are not more app developers than their were factory laborers.

Most people would be surprised that US manufacturing is at the highest level
in history. And they are surprised because manufacturing employment is at the
lowest levels.

Is it good that Americans are not doing highly physically demanding
manufacturing jobs? Sure. But what are the long term consequences of
productivity without people? Every industry is more productive with
drastically fewer people and they're improving on that equation every day. Not
just factories but also white collar office work too. When self-driving
vehicles become the norm, a massive amount of people will no longer have jobs.

------
techVentureStar
Nice work. I see a future of robots ruling homo sapiens vividly.

~~~
kochikame
Dexterity is their secret weapon!

------
madeuptempacct
Let's be honest - the only thing we care about is "are the programming jobs
safe?!" Well, are they?

P.S. I am trying to help a newer dev atm, and I realize I always have only one
question for them while basically doing their work "What. the. hell. are. the.
business. requirements?"

Suddenly, this makes me feel much more like a business analyst than a code
monkey, though being a decent code monkey is definitely a pre-req.

~~~
hk__2
> Let's be honest - the only thing we care about is "are the programming jobs
> safe?!" Well, are they?

As a sofware developer myself, I’d love to ever see a world where my work is
not needed anymore. Wouldn’t that be awesome to have worked so hard toward
automation that you can automate your own job?

