
Solving the Rubik’s cube with a robot hand - gdb
https://openai.com/blog/solving-rubiks-cube/
======
daenz
>We’ve trained a pair of neural networks to solve the Rubik’s Cube with a
human-like robot hand. The neural networks are trained entirely in simulation

Simulated training is so cool. Related, is anyone interested in a plugin for
Blender that allows you to easily build physically-accurate simulation
environments for robots and then apply reinforcement learning to the virtual
robots? I have a hodge-podge amount of code for doing exactly this, and I'm
curious if anyone else would be interested in it?

~~~
sgillen
Personally I think it would be useful to focus on the "easily build physically
accurate simulation environments for robots" using blender part. IMO it makes
the most sense to try and make this created environment into an OpenAI gym
environment so that way most of the existing RL algorithms can be applied to
the robots. If you do want to spin your own RL this approach does not stop you
from doing so.

AFAIK there are a lot of publicly available RL algorithms out there, but not
many (any) blender like interfaces to make physically accurate simulations.

------
ZhuanXia
Gwern on the luck of the last mover:

"Launching too early means failure, but being conservative & launching later
is just as bad because regardless of forecasting, a good idea will draw
overly-optimistic researchers or entrepreneurs to it like moths to a flame:
all get immolated but the one with the dumb luck to kiss the flame at the
perfect instant, who then wins everything, at which point everyone can see
that the optimal time is past."

Robotics has been a money pit for startups and corporations for a long time.
Think of the billions Toytoa has spent on home robotics research, to little
avail.

But at some point it won't be. Some entity will "kiss the flame" at the right
movement. The wealth they create will be beyond any company ever, by an almost
incomparable margin.

~~~
carapace
Part of the problem is that the popular conception of robots tends to be a
kind of fetish. What I mean is, the things that are easy for robots to do are
already addressed. You can buy off-the-shelf robots that work really well.
They're not cheap though.

But those don't _look_ like "robots", they look like arms with tools on the
end of them.

The kind of humanoid servant robot from books and movies, however, is still
pretty much fictional. The required capabilities are mostly _really_ hard,
even after you factor in the recent advances in ML et. al.

I remember when Sony made that little humanoid robot that danced. I was like,
"Big deal! I _like_ to dance. Make a robot that does the dishes."

\- - - -

To make it big with robots ( _per se_ as opposed to just building an automated
factory, or toys) you have to find the economic niches.

~~~
DuskStar
> But those don't look like "robots", they look like arms with tools on the
> end of them.

> I was like, "Big deal! I like to dance. Make a robot that does the dishes."

From these comments, I think you're missing a really huge category of robots -
appliances. Why does a dishwasher or laundry machine not qualify as a robot,
after all?

~~~
carapace
I sometimes do call them robots, but you could exclude them on the basis of
lack of mobility, or, better yet, lack of decision making. (Although a friend
of mine has a laundry dryer with a moisture sensor.)

In a sense, anything with a PID controller or even just a _governor_ could be
considered a "robot", or at least "automation", eh?

[https://en.wikipedia.org/wiki/Centrifugal_governor](https://en.wikipedia.org/wiki/Centrifugal_governor)

------
sytelus
Many caviates but impressive progress in manipulation, especially sim2real:

\- Only 20% attempts successful on hardest configs with 26+ moves

\- Solving steps are not generated by RL (but could be[1])

\- Cube is modified internally to transmit additional state via bluetooth

\- Highly calibrated and fine tuned environment+MuJoCo based sim to match
simulation to reality as much as possible

\- Open AI Five algorithm is pretty much reused as-is

\- Cumulative training time = 13 thousand years, same order of magnitude as
the 40 thousand years

\- 32+64 V100 GPUs per training cycle

[1] [https://arxiv.org/abs/1805.07470](https://arxiv.org/abs/1805.07470)

------
hinkley
Some of Vernor Vinge's books deal with the 'alien' in alien intelligence in
ways that were quite illuminating/shocking for me at the time. They weren't
just humanoids with animal instincts. He created intelligent spiders that were
believable. And the only sympathetic treatment of a hive mind I've yet
encountered (Card's are pale in comparison)

But one of my favorite inventions of his was a creature that had somehow
evolved wheels. With veins and nerves and such there is hardly a creature on
earth that can rotate a limb farther much farther than 200°, and the ones that
can, like owls, we treat with a certain reverence.

Developing an artificial wrist that can spin arbitrarily would be, I'd think,
a quite compelling compensation for someone having to use a prosthetic arm. It
would also make for some wicked Rubix solving skills. I wonder how
proprioception would deal with that though...

~~~
chewxy
On that note, I also highly recommend Adrian Tchaikovsky's Children of Time.

------
minimaxir
I appreciate the plushed giraffe perturbation. Reinforcement learning needs to
account for all eventualities, including giraffes.

------
perl4ever
"What people don't appreciate, when they picture Terminator-style automatons
striding triumphantly across a mountain of human skulls, is how hard it is to
keep your footing on something as unstable as a mountain of human skulls."

...I'm not feeling so confident now.

------
gapo
It's great that OpenAI has continued to exist as an technological organization
with no clear revenue expectations. At the same time I am not sure how long
they can sustain doing what they are doing OR whether there is this new found
feasibility for private research organizations to exist in this space provided
they produce clear high-quality output like OpenAI is doing.

~~~
hos234
I'll believe they are doing something useful the day they setup a Burger shack
opposite a McDonalds and outcompete on inventory or queuing or something
practical. Nobody in industry cares about Rubik's cubes and Go.

~~~
sytelus
I think you are underestimating the power of such progress. Look around all
the objects you have from iPhone to laptops to pencil sharpener. They were
made in some factory and very likely human hands played some role there. Now
imagine you can throw in $100 human hands which can operate as dexterously as
human from cameras just like human without taking rest or vacations or
requiring medical insurance. What you think will be the impact of this? People
call it Industrial Revolution 4.0. It will change world beyond billions or
trillions of dollars. The investment in places like OpenAI is bargain of
lifetime.

------
jefft255
As a roboticist, it's really clear to me that this sort of transfer in
controlled environment is hard but doable. I think it's already been
demonstrated many times and I'm not that convinced that there is anything new
in there except more GPU + fancier robot.

I'll be impressed by RL is a) they manage to do sim2real in open environments,
think Doom -> office building or b) they manage to get data efficient enough
that sim2real is still necessary but you don't have to do real data collection
with 10 parallel robots for days on end.

As someone in mobile robotics as opposed to pure manipulation, I read these
papers and I'm like: "How the hell am I supposed to get this to work on a
robot moving in the real world???". I don't see anyone being close to this
right now.

~~~
ilaksh
As a roboticist what do you think of my theory that what's missing is more
biomimetic artificial muscles with greater power-to-weight ratio?

~~~
jefft255
Honestly I don’t know; you’re out of my area here I’m really into
perception/slam/planning. Greater power to weight ratio is always good. I
never really cared about biomimetism for the sake of it. If the way to get
better power to weight ratio is biomimetism then great but if you can get it
without trying to imitate nature then it’s great too.

------
est31
This is cool. I wonder about the hardware. Why does the mount for the hand
have a fan? Does it contain the inference computer? Power transformers?

------
askytb
Does anyone have any experience with soft robotics? For example these guys:
[https://www.youtube.com/watch?v=X6CRe2ieuYE](https://www.youtube.com/watch?v=X6CRe2ieuYE)
advertise their gripper as supposedly being able to handle weight/size variety
with no training at all, just with the use of different materials in the
gripper

------
breck
A few weeks back I was at a program synthesis conference and gave a short
lightning talk where I said deep learning so far has been used to solve the
easy computer chess, and the easy computer go, etc...not to take away from
those accomplishments at all, I was just saying that having a robot beat
grandmasters at real world physics chess where you have to move the pieces
with many degrees of freedom is a harder problem, but trivial for a 7 year
old.

I thought we were still a decade away from having machines beat humans at real
chess and real go, but this makes me think maybes it’s just 5 years out. Very
impressive.

~~~
PeterisP
Manipulating chess pieces is trivial for e.g. a pick and place robot, which
are quite widely used for industrial activities that are quite close to moving
chess or go pieces.

In particular, far from being "just 5 years out", robot hands that execute
chess moves have been already demoed many times, including by hobbyists with
very limited resources. Reliable computer vision was a bit more trickier a
decade ago, but that's not a problem now; Having a robot beat grandmasters at
"real chess" (i.e. the same thing as "virtual chess" but also manipulating the
physical pieces) would not be considered a hard problem nor a valuable
achievement, it's a nifty parlor trick that could make a cute demo 10 years
ago, and could be used as a homework project for engineering students nowadays
- however that's likely to be two separate projects, as the mechanical
manipulation and visual recognition is likely to be different skillsets and
thus different students.

Here's a random article from 2010 [https://newatlas.com/chess-terminator-
robot-takes-on-kramnik...](https://newatlas.com/chess-terminator-robot-takes-
on-kramnik-in-match/16996/)

Here's a hobbyist project from 2013
[https://www.robotshop.com/community/blog/show/a-chess-
playin...](https://www.robotshop.com/community/blog/show/a-chess-playing-
robotic-arm)

Here's a tutorial from 2017 on how to make the chess piece manipulation
yourself -
[https://www.youtube.com/watch?v=NefiXZ7BCsE](https://www.youtube.com/watch?v=NefiXZ7BCsE)

Here's a student project, replacing the vision with sensors -
[https://www.instructables.com/id/Chess-
Robot/](https://www.instructables.com/id/Chess-Robot/)

~~~
breck
Great links, thanks very much for bringing me up to speed on this domain. The
Chess Terminator is the sort of thing I'm talking about.

> Manipulating chess pieces is trivial for e.g. a pick and place robot,

Perhaps in a sterile, well-known, controlled environment; but not in a real
world, novel, potentially adversarial environment.

I guess my point is about AGI is that I would bet a 7-year old could currently
beat the best AI in the world at real, physical chess, played in a randomly
chosen park. Kids can quickly figure out strategies in the real world with its
more degrees of freedom than you have in the digital world of computer chess.
In other words, perhaps a kid may figure out that if they place a piece in a
certain position, the computer is unable to "see" or "execute" the desired
move, perhaps because the angle of the sun or some line of sight obstruction.
While an adult might be generous and offer help, a lot of children will take
advantage of the robot's weaknesses.

~~~
PeterisP
IMHO that's not chess anymore, as that explicitly violates the rules of the
chess - if you manage to get an advantage by distracting your opponent and
obscuring line of sight to the pieces, that's simply violating the laws of
chess (specifically, FIDE "12.6 It is forbidden to distract or annoy the
opponent in any manner whatsoever.") and appropriately punishable by the
arbiters.

Chess is a well-defined, strict game not only from the "on-board" perspective
but also regarding how the opponents can behave - e.g. it's explicitly
specified that if your phone makes a sound during a match, then you lose the
game; the rules of chess IMHO are _exactly_ a sterile, well-known, controlled
environment, and attempting to transform it to a novel, potentially adversial
environment would generally be a violation of both the spirit and letter of
laws of chess.

E.g.
[https://en.m.wikipedia.org/wiki/Chess_boxing](https://en.m.wikipedia.org/wiki/Chess_boxing)
is a fine physical, adversarial form of sports, but it's not chess.

~~~
breck
Haha, thank you! I stand corrected. (Next time I play with my nieces and
nephews, I'm going to be stricter about rules :) )

~~~
PeterisP
Hah, I have a niece that will assert that I violate that "forbidden to annoy
opponent" rule because I annoy her simply by existing.

------
imtringued
This is a pretty pathetic result and is damning for the progress of AI.
Instead of focusing on efficiency, AI researchers simply throw more resources
at the problem with the hope that it's enough. The end result after 13000
years of training is a robot hand that can do nothing but solve rubik's cubes
and fails 40% of the time.

------
yCloser
world record one handed is 6":88 average of 5 is 9":48
[https://www.worldcubeassociation.org/results/rankings/333oh/...](https://www.worldcubeassociation.org/results/rankings/333oh/average)

non-world-class: doing <30" one handed is very doable, anyone can do less than
1 min (yes, if you know how to solve and you trained one handed. ofc not if
you never solved a cube in your life)

that said... I really don't understand how the hand keeps the cube "floating"
around. In one handed the technique is pretty much to keep the cube fixed
holding front/back centers with thumb and index. Something like
[https://www.youtube.com/watch?v=mUF3aPDTO-4](https://www.youtube.com/watch?v=mUF3aPDTO-4)

I understand the achievement, but wow, this solve is HORRIBLE. What did they
train the network with to get this?!

------
throwaway07Ju19
Around 1988, I read a book that claimed the ideal robot hand would have
fingers that repeatedly bifurcate until it has a digit so small it can
manipulate matter at the atomic level. Implausible but fun to think about.

------
ilaksh
I think artificial muscles that are more biomimetic with better power-to-
weight ratios are going to make a huge improvement in robot capabilities at
some point. Especially for humanoids.

------
____Sash---701_
Any YC companies going after the robotics industry?

------
a13n
Would be cool to see it done with two hands (or one), solved faster than the
human world record. It's still pretty clumsy looking.

