Hacker News new | past | comments | ask | show | jobs | submit login
OpenAI's blog post about "solving the Rubik's cube" and what they actually did (twitter.com/garymarcus)
111 points by bratao on Oct 20, 2019 | hide | past | favorite | 50 comments

How is OpenAI misleading? The entire post on OpenAI is about the physics of the problem (different sized cubes, materials, etc.)

When I saw the press release I understood the demonstration was about hand dexterity and not trying to use AI to solve a Rubik’s Cube pattern. That would be overkill IMHO. You don’t need a neural net to solve it and I never thought OpenAI was trying to mislead.

Side note: One of the commenters on the Twitter thread referred to Marcus as the James Randi of AI in jest. I worked for Randi for several years handling the Million-Dollar Paranormal Challenge and investigating unusual claims. I can tell you a lot about misleading claims...

Because this has become a pattern in OpenAI's MO.

Take a very impressive research achievement (large LSTM on byte-level language modeling/GPT-2). present it in a hyperbolic manner ("we've discovered a single neuron that captures sentiment", "the full GPT-2 is too dangerous to release"). wait for the press to eat it up, and if the technical press calls them out on misleading claims, even better, because it'll get even more traction then. Wait for defenders to show up stating that the original research achievement was impressive. Make no effort to clarify misleading claims.

The misleading word here is "solve", which can have two meanings: to derive a solution for a Rubik's cube, or to manipulate a Rubik's cube into a solved state. The casual reader absolutely assumes the former (which also appears as a challenging, intellectual task), whereas the technical achievement here is the latter. But of course, a press released titled "Solving Rubik's Cube with a Robot Hand" sounds much more impressive than "Manipulating Rubik's Cube with a Robot Hand".

I say this as a person who has benefited from their great research output and models: please stop playing this terrible PR game. You do your research work a disservice by muddying the waters like this.

"Too dangerous to release" itself seems like a somewhat misleading summary of OpenAI's position? Figuring out how to do AI research responsibly is a core part of their mission, not a PR game they're doing just to make headlines.

GPT2 had no LSTM

> Because this has become a pattern in OpenAI's MO.

I wonder if this is because it Elon Musk’s MO. It seems exaggeration and hyperbole is what he does.

Elon Musk hasn't been involved in OpenAI for a while now.

And landing rocket on boats before refitting and relaunching them into orbit, but yeah... totally hyperbole.

I think he's more likely referring to Elon Musk's predictions about Autopilot and Tesla. Elon Musk has been promising "full self-driving" capability for years now (by which I mean he has already missed his first public deadline by years). I also find The Boring Company to be a little overly ambitious, as well as many other public comments Musk has made over the years.

I don't think there are many people saying Elon Musk hasn't done some absolutely amazing things. But he's famous for being late to deliver on just about everything, and some of his comments about self-driving cars are considered truly pie-in-the-sky by experts.

More like a parabola, right?

> How is OpenAI misleading? ...When I saw the press release I understood the demonstration was about hand dexterity and not trying to use AI to solve a Rubik’s Cube pattern.

Uh... the headline of the article was literally "Solving Rubik’s Cube with a Robot Hand".

Kudos to you for apparently reading past the headline and understanding that the demo was actually about hand dexterity. But come on. The average layman reading the headline is going to believe that what's newsworthy is that OpenAI solved a rubik's cube.

If OpenAI didn't intend for that to be the case, they should have used different words. For example, "Using a neural net to achieve a breakthrough in robot hand dexterity" would actually describe what the demo is about.

Unfortunately that is about 1000x less interesting than "AI robot solves rubik cube". Which is why OpenAI didn't choose that headline, and why people like Gary Marcus are criticizing them for being misleading.

> Kudos to you for apparently reading past the headline and understanding that the demo was actually about hand dexterity. But come on. The average layman reading the headline is going to believe that what's newsworthy is that OpenAI solved a rubik's cube.

I criticize OpenAI's aggressive marketing as much as the next guy, but I don't actually feel this was one such case. I only ever assumed from the headline + video that they were using neural nets to control the hand, not to solve the Rubik's cube.

I'm not an average layman, though, so YMMV.

> Uh... the headline of the article was literally "Solving Rubik’s Cube with a Robot Hand".

Which part of that statement isn't true?

You read that headline one way. Myself another. When someone actually reads the article (a lost art now) you get full context in case there was any confusion.

I'd argue that calling OpenAI misleading (in this instance) is even more misleading. From Marcus's Tweet I assumed that he had problems with OpenAI's actual claims. Nope. He just didn't like the headline.

More like 1/1000x, there are a bunch of rubik’s cube solvers on github.

Dexterity to manipulate a Rubik’s cube is really incredible, especially though the entire sequence of solving it. It’s a very well-chosen dexterity challenge.

This whole criticism is bizarre

The claim "I solved a Rubik's cube using a computer" is totally uninteresting.

The claim "I solved a Rubik's cube using a neural network" is different, and much more interesting than the first claim.

It really isn't. It's quite obviously possible and not worth the effort if you happen to know anything about Rubik's cubes and neural networks.

In fact, I would be much more interested in "I solved a Rubik's cube using a computer" because you can then talk about the mathematics of a Rubik's cube (presumably the algorithm used is a human-comprehensible one), while for "I solved a Rubik's cube using a neural network" the only sensible question is "and how badly did you have to overfit to do that?"

You can assume the "computer" solution is overfitted, too. There's no reason to, because general methods are well-known, but it's even easier to just hardcode a cube and the list of moves that solves it than it is to implement one of those general methods.

Why assume that "I solved a Rubik's cube using a neural network" guarantees that I cheated?

Yeah, because what they did is actually 1000x less interesting than actually solving the task, not just it "sounds" less interesting.

It really feels to me like Gary Marcus is being a pedantic contrarian, particularly after reading that reddit thread.

From what I can gather, he's a proponent of some other approach to AI (symbolic, maybe?) with a long-running grudge against deep learning.

Both of them (openai and marcus) routinely make more noise than needed

I felt that the communication in PR was not clear. Even about this, the cube was heavily instrumented.

(I work at OpenAI.)

Per https://news.ycombinator.com/item?id=21306452, we have results for both instrumented and uninstrumented cubes!

Our cube variants are listed in the blog post ("Behind the scenes: Rubik’s Cube prototypes"), and results are in the paper in Table 6.

According to the Table 6, the performance of not instrumented cube is 20% for applying half of a fair scramble and 0% for a full scramble. Right?

Yes — but note that "success" is a not-very-granular metric (which we set for ourselves!), as it means fully unscrambling the cube without dropping it once.

To be clear, that means executing up to 100 moves without a drop. If you put the cube back in the robot's hand, without any additional effort it'll continue solving unfazed.

There was no uninstrumented cubes. What OpenAI claims to be a "regular Rubik's cube" is in fact not regular, but has color stickers cut. OpenAI couldn't get regular Rubik's cube working.

The video they released hinted at instrumenting but I thought it was for validation purposes only. Interesting, thanks.

Can you point to some particularly amusing writeups, please? I've found the existence of the prize has helped persuade some of my less skeptical friends against pseudoscience.


There used to be newsletter called SWIFT where Randi would write-up various attempts. I left the organization over a decade ago and have no idea what happened to those articles since then. The current website only goes back five years. Yikes.

Relevant thread on /r/MachineLearning: https://www.reddit.com/r/MachineLearning/comments/dkd4vz/d_g...

Greg Brockman commented there:

> We ping journalists to ask them to correct factual errors in reporting when we see them (though they may not always agree with our corrections). For example, the Washington Post article (https://www.washingtonpost.com/technology/2019/10/18/this-ro...) feels misleading, so we've emailed them and linked them to the relevant sections in our blog post (namely, that we use Kociemba's algorithm as you mention).

If they had to instrument the cube internally, that makes the result much less interesting. And if it's only succeeding 20% of the time, that's not good either.

Robot manipulation is hard. There are lots of systems that work some of the time. Few work well enough in an uncontrolled environment to be useful. Amazon is still looking for a robot picking system, and nothing works well enough yet.

(I work at OpenAI.)

We also have results with an uninstrumented cube (as described in section 7 in the paper, or "Behind the scenes: Rubik's Cube prototypes" in the blog post), which are slightly weaker (see Table 6 in the paper). The 20% number is an example of critics cherry-picking their facts — the success rate is 60% under normal conditions, but 20% with a maximally-scrambled cube (which would happen randomly with probability less than 10^-20).

Also note: success here means that the robot was able to perfectly unscramble the cube — which requires perfectly performing up to 100 moves — without dropping it once. What it means, in practical terms, is that you need to wait for a long time in order to witness even a single failure. If you pick up the cube and place it back in the hand, it'll get right back to solving.

Note that like with OpenAI Five, the success rate is more of a function of how far we've had time to push the system than something fundamental. We're not building commercial robots; we're in the business of making new AI breakthroughs. So the next step for our robotics team will be finding a new task that feels impossible today, and see what it takes to make it no longer feel impossible.

From the blog post:

> Our method currently solves the Rubik’s Cube 20% of the time when applying a maximally difficult scramble that requires 26 face rotations. For simpler scrambles that require 15 rotations to undo, the success rate is 60%.

And looking at the data in http://cube20.org/qtm/ , with a random cube, the probability to have a maximal-scrambled cube that needs 26 quarter-turns is 10^-20, but most (~75%) of the random cubes need 20 or 21 quarter-turns. Most of the algorithms don't use the most efficient path to solve the cube, so if the best path has 20 steps, the actual path will have a hundred or more steps.

To solve the cube in 15 steps, it must start as almost solved. It's not what people usually call "normal conditions".

Where do you get these numbers from?

Under "normal conditions", my reading is the success is 20%, 0% for a maximally-scrambled cube. I don't feel the Giiker Cube can't be considered "normal conditions".

> What was learned was object manipulation, not cube solving

Which is almost certainly a harder problem

I wonder if tech people are assuming this is super obvious (the object manipulation is the hard part; the cube solving part is just an extraneous detail since that's been solved since forever) while communicating about this project, and it just happens to be less obvious to non-tech people and therefore accidentally misleading. I could guess that non-tech people when seeing the project barely notice the robotic hand part ("robots have been in popular culture for decades, I guess this is just one of them"), figure that thinking about a Rubik's cube is a smart-person thing, and therefore it's a new smart AI thing too now. The whole case seems like a nice example of how it's necessary to think about how to present the assumptions and goals of a project.

The point is it wasn't ad hoc object manipulation + cube solving, the hand had its hand held regarding which macro-action it was supposed to take next.

The PR I've seen about it doesn't make this clear and is happy to leave the ambiguity because it means more publicity, rather than demonstrating a better understanding of editorial ethics.

As much as I'd love to give OpenAI the benefit of the doubt, I have to agree that they are purposefully being misleading to generate more interest rather than be as straight-forward as possible about their accomplishment. After the confusion over whether or not they're really "open" after taking MS investment, and what I personally consider terrible mismanagement of the GPT-2 announcement and release, I've just lost faith in the organization to be honest. They never directly lie, they always ambiguously lie, so I rarely even bother to explain to people why I don't like them and just recommend my friends look for work elsewhere.

> I've just lost faith in the organization to be honest

:( sorry to hear that.

Note that we have a publicly-available Charter which spells out how we operate: http://openai.com/charter. We all use that to guide our actions, and we have not changed a single word in it since publication. I hope that as time goes on, you'll increasingly see the consistency between our actions and the words in the Charter, we'll be able to win back your support.

FWIW, just watching you interact with your critics on here has done a lot for you to earn my respect. I hope I've been too quick to judge and that OpenAI is able to make me reconsider my stance in the future.

The part they didn't do was the trivial part. My phone has an app that solves Rubik's cubes based on photograph.

That's correct, and also not the concern.

I think the blog title use of 'solving' is misleading. As OpenAI didn't make the solving algorithm. It's a neat but limited use of transfer learning in an environment which requires heavy instrumentation. I could tell this immediately from the photo. What I didn't know was that the Rubix cube also had to be modified and that it had a high failure rate. While I'm sure this is state of the art in something it does seem overblown.

This tweet by Woj Zaremba is bewildering. If we all agree it's about physics and manipulation, how do you get to claim that a failure in manipulation doesn't matter?


> It’s 20% success rate to solve the most difficult configuration of Rubik’s cube. However, on average the success rate is 60%. Moreover, the failure is by dropping the cube. The hand always solves the Rubik’s cube if you put it back the cube after the drop.

I think it just points out that hardest part is not dropping the cube. The hard problem they solved completely is orienting and manipulating the cube in an intended way, provided they managed not to drop it.

Determining how to manipulate cube to get it solved state wasn't part of the problem as it is trivially easy for a machine or any human that learned how to solve rubiks cube.

The takeaway from that comment is that it's not like the robot is just unable to solve certain cubes or has an equipment failure, etc.

If anyone reads Gary’s new book Rebooting AI, his whole MO is that to create really good generalized intelligence current level of neural nets are too narrow.

And here OpenAI made a claim about solving Rubik’s cube with a robot hand. One would assume they found a generic algorithm from the headline that does Rubik solving From both physical and algorithmic perspective. In actuality OpenAI made a demo of a very specific Rubik’s cube that gave Bluetooth info about its state (not pure vision like humans do). The Rubik’s algorithm was a pre-programmed one, not something that was learnt. Only hand manipulation of that specific cube was learnt.

And we don’t know how general the hand manipulation was. Does it work with different sized Rubik’s cube? What about a non-bluetooth one? Can the same system also fold clothes? Assemble lego blocks into some fixture?

Basically OpenAI has taken a fuck ton of VC funding. So the headlines are hyperbolic when reported. Whether intentional or not, I don’t know. To the layman it’s sending the wrong message and creating unnecessary fear.

OpenAI, DeepMind, AAMFG need to always explicitly say how narrow their AI is when they make claims. I.e Here’s 10 things it’s good at and these are the boundaries. If you change things slightly in the following ways it will fail.

Solving it analytically seems to use deep nets and Monte Carlo tree search [1]. Some additional structure to deep learning and you're done. Manipulation is the harder problem.

It's like complaining that a juggler can't count to three.

[1] https://arxiv.org/abs/1805.07470

> instrumented cube as vision has not been adequately solved.

this speaks to Elon's stubbornness in avoiding the use of lidar.


I mean sure, what better way to present your arguments than a picture on Twitter and asking for people to zoom.

That is it, I'm adding a ublock filter that blocks Twitter thread posts here.

OP violates HN guidelines about charitable interpretations and middlebrow dismissals.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact