The Spider of Doom

ColinWright · on Oct 12, 2011

  > ... management didn't quite see what was wrong with
  > that.  Instead, they told the client to NEVER copy
  > paste content from other pages.

I'm reminded of the story about Feynman exposing the security problems with the combination locks used on safes at Los Alamos and Oak Ridge. Instead of fixing the problem, management just told people not to let Feynman near their safes.

  > He tells the story of being in Oak Ridge, and delivering
  > a report to a colonel there. He reports that the colonel
  > felt himself far too important to have an ordinary safe
  > -- he ordered a special multi-ton safe. Feynman was
  > delighted to discover that this big, important safe used
  > the exact same type of lock as their little safes did,
  > and just to be sure, he took the last two numbers off it
  > while standing in the colonel's office. After the colonel
  > closed the safe, Feynman told him the safes weren't
  > secure, and proved it by opening the safe, then
  > explaining how he did it. He told the colonel that the
  > vulnerability was in leaving the safe open while he
  > worked. "I see, very interesting," replied the colonel.

  > Several months later, Feynman was again at Oak Ridge, and
  > was surprised at all the secretaries telling him, "Don't
  > come in the office! Don't come in here!"

  > It developed that the colonel had immediately sent around
  > a memo asking everyone, "During his last visit here, was
  > Professor Feynman in your office?" Those that answered
  > yes received another memo: "Change your safe combination."

  > That was his solution - Feynman himself was the danger.
  > Meanwhile, of course, people still continued to work with
  > their safes open ...

jarrett · on Oct 12, 2011

To me, the problem isn't that authentication depends on cookies and JS. The problem is that the system does the exact opposite of what it should.

An authentication system should allow access if and only only if the client presents valid credentials in the expected manner (e.g. reporting a cryptographic cookie back with the HTTP request). In other words, the auth system should use the presence of valid credentials as the criterion for access.

Instead, this system used the absence of something as the criterion. Imagine if a secure building did this. Everyone arrives at the security checkpoint, and if the guards don't recognize you, they give you a badge that says "I'm not allowed in." Anyone who's not wearing the badge is allowed in. That's what this website was doing.

Moral of the story: Your authentication system can depend on cookies, JavaScript, and any other technology being enabled on the client side as a precondition for access. If you do that, and someone has disabled the technology, they're locked out, and there's no security breach--just a frustrated user. But your system should never trust the absence of some marker as proof that the client is allowed in.

raldi · on Oct 12, 2011

Local legend has it that the guy didn't have a backup, but senior Googler Matt Cutts wrote a custom MapReduce to process the spidering results and make him a tarball of the content GoogleBot recorded as it deleted the site.

Supposedly he sent it with a note along the lines of, "This one's on the house, but we're not doing it again."

reedlaw · on Oct 12, 2011

> He brought up the root cause -- that security could be beaten by disabiling cookies and javascript

Wouldn't the root cause be allowing GET requests to perform destructive actions?

pavel_lishin · on Oct 12, 2011

I think the authentication problem is more dangerous. If you fix the GET issue, you're still allowing any savvy stranger to delete your articles by hand.

mrfu · on Oct 12, 2011

There are two problems.

1: disabling cookies bypasses security checks

2: a GET request is not side-effect free

The root cause is the combination of both issues.

mekazu · on Oct 13, 2011

There is no more security in DELETE than there is in GET. Using DELETE over GET would only decrease accidental changes.

JonoW · on Oct 12, 2011

Haha, blame the client for copying and pasting between pages, that sounds fair.

This is also a pretty good case study of why using authentication frameworks is a very good idea.

pavel_lishin · on Oct 12, 2011

Also why a GET request should never delete, or possibly even modify, data.

landhar · on Oct 12, 2011

Indeed, although it isn't enough anymore: https://news.ycombinator.com/item?id=3100239

JonoW · on Oct 12, 2011

For sure. Combine the 2 mistakes and you're pretty fucked

on Oct 12, 2011

[deleted]

JonnieCache · on Oct 12, 2011

>Why should a GET never modify data?

Because that's what the HTTP spec says. Wander off-spec, and you will eventually have problems, like the unfortunate gov department in the article.

http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Saf...

Relatedly, idempotency is a too-poorly understood concept.

pavel_lishin · on Oct 12, 2011

> That would stop half the web working.

What do you mean?

loup-vaillant · on Oct 12, 2011

Probably that half the web rely on GET requests to do things they are totally not intended to do, like, deleting a resource.

Therefore, if you suddenly find a way to actually forbid GET to ever trigger that, then that half wouldn't work "properly" any more.

jmischel · on Oct 12, 2011

I ran into this with my crawler, too. See http://blog.mischel.com/2008/08/08/hey-you-deleted-my-files/

Liu · on Oct 12, 2011

Actually expected a story about id software's spider of doom...

tambourine_man · on Oct 12, 2011

So did I.

buckwild · on Oct 12, 2011

Smells like 2006 in here :-D

I'm not a web-dev guy at all--is this still a problem today?

ck2 · on Oct 12, 2011

This is also why nonces exist and should be used for admin actions.

run4yourlives · on Oct 12, 2011

I love how it's spun at the end to make the client look stupid.

Josh should have been fired on the spot for such a ridiculous security oversight.

waitwhat · on Oct 13, 2011

The article doesn't say that this was Josh's design decision, or that he was even involved in the project until the post-launch incident-handling.