
The Wiki is in French (2011) - luu
http://www.hccp.org/the-wiki-is-in-french.html
======
klodolph
It could have been worse, and it has been.

"The Spider of Doom" (2006)

A crawler deleted all of the content in a CMS, because the delete button used
a GET request, without authentication, and with the URLs embedded in the HTML.

[https://thedailywtf.com/articles/The_Spider_of_Doom](https://thedailywtf.com/articles/The_Spider_of_Doom)

~~~
Drdrdrq
I know of someone who tried to backup company wiki with wget. There were links
that would delete pages, "protected" by some JavaScript which didn’t get
triggered. Fortunately they had some real backups, but the error seems to be
quite common.

~~~
mort96
Part of the problem is that we don't really have nice ways to send POST
requests from something which looks like links, from what I can tell.

An <a> tag can only ever send a GET request. It can't be made to submit a form
(without JS) afaik. The only simple way to have something the user can click
to send a POST request is to use a <button> or <input type="button"> in a
form, and styling that to look like a link is absolute hell.

I'm sure that if we had proper HTML tools from the start to say, "I want this
thing to look like a link, but I want it to send a POST request", we'd at
least have less such issues.

~~~
klodolph
Sure, but styling is possible
([http://jsfiddle.net/adardesign/5vHGc/](http://jsfiddle.net/adardesign/5vHGc/))
even though it’s not perfect, and requiring JS is completely acceptable for
>99% of people. If you want it both ways, you can always use JS to replace the
button with a link, that way you get a functional button without JS that looks
like a link (but is slightly off), and a well-styled link with JS, and you get
zero GET requests that delete resources. Progressive enhancement!

It’s a mess, but as far as I’m considered “not messy” HTML+CSS+JS was never an
option to begin with, at least for most people.

~~~
mort96
Sure, it's possible, and every web application worth its salt should make sure
it doesn't use GET requests to mutate state. I'm just saying that if it wasn't
so unintuitive (i.e having to just know of all the styles browsers tend to
apply to buttons, and how to change them to how browsers tend to style
buttons), people would have made that mistake less.

The thought process should have been as straightforward as, "I want this link
to modify state, so I'll use <a method='post'>".

I agree that using JS is probably okay for most tasks these days, but 1) a lot
of web platforms were written at a time before JS was widespread, and 2) just
throwing together some HTML is a lot easier than adding javascript to every
link. (If you don't think that speed/ease of programming matters; you'd be
surprised to learn how many internal tools various places use were just thrown
together and isn't a polished product.)

------
1337shadow
One day I got a similar call from an old customer I had not heard about for 6
months or so "Please help, my customer website has been hacked ! All content
has been deleted !". A simple grep delete on the Apache logs showed that a
User-Agent called GoogleBot visited all delete pages. Looking at these pages
source codes, the developers that coded their website had forgotten to check
authentication on a number of administration pages. And of course, the list
page showed links to delete each content item, that was intercepted by a
javascript callback doing a confirm(), which didn't matter for GoogleBot.

------
inflatableDodo
Reminds me of something that happened to a mate. He created a database for a
project and while it was still in testing he generated a very long random
number and made it into a URL to reset everything. It was not linked to from
anywhere and he sent it once to the client over a Skype message. He gets a
call from the client a few days later to ask if there is a problem as the
database had emptied itself. He finds one visit to the page in his logs. From
an IP located in Redmond. He reckons they were scanning URLs from Skype and
using them to help populate Bing.

~~~
ambicapter
Wow, that's...Redmond's fault in this case? You can't guarantee every URL you
scan is going to be idempotent. That's a hell of an oversight on their part.

~~~
lucideer
I completely agree with your comment but... automatically hitting URLs to
create link previews in messenger apps is a common practice, and one that
seems to have been (frustratingly) accepted by the mainstream of developers
and users alike.

This might be MS' fault, but it's also Facebook's, Slack's, pre-FB-WhatsApp's,
to name just a small few of many.

<aside> On the other hand, if you're putting up a GET endpoint that's not only
not idempotent, but unauthenticated and wipes your whole DB. Even on a test
site. Blaming MS for your troubles is a bit much.

~~~
Kalium
This is a practice that exposes the tension between protecting users from
malicious use and privacy at scale. A _lot_ of malicious links get sent over
messenger systems, and loading them is often the only good way to find out
which ones are or aren't.

But wait! Surely there are other possible approaches. Can't users just report
spam? Surely there are clear usage patterns that can be detected too!

These are valuable and useful approaches, but have historically proven unequal
to the task. It can be tricky to spot a spam campaign spread over N accounts -
only the most basic forms of spam are really obvious. URLs are easily
disguised and changed. Loading the URLs and checking the contents that come
back is one of the better approaches. Users also don't tolerate a bunch of
spam. They _will_ migrate away.

So, yes, anyone who makes a GET endpoint into an unauthenticated change-things
URL deserves all the misery they get.

------
pierlu
Modern crawlers can easily follow javascript or some basic POSTs. So, apart
from the insane state-changing GET requests in the code, unless you are
spidering your own application, I would not give a crawler admin credential.
Anyway a very nice anecdote of the two-thousands decade. Never ever forget the
basics.

------
skrebbel
Pretty lucky that the "delete everything, really, also the backups!" button
needed a POST.

------
collyw
I have seen a lot of web apps doingthe opposite - using POST requests for
everything. Has this ever caused anyone problems?

~~~
vollmond
It makes it impossible to bookmark a specific search result, for one example;
either the page of results or a specific instance, because the bookmarked URL
doesn't have any of the search/ID metadata that a GET-loaded page would.

------
thanatropism
Once somewhere in 2016 on Facebook I clicked a slate.fr link. Ever since, my
Facebook has been in French.

------
robbrit
And this is why GET requests should not modify state.

------
lol768
I don't understand how a crawler could change the language of the wiki without
being authenticated.

~~~
trampi
> After determining the source IP of the crawler (one of the admins was
> experimenting with Nutch as a supplement to the wiki's impoverished search
> capabilities and had authenticated the crawler using their admin
> credentials)

Quoting the article.

~~~
lol768
Thanks, I missed this on first read! The lines are quite wide.

Now wondering what the wiki software was that had implemented it as a GET (of
course, opening up to CSRF too if there's no token param) _and_ implemented
poor search functionality.

------
ncmncm
Funny, the original author appears not aware of the derivation of the
expression "all too".

------
walrus01
Company wikis should not be accessible from outside a VPN or physically being
on the office lan.

~~~
trampi
While your comment is correct, this would not have prevented the issue stated
in the article.

> After determining the source IP of the crawler (one of the admins was
> experimenting with Nutch as a supplement to the wiki's impoverished search
> capabilities and had authenticated the crawler using their admin
> credentials)

The real problem is that a GET request is meant to be side-effect free. A
crawler only issuing GET requests should not be able to modify e.g. global
settings. Even when using an admin token.

~~~
avmich
> GET request is meant to be side-effect free

Meant, yes, by somebody, but a web service creator can decide otherwise, for
some reasons, like simplicity.

If I remember correctly, there was a good story about Viaweb, how they figured
that sending requests to follow links can be used as commands - and I wasn't
sure they didn't use GET for that... but maybe I'm wrong.

~~~
cesarb
> but a web service creator can decide otherwise

This is like "undefined behavior" in C. Sure, you can make a GET have side
effects in your application, but everything else will still assume that HEAD
and GET is free of side effects, and might repeat requests, omit requests
(using cached data), or even do requests speculatively in advance.

~~~
csunbird
This reminds me of the story of an internet connected garage door opener that
uses GET for opening and closing

>
> [https://twitter.com/rombulow/status/990684453734203392?lang=...](https://twitter.com/rombulow/status/990684453734203392?lang=en)

TL;DR: Safari figures out the user is visiting this page very frequently so
whenever Safari opens, it tries to send a request to fetch the garage door
"page" link before he visits anywhere to cache the response. Which immediately
opens the garage door.

