
The Twelve Days of Crisis – A Retrospective on Linode’s Holiday DDoS Attacks - alexforster
https://blog.linode.com/2016/01/29/christmas-ddos-retrospective/
======
kyledrake
I strongly believe it's not possible to safely run a site without DDoS
protection for all servers anymore. Anyone with $20 can take down anything on
Digital Ocean, Linode, Hetzner, and many others. Or they can run up a huge
bill for you on AWS. I would love to use Cloudflare but I can't afford
$6000/mo for DDoS protection on my servers with the wildcard requirements we
need. Linode may have solved their DDoS problems with their own stuff, but
what about their customers' VPSes?

I really wish people would start taking DDoS more seriously. It's really not
something we can just null route servers for anymore. It's becoming a very
serious problem. It's not going away, it's amplifying and getting far worse.

I'm also not sure how effective it would be, but it would be nice to see the
FBI, NSA or whomever spend at least as much time fighting these DDoS warlords
as they did persecuting whistleblowers and trying to shove backdoors into
cryptography.

~~~
click170
I think that effort would be better spent encouraging (see Forcing) ISPs to
start dropping forged traffic at their borders.

IMO there should be significant penalties for network operators who do not
drop obviously I forged traffic. How long has that rfc been around now and how
little adoption has it seen?

~~~
ryanlol
>I think that effort would be better spent encouraging (see Forcing) ISPs to
start dropping forged traffic at their borders.

The importance of spoofed traffic to attackers is greatly exaggerated, I could
personally easily send 500+ Gbit (probably terabit) sized attacks by spending
a couple of weeks building a router botnet. No need to spoof IPs and at that
point diminishing returns would make amplification attacks useless. Not only
that, but most amplified attacks are particularly inexpensive to filter.

>IMO there should be significant penalties for network operators who do not
drop obviously I forged traffic. How long has that rfc been around now and how
little adoption has it seen?

Who would penalize them? Why?

And I'm not entirely sure if you understand what RFCs are, that RFC (which
hasn't even been around for very long) is - most other RFCs - completely
meaningless.

~~~
cft
In practice, most volumetric attacks have spoofed IPs- amplified UDP
reflection attacks and even SYN floods with no amplification.

Getting rid of UDP amplification reflection attacks will get rid of 90% of
volumetric attacks.

~~~
ryanlol
We had volumetric attacks every day much before reflection attacks became
common, the biggest attacks these days aren't reflected but from router nets.

And you simply cannot solve IP spoofing without rebuilding the entire
internet, not to mention the fact that it does have legitimate use cases.

Also, if IP spoofing is making filtering difficult for you then you're doing
filtering wrong.

~~~
cft
How big were the volumetric attacks that you saw that involved real IPs? The
amplification factor is 1x for the real IPs. With NTP reflection and DNS
reflection, you get 50x amplification, so 1Gbps botnet (trivial bandwidth)
will cause a 50Gbps DDos (non-trivial bandwidth). This is why filtering is
desirable.

~~~
ryanlol
I saw a 500Mpps SYN flood just last week, followed by about 500Gbps of UDP
packets. All from real IPs.

Botnets have far surpassed amplification attacks at this point.

>This is why filtering is desirable.

Come up with a way to implement it that actually works and doesn't break
legitimate use cases. spamsolutions.txt is starting to seem relevant here.

~~~
cft
We work with Verisign that specializes in DDoS mitigations, they have state of
the art scrubbing centers on four continents and they are leading mitigation
provider to the banks and schools. They told me they almost never see anything
above 300gig. You must be special.

~~~
ryanlol
Yes, I would imagine I have far more experience dealing with large attacks
than verisign.

>they are leading mitigation provider to the banks and schools

:)

------
rdl
Layer 7 attacks are the new hotness in DDoS. If you have a big enough botnet
(either conventional botnet, or hijacked browsers), you can do them, and
they're often quite effective.

Fundamentally, layer 3/4 are usually amplification. Those are still effective,
and very efficient for the attacker, but they will someday (5y? 10y?) be
blocked by closing up sources amplification. Address spoofing address at layer
3/4 might get addressed by BCP 38, Vixie's good fight, etc., but not holding
my breath.

By the time all that happens, attackers will have moved on to layer 7 attacks.
Those can target the weakest parts of your stack, and with a large botnet,
even the act of blocking the IPs in the wrong place can add enough overhead to
hurt. With a huge botnet of hijacked browsers, blocking everyone affected
becomes a DoS vector in itself, since some of those are your own legitimate
attacks.

The big problem for DDoS mitigation is that this requires much deeper
knowledge of the protected application. It's hard to just put a box inline, or
an unmodified cloud service, and have it block the attacks. There's both good
science and great engineering to be done, by developers, platform vendors, and
specialty anti-DDoS providers, to block this emerging kind of attack.

~~~
ryanlol
>Layer 7 attacks are the new hotness in DDoS

Maybe 10 years ago.

>Fundamentally, layer 3/4 are usually amplification. Those are still
effective, and very efficient for the attacker, but they will someday (5y?
10y?) be blocked by closing up sources amplification. Address spoofing address
at layer 3/4 might get addressed by BCP 38, Vixie's good fight, etc., but not
holding my breath.

Sending "raw" UDP floods from bots still has several benefits over
amplification attacks provided you can amass enough bandwidth, which isn't
that difficult these days.

>By the time all that happens, attackers will have moved on to layer 7
attacks. Those can target the weakest parts of your stack, and with a large
botnet, even the act of blocking the IPs in the wrong place can add enough
overhead to hurt. With a huge botnet of hijacked browsers, blocking everyone
affected becomes a DoS vector in itself, since some of those are your own
legitimate attacks.

Unlikely, it'll take a fundamental change on how networks work for network
layer attacks to become irrelevant. Especially considering volumetric attacks
have the added benefit of potentially getting your target kicked off by their
hosts and causing added damage in BW bills. And as internet connections become
faster, DDoS attacks become bigger.

~~~
rdl
Browser JS insertion on HTTP is pretty novel; it was done with ad networks a
few times, but never at the scale of the GitHub/Great Cannon attack. Using an
existing botnet certainly was fine, but for most sites, blocking botnet IPs
doesn't cause as much collateral damage as blocking compromised-browser IPs.

If you did a watering hole attack, doing JS injection on a really popular
"show HN" post on HN, against HN, you'd be effective in getting HN to block
the IPs of a large percentage of real users, which would hurt, even if HN
could repel the attack entirely. Blocking 50k random botnet IPs wouldn't
really affect many regular HN readers.

~~~
ryanlol
>Browser JS insertion on HTTP is pretty novel; it was done with ad networks a
few times, but never at the scale of the GitHub/Great Cannon attack. Using an
existing botnet certainly was fine, but for most sites, blocking botnet IPs
doesn't cause as much collateral damage as blocking compromised-browser IPs.

But the "great cannon attack" was absolutely minuscule compared to the stuff
that happens every day, attacks of similar sizes were already reasonably
common years ago (See:
[http://i.imgur.com/0quYBdV.png](http://i.imgur.com/0quYBdV.png) a graph of an
attack on reddit, from 2013).

Today we're talking about Mrps, not Krps.

>If you did a watering hole attack, doing JS injection on a really popular
"show HN" post on HN, against HN, you'd be effective in getting HN to block
the IPs of a large percentage of real users, which would hurt, even if HN
could repel the attack entirely. Blocking 50k random botnet IPs wouldn't
really affect many regular HN readers.

That'd just be sloppy filtering, there's no need to drop L7 attacks on IP
basis.

------
dantiberian
Far more concerning to me than this outage were the security incidents
([https://news.ycombinator.com/item?id=10845278](https://news.ycombinator.com/item?id=10845278))
that Linode seem to continually have once every year or so. The most recent
one seems to have happened in July, but they didn't notify customers or reset
passwords for another six (!) months.
[https://news.ycombinator.com/item?id=10845619](https://news.ycombinator.com/item?id=10845619)

------
oliwarner
Wow, still a lot of people fighting over whether or not Linode is a good
company. It's a shame we don't get to see how <hipster hosting company of the
month> copes with 80gbps of DDoS on a single DC.

I'm personally happy with Linode. They have a seriously tough technical issue
to deal with —as much working out what's happening as how to stop it— and they
seem to be doing a fairly top job at staying afloat. My servers haven't gone
down. Any downtime in the last four years has been my fault.

So even if they are targets of some ludicrously powerful botnet, I'd rather
stay with them than let the bastards doing this win. The attack isn't hurting
my business or my clients and each incident we go through, the lower the
chances of it _ever_ being a problem in the future.

On a more serious note, governments keep moaning on about encryption but
botnets are still a much greater direct threat to national security.

------
larrymcp
Uh-oh, the attacks started again a few minutes ago:

[http://status.linode.com/incidents/mkcgnmjmnnln](http://status.linode.com/incidents/mkcgnmjmnnln)

~~~
jsmthrowaway
Which is why you do not blog about them and communicate quietly with your
customers.

~~~
rdl
Not really as possible with a consumer scale web service, but you can
certainly modify your communications strategy to be less of a red flag for the
bull. :(

------
staunch
Where's the Linode founder(s) in all this, and why couldn't they have kept
customers informed? It seems like a lone network engineer has been left to
deal with a potentially company destroying event.

~~~
jsmthrowaway
I was personally in the room, and in agreement, when running a real grown-up
AS with carrier transit was proposed to Chris Aker as early as 2010, maybe
2009, to avoid this very scenario and many others like it. It's not really
news. I have tremendous respect for the engineer who proposed it and fully
believe he could have executed on this when Linode still had four facilities
and 360 MB Linodes were the norm. I'm not saying that to toot my own horn
(really, I'm not "I told you so" or arrogant like that), but there are very
specific reasons that this wasn't done for as long as possible. I lack recent
context, but the Linode decision-making culture was for many years completely
driven by one individual who worked to spend as little on infrastructure as
humanly possible.

Even once growth really took off and revenue started making these big shifts
in strategy viable, the mindset was still to be lean and scrappy. The minimal
capital expenditure strategy had benefits early on and allowed Linode to
maintain an incredible margin and support explosive growth, but they were too
slow to start thinking like a grown-up company when it started to matter, and
it's coming back to bite both on security (with almost zero investment; just
enough to pass PCI-DSS) and things like this.

When I heard they bought the Philadelphia building, for example, I was very
surprised because that's not the Chris I knew. We lobbied for a Philadelphia
office for _years_. Could be a good sign regarding decision-making culture for
the future, but hard to say.

Don't read me as bad blood or anything, as I wish Linode no ill will (I
actually hope they can turn this perceived slump around), it's just
educational to see the consequences of choices and mindset catch up with a
company. I learned a lot about management style while working there and
contrasting with subsequent employers.

~~~
mskaldlmk
> just enough to pass PCI-DSS

nope, the only way Linode "pass" now is being below the self-assessment
questionnaire threshold. Once they have to move to an actual external audit
they are fucked.

------
jph
> Our longest outage by far... can be directly attributed to frequent
> breakdowns in communication

I have direct experience with Linode staff breakdowns in communication because
of a security problem before the December attacks.

The problem affected many Linode customers and included risks to confidential
information such as billing.

The Linode staff communication was terrible. The problem was severe and ended
up with Linode on a blacklist of companies that are not suitable for hosting.

I have to agree with tptacek: do not use Linode for anything, and if you do
now, make plans to switch to a new provider.

To end on a happy note, I migrated the project to Rackspace, and the Rackspace
staff communication is excellent.

~~~
click170
I'd like to learn more about these blacklists so I can factor that in when
choosing a vendor, do you have a link? Google is just showing me pages of
vendors trying to sell me hosting when I search.

~~~
ryanlol
I think this is more of a mental blacklist.

------
ryanlol
>Layer 7 (“400 Bad Request”) attacks toward our public-facing websites

I really wonder what that is supposed to mean, Linode has mentioned it
multiple times but not elaborated on what sort of an attack this is.

I personally haven't ever head of a "400 bad request"-attack.

Edit: Yeah, I know what Layer 7 floods are :), but I'm pretty sure "400 bad
request" floods are something Linode came up with, so that could use some
elaboration by them.

~~~
sandstrom
I think it's just another name for Layer 7 DDOS. I.e. crafted HTML-requests,
designed to be 'expensive' to compute/process.

~~~
guelo
Right, it seems pretty clear to me.

~~~
ryanlol
There's a lot of different types of Layer 7 attacks, what sort of a payload is
a "400 bad request"?

~~~
guelo
Any HTTP request that causes the webserver to respond with HTTP status 400?

~~~
ryanlol
But that's not descriptive at all.

~~~
guelo
To me what it means is that the attacker figured out some custom call to the
application that is probably expensive for the app to deal with and can easily
cause a denial of service.

~~~
ryanlol
It'd be pretty rare for the application to return error 400, generally that's
something that the server would be spitting out when it fails to parse the
HTTP request.

~~~
guelo
That could suggest random urls. But it could be anything really depending on
the app. I'm coming around to your pov that it is not descriptive. Something
about the layer 7 flood was causing the app to respond with 400s and that's
what Linode started calling it. But it doesn't help us understand anything
about it.

~~~
ryanlol
It wouldn't suggest a flood of random urls, error 400 is generally a response
from the webserver when it receives a request it can't understand (e.g not
HTTP).

A request like that would never hit the web application as the server wouldn't
know what to do with it.

see:

    
    
      echo ":P"|nc linode.com 80
      <html>
      <head><title>400 Bad Request</title></head>
      <body bgcolor="white">
      <center><h1>400 Bad Request</h1></center>
      <hr><center>nginx</center>
      </body>
      </html>

~~~
dgemm
What exactly are you asking here? You seem to already have a solid
understanding of what type of attack would generate HTTP 400.

~~~
ryanlol
Not really, there's a wide variety of different requests that would cause such
an error.

Just flooding someone with ":P" shouldn't take them down.

------
virtuallynathan
I am pretty amazed Linode didn't have their own IP Transit up to this point.
Their colo provider is Newark charges some pretty high prices from what i've
seen.

------
tptacek
My plan is to keep saying this on Linode threads, just in case there are
people who have missed it. Take my advice or leave it:

Please don't use Linode. If you are using it now, make immediate plans to
switch. If you have friends who have things built on Linode servers, tell them
to switch.

~~~
yomism
My plan is to keep saying this on Starfighter threads, just in case there are
people who have missed it. Take my advice or leave it: Please don't use
Starfighter. If you are using it now, make immediate plans to switch. If you
have friends who have used Starfighter, tell them to switch.

Feel how dickish this sounds without giving any reason whatsoever?

Please explain why.

~~~
ryanlol
There's a pretty clear difference here.

Saying that would make you a dick.

The linode comment doesn't make tptacek a dick.

Why?

Because it's trivial to figure out why he's saying that by simply typing
"linode" into google, but as of right now googling starfighter doesn't
immediately bring up any reasons to avoid them.

Edit: Didn't mean to imply that parent was a dick.

~~~
yomism
It's dickish saying that without explanation. Like this it sounds like
baseless smearing.

If he argumented the reasons before he could had just put a link. Sincerely,
it's too much to ask?

~~~
hobs
To be honest, the number of "linode screwed up" posts on hacker news the last
few years would be educational to you, and if I remember correctly, ryanlol
even got a slap on the wrist due to one of those situations.

At this point, I am bored of people asking for citations on hacker news for
things that are should be part of our tribal knowledge.

[https://www.google.com/search?q=linode+hacks&ie=utf-8&oe=utf...](https://www.google.com/search?q=linode+hacks&ie=utf-8&oe=utf-8#q=linode+hacks+site:news.ycombinator.com)
About 2,810 results (0.34 seconds)

~~~
yomism
Tribal is the right word here with all the blind faith in medicine men and
cargo cultism.

~~~
hobs
I meant it in the way of shared knowledge, just like we all know how to bypass
a NYT filter, or that someone is going to complain about the lack of native
scrolling in an article, especially on a Show HN.

I definitely agree that there is a huge amount of that type of thinking on HN
(of course), reading the amount of people who used github but didnt know the
different between it and git and were commenting today was a personal
education.

------
radialbrain
Slightly related question:

They mention segregating their customers into separate /24s, and consequently
having to assign an IP from every one of these subnets to the router for use
by the customer as a gateway.

Is there any reason why they couldn't get rid of these by having customers set
up a static route to the "primary" IP of the router (migration / configuration
issues aside)?

~~~
caf
The static route has to be via a gateway that is on a locally connected
subnet.

~~~
radialbrain
You can have a static route to any local device, this is essentially what
subnet membership does.

For example, if I have IP 192.168.0.2/24 assigned to eth0, my routing table
will have:

    
    
      192.168.0.2/24 dev eth0 proto static scope link
    

I'm free to add a local route to a device outside 192.168.0.2/24 though:

    
    
      192.168.10.1 dev eth0 proto static scope link
    

This just indicates that I should be able to resolve the MAC address
associated with 192.168.10.1 through an ARP query, same as other devices on my
subnet.

------
thomaslutz
We are currently getting DDoSed at Hetzner and they are clueless as well.

~~~
ryanlol
If you're planning on getting DDoSd you probably should pick a provider that
offers DDoS protection.

Not sure why you'd choose a budget provider for production infra anyway.

~~~
FigmentEngine
> If you're planning on getting DDoSd

This, don't do it, far better to not get DDoSd ;-)

~~~
ryanlol
Surely any org that depends on internet presence would plan against DDoS
attacks. That's like internet 101.

------
brownbat
No guess at motive? Did someone ask for ransom before these started? Is one of
the Linode subscribers hosting censorship-evasion technologies? Or is this one
just some very determined kids having fun over holiday break?

~~~
nodamage
Could it have been Bitcoin related? The timing seems to roughly match up with
the Bitcoin XT DDoS attacks mentioned here:
[https://medium.com/@octskyward/the-resolution-of-the-
bitcoin...](https://medium.com/@octskyward/the-resolution-of-the-bitcoin-
experiment-dabb30201f7)

------
ancarda
> Our nameservers are now protected by Cloudflare

How? I thought CloudFlare only protected HTTP? Can you have it reverse proxy a
DNS server or is Linode using CloudFlare as the host for ns1.linode.com now?

~~~
rdl
Yeah, it's called Virtual DNS (vDNS); essentially a DNS application proxy.

(email me if you want more info; it's not really ideal for small sites, it's
better to just use cf for hosted DNS then, since it's free, but we're happy to
do vDNS for people who can't do hosted. Mainly providers, but also some
enterprise customers with special DNS needs. It's a pretty cool technology.)

[https://www.cloudflare.com/virtual-dns/](https://www.cloudflare.com/virtual-
dns/)

------
wereHamster
> after some stubborn transit providers finally acknowledged that their
> infrastructure was under attack and successfully put measures in place to
> stop the attacks.

Care to elaborate why it took them so long to ack? And name them so I know who
to avoid in the future (or route around)!

------
tim333
>he pervasiveness of these types of attacks has caused hundreds of billions of
dollars in economic loss globally.

Is it really $100bn+ ? If so we could do with some government funded research
/ countermeasures.

------
jakeogh
Thread on IPFS/DDoS:
[https://news.ycombinator.com/item?id=10329195](https://news.ycombinator.com/item?id=10329195)

------
bjano
> blackholing is a blunt but crucial weapon in our arsenal, giving us the
> ability to ‘cut off a finger to save the hand’ – that is, to sacrifice the
> customer who is being attacked in order to keep the others online

There is something very ironic about this. They have a policy which instead of
addressing the problem actively assists anyone wanting to attack their
customers. No surprise that these customers have been complaining about this
practice for a long time. But until now it was Somebody Else's Problem so they
didn't bother figuring out some proper (or at least less terrible) solution.
Now this lack of preparedness bit them in the ass...

~~~
jsmthrowaway
I'd posit that 98% of providers from whom you can acquire budget VPS will do
the same thing. The practice is not unique to Linode; why should a network
you're paying $20-$100 do everything they can to keep a target online and
threaten other customers?

Contrary to popular opinion, if you're getting DoS attacked, you're either (a)
popular enough to start thinking about adult-size pants for your transit
strategy or (b) inviting the attention by your choice of content or
activities. In years of hosting, I started to know the targets of DoS attacks
by name. You have to own at least a little bit of responsibility, and mitigate
on your own end if you're going to be inviting that kind of attention; IRC and
controversial blogs are the usual suspects here, but that's probably changed
recently as I've been out of the hosting game for a while.

Linode has few options for reacting other than the one they use. I know that
sucks, but it's how it is.

~~~
bjano
Yes, customers of other budget VPS providers are complaining about this too.

> why should a network you're paying $20-$100 do everything they can to keep a
> target online and threaten other customers?

I am not a network engineer and I know that this is a very difficult problem.
But when the provider doesn't even _seem_ to try, it only encourages further
attacks.

------
brandon272
Does the buck stop with this network admin? Where's the CEO?

~~~
db7a11196
Riding his motorcycle across Europe, occasionally sharing photos from vacation
with his plebeian workers in #linode-staff who are earning $38k and can't
afford to take any vacation.

------
gauravphoenix
I wonder what will happen if Linode routes their traffic through CloudFlare...

~~~
jsmthrowaway
It wouldn't work. That's not what CloudFlare does (right? they didn't do BGP
last I heard). You'd need something like Black Lotus, now owned by Level3, for
that.

~~~
rdl
You're technically correct as of Fri Jan 29 15:48:27 PST 2016.

(email me if you'd like more detail; you seem smart/interesting but don't have
email in your hn profile)

~~~
X-Istence
Transparent scrubbing would be awesome, have CloudFlare advertise my networks,
then CloudFlare on the backend sends the clean traffic to me...

Can do it with peering with CloudFlare directly, or even if that is not
possible MPLS on the backend to direct the traffic to where it needs to go.

