
An update from Linode about the recent DDoS attacks - alexforster
http://status.linode.com/incidents/mmdbljlglnfd
======
silverlight
Glad to hear something official on this...5 or 6 days is way too long to go
without something more than "We're working on it" and some light details. I
understand that it's likely an all-hands-on-deck hair-on-fire situation over
there, but those of us who rely on Linode for our own businesses have been
largely left in the dark.

When our customers are emailing and tweeting us and they just want to know
when we are going to be up, and all we can say is "We have no idea, we don't
know why this is happening or what's really going on", that's pretty much the
definition of a worst case scenario from a customer service standpoint.

As someone whose business relies on Linode currently to function, I am
sympathetic to Linode's plight...this is the equivalent of someone coming and
setting off a bomb in your factory; not exactly something that you can always
plan for even if you have prevention measures in place. But they would have
kept a lot more of my sympathy long-term if they would have communicated
better with their customers in the first place...

EDIT: And it looks like the attackers decided to start things back up again,
as Linode.com is unavailable...

~~~
alexforster
We know that we've dropped the ball here. To be frank, it's just been
extremely difficult to take our people off of mitigation long enough to write
something more coherent than "they're attacking our webservers", "they're
attacking our core routers", etc.

> And it looks like the attackers decided to start things back up again, as
> Linode.com is unavailable...

They're watching our status page for updates and starting new attacks when we
resolve previous ones. There's been an almost 1:1 correlation lately.

~~~
erikpukinskis
> it's just been extremely difficult to take our people off of mitigation long
> enough to write something more coherent

This always rings hollow to me, and yet I hear it over and over.

A company like Linode surely has at least a dozen people who can be on call in
a situation like this, probably much more. All it takes is for one engineer or
even a product person... Heck a technically-minded support person could listen
in on the war room meetings and get enough information to post something
better than "we're fixing it".

5 minutes of blogging every six hours would be plenty.

And yet, people always claim it's impossible and there's no time. Frankly I
find it frightening... If you are coding that fast that no one on your team
has five minutes to step aside, take a breather after six hours of coding and
summarize what the team just spent the last six hours doing, I shudder to
think what kind of panicked alarmist interventions you are making.

There's no excuse for silence. There just isn't. It's a gross failure on the
part of management to prioritize the responsibilities they have to your
customers customers. Full stop. Sure, the engineers can't be expected to
remember to tap out to blog. But if that's the extent of the accountability
structures you can assemble during a crisis, that is a serious organizational
failing, particularly for an organization the size of Linode.

~~~
collyw
I would say that 5 minutes of blogging every 6 hours is just as likely to
cause more harm than good. Wording something badly could lead to far more fear
and confusion than the present situation if taken out of context.

------
alexandrerond
I can't believe people are criticising Linode:

1 - Attack mitigation was mostly successful. As I thought and they have
confirmed, the attack vectors evolved continuously.

2 - They had to deal with this over Xmas. Anyone familiar with such a job
knows what this means in terms of human resources, knowledge distribution,
organization of technical response and communication with 3rd parties.

3 - Linode is not Nagios. If you don't monitor your own infrastructure don't
expect Linode to SMS you because your site might be down. Linode resources
were focused on fighting the DDoS, as they should, and provided regular
updates through their status site, as is expected. Everything else is nice-to-
have, but no a must-have.

4 - In line with what others said, I had 7 hours downtime in my London VPS.
That is an uptime of 96% in the last 7 days. Considering restless DDoS ongoing
over holidays, I'd say that is pretty good.

I'm sorry, but what happens to Linode sucks, but it is an eventuality anyone
with assets depending on this service should have counted with, because it can
happen everywhere. Cannot blame Linode if your HA strategy does not exist, or
you never thought of a way to gracefully fail over to a second provider if
your business depends on >96% availability.

~~~
scurvy
1) They were caught with their pants down. Their DDoS playbook was probably
either outdated or not full fleshed out. This is somewhat excusable if you're
a content provider. It's not excusable if you're the cloud/colo/datacenter
provider. This is literally your raison d'etre. Cue the "you had one job"
memes.

2) Netops people don't get Christmas off (1). Our teams are working all year,
all day, every day. Netops isn't HR or marketing. It's not super difficult to
arrange a conference call with your transit providers, DDoS mitigators, etc.
This is old hat to them.

3) Completely agree.

4) The downtime was a lot worse if you had multiple instances in more
datacenters. Or put another way, the bigger the Linode customer you were, the
worse it was.

(1) I've had a router die every Christmas or New Years for the past 6 years.
Never had one die outside those windows. They must angry at me for what I've
put them through.

------
rebootthesystem
My message to the attackers (in case they happen to read HN):

Fuck you. I will continue to be a Linode customer. Not sure what your goals
might be but you will not succeed.

Frankly, and I am going to be politically incorrect here, these are the kinds
of cases where I wish there was a "special forces" kind of task force to hunt
down these pieces of shit and put them out of their misery.

This amounts to financial terrorism of the worst kind. It affects small and
large companies and creates untold losses across the board. It is entirely
unproductive. The world would be a better place if the pieces of shit who
engage in this sort of financial terrorism simply didn't exist.

Happy New Year.

Linode folks: I'm renting another server next week. Don't need it. Just want
to support your effort and, in a tiny way, help mitigate losses. I might just
give it to the kids in the robotic team I mentor so they can play around in a
real server environment.

~~~
mschuster91
> Frankly, and I am going to be politically incorrect here, these are the
> kinds of cases where I wish there was a "special forces" kind of task force
> to hunt down these pieces of shit and put them out of their misery.

Oh, it's certainly in the capability of the FBI and the NSA to hunt them down.
Even child fuckers inside the Darkweb got busted.

The problem is priority: unless either child porn or a huge US company is
involved, the three-letter-agencies don't give a shit about this kind of
crime.

~~~
prawn
Which is a shame, because some action would probably dissuade future attempts.

------
atom_enger
Thanks for the update and the hard work. You all work a job that requires a
lot of sweat and tears that goes unappreciated from many levels of our
society. Making the internet work is hard work.

Know that you're in good company here and that we're rooting for you.

~~~
alexforster
We appreciate this more than you know. Many of us have had the holidays ruined
by these utterly relentless attacks, and it's a difficult thing to try and
explain to our loved ones. Support from the community really helps.

~~~
Zancarius
When I noticed this on Christmas Day (and saw my suspicions confirmed on the
status page), my first thoughts were along the lines of "Wow, that's going to
ruin Christmas for a lot of Linode families this year." My heart sunk for you
guys. I can't even begin to imagine the stress it's placing on friends,
relatives, and immediate family members. (The inevitable "But it's
Christmas..." comes to mind.)

I just want to say that I switched this summer to Linode from DO for my
personal sites, and I've been very happy with your services. I plan on
launching a few more things in the coming year, and these attacks have
galvanized my resolve to continue supporting you through all of this. Thanks
for everything you guys are doing, and keep up the good work!

I know it's unlikely to ever happen given the nature of these attacks, but I
really do hope the perpetrators are found, brought to justice, and locked away
for a long, long, long time.

------
jafingi
My London Linodes' stats from the past 7 days: 40 outages. 6h25m downtime.

What upsets me the most is that customers haven't heard a single word from
Linode. If you weren't watching their status page (and have server monitoring
on your servers), you'd be clueless.

The least they could do is to email affected customers about what's happening,
and the time frame they need to fix it.

But one week of continuous issues is just not good enough.

~~~
jafingi
That being said, I love Linode, and will not use any other primary provider
for my servers. They have been dead stable the past years before this
happened, and never had any bottlenecks (e.g. overselling servers like other
providers). And I feel sorry for the network engineers trying their best to
fix this. The missing information is a Linode Customer Relations issue, not
the engineer's.

Happy new year to everyone. And looking forward to a great 2016 at Linode.

------
scurvy
1) Why in the world are you exposing your router control planes to the outside
world? That should be ACL'd off (in stateless firewall rules and routing
engine) to only allow access from a few IP's.

2) Your transit providers should be defending their infrastructure. I've never
seen a transit provider allow an attacker to take out their /30 serials or IX
addresses. This is their network after all. If attackers try to hit the serial
between customer and provider, you just readdress the serial to RFC1918 space.
You don't really need a routable address there other than to make traceroutes
easy to read. If they attack farther upstream in the provider's network, you
just add ACL's at the provider edge. Nothing external will ever need to reach
a provider's core. This is basic, basic stuff.

Next time, don't only run your network on house bandwidth (HE, TelX, etc). Or
in other words, caveat emptor.

~~~
adamzoz
I want to cut Linode some slack they have been great but those were my
thoughts when the employee above divulged that info. It all seems a little
like using cold fusion to run your whole shop..

People dont need to jump ship they need plans in place to deal with problems
like this, even just a rsync.net account.

~~~
scurvy
Judging by the downvotes I got, there are people on HN that must think you
should expose your router's control plane to the Internet. That or they expect
transit providers to not protect their network.

There's no intelligent discussion left here. It's devolved into a sympathy
vote. "You said something mean about Linode. I love Linode. Downvotes for
you."

If Linode had posted the update to NANOG, it would have been more productive.
I don't often say that.

~~~
alexforster
You may want to go pinging around some of your own tier 1 crossconnects. I bet
you'll be surprised.

~~~
scurvy
They all respond because I asked them for it. About half didn't upon turn up.
It's super trivial for them to null route them or readdress in rfc 1918/4193
space.

Or are you referring to xconnects inside their network? That's up to them to
work out and I've never seen a provider just abandon their network while under
attack.

~~~
alexforster
You may want to reconsider. CoPP only goes so far, as we've learned the hard
way.

~~~
scurvy
I don't run Cisco, but thanks for worrying about that. Also, stateless ACL's
should protect against overrunning the control plane.

------
pbowyer
Link to original thread, which has been getting comments up until today:
[https://news.ycombinator.com/item?id=10806686](https://news.ycombinator.com/item?id=10806686)

------
bm98
> a bad actor is purchasing large amounts of botnet capacity in an attempt to
> significantly damage Linode’s business.

I wonder what size investment this is taking, and what the end-game is for the
bad actor. Unless Linode's mitigation tactics are increasing the bad actor's
costs, what's to stop the bad actor from continuing the attacks until Linode
goes out of business?

~~~
scurvy
Stolen credit cards go a long, long way in AWS and GCE capacity.

~~~
nickpsecurity
Exactly. Plus, botnet capacity was cheaper than AWS last I checked where it
was around $1 a machine. So easy to pull this off on the cheap that it could
be anyone for any reason.

------
psxhacker
Greetings.

I know I migth come late, but being one of your very satisfied Customers, and
having experienced this type of issue multiple times with other providers that
didn't even bothered to even ackowledge that there was an issue, I can say
that I wil remain with you regardless.

Also to those saying "I have all my business running at Linode so this is
unacceptable", I only says this: You get what you pay for, and for a VPS
Service you won't find better than Linode, and if you have something critical
running for YOUR Clients, than it is YOUR responsability to ensure resiliency
against this type of situation. Linode is a VPS provider after all, and the
reason why you are making money out of someone who doesn't know enougth to go
to the VPS hoster themselfes.

Good luck making a profitable business and milking your Customers running on
AWS or AZURE. You'd be broke and in debth at the first DDOS and Over-Bandwith
charge from any of them.

I work at a service provider myself, and I understand what you guys had to
deal with the last 10 days, and you have my full support.

------
switch007
I'm a bit surprised to see the update from someone who is not senior
management/C-level (as far as I can tell). Where is the communication from the
CEO/CTO?

It seems a bit unfair to have this fall on Alex's shoulders. I could be way
off base, happy to be put right. I'm sorry to hear about your ruined holidays.
Hopefully you'll get some time off soon :)

------
tunesmith
That's crazy... I have a linode box with several low-traffic websites on it,
old projects I've wanted to keep around for archival purposes. I picked linode
because I wanted root access and they were cheap but really just because I
wasn't sure of better options. I suppose there is always t2.nano.

~~~
atom_enger
You get what you pay for. t2.nano isn't as fun as it sounds. Just wait until
someone soaks up all the available instances surrounding yours and starts
generating CPU steal on your box, thus making everything in your OS take
50-90% more time to complete. I've seen this happen on t* and m* instance
types and it's especially the case when they're the smaller sizes in that
instance class.

------
circuit_breaker
Linode has always been a great host. Sure they've had their growing pains but
I've never been more happy with a virtual hosting provider, even their
support. But yes, days without communication is not a good thing. Let's hope
they learn from this.

------
vox_mollis
ISPs already spend plenty of money on DPI and HTTP injection gear. It would
cost next to nothing to do basic egress filtering and detecting+throttling
known compromised customers.

And yet, we still get DDoS attacks. Why?

~~~
mike_hearn
Because these days "botnet" can easily mean "botnet of compromised Linux
servers" or "botnet of WiFi routers". The biggest DoS attacks are often based
on exploiting UDP based protocols and so you end up being attacked by
ISP/university-sized DNS or NTP servers.

------
danieltillett
I think the lesson here is don't rely on one supplier. I have my tiny
infrastructure spread over three different suppliers in different geographical
locations. Plan for the worst and hope for the best.

Edit. This is in no way a criticism of linode. The worst outcome is if we all
end up with one monopoly supplier. I have deliberately avoided using the big
player in this space as I want support diversity. This makes my job harder,
but it is better for us all if we don't put all our eggs in the one basket.

~~~
pbowyer
Have you done a write-up of how you've managed to keep a real-time sync of
assets/database? Or are you hosting static websites?

~~~
mikegioia
This is what I'd like to know too. It's easy to suggest mirroring your
infrastructure on multiple providers but in practice it's very difficult.
Multi-master SQL in particular is non-trivial and that's just one piece of the
puzzle. We have MongoDBs, CouchDBs, Redis, and Elasticsearch -- each with
varying difficulty and costs to cluster.

~~~
atom_enger
You're definitely not mirroring your infrastructure any time soon :). Not to
say that it's impossible because I don't believe in impossible, but that's
quite a stack to replicate.

We have much the same stack at Reverb.com and I focus more on creating a
disaster recovery plan than tackling a mirror of our infra. I'm focusing on
bringing up our stack in another region of AWS if we would ever need to rather
than mirroring everything.

Sometimes a mitigation plan is far more useful than building in excessively
complicated redundancy.

~~~
mikegioia
That's interesting and probably more in line with what I need to do. It
wouldn't be difficult for me to bring the entire stack online in another
datacenter, but if all of your customer data resides on (for simplicity) a
master SQL node in Linode-Dallas and that datacenter goes offline, how can you
bring that backup node online when you can no longer access the Dallas
machines?

Are you suggesting to have a slave ready to become master in the 2nd
datacenter? If that's the case, my main area of uncertainty is: if you promote
a slave to master in a 2nd datacenter and that new master accepts writes, you
then have a brutal problem of propagating writes to the old master when he
comes back online.

The whole thing may exceed my capability right now but I think having a
disaster recovery is far better than a mirror now that I think through it.

------
rast-a
Out of curiosity: why is it so hard to track the real origin of DDoS attack,
who's behind them and what they are after?

~~~
kevinbowman
Imagine the scenario: person A sends a malformed DNS request to a bunch of DNS
resolvers, asking them to send the response to person B. Now imagine that
person A is actually part of a large botnet, being controlled by person C, via
some smoke-and-mirrors.

If you're person B (under attack) it's pretty difficult to track through all
of that to person C. You'd need a lot of cooperation from people (likely in
many different countries) who really just want to go back to their normal
business. They're likely also charging for the traffic, so they're not really
that bothered, and they're each only seeing a small proportion of what person
B is seeing so they don't see it as much of a problem (so aren't likely to be
inclined to get involved).

------
elinchrome
How do botnets still exist? Wasn't that a problem that should have gone away
with Windows XP?

~~~
zepto
Windows XP hasn't gone away.

~~~
elinchrome
So are the botnets that still exist mainly running on the legacy XP boxes?

It says here that it's roughly 10% of machines. It's safe to assume at least
half are infected. But aren't they mostly third world?

[https://www.netmarketshare.com/operating-system-market-
share...](https://www.netmarketshare.com/operating-system-market-
share.aspx?qprid=10&qpcustomd=0)

~~~
zepto
Android too.

[https://encrypted.google.com/search?hl=en&q=android%20botnet...](https://encrypted.google.com/search?hl=en&q=android%20botnets)

------
geuis
jsonip.com is hosted on Linode. It's been averaging roughly 6mb/s inbound for
months, but in the last week it's been about 8.5. I'm not sure if the uptick
has anything to do with the DDOS attacks or not.

~~~
brobinson
Did you also run jsonip.org? I used that occasionally (not programmatically,
just when I wanted to see my IP) and it started returning a 502 a few months
ago.

