
Dyn Statement on 10/21/2016 DDoS Attack - QUFB
http://hub.dyn.com/dyn-blog/dyn-statement-on-10-21-2016-ddos-attack
======
kev009
Does anyone know details on the throughput or packets per second?

This smells so much of gross negligence on the part of Dyn executives and all
the unicorn web executive teams for single sourcing to me.

I present as a counterexample and encourage people to go research the
architecture of Verisign. I attended a talk by Verisign, which runs .com and
.org as well as root servers. They are a constant DDOS target. They are
necessarily a single entity of failure and appointed by ICANN to perform these
services for the most common TLD in the world. If they mess up, they will
probably lose that status. They've had over 16 years of uptime.

Every layer of the stack is dual or triple sourced. Two server makers, two
generations, two router vendors, two switch vendors, two network
architectures, POP diversity, peering diversity. Services and capacity always
added in pairs. Two separate NOC and Ops teams. FreeBSD, Linux and Solaris.
nsd, bind, and internally developed userspace on Netmap. Code upgrade deployed
in halves. Everything is structured for at least a full halving for The Big
One zero day I don't think we've really seen yet.

They were doing 10gbps of DNS on a single commodity server 3 years ago. This
makes it easy to absorb any DDOS and gradually clamp it at the public peering
points.

The above should be standard operations structure for any breakaway success
web business. It's not that hard, but you have to claw the charlatans out of
the management chain and put in professionals that take the career seriously.
Professionals.

What's really appalling is hearing some unicorn web biz uses one cloud vendor,
one service provider like Dyn. This is absolutely trivial stuff to multi-
source and has almost no effect on OpEx.

Last I heard from a Dyn insider was they were eliminating FreeBSD and had a
brain drain a few years ago. Overall pretty unsurprising outcome.

Internet reliability no more complex and certainly much cheaper than analogous
critical infrastructure like electrical generation for a region. I am
constantly disappointed in this field for its lack of recognizing and
installing professionals into management. We've been doing Internet
architecture in modern form for 30 years. Get your shit together.

~~~
drieddust
Top Management don't care because they do not have a downside. By optimizing
the cost they can get additional bonuses. Even if they sink the company, they
still walk away rich and ready to screw someone else.

Middle manager are too busy to save their own asses to ask questions. If one
off poor guy musters courage to speak up, they are immediately shut down in
the name of falling in line. This is the usual response from top management
when they don'have an answer.

"This isn't a Democracy."

~~~
kev009
This mirrors my experience. And for public companies, the Board of Directors
are not attuned to these outcomes. For the most part B2B customers are in the
same boat so everybody feigns an uncomfortable laugh, some moderate outrage or
displeasure, put out a silly postmortem without intent to rectify the culture
that enabled it, and continue collecting an outsized salary. I adore
computers, but I really wish I could lateral into some career like
construction management where the layman is much more aware of success and
failure, who is good and who is bad at their profession.

------
paradite
The style of this post mortem is quite different from a typical post mortem of
an attack on a large tech company, such as AWS, heroku or GitHub.

There is very little technical details on the investigation and mitigation
process beyond phrases like "the NOC team was able to mitigate the attack and
restore service to customers", "...but was mitigated in just over an hour".

Why was this written by a Chief Strategy Officer but not someone with more
technical knowledge and insights?

~~~
mino
Because it's not a post-mortem but a "statement", which they felt like
publishing to thank all the actor involved.

------
nmjohn
> We observed 10s of millions of discrete IP addresses associated with the
> Mirai botnet that were part of the attack. (linked article)

> ... the Mirai botnet was at about 550,000 nodes, and that approximately 10
> percent were involved in the attack on Dyn (from Level 3 CISO) [0]

Something really doesn't add up there - even if it turned out 100% of infected
hosts in the Mirai botnet were targeting dyn (ie: 5.5 million nodes) - that
still is a fraction of the number dyn is claiming.

[0]: [https://threatpost.com/mirai-fueled-iot-botnet-behind-
ddos-a...](https://threatpost.com/mirai-fueled-iot-botnet-behind-ddos-attacks-
on-dns-providers/121475/)

~~~
dmourati
I believe Dyn's numbers are conflating two things:

1\. The number of source IPs seen 2\. The size of the botnet

They are most certainly not equal because of spoofing.

------
helthanatos
I couldn't use twitter or github from noon till 4. If they mitigated the third
attack and no customers were affected, why were github and twitter still down
for me?

~~~
herlitzj
Same for us. We're backed by Shopify and it was down until 4pm. Not sure I buy
they quashed this by 1pm

~~~
Zancarius
Can confirm as well. GitHub was inaccessible until about 5PM, and I'm in the
southwest. My provider peers with Level3, so it looks like nearly everything
gets routed through Los Angeles.

------
mirimir
> We observed 10s of millions of discrete IP addresses associated with the
> Mirai botnet that were part of the attack.

OK, so why don't Dyn staff identify owners of all IP subnets represented, and
provide each with a list of participating IP addresses? That's a trivial
exercise, right? And maybe they could publicly shame ISPs that didn't act on
the information.

~~~
ssharp
If they are residential or business IPs, it would be nice for the ISPs to let
them know something on their local network is part of a massive botnet.

~~~
lilott8
why would they? I mean with how they are charging by the GB these days, it
gives them financial incentive to _not_ tell you your refrigerator (or any
device) is part of a botnet, that's free money for them.

~~~
totalZero
Assuming you mean this seriously and not just as a tongue-in-cheek dig at
internet carriers...

Even if every refrigerator were on a pay-as-you-go plan, handling all of the
data may ruin throughput for other customers, especially at peak usage.

------
CaveTech
The attack was surely not mitigated by 1 PM ET. We were experiencing issues
until well after 3.

~~~
miken123
Dyn states their problems were fixed at 17 UTC, but me and most other people
in the Netherlands have been seeing issues for the whole evening. Something
about their story does not add up...

------
Shank
I would really like it if Dyn could give some credence to the 1.2Tpbs number
or if it was higher.

As these attacks grow in scale, it becomes more and more important to know if
this was an attack at record capacity or if it was just that Dyn had lower
capacity hardware/links in place. If we're already facing, for example, 2Tbps
attacks, a lot has to be done to make mitigating these attacks easier, either
through hardware or strategic upgrades.

~~~
jimjimjim
no.

what you get then is groups trying to outdo each other trying to get bragging
rights for being the largest.

------
pragone
Entertainingly, this page 503s currently.

~~~
Crosseye_Jack
[https://web.archive.org/web/20161022220033/http://hub.dyn.co...](https://web.archive.org/web/20161022220033/http://hub.dyn.com/dyn-
blog/dyn-statement-on-10-21-2016-ddos-attack)

------
nodesocket
Overall great write up, however:

"Again, at no time was there a network-wide outage, though some customers
would have seen extended latency delays during that time."

That can't be true. Here in the west coast, I know of over 10 "name-brand"
sites that were absolutely down for over an hour. The east coast was
apparently hit even harder and for longer.

~~~
joatmon-snoo
Are you sure that it was Dyn that was down, or if there was a break somewhere
in the nameserver chain?

I know that the digs I was running periodically were getting SERVFAIL
responses from Google DNS even though my local nameservers were actually
succeeding.

------
dronemallone
Does anyone here know the actual technical details of the attack? How exactly
did the DDoS occur? Just a ton of bots making DNS requests over UDP??

Links to articles with tech details would be greatly appreciated.

------
ademarre
I was hoping to read a more detailed account of the attack and specific
mitigation strategies. Have such details emerged anywhere?

~~~
rayvd
The blog post indicates additional details are forthcoming.

~~~
ademarre
Sure. But they also say, understandably:

> _It is worth noting that we are unlikely to share all details of the attack
> and our mitigation efforts to preserve future defenses._

------
laluser
This should be a warning to everyone that was affected by this. Even the
dependencies that you don't normally think about go down at some point. For
DNS, make sure you are using multiple DNS providers.

~~~
buro9
> For DNS, make sure you are using multiple DNS providers.

In this case, that would have made it worse.

If your domain CNAMEs off another provider (as nearly all SaaS solutions do,
as do mapping to AWS servers, etc) then you would have been affected by the
attack on Dyn regardless of whether you could change your nameserver in time
and have multiple providers at that level.

If you don't have any CNAMEs, then I think the better choice is to go same-
origin same-provider for everything.

Which may sound bizarre... my personal sites (a lot of forums) would either be
up or down. I would argue that availability having a binary nature is a lot
better for end users than a constantly broken half-state that is frustrating
or impossible to use at any time even though a graph might show partial
availability. A hard sudden failure shows up better in monitoring, triggers
end user feedback faster, and is in many ways easier to debug and solve.

With so much CNAME'd, and SaaS still growing, replicating and failing over DNS
that sits above those wouldn't buy you anything.

A lot of sites lost Zendesk (their support.), Statuspage (their status.),
Pagerduty (hey, they were quiet because no-one could let them know). They
couldn't even help the end users having a bad time.

~~~
eslater
> In this case, that would have made it worse.

It might have made little difference depending on the specifics of the records
in question but it would not have made things worse.

------
paulddraper
Is Dyn any less vulnerable to this attack than 48 hours ago?

That's kind of the point of retrospection, but I missed the part where they
say what they're doing now that they failed to do last week.

------
diegorbaquero
I haven't seen comment from any IP-transit/peering company that provides
service to them. In fact, I haven't seen any infrastructure provider of them
comment anything. This is starting to seem more like a marketing campaign or
human error (action?) blamed on DDoS.

------
willvarfar
_Why_ might the Mirai botnet attack Dyn? What might their motives be?

------
cagenut
that's a killer IP list to license out

------
libeclipse
>It is worth noting that we are unlikely to share all details of the attack
and our mitigation efforts to preserve future defenses.

Security through obscurity.

~~~
labster
Obscurity is a valid defense layer, so long as it is not the only layer, and
so long as the obscurity is not just a way of covering up garbage.

~~~
dllthomas
But it can also be substantially more costly than its worth, and a lot of the
cost isn't super visible (people finding it marginally harder to get their job
done, being unable to learn from informed discussion on HN, &c).

