
Why Dropbox decided to build its own infrastructure and network - zachperret
https://techcrunch.com/2017/09/15/why-dropbox-decided-to-drop-aws-and-build-its-own-infrastructure-and-network/
======
antoncohen
Dropbox hasn't dropped AWS, they moved things off AWS as it made sense to. The
article is talking about two things, the move of file storage and a network
backbone. Neither of which were done recently.

The file storage move from S3 to Magic Pocket is detailed in these blog posts:

[https://blogs.dropbox.com/tech/2016/03/magic-pocket-
infrastr...](https://blogs.dropbox.com/tech/2016/03/magic-pocket-
infrastructure/)

[https://blogs.dropbox.com/tech/2016/05/inside-the-magic-
pock...](https://blogs.dropbox.com/tech/2016/05/inside-the-magic-pocket/)

[https://blogs.dropbox.com/tech/2016/07/pocket-
watch/](https://blogs.dropbox.com/tech/2016/07/pocket-watch/)

The network backbone is talked about here:

[https://blogs.dropbox.com/tech/2017/09/infrastructure-
update...](https://blogs.dropbox.com/tech/2017/09/infrastructure-update-
evolution-of-the-dropbox-backbone-network/)

~~~
bsenftner
The 'value' cloud services provide is purely for experimental services an
organization does not want to commit physical assets. Those deluded to
believing cloud services provide any value beyond test service deployments are
propaganda poster boys for today's tech sucker awards.

~~~
luckydata
You are extremely wrong. The Dropbox story shows that if the bulk of your
value comes from selling a commodity (storage) then you need to improve your
margins by moving away from a provider that also makes the bulk of their money
selling the same commodity.

For companies where the value lies in the utility of a service that can't be
easily replicated, you have pricing power to make the convenience of AWS worth
the expense.

~~~
user5994461
Pretty sure that the value of Dropbox doesn't come from the storage. It comes
from their software. The magic thing that allows to replicate all your data
seamlessly between as many computers and phones as you want, without the need
to click any button or understand what the word storage even means.

------
epa
This means their EBITDA now probably shows profitability (servers they own are
amortizable [the A in EBITDA], where are AWS is expensed.)

Edit: Amortization/Depreciation can <i>generally</i> be used interchangeably.

~~~
lobsterloga
Maybe but OpEx is tax deductible. CapEx is not.

~~~
qaq
I am pretty sure if they desire they can convert CapEx into OpEx.

~~~
toomuchtodo
Sure, if they sell the storage system to someone and arrange a leaseback.

~~~
0xbear
That's how Microsoft manages its campus real estate. :-)

~~~
dx034
Nearly every company does that. It's amazing how little many companies
actually own. I've worked in offices where everything from building over
furniture to even the plants was leased. The company ran with close to zero
assets (services business). Pretty sure that this wouldn't make any sense if
it wasn't for tax purposes.

------
swampthinker
I'll be 100% honest, I didn't realize Dropbox was still on AWS. You figure at
a certain scale it makes more sense to run your own solution.

~~~
xbmcuser
Yeah and that's why it amazes me that Netflix uses Amazon a straight
competitor that uses the money it makes from Netflix to compete with Amazon
prime.

~~~
StreamBright
Netflix probably in a better position to do the math than random people on HN.
I think they have done their homework and decided that it is overall cheaper
to keep ___some___ of their infra on AWS.

~~~
dna_polymerase
Yes! Thank you! Every time some piece like this is posted 'experts' arise here
and know everything better. Like there is a bunch of idiots working at Netflix
who are happily burning money...

------
mc32
What does AWS do when someone moves 500PB of storage off of their systems? Do
they sit idle till some other big customer comes along or is their growth so
phenomenal that a 500PB move off doesn't even slow new (storage) deployments
at all and just keep up their pace?

~~~
late2part
with no actual facts, i'll speculate.

let's assume they have 12 buildings at each region with 500 racks in each. 6k
racks

Let's assume that S3 is 1/4 of their infrastructure.

Let's assume you can put 60 * 12TB disks in 4U and they use 52 racks. That's
13 _60_ 12TB per cabinet or 9.4PB per rack.

1500 racks at 9.4PB raw capacity with RF=2.1 so best case that's that's 1500 *
9.4 / 2.1 = 6700PB at a site.

So 500PB would be 7% of the space.

They're probably growing 5-15% per month like most hot companies now, so
that's 1-2 month's growth.

Of course these numbers are certainly off. But probably in the magnitudinal
ballpark.

~~~
mc32
My guess (and it's a terrible guess) is this may be about 4mo growth... and
they might take the opp to retire older lower capacity drives and just keeping
up the build-out.

But who knows...

~~~
chis
If you're ordering 500 petabytes you probably sign a contract with Amazon, and
have to let them know in advance if you want out.

~~~
mc32
That's a good point --but they didn't start with 500PB, when they first went
to AWS they probably only used a few PB to begin with and grew "organically"
over the years. But likely still had some contractual stuff for this kind of
scenario.

------
ChicagoDave
I had a chance to compare all of the file services a few years ago in a large
corporation and I thought out of all of them, the DropBox engineers were the
strongest. Microsoft second.

------
qaq
People making sane choice very refreshing. I wish my company would make same
decision.

~~~
mrb
Make the sane choice of applying to work for Dropbox =:)

------
raides
I am curious on the hardware build out used for the storage nodes. Some of the
major issues I have seen with all the appliances out there for storage are the
following:

1\. Network throughput on the appliance is fast but for an Enterprise level
the 10gigE cards used become a bottleneck for transactions because of how the
software hypervisor scales the data. 2\. Power consumption of the appliances
in a rack mount environment are too high and leave needed space that has to
stay empty because of the facility power per rack limitations. 3\. The
software hypervisor scales the stack vertically and relies on the software to
load balance horizontally. The performance in a high transactional environment
becomes dependent on the software to scale instead of the natural horizontal
distribution that can be setup on the hardware out of the box. Standard multi-
purpose storage arrays scale horizontal with very little over head from
traditional software storage management. I only found one company whose
software does not force the stack to be vertical but they fail to meet a
reasonable performance in network/power.

Streaming petabytes of data to keep a dynamic constant (static overall storage
requirement that changes it's data life cycle via retention rules) becomes
very hard with premade hardware.

Does anyone have any recommendations or has attempted a similar exodus from S3
that they can share?

------
justonepost
Is 500 petabytes really that epic? 1500 tapes? I could probably stuff that in
the back of my minivan.

~~~
jsemrau
Great comment for 2017. I would like to see the 1987 version and the 2037
version of it.

~~~
jerf
I recently misplaced one of my 32GB SD Micro cards I was using to have an
alternate image for my Raspberry Pi. So I just bought another one for the
price of a half-decent lunch.

And I did have a flashback to my first hard drive, all 220 glorious megabytes
of it, and a great deal more expensive than "a decent lunch", and get a sweet
hit of that _yeah... I am in the future, aren 't I?_ feeling.

------
iUsedToCode
Dropbox is cool and all, but i hate their pricing model. I just want to keep
about 50-100 GB there (i don't hoard stuff). I don't wanna pay $10 / month for
that. At backblaze i pay less than $0.5 / month. I could pay quadruple, since
Dropbox is a lot better service (and quite a different one, too). But not 20
times as much.

I know that Dropbox doesn't care about my tiny dollars and all. But why not
let customers pay for what they use? This "constant growth" bullshit is
probably the reason they don't care.

~~~
mgkimsal
what bugged me about their pricing model (perhaps it's changed?) is that
things people share with me count to my cap. I had a client share stuff with
me via dropbox. So I signed up. I get "2 gig free!" Yay. Another client shares
stuff with me. Then another. Then another. Then I'm at my 2gig cap with just
other peoples' stuff. They all think it's great, because none of them are
paying for it. _I_ needed to pay money to support their free use of the
service, and I declined. Just one more $x/month service I didn't need to get
hooked to. Maybe the pricing model has changed what they count now?

~~~
kungito
How do you have "clients" and you can't afford 10$ a month to not have to
think about it?

~~~
mgkimsal
It was largely the principal of the thing. It's _yet another_ $x/month, on top
of other services being paid for to help service/manage those clients. Example
- PM tools. I'm already paying for project management tools - put the files
there, damnit. "oh, but I like dragging to my dropbox!". Oh, and someone else
uses box.com, and so on.

Again, what gets my goat is partially the payment, but mostly the "it's free
for them, but I end up needing to pay to accommodate their use of the service"
angle. If they - other parties - were paying for it first, it would bother me
less. But raving about how 'free' a tool is which _I_ have to pay for to work
with them (and they're using it precisely because it's free) bugs me.

Getting sucked in to a few more $x/month things here and there 1) dilutes
where stuff is supposed to live (wrt files) and 2) just becomes a drag on
finances. A handful of services can end up going from several hundred dollars
in to the thousands if you don't keep tabs on things.

Maybe I'm an HN failure because saving several hundred dollars per year
matters to me? I guess my skills must suck - most people can apparently rustle
up $300/hr Rust/Go/Elixir project work simply by starting to formulate the
idea that they're considering taking project work. I'm not that skilled.

------
shevy
> While cost is always something that we consider, our > goal with our network
> expansion was to improve > performance, reliability, flexibility and control
> > for our users — which we have succeeded in doing,” > a company
> spokesperson told TechCrunch.

Who believes this?

It was, easily noticable by everyone, wanting to reduce the cost. Which is
fine, everyone does so, so why not admit that it was the primary impetus?

I would not want to outsource control over any larger company that I were to
run to other, even bigger companies.

~~~
maccam94
Performance and reliability definitely affect subscription rates. Revenue
growth is actually more important than decreasing costs (as long as your costs
don't start increasing disproportionately).

------
sandworm101
Three data centers "biult" by only the dozen people on the infrastructure
team? Not possible. I wish articles like this wouldnt hype small teams where
it is obvious that most all of the work was outsourced. It would seem that
dropbox here was still operating as customer: ordering rather than physically
biulding much of anything. Those dozen dropbox people were not running cables.

~~~
iscoelho
The article is unfortunately not very specific and I can not find much
information with a quick search, but:

It's very possible that they did this all on their own provided they rented
cabinets/cages in an existing facility like Equinix, which is extremely
common. Then they do not need to manage power/generators, fiber into the
building, or any other data center necessities.

It does not take very many people to do the wiring inside a cage, especially
considering Dropbox has been doing this over the span of a few years. If
you've ever been physically in a data center (personally visited a few Equinix
facilities myself), it's mostly one person from a company working in a cage
that is wiring everything. I've rarely seen multiple people do wiring, and
that same person will come back every day thereafter to continue working until
the job is done.

What they are doing sounds entirely feasible. If money is no object in regards
to equipment, with 3 data centers, I'd honestly say you only need 4 competent
people to get the job done.

3 on-site. 1 remote/office.

From there, the more the merrier. A guy to lift the equipment up as well is
nice sometimes - can be pretty heavy!

------
kibwen
The article is dated today, but isn't this news from a few years ago?
[https://www.wired.com/2016/03/epic-story-dropboxs-exodus-
ama...](https://www.wired.com/2016/03/epic-story-dropboxs-exodus-amazon-cloud-
empire/)

------
paxy
They did this a while ago.

Here's an article from last year with a lot more detail -
[https://www.wired.com/2016/03/epic-story-dropboxs-exodus-
ama...](https://www.wired.com/2016/03/epic-story-dropboxs-exodus-amazon-cloud-
empire/)

------
notyourday
(a) CAF ran out

(b) Without CAF, GC and AWS pricing is ridiculous - somewhat akin to paying
$9.99/lb for a Purdue chicken a day before the expiration

(c) Without the scale of AWS and GC one can get vendor prices at Google/Amazon
MFN + 10% any day or +5% after schmoozing.

------
knodi
I'll save you a click. Cost savings.

------
goptimize
"We’re talking about a company that had 1500 employees, with just around a
dozen on the infrastructure team" \- what the rest 99% of the company is
doing, marketing?

~~~
vacri
I was also thinking that those numbers are really weird, for a company which
does infrastructure(fileservers)-in-the-cloud. Obviously some devs doing
software for various platforms, business-side folks and so forth. But 1500:12
is a really weird ratio, given the core function of the business. If they're
not spending on infra staff... what staff are they spending on?

~~~
zaroth
Seriously, 12 people controlling all of that sounds like a liability. Are they
allowed to meet all together?

------
flamedoge
did Dropbox really only have 12 people in the infra team at the time?

------
uiri
Can we remove /amp/ from the end of the URL? Some people still use HN on a
laptop or a desktop.

Also, as mentioned elsewhere in the comments here, it should be retitled to
"Why Dropbox decided to drop AWS"

~~~
driverdan
I'm against Google hosted amp but what's the issue with self hosted? This page
is _so_ much nicer than the overly busy normal TC site.

~~~
mulmen
The problem is you making that decision for me.

~~~
rev_bird
Wait, but using the non-AMP version is a decision too. Why do _you_ get to
make that one instead? Shouldn't the person posting the link get some
discretion?

~~~
mulmen
Principle of least surprise. techcrunch.com has a default experience, amp is
secondary. Since we can't all read each others minds the default behavior
should be to use the ...default.

------
mr_tristan
Um, what is this crap? When did they actually move? How long did it take?

The title of this HN link leads you to believe this is recent, but the article
makes it seem like a multi-year effort, that in fact, could have finished a
long time ago.

~~~
dandr01d
I agree. The title is misleading and should be changed.

~~~
mr_tristan
But the blog itself lacks a lot of information.

The timeline is really, really significant here. Did they initiate this back
in 2013? 2015?

My company has seen a remarkable shift in trust for online services change the
last two years. Was Dropbox ahead of this curve? Were they someone who brought
this change in mentality around?

This blog article is crap because without a basic sense of _when_ it could be
that they were riding the coattails of other industry leaders. Or they were
leading the charge.

~~~
manigandham
It took several years and they finished early last year, much better article
on Wired: [https://www.wired.com/2016/03/epic-story-dropboxs-exodus-
ama...](https://www.wired.com/2016/03/epic-story-dropboxs-exodus-amazon-cloud-
empire/)

TechCrunch is only useful for headlines today, go elsewhere for actual
content.

------
tcptraceroute
The title here isn't quite accurate.

Dropbox has moved user data from S3 to its own colocated data centers over the
past few years, and is also doing compute in those data centers too. The
compute actually existed for quite a long time - in the past you'd be talking
to a Dropbox run server which would connect back to S3 to retrieve data.

Dropbox is definitely still an AWS customer, just not a major S3 or EC2
customer anymore. For example, all transactional email uses SES, and DNS is
hosted on Route 53.

