
Is it possible to host Facebook on AWS? - vinnyglennon
http://blog.sqlizer.io/posts/facebook-on-aws/
======
discodave
I don't understand why people think that companies can get special pricing
from AWS but not drive feature development. Large customers like Netflix, GE &
friends get to drive feature development. Heck, if you're the CIA and give AWS
$600 MM they will build a private region for you!

This post has two contradictory quotes in my opinion:

"It’s worth noting that this AWS price wouldn’t be what Facebook paid in this
hypothetical situation. Much like Snapchat and Netflix, Facebook would be a
heavy and influential user"

"Facebook’s years of specialization for running Facebook are in contrast to
AWS, whose storage is designed for multi-purpose (albeit-heavy) use." \-
Amazon already has several tiers of storage (S3, S3 - Infrequent Access,
Glacier, CloudFront)... who is to say that they wouldn't be motivated to
introduce more classes by a large customer like FB.

~~~
BurningFrog
Sidenote: It's remarkable that Amazon happily provides server services to
Netflix, who is a huge competitor!

~~~
19eightyfour
It's not remarkable its Bezos' business strategy.

The way he thinks about things is it's not a threat to his business to have
competitors in his space not just because he is prepared to undercut them on
cost but because he knows that his true focus is on customer satisfaction. And
competition, whether through providing customers with a choice or encouraging
his own systems to provide better services is good for his business.

But in this case it's really just because Netflix is too insignificant to
matter to Amazon overall. It's just a tiny fraction of their revenue and what
they're trying to do.

And if you want to be Machiavellian of course it's good because then Amazon
has leverage of Netflix.

Seriously how would it look if Amazon turned down a company because they were
competing? It would be an admission of defeat and weakness for AZ. That's not
how an "apex predator" thinks.

Keep your friends close, and your competitors....

~~~
danielsamuels
Keep your friends close, and your competitors hosted

~~~
19eightyfour
I love that, seriously. Machiavelli's Guide to PaaS.

------
jzelinskie
I think a lot of the smaller details this article glosses over contribute to
FB's ability to run on so few machines. For example, comparing my anecdotes of
FB's internal network (they probably have the most organized/mature IPv6
network on the planet) with those of my AWS public cloud experience, I'd
reckon that a lot of FB applications would require architectural redesigns to
operate in that kind of environment.

The real question isn't "can FB be hosted on AWS?", it's "why isn't FB
competing with AWS?" because what they've already got is much better for the
range of applications that they deploy.

~~~
adventured
They went from $3.6 billion to $10.2 billion in net income in a year. The $3.6
billion is probably more net income than Amazon has generated combined in the
last two decades. The $10 billion is more than Amazon will earn a decade from
now.

Their focus should be solely on what is going to likely end up being one of
the three or four greatest money producing machines in world history (of those
owned by a traditional corporation that is).

Cloud computing profits are - and will remain - a joke compared to the $20
billion in net income they'll yield in just a few more years from what they're
already doing. Their focus should remain fairly narrow, they're a mere 13
years in at this point. Diving into cloud computing would be making the same
exact mistake that Microsoft made with Bing, and that Google made with all
their laughable social attempts, and so on. Facebook would split the market
further and likely still end up #2 or #3 at best. Their best engineering
talent should be solely focused on the golden goose social monopoly, which
nobody else has and nobody else can compete with.

~~~
bamboozled
"Their focus should be solely on what is going to likely end up being one of
the three or four greatest money producing machines in world"

Companies don't "produce money", that would just devalue currency.

I gusss you had a good go at making it sound cool.

~~~
umanwizard
Is this your first time reading figurative language?

------
OJFord
> If we take the 2012 estimate to be true, Facebook’s server capacity
> outstrips Moore’s Law.

As well as inflation, and all manner of other totally irrelevant concepts.

~~~
tyingq
It could be relevant if there is some fixed refresh cycle of servers. That is,
newer servers can serve more end users because of the faster cpu, drives, etc.
The author seems to have skipped the concept though.

~~~
puzzle
Yeah, I looked through the article, but never does he mention that one rack of
today's server vintage might have an extra order of magnitude more RAM, if not
cores, than one from 5 years ago.

Ten years ago, four-core Barcelona Opterons were so hot and in high demand
that, if you wanted to be on the latest servers, you had to show a performance
boost worth the premium. Sometime this year you'll be able to buy dual
processor Naples systems with 64 cores (128 threads).

~~~
jjeaff
I assume that their assumption is that the facebook application has grown in
complexity to keep pace with better servers. There is certainly a lot more
going on as they had several years ago.

------
TheChetan
In 2012 Facebook had 1BN users and 180K servers. 1,000,000 users / 180,000
servers = 5,556 users per server In 2017 Facebook has close to 2BN users.
2,000,000 users / 5556 users per server = 360,000 servers

Notice how the 1B has only 6 zeros. The awesome part is that this mistake
evens itself out later, yielding a correct number for "The of servers".

------
p0rkbelly
Keep in mind, you can bet your ass that prices would be drastically lower than
the public prices posted.

Large enterprises of any significance, and especially flagship/strategic
customers, don't pay list prices.

Every cloud provider has private pricing and enterprise discount programs.
That goes for all hardware vendors (e.g. Cisco, Palo Alto Networks, Oracle,
etc selling to any company) as well.

I'm sure Netflix pays nowhere near public pricing on AWS and I'm sure AWS pays
less than anyone in the world for an Intel CPU.

~~~
zeep
wholesale vs retail costs...

------
virtuallynathan
I'd have to assume Facebook is using well over 500TB/mo of outbound traffic.
I'd Assume on the order of 1-2Tbps or more on average.

Disclamer: I work for AWS.

~~~
kondro
Well over. 500TB a month is only 250KB per active user.

~~~
kondro
Oh. I see this might've just been a typo. Agree with the 1Tbps+ figure for
external bandwidth (and maybe 300%+ on that internally and between regions).

------
drawkbox
Reddit and Netflix are hosted on AWS so yes you could.

You wouldn't run it the same as if you had your own infrastructure though, and
probably you would run it more horizontal and cached but it could be done much
cheaper than this estimate. On your own environments you can use larger
servers, dbs and sharding or clustering is closer. But the cloud is different
and you'd be able to run it but differently. The cloud definitely influences
architecture.

~~~
jsmthrowaway
> Reddit and Netflix are hosted on AWS so yes you could.

Those are read-heavy properties (yes, even Reddit, a miracle of aggressive
caching). Don't underestimate Facebook's _write_ load, which is the bloody
difficult thing to scale.

~~~
sk1pper
Not sure if I buy that Reddit is read-heavy compared to Facebook, got any data
on that?

~~~
crispytx
Reddit doesn't let people upload videos and pictures to their site. You have
to post pictures and videos on other people sites and then link to them. Sort
of the same thing with Netflix. Netflix doesn't have a bunch of people
uploading stuff to their servers all the time.

~~~
adamors
Reddit allows (and encourages) people to upload images to their site for a
year now:
[https://www.reddit.com/r/announcements/comments/4p5dm9/image...](https://www.reddit.com/r/announcements/comments/4p5dm9/image_hosting_on_reddit/)

------
philip1209
Some big players with their own hardware host overflow capacity in cloud data
centers, e.g. so that in DDOS they can dip into additional capacity. I don't
know if Facebook does this, but I expect that many of the big players already
peer with cloud data centers.

------
jaypaulynice
"We estimate that Facebook has 830,000 servers in 2017."

No way this can be true. Facebook scaled primarily using CDNs like Akamai.
Akamai is the largest CDN in the world and has no more than 300k-350k servers
all over the world. It's one of the largest distributed system in the world. I
worked there, so I know this for sure.

Facebook has recently been moving their data in house, but they do not have
830,000 servers. That's almost 3 times Akamai. At this point, Facebook would
be more profitable letting people use their server infrastructure instead of
using it for themselves.

Running Facebook over AWS is a CDN problem that was solved by Akamai!

~~~
origami777
<insert "when you're a hammer..." joke here>

Not to be a jerk, but, running Facebook over AWS is not a CDN problem. That's
crazy talk. There are far more Facebook requirements that AWS can meet that
Akamai cannot. You have to look beyond their content delivery reqs.

Also, they could absolutely have 800k servers. I don't think comparing their
size to Akamai is fair. The requirements of each company are drastically
different.

Do I think the 830k is accurate? No, but only because these estimates are
usually wrong (unless informed by an inside source).

~~~
jaypaulynice
Akamai handles 15-30% of all web traffic on any given day with 200k-300k
servers...does Facebook have that kind of load in a day?

The problem is mostly a CDN one, crunching data and writing to the servers are
second to actually serving users fast.

For example, I don't necessarily need to see the latest posts, comments, etc.
so you can queue up the writing and sync data later, but when I hit fb.com, I
have to see stuff...

~~~
ADefenestrator
I think working for Akamai you're looking at things in a rather CDN-centric
sort of way, but shuffling bits around is a relatively small percentage of
what Facebook does. Actually doing things with the data beyond serving it up
takes a lot more servers. The "CDN-equivalent" servers in PoPs etc for FB are
a a pretty small percentage of the total, and the bulk are things like
web/cache/search/DB servers.

------
ivan_ah
Since we're guesstimating numbers a lot, it would be interesting to also
guesstimate the human resources costs required to run the infrastructure: X
electrical engineers, Y cooling engineers, Z SREs, W infrastructure devops
ppls, etc.

Using the cloud you don't need X,Y,Z and presumably will need less of W.
Assuming a 100+k/year salary, that's like 200+k/year total cost of employment.
If X+Y+Z-ΔW ~= 1000, then the cloud provides a 200M savings on that front. The
real number is probably 2x 3x higher, but I'm not going to bother going back
and updating the guesstimates.

~~~
p0rkbelly
You are also thinking about this in a perfect vacuum. Global scale presents an
entirely different --non-technical-- challenge.

Take for example, Dropbox stated they are moving off AWS -- except in Europe
(and other non-US areas I'm sure). Having boots on the ground abroad,
operating 24/7, dedicated to the mission of the company where they have not
even visited ... near impossible.

[https://blogs.dropbox.com/business/2016/09/making-
european-i...](https://blogs.dropbox.com/business/2016/09/making-european-
infrastructure-available-to-our-customers/)

Why? Hiring people around the globe is hard who are going to remove that HDD
24/7 is pretty hard. Hiring lawyers to understand foreign law is hard. Keeping
your company focused abroad and building a culture abroad is very hard.

How long does it take for a company to expand to India on their own? In AWS,
you just change your CloudFormation template to another region.

------
deepGem
Minor nitpick,

1,000,000 users / 180,000 servers = 5,556 users per server

Should be

1,000,000,000 users / 180,000 servers = 5,556 users per server

------
vacri
> _We need to take into account that Facebook not only has double the users
> but also more data created per person - photos, videos, live streams etc.
> Plus it now hosts Instagram. So let’s double the number._

... but apparently no efficiency gains in 5 years to reduce the number again?
Faster processes, more cores, more ram, better storage, more senior-developer-
hours, bespoke datacentres... ?

------
crispytx
I think you could easily build the NEXT facebook on AWS or any of the big
cloud computing platforms. But obviously moving facebook to AWS at its current
size wouldn't be practical, and no one would probably ever do something like
that.

------
johnsmith21006
What I have heard about Facebook infrastructure it would better with Google
cloud than Amazon.

------
luord
A mostly good thought experiment, IMO, if pointless since it's never going to
happen.

------
oneplane
The answer is no. This is because you can't compare commodity hosting like AWS
with specialized hosting like Facebook's data centers and infrastructure
software. This question of possibility has nothing to do with 'number of
servers' or 'amount of money'. It's architecturally incompatible.

~~~
accountyaccount
it's still possible, you just have to change the architecture

~~~
oneplane
With that concept, everything is always possible. "Can you do X with Y and Z?"
-> "Yes but change Y and Z and then X is possible".

------
curiousgal
Tech's own Fermi problems.

------
pier25
> Its application code is still developed using PHP

This is surprising since 1) Facebook is most likely using Hack and 2) it could
save a lot of money by moving to Go.

> Facebook’s entire site runs on HHVM (desktop, API and mobile)

Isn't Facebook running mostly on React? I'm guessing HHVM only really powers
the API, business logic, etc.

~~~
cbhl
I imagine the lifetime savings of having a backend in Go is dwarfed by the
one-time cost of such a migration.

That's why they invented Hack and HHVM in the first place -- it was a cheap-
enough compromise that didn't require rewriting the whole code base.

~~~
closeparen
IIRC, Facebook uses (their derivative of) PHP to collect data from a number of
services in different languages over Thrift, which they developed.

PHP is the presentation layer. Much of the actual logic is in C++, Java, and
maybe even Go by now.

~~~
rbranson
(Former Facebook employee)

The VAST majority of online serving code at Facebook is written in Hack (PHP)
and is part of a gigantic codebase called fbwww. Backend services that do
heavy lifting (mostly search, feed ranking, ad matching, TAO, various caches,
proxies, etc) are fairly light on business logic and written in C++.

It isn't a microservices architecture at all: they only move stuff to C++ for
latency and efficiency reasons. For instance most of the ad stack is written
in Hack, but certain key pieces use C++ for better performance. I'm not aware
of any Java or Go code that is part of the online serving stack.

~~~
geggam
Do you know of any place actually using microservices at scale ? ( 400 Million
+ users )

~~~
closeparen
Your definition of "at scale" doesn't seem to leave room for Amazon, Netflix,
or Uber (all in tens to low hundreds of million monthly active users according
to a cursory Google search) but... those.

~~~
geggam
Guess my perspective of scale is different.

Its just an observation of having worked on systems at scale and I dont see
the microservices / containers used.

I am curious if anyone else has.

The more I am around containers and the network complexity/issues that are
created the less I am convinced it is the most optimal way and I am looking
for opinions that differ from mine so I can learn

~~~
closeparen
In my experience, these issues are minor when your large company has an
infrastructure team dedicated to the service substrate (CI, self-service
incremental deploys, service discovery and routing, standard RPC layer,
canonical repository of IDL files, standard messaging layer, persistent
storage as a service, time series collection, dashboarding, alerting,
distributed tracing, safe internal and external nginx configs, etc).

Microservices are probably a bad idea (for now) if you aren't large enough for
such a team.

But you've constructed your question to exclude the companies that work this
way, which I think any reasonable person would call "at scale."

