
What Powers Instagram: Hundreds of Instances, Dozens of Technologies - systrom
http://instagram-engineering.tumblr.com/post/13649370142/what-powers-instagram-hundreds-of-instances-dozens-of
======
jphackworth
I'm a little surprised so many machines are used to run Instagram. TechCrunch
mentioned their peak has been 50 photo uploads per second (which they say go
directly to S3, so Instagram's servers only need to pass a token). Of course
there are other forms of requests, but just back of the envelope it seems like
it should not require anywhere near "hundreds" of machines.

Not to be too harsh - it's just three engineers, so it makes sense if the
setup is still evolving.

~~~
rdouble
I was surprised they had so few... I once worked on a site with 1/6th the
users and 3.5 times the number of instances.

They could do better but they'd have to manage their own datacenter and write
portions of the app in C++. It's probably not worth it at this point unless
they hire someone with that specific expertise.

~~~
jphackworth
I once worked on a site with 1/6th the users and only one machine. ;-)
Counting users often doesn't match across sites, especially when an Instagram
user is someone who has downloaded the app, and might never come back. That's
why the 50 photo uploads per second peak is a useful benchmark.

~~~
mikeyk
Hey, author here. To clarify, our uploads go to our servers first (where we
resize for thumbnails, etc) then go to S3.

~~~
WALoeIII
Do you use GraphicsMagick or ImageMagick? Shell out or python bindings?

What settings do you use for: MAGICK_MEMORY_LIMIT MAGICK_MAP_LIMIT
MAGICK_DISK_LIMIT

Do you tune them dynamically or just set and forget?

~~~
scottostler
I'm curious about this as well.

------
armandososa
I like posts like this a lot. I'm just a web designer, but I found scaling web
sites fascinating, like some kind of dark art or secret craft.

Where do you learn this stuff? Do you need a CS Degree from stanford or
something? I like the black magic aura, it's romantic, but I'd really like to
understand how to scale websites doing stuff like the OP describes.

~~~
ippisl
<http://highscalability.com/>

------
d_r
I know that this is probably a recruiting-inspired post, but detailed posts
like this genuinely benefit the community. Thanks for specifically mentioning
the reasons for choosing particular technologies (i.e. why you switched to
Gunicorn from mod_wsgi) -- this makes the already excellent post even more
helpful for someone trying to build things.

------
latchkey
I guess my question is, how do they make money? I really like instagram
images. I've used the site myself, but it certainly isn't something I'd feel
the need to pay money for.

~~~
gallerytungsten
Funding Total: $7.5M (per techcrunch)

Server bill: $35k/month, $420k/year, per estimates in other comments.

Personnel, overhead, other expenses: $1.5M/year (guess).

Runway: 3.9 years to figure it out.

~~~
Ecio78
dumb question: is there a way to pay Amazon for AWS fees except for credit
card? 'cause i was wondering how can you create big infrastructures on Amazon
if you cant pay by wire transfer or some other kind of link to a bank account
(like phone and gas bills)

~~~
camwest
American Express has no limit as long as you pay it back ASAP.

~~~
hboon
Generally yes, but a country's central bank may prohibit that. Singapore's for
example does that.

------
latchkey
Those Quadruple Extra Large instances are $2/hr. The 24 of them used for
postgres would be like $35k/month just for that part alone. I'm guessing they
are spending >$100k/month on just hosting 100+ instances. Not to mention disk,
bandwidth, dns, s3, public ip's, etc.

~~~
foobarbazetc
Every time I see numbers like this, I wonder why everyone seems to think you
_have to_ use AWS or else you've failed at scaling.

They could run their operation for 10-20% of their AWS costs at a dedicated
server host. And everything would be much, much faster.

~~~
ww520
Using AWS is not just for its instances. S3 is a big factor. It's hard to
replicate the S3 functionality in your own hosting without much more effort
and cost. Granted that the AWS instances can be used more efficiently.

~~~
Ecio78
cant you just upload to S3 from your own dedicated machines? or it adds too
much delay to operations? Author posted that images are first loaded on their
system, resized and so on and then loaded on s3, so at least for image upload
it shouldnt be such a great problem.

disclaimer: i have no smartphone and never used their app :)

~~~
ConstantineXVI
Besides latency, you don't pay for internal data transfer within AWS services.
If you did the image processing on your own machines, you'd be paying for
bandwidth every operation; where if you do it in EC2, your only outbound
transfer is viewing the images.

------
apu
Is there a collection of these kind of blog posts somewhere? i.e., for
comparing the stacks of different sites?

~~~
cadr
The site highscalability.com has some good descriptions (look under the 'REAL
LIFE ARCHITECTURES' topic).

------
geuis
One thing about how Instagram's load balancing that I don't like is that they
rate-limit their proxies on image requests. In my recent testing, its roughly
5-6 requests every 3 seconds or so. Any requests more frequent than that
return 503 status codes. I don't entirely understand why they do this, since
their load balancer simply does 302 redirects to the S3-hosted image resource.

I can guess at some of the reasons, such as they didn't foresee a user loading
more than a few images at once. Perhaps they perceive rate limiting as a
protective measure.

However, I've done testing on Twitpic, imgur, and yfrog and haven't run into
the same issues. Twitpic, for example, generates a _lot_ more traffic than
Instagram and they don't have the same rate-limiting.

~~~
ceejayoz
> I don't entirely understand why they do this, since their load balancer
> simply does 302 redirects to the S3-hosted image resource.

S3 accesses cost money, so it makes sense that they'd rate limit access to
them. A botnet hitting an S3 URL could incur large fees for the owner of the
file very rapidly.

------
mkjones
Glad to see other people using vmtouch. It's also great for keeping large
codebases in the filesystem cache on [shared] dev machines.

------
cagenut
With that big a monthly AWS bill, I could pretty easily justify my salary and
the costs of building out a 4 - 10 rack colo setup. With room leftover for a
dba consultant on retainer and a pro-serv budget for ad-hoc stuff.

------
sant0sk1
That's a lot of instances! It'd be interesting to run the numbers and get an
idea of what their monthly AWS bill looks like.

~~~
clarkni5
By my math, the bill for their app and database servers would be approaching
$30,000 per month. That doesn't include storage costs, bandwidth, or any of
the other aspects of their infrastructure.

That's crazy, if you ask me.

~~~
simonw
Is that calculation taking reserved instances in to account?

~~~
rkalla
No, I don't think so. Latchkey did the same calculation, using on-demand
prices and came up with $35k[1]

3rd reserved is roughly 48% cheaper than on-demand, so real hosting cost would
be around $18,200 for those servers.

[1] <http://news.ycombinator.com/item?id=3306394>

------
mcginleyr1
For their load balances, why aren't they assigning elastic ip. Then they would
have to wait for DNS just reassign the ip...

------
vidar
What was your take on Gunicorn over uWsgi?

