Hacker News new | comments | ask | show | jobs | submit login
What Powers Instagram: Hundreds of Instances, Dozens of Technologies (instagram-engineering.tumblr.com)
335 points by zio99 on July 10, 2012 | hide | past | web | favorite | 77 comments

I love that companies like Instagram are happy to share this kind of information. I don't have the problem of scaling yet, but it's reassuring to know that when I do, there are good models like this one to follow.

The Instagram founders are exceptionally cool. After the $1B Facebook acquisition Mike Krieger kept right on answering questions on the instagram-api mailing list without missing a beat, as if nothing had happened.

Mike has been an example of good citizenship in the Redis community as well. Always helpful, kind, and providing important feedbacks.

I might be wrong, but as far as I know they haven't sold yet (it's still got to clear regulators?) so although he technically could have walked away I don't think it would been sensible as he isn't going to be able to cash out yet?

Indeed - it hasn't cleared, and if it doesn't Facebook still owes them $200 million for the failed buy.

excuse my ignorance of US corporate law, but why would FB owe them $200M for a failed buy?

Its essentially a retainer. What if they go through the motions, and then Facebook backs out at the last second? It could have thrown Instagram for a loop, and sometimes an acquisition takes enough resources that it could cost the company money or disrupt their growth or operation. It essentially could prevent frivolous offers from "attacking" competitors by pretending to buy them out, and then not.

I guess you could call it "attempted merger".

I imagine it's not US corporate law per se, but rather that as part of the agreement, even if the merger isn't cleared, FB still has to pay the $200m although at roughly 20% of the fee it seems a little steep to compensate Instagram for the cost of having to reverse the transaction (not that I'm particularly familiar with the deal terms and associated costs!)

It's fairly common in acquisitions for one or both sides to agree to a "break-up" fee to protect against one side getting cold feet, regulatory risks, financing risks, and more. http://blogs.wsj.com/deals/2012/04/25/facebook-ipo-whats-wit...

It's not a matter of law. It's a "breakup fee" negotiated by Instagram in the event that the acquisition falls through.

The purpose of a breakup fee is to compensate the target for costs associated with the failed acquisition and various other losses or expenses. Mostly however, a breakup fee compensates the target for business oppportunities not pursued while the acquisition was being attempted because of the acquisition.

In cases where the acquirer is a direct competitor to the target, the breakup fee guarantees that the acquirer can't simply look at the target's books and assets, walk away, and use that knowledge to clone the target's business, without paying for it.

Meant to upvote, fat-finger downvoted instead. My bad. Hopefully someone can compensate.

Solid copy!

Some sources (the NY Post for example) say the deal may take up to 6 months with a 50% chance of going through. There's some concerns as to whether Instagram may prevent sharing with social networks other than facebook (flickr for e.g.). I'm already booted off instagram without upgrading to the new app, but fear that I'll be forced to share all my instagram pictures on my facebook timeline.

To prevent sharing out except to Facebook, Instagram would have to disable its API and stop saving photos to Camera Roll. I don't expect either of those to happen.

Even if the deal might fall through, I could easily imagine Mike taking a day or two to celebrate instead of jumping back in and answering questions, particularly on the instagram-api list, since nurturing a slowly-growing ecosystem is not Instagram's best path to a successful exit. Logically, I think he did it because he's a nice guy.

Also worth mention a post than Donna Kline penned: http://www.donnaklinenow.com/investigation/instagram-scam

Don't forget to copy it locally, in case they (or their acquirers...) ever take it down. The Evernote web-clipper plugin is great for that.

I also did the same with all the folks who posted technical details on their experience dealing with the AWS outage. Great stuff to hold on to for future reference.

This is a fairly old blog post and it is posted to HN by Kevin Systrom – he's co-founder of Instagram. Here's his post:

http://news.ycombinator.com/item?id=3306027 (223 points, 220 days ago) so I'd like to say this is a repost. Though, I have no idea how the same link can be reposted to HN.

I don't understand how this topic comes up so frequently, but HN is configured to allow reposts after a certain amount of time. That isn't really a problem for old posts with excellent information, so I think a repost is warranted for this.


Guess they figured that one out in less than 221 days.

I don't see Amazon SimpleDB in this list, rather Postgres and Redis.

Can someone who knows more about various DBs opine on this. Is it better to run your own DB as Instagram seems to be doing or is relying on SimpleDB good enough if you don't need such high performance.

Also, as happens with many startups, how easy/difficult is data migration when startups try to scale and need to scale fast.

AWS has more than one DB option.


SimpleDB is a non-relational store that automatically indexes everything, but can only store up to 10GB. It's very flexible but limited, good for prototyping.


DynamoDB is like SimpleDB's grown-up version. At greater cost and with more up-front configuration, it scales automatically to huge workloads. It's still a non-relational store, with the drawbacks that implies.


Relational Database Service is literally managed MySQL instances. Amazon spins them up for you, manages configuration, backups and restoration. One of RDS's primary value-adds is that it can automatically partition your data across multiple EBS volumes (like hard drives). This helps get around the relatively low I/O performance of EBS volumes.

So pick your poison -- if you don't need high performance, simpleDB or a small RDS instance will work; it depends whether you want relational data or not. I can't speak for the difficulty of migrating; we stuck with running our own MySQL instances from the start.

For a lot of reasons (that I can enumerate if you'd like), we think running your own DBs is the best option.

I would very much appreciate it if you elaborate on why managing your own MySQL database is the best option. I'm currently moving from back end proprietary systems development to web development and would like to hear your considerations.

The short answer is that we believe running your own instances gives better performance and reliability than RDS. However, the cost is complexity: I'm a relatively experienced DBA, and we've since hired a second person with deep MySQL experience. If MySQL admin is something you don't want to spend much time doing, you may be willing to make the performance sacrifice of RDS.

The longer answer is that we don't use RDS because it relies on EBS, and we do not trust EBS for any critical applications. Instead, we put our data on instance storage (aka "ephemeral" storage).

This has two big disadvantages:

a) portability: you can't detach the drive and move it to a new instance like you can with EBS -- to clone or backup, you have to copy over the network, which is much slower (and obviously, if you kill the instance, you lose the data).

b) storage: you are limited in how big your DB can be. An AWS large instance these days gives you nearly 1TB of instance storage, but if you have a single DB larger than that, you need to use EBS if you're on Amazon. (Of course, if you care about performance and your database is > 1TB, you should probably be looking at sharding across multiple machines anyway)

However, using instance storage has two big advantages that we think outweigh those:

a) performance. EBS is basically a network drive. Total I/O operations per second (iops) is punishingly low. If you have a high transaction rate on your database you're going to really hate it. As I mentioned, RDS tries to mitigate this by using multiple EBS drives, but we consider that a band-aid on a pretty fundamental problem with EBS. Instance storage on the other hand is physically local to the VM's host machine, and is therefore much faster.

b) reliability. After 3 years on AWS, our trust in EBS is zero. It fails too often, and its failure pattern is awful: you tend to lose big batches of EBS drives at the same time, and whenever there been a major EBS failure, the API used to launch replacement volumes has failed at the same time, making replacement impossible. Again, we think this is a fundamental problem with the nature of EBS and unlikely to change.

Thanks seldo, interesting. Just been looking at a client's Amazon dashboard where they have a small set-up running, not something I normally deal with but I see their RDS is billing over 2e9 I/Os/month and ends up being a significant part of the non-fixed bit of their bill. I suspect their MySQL queries are doing table scans and building temporary tables for some of the queries; these would both up the I/O count as all RDS storage is EBS, even temporaries?

So if your MySQL storage is ephemeral how do you cope with outage? Replicate it off AWS?

I believe MySQL's working directory is on EBS, so yes, even temporary tables would be on EBS -- don't quote me on that, though.

Re: outages, we use multiple replicated servers in different availability zones -- an outage is usually (though not always!) limited to a single zone. For a region-wide outage, we have emergency backups being sent to a different AWS region (east -> west), and if shit completely hits the fan we have off-AWS backups.

Just out of curiosity, do you have any thoughts on DynamoDB, or have you played with it? Not as a "would you replace what you're doing with DynamoDB" but more a "heres a niche where we think it would work really well"?

I know it's very new so I haven't seen any advice on it, where I don't think I've ever seen a pro-EBS point of view from people with non-trivial experience with it.

Speaking theoretically, Dynamo is some really clever tech built by some very smart people -- it's clear that Amazon are using something very similar internally, so it must work in practice. Beyond that I've no direct experience with it or its performance profile.

If I had a very large, rapidly-growing key-value application and a shortage of experienced ops engineers that made maintaining my own solution impractical (e.g. a cassandra cluster) I would look hard at dynamo.

However, as a matter of principle I am very suspicious of the lock-in that comes with proprietary solutions, no matter how clever. We try not to buy cloud services that only have one vendor.

If you run your own stack, you get lower prices and more flexibility. On the flip side, if you don't know how to administer MySQL, you'll have some learning to do.

Dedicated server pricing is 1/2 or less of what Amazon offers you, and you get better performance to boot. Seems like a no brainer to me (but then again I've been doing "dev ops" stuff since the late 90s and learned many lessons the hard way).

Seldo said they run their own DBs, no that they run their own MySQL instances. Being able to run other engines (e.g. PostgreSQL) is no doubt a benefit of running your own DBs.

Cost might be a reason. Amazon has competition for MySQL hosting, keeping prices relatively low. DynamoDB seems expensive in comparison.

Mike gave a great talk to the San Francisco PostgreSQL User Group in which he discussed their PostgreSQL setup.

Slides: http://media.postgresql.org/sfpug/instagram_sfpug.pdf

Video: http://www.ustream.tv/recorded/21929154

SimpleDB is not an RDBMS. It has very specific use-cases where it makes sense, and most database use does not fit into those use-cases. Perhaps you're thinking of Amazon RDS, which is their managed MySQL/Oracle/SQL Server deployments. RDS is very expensive compared to running your own database, even on the very same EC2/EBS platform -- but you get reliability, replication and backups that some employee would have to be managing otherwise. In theory, at least. RDS didn't live up to its promises when I used it, so I moved back to running databases on physical servers at half the cost.

Please elaborate on how RDS didn't live up to its promises?

When EBS failed in US-East a year or two ago, my RDS instance did not remain available and did not automatically fail over to its replica, despite my paying for these benefits. Even when it was working correctly, the RDS instances had low capacity compared to physical servers, and there were often spikes of significant latency that would lead to overloaded web servers as DB queries backed up unanswered.

Good to know. I read here on HN that during the storm event in June, the same thing happened (no auto fail over for those who paid for it).

I'm pretty sure that "dozens of technologies" is a meaningless thing to say. Instagram is powered by thousands of technologies; it's just that a few dozen of those happen to have trendy buzzwords attached.

True, there's a good blog post regarding that scrappy attitude most hackers have and why we need to simply our architecture: http://www.zemanta.com/fruitblog/i-bet-you-over-engineered-y...

Instagram! Powered by the polio vaccine and domestication of wheat.

Instagram! Powered by a lack of earth destroying quantum vacuum metastability!

I'm pretty sure that one's not a technology, but just a happy and not very unlikely set of circumstances.

To sum up: it's gonna fall over at any minute without a continuous injection of money.

My take-away was "this is why when EC2 when down it wasn't just a case of rebooting the server". My other thought was "and how did all of this help you when EC2 went down?"

That was my thought too, the multiple mentions of "stuff" in different zones had me wondering why the recent EC2 outage was able to get them.

I don't feel very good about the idea of DR just being duplicate servers in a different zone. You don't know what kind of problem could ripple out and affect more zones, or all of them. A completely different host/cloud/colo provider seems like a safer bet.

When you come down to it earth is a single point of failure, its about how big of a disaster you expect to occur and how big of one you expect to recover from.

Yes but it's a lot easier and cheaper to protect against e.g. a major bug surfacing in amazon's cloud management api vs protecting against an asteroid hitting earth.

All the zones are geographically apart so its equivalent to putting server in different cloud/colo and probability that all of them gets "blown" away simultaneously is very less.

And yet all it took was a single AWS availability zone going away for a short while for them to have a major outage.

Major issues in Netflix case per their last blog post due to bugs in their environment not properly failing away from dead ELB's. Also the issues were related to API backups due to everyone rushing to launch new instances in a new AZ, but existing services in other AZ's continued to work fine.

My reading of the Netflix announcement was that it wasn't just a bug, but that they made the conscious decision to include manual intervention in the process (of releasing dead instances) but grossly underestimated the time required to do this across an entire zone.

DR is all about cost, though. It's likely to cost double (or more) to have two (essentially) identical systems in two different cloud providers -- whereas duplicating the same system in two different zones of one provider is nearly free.

Could someone plug these into the AWS monthly calculator and try to figure out the monthly cost, when I try I get around $10,000 per Months, but I think I am wildly underestimating the bandwidth usage. AWS Calculator http://calculator.s3.amazonaws.com/calc5.html

The idea of mdadm on top of EBS volumes makes me cringe.

Mark Mayo's thoughts on abstracted block storage are spot on: http://joyeur.com/2011/04/24/magical-block-store-when-abstra...

A couple of comments as one of the authors of Solr's spatial code:

- I don't think any of us have compared it to PostgreSQL, but I can tell you we have clients doing 500 queries per second+ with it including text search. I've yet to see a DB do good text-based fuzzy matching and combining two systems (DB + search) via a join is usually slow. YMMV.

- The main goal of the implementation is to add point-based search to text search. It is not a general purpose replacement for an r-tree, etc.

- You are not required to use haversine. The distance function is pluggable and we have other options implemented. Also, in many cases, you have other clauses in your query that restrict down the set of documents that need to be scored by distance.

"The photos themselves go straight to Amazon S3"... Is there no image resizing involved? It's pretty hard to find stuff about scaling image resizing, It's hugely CPU intensive, that's a sure thing.

I'd imagine it's easy enough to do before transmission.

Actually they have to do the compression on the mobile clients before uploading, otherwise it will take forever and cost a real fortunate on the mobile data usage.

Well explained, and an awesome way to say "We are hiring a DevOps"

Do they have any application monitoring stuff a la NewRelic? I saw stuff like munin but that isn't really the same.

> For our geo-search API, we used PostgreSQL for many months, but once our Media entries were sharded, moved over to using Apache Solr. It has a simple JSON interface, so as far as our application is concerned, it’s just another API to consume.

Does anyone have particular insight to share on this? Last I checked, Solr's geospatial searching methods are rather inefficient -- haversine across all documents, bounding boxes that rely on haversine and Solr4 was looking into geohashes (better but have some serious edge-case problems where they fall apart).

Meanwhile PostgreSQL offers r-tree indexing for spatial queries and is blazing fast.

Am I missing some hidden power about Solr's geospatial lookups that make it faster/better than an r-tree implementation?

It probably was the database sharding. If the Solr setup could handle the geo-search-related data without the need for sharding it probably can beat out Postgres with sharding.

Having this exposed through an api that is standardized and maintained by someone else is also nothing to sneeze at. I'd trade a bit of performance for that kind of standardization and turnkey use in the right scenario.

The reason we use Solr for this specific task is because PostgreSQL cannot efficiently and quickly merge two index queries (time & space). It can do this to a limited degree, but both of these dimensions potentially match 10s of millions of documents, and PG falls over at this.

So you make the r-tree 3 dimensions (lat,lng,time). PostgreSQL supports this.

I dunno I can't envision Solr being more efficient than a properly designed RDBMS for these situations. If you were integrating a full-text search I'd absolutely believe that to be the case but...

We need independent time & geo searches as well. The indexes are vastly smaller in Solr. We use PostgreSQL extensively and prefer it, so it's not a matter of simply wanting to use something different.

That's very interesting. Could you share your story with the mailing list pgsql-hackers a little bit? The guys who work on indexing are quite active on those lists.

Also, there's some new thing I don't understand super well, sp-gist, do you have any thoughts on that?

I'm no Solr expert, but bug SOLR2155 has a patch [1] that does a geospatial search using geohash prefixes [2].

As far as I can tell, you take the point's latitude and longitude and interleave the binary bits - so if your record's latitude is 11111111 and your longitude is 10000000 your geohash is 1110101010101010. You index on that, then when you do a spatial search for the point nearest to 11111110,10000011 you look up key 1110101010101101 and a prefix search finds the closest value in the index is the the record you inserted earlier. Presumably then you realize there could be an even closer record at 11111111,01111111 which would have got stuck at 1011111111111111 in the index so you look there too just in case, take the closer of the two search results, and bob's your mother's brother.

[1] https://issues.apache.org/jira/browse/SOLR-2155?focusedComme... [2] http://en.wikipedia.org/wiki/Geohash

Hrm, so for a proximity search it basically has to take a combination of all potential encompassing geohashes and then do a second-pass (substantially reduced data set) using a haversine approach or something.

I suppose that might work pretty well.

Color me surprised: No NoSQL? Instagram seems to be a prime candidate for such a storage method.

You don't consider Amazon/S3 or Redis NoSQL databases?

It's weird that things as different as S3 and Redis all fall under the umbrella of "NoSQL". We should probably replace the SQL-NoSQL dichotomy with terms that more accurately reflect the real differences.

Since when is S3 a 'database'?

It's got an API, you can store data in it and retreive it later by key, since when was S3 not a database? It's a key/value store that lets you store very large values.

Not really a "distinctive" one (a la mongo, couch, etc.), but I see your point.

They use Redis.

Redis is a key-value store. S3 is a distributed file system.

Can we stop labeling the set of "not a rdbms" data storage mechanisms with the stupid fucking "NoSQL" moniker.

To be fair I've hardly seen an article on the benefits of NoSQL on the frontpage for a while.

First it was No SQL. Then it was Not Only SQL. Then it was, we have to get some real work done, Need Our SQL.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact