Hacker News new | comments | show | ask | jobs | submit login
How Google makes Google(+) fast (plus.google.com)
157 points by jvandenbroeck on Nov 15, 2011 | hide | past | web | favorite | 32 comments

"On a side note, you may have noticed that we load our CSS via a XHR instead of a style tag - that is not for optimization reasons, that’s because we hit Internet Explorer’s max CSS selector limit per stylesheet!"

Can anyone go into more detail about what they're talking about here? I didn't realize IE had a "CSS selector limit." Seems like a funny thing to mention when you're talking about how optimized your site is.

Yes, the magic number is 4095 selectors. And it's still true, even in IE9.

Do they really need that many selectors - it can't be good for performance?

The little red box that shows how many notifications, is anything but fast...

Edit: It reminds me of the eBay motors data we got a few weeks ago. A lot of it is making the page appear to be loading (i.e. flush the buffer for 'above the fold' and then ajax everything else in)

That little box was one of the delights of previous social websites i've toyed with, G+ flipped it into an anguish box.. amazing.

Are people outside of Google using closure to develop apps?

I used the closure compiler with Jammit but I've never tried the library.

Many people are using it, they're just not the types writing "10 Great Form Validation Plugins for jQuery" posts, so it might not seem like it. The same can be said of Dojo, but in my opinion Closure is Dojo done right.

Cloudkick does, I got to work with it this summer.

clojure-script uses closure as part of compiling clojure to js. I'm not sure if any significant projects are using it though:


Google Closure Tools is not the same thing as Clojure.

A little confusing I am sure since Closure Templates is a templating system that dynamically generates HTML in Java and JavaScript... And Clojure is a language that targets the JVM.

I have no experience with Closure Tools but Clojure is awesome definitely.

Either the author of the comment I replied to ninja edited his comment or I need to work on my reading comprehension. I think it was the first thing though...

Does anyone know why they've chosen to use the iFrame approach to loading JS, instead of attaching script tags dynamically, a la RequireJS?

Google+ is certainly fast, I just wish it were quicker about showing me valid/updated state.

I've never used any website where the data I was seeing was so obviously out of date across the board as it is on Google+. Yes, eventually it coalesces towards correctness, but I don't think this model works well for things like notifications and such.

"Oh look I have a new notification... oh wait, no I don't, that's from like an hour ago and I already looked at it."

Speed is a feature, but so is not making your users think your software is just plain buggy and too much local state caching is basically indistinguishable from a bug for many users.. and in many contexts it can also be annoying even for those who know roughly why it is happening.

That's because most distributed data stores are eventually consistent. This works fabulously in a world where data needs to exist in a handful of places (say, a user's email inbox, which should be replicated so it can be accessed quickly from different geographical areas and for the sake of disaster recovery) but need not necessarily all agree with each other at the same time (If I send you an email, it's OK if you don't get it for 5 or 10 minutes, in fact, it's generally expected).

The problem with social networking is that such silos don't exist anymore. You can't even group users into larger silos with their cliques, because most social circles are overlapping (for some reason people feel compelled to keep making friends throughout their lives). So instead of the trivial email case of caching, you're fighting this battle where you have to constantly trade off consistency (do we synchronously write to every cache in every cluster in every region?) with performance (obviously not, round trip time to sweden is like 200 ms).

And you're thinking "Well, a consistent user experience is the most important thing, so just block the user's notification until the data is available to them." Well how do you know the data is available to them? They might see the notification on their cell phone, which is routed to the east coast for DNS reasons, but the write hasn't made it over to the west coast for whatever reason yet, and they're accessing it on their PC. If you wait for it to appear everywhere, two users sitting beside each other in a dorm room think that Facebook sucks because one guy posted on the other's ten minutes ago and it's not there yet. This isn't a logical consequence to the end user of a backhoe cutting a fiber line somewhere in oregon.

To summarize, it's a super hard problem, which makes it incredibly interesting to work on :) https://www.facebook.com/careers

Wow, thanks for the writeup Alex! I never though about the DNS routing issues on mobile that would send the same packets on two completely different paths (all to end up in the same dorm room). All of this on top of the ticker / chat / live connections between 750 million people is pretty insane. I don't think Facebook gets enough technical credit for overcoming all of those challenges.

Makes me excited for my interview on Friday!

Guess fb updates the users UI right away and lets the change propagate "eventually"?

Seems like a good idea, and in my experience most "normal people" think fb is instant.

Facebook is pretty amazing in that respect. They managed to engineer a scaleable infrastructure without making it too obvious that everything is being cached. There are times when it takes a minute or two for my profile picture to change everywhere after I change it in my profile, but aside from minute-or-two scenarios, Facebook's interface (and the real-time notification system) is just amazing in comparison.

I've heard (from a coworker that knows some people that work at FB; don't take this as gospel) they use some clever trickery to get this to work. Basically, there's a localized cache of information just for you, storing information entered by you that you might see. Since this is a limited set and is easily sharded by user, you can put everything in memcached, have the frontends hit that record and merge it with the rest of the data, and pretty much guarantee immediate consistency - as long as it's just you looking at the profile.

When other people look at your profile, they have to go through the normal eventual-consistency mechanisms, deal with the normal replication lag and message-passing delays, etc. But they have no idea that the information is out of date, because they're not the ones who inputted it. As long as you don't go ask someone through backchannels "did you see my FB update?", they'll never be the wiser.

FWIW, I've seen data-consistency problems on my FB wall before where the same entry will appear multiple times, usually due to pagination bugs.

Facebook runs largest memcached cluster in the world and largest sharded MySQL installation in the world (which they use as a key-value store since it's hard to evolve schema otherwise). So, essentially, quick updates are tremendous amount of duct tape and hand-crafted code to make it work more or less correctly.

I've always wondered why alternative datastores, like MongoDB, get so much good press when a RDBMS like MySQL or PostgreSQL can be used for storing non-relational data in a way that scales tremendously.

The only article I'm seeing for this issue is this one from 2009, by Bret Taylor: http://bret.appspot.com/entry/how-friendfeed-uses-mysql

Other than that, crickets, which is a shame as these RDBMS databases are proven to be more reliable than the fad du-jour.

There's a video (I don't have the link handy) where a Facebook engineer says they actually had to make code modifications to MySQL to make it do what they want.

They're using the tool in a way that it wasn't designed to be used. Therefore it's probable that that there's an opportunity for specialized software to fill the role better.

Except for the iPhone app, with multi-hour delays required for a refresh to actually load new notifications, comments, etc. It'd be fine if push notifications could simply purge the cache since those arrive instantly even if it takes the better part of a day for new comments to actually be displayed on the post.

In my experience (and many friends' experiences), Facebook is and always has been unbelievably buggy on that front. I don't know why, and I personally think Facebook is better than G+ for general social networking (I see G+ as more of a fusion of LinkedIn and Tumblr with some FB-like features), but it's still a pain to use.

Relevant: http://gigaom.com/cloud/facebook-trapped-in-mysql-fate-worse...

I'm not so sure that using mysql at this scale is in the end such a good decision...

Considering the unprecedented scale they are operating on, their track record is not too shabby and because of that I would take any opinions to the contrary with a big grain of salt. Also, that article is filled with hyperbole and meaningless words, with sentences that contradict each other. In one place he's are saying ...

    MySQL ... wasn’t built for webscale applications
But in another he's are saying ...

    Facebook has split its MySQL database into 
    4,000 shards
Well, if the author had any clue whatsoever, he would know that splitting a database in multiple shards effectively goes against the relational model (Facebook is effectively using MySQL as a storage engine, not as a RDBMS) and so inherent limitations that make people say MySQL wasn't built for webscale applications ... no longer apply.

     SQL databases ... consume too many resources for
     overhead tasks (e.g., maintaining ACID compliance
     and handling multithreading)
Again, ACID compliance and multithreading becomes a local issue, relevant for a single shard. ACID compliance does not apply to the whole cluster of shards they have and it is still freakishly useful, even if applied locally, because you want the guarantee that the user's submission has been saved somewhere in a consistent state, from which you can safely replicate.

At their scale maybe they could have chosen something better, more suitable to their needs, however the question is ... how can you make a choice when nobody else has dealt before with a social network handling submissions from hundreds of millions of users and that's expected to reach 1 billion?

You can't. On the other hand it is easy to blabber about webscale (WTF does that mean anyway?)

PS: I'm not going against the opinion of the great Michael Stonebraker, but he's not the author of the article and his views must be taken in context ;) He's not wrong to say that MySQL will be problematic to Facebook. What I'm saying here is that MySQL was as good as any other available alternatives, especially considering how Facebook uses it.

I agree with the notification problem, although I don't think it's to do with caching local state; it just doesn't seem to register that the notification has been seen until I actually click on the red square. It's more annoying than I would have thought, really.

Facebook has the same problem, of course; they have some increasing counter which seems to tell me nothing useful at all.

Actually, I see the same notification issue in Facebook. I don't think I ever login to Facebook without my immediate number of notifications being exactly 1 more than they should. The last "new" notification is always what I have already most recently checked. So in this sense, Facebook is also very fast, and also does not have an entirely flawless system.

I'm rather more grateful for all the work Google have done here after using Diaspora; the one thing that nags me about that more than anything is how much slower than G+ or Facebook it is. A couple of seconds on each click really breaks the whole experience.

There's a major downside to all of this complexity, however, because it degrades really poorly: I routinely see 30+ second page loads in Google Plus because they load a ton of code and you won't see anything until it completes 80-130 requests (warm/cold cache respectively) to load almost 5MB of resources! An ugly, non-interactive 1994-style simple HTML page would be far more useful to me at such times…

In practice, this happens to me multiple times a day on a normal connection and more frequently on congested or hotel crappy WiFi.

Is this also why the new Google Reader is so slow (they're using a more general framework that doesn't match the old Reader's performance when rapidly paging through article previews)?

Cliff notes: By not having any users.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact