Can anyone go into more detail about what they're talking about here? I didn't realize IE had a "CSS selector limit." Seems like a funny thing to mention when you're talking about how optimized your site is.
Edit: It reminds me of the eBay motors data we got a few weeks ago. A lot of it is making the page appear to be loading (i.e. flush the buffer for 'above the fold' and then ajax everything else in)
I used the closure compiler with Jammit but I've never tried the library.
I have no experience with Closure Tools but Clojure is awesome definitely.
I've never used any website where the data I was seeing was so obviously out of date across the board as it is on Google+. Yes, eventually it coalesces towards correctness, but I don't think this model works well for things like notifications and such.
"Oh look I have a new notification... oh wait, no I don't, that's from like an hour ago and I already looked at it."
Speed is a feature, but so is not making your users think your software is just plain buggy and too much local state caching is basically indistinguishable from a bug for many users.. and in many contexts it can also be annoying even for those who know roughly why it is happening.
The problem with social networking is that such silos don't exist anymore. You can't even group users into larger silos with their cliques, because most social circles are overlapping (for some reason people feel compelled to keep making friends throughout their lives). So instead of the trivial email case of caching, you're fighting this battle where you have to constantly trade off consistency (do we synchronously write to every cache in every cluster in every region?) with performance (obviously not, round trip time to sweden is like 200 ms).
And you're thinking "Well, a consistent user experience is the most important thing, so just block the user's notification until the data is available to them." Well how do you know the data is available to them? They might see the notification on their cell phone, which is routed to the east coast for DNS reasons, but the write hasn't made it over to the west coast for whatever reason yet, and they're accessing it on their PC. If you wait for it to appear everywhere, two users sitting beside each other in a dorm room think that Facebook sucks because one guy posted on the other's ten minutes ago and it's not there yet. This isn't a logical consequence to the end user of a backhoe cutting a fiber line somewhere in oregon.
To summarize, it's a super hard problem, which makes it incredibly interesting to work on :) https://www.facebook.com/careers
Makes me excited for my interview on Friday!
Seems like a good idea, and in my experience most "normal people" think fb is instant.
When other people look at your profile, they have to go through the normal eventual-consistency mechanisms, deal with the normal replication lag and message-passing delays, etc. But they have no idea that the information is out of date, because they're not the ones who inputted it. As long as you don't go ask someone through backchannels "did you see my FB update?", they'll never be the wiser.
FWIW, I've seen data-consistency problems on my FB wall before where the same entry will appear multiple times, usually due to pagination bugs.
The only article I'm seeing for this issue is this one from 2009, by Bret Taylor: http://bret.appspot.com/entry/how-friendfeed-uses-mysql
Other than that, crickets, which is a shame as these RDBMS databases are proven to be more reliable than the fad du-jour.
They're using the tool in a way that it wasn't designed to be used. Therefore it's probable that that there's an opportunity for specialized software to fill the role better.
I'm not so sure that using mysql at this scale is in the end such a good decision...
MySQL ... wasn’t built for webscale applications
Facebook has split its MySQL database into
SQL databases ... consume too many resources for
overhead tasks (e.g., maintaining ACID compliance
and handling multithreading)
At their scale maybe they could have chosen something better, more suitable to their needs, however the question is ... how can you make a choice when nobody else has dealt before with a social network handling submissions from hundreds of millions of users and that's expected to reach 1 billion?
You can't. On the other hand it is easy to blabber about webscale (WTF does that mean anyway?)
PS: I'm not going against the opinion of the great Michael Stonebraker, but he's not the author of the article and his views must be taken in context ;) He's not wrong to say that MySQL will be problematic to Facebook. What I'm saying here is that MySQL was as good as any other available alternatives, especially considering how Facebook uses it.
Facebook has the same problem, of course; they have some increasing counter which seems to tell me nothing useful at all.
In practice, this happens to me multiple times a day on a normal connection and more frequently on congested or hotel crappy WiFi.