Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Twitter responds to Techcrunch about its scaling issues (blog.twitter.com)
32 points by lyime on June 1, 2008 | hide | past | favorite | 31 comments


Good to see the Twitter team reply to that disgusting Blaine-bashing that TechCrunch had engaged in: The folks at TechCrunch singled out a former employee of Twitter by name in their questions but Twitter is a team—we share responsibility for our victories as well as our mistakes.


Wow, that was a perfect response. Couldn't have been classier. Kudos to the Twitter team for not only responding well but also for taking the edge off of Arrington's personal attack on Blaine



I agree, and flawless execution contrasting the Mars landing and Twitter operation. You couldn't ask for a more appropriate response.


"Our new architecture will move our reliance to a simple, elegant filesystem-based approach, rather than a collection of database."

Finally! Using a database for Twitter was a huge mistake. Anyone who's developing an IM should not use a database as a general message storage. And especially not for a system that's multicasting at such a high rate.


No one in their right mind ever would. Does IRC use a database for messages? Yahoo/msn/aol/icq? no.


Yes, but to counterpoint that..

they don't store anything, anyway. They're simply protocols, a network and a login system.


most of them support offline messages as well.


Yes, but that's more a temporary thing.


Well if you also need permanent storage - logs for example - a DB is clearly not the right thing to use.


Logging, in such a case, is usually done locally, not remotely. You're still not making much sense here, axod.


> We currently use one database for writes with multiple slaves for read queries. As many know, replication of MySQL is no easy task, so we've brought in MySQL experts to help us with that immediately.

This really makes me wonder if they know what they're doing. MySQL replication really isn't that hard at all. They're hitting performance bottlenecks with only 3 servers and don't appear to be looking into sharding/partitioning to split up writes?


On the contrary, I'd say that they must really know what they're doing. How many sites with a million daily active users do you know that run on only 3 db servers?

The twitter team deserves a lot of respect for managing to stretch their hardware so far.


smirk The company I work for serves volumes that dwarf that of Twitter. We don't drop anything, because we are handling actual money, not tweets. And we get by with 3 database servers, one of which is for failover.

OK admittedly they are wardrobe-sized Suns running Oracle, but that's not the point. Yes it takes skill to do a lot with a little, but it takes skill not to paint yourself into a corner too, and you need both skills to succeed.


I'd point out that 3 wardrobe-size Sun servers running Oracle probably cost about as much as 300 MySQL servers like Tweeter is running, but some might say I'm the one being a smartass then :-P


Twitter definitely does not have millions of daily active users. There are only 824M total tweets. Comparing IDs from posts 24 hours apart shows there were ~500,000 tweets in the past 24 hours.


I didn't say "millions", I said "a million". That's according to their own reports, and if there's 500'000 tweets, it's quite likely there's a million people sitting there with a tweeter client open (not everyone tweets every day).

Daniel


Your guess is still way high. It's around 200,000 active per week. Probably close to that every day as well since it's a dedicated group.

http://www.techcrunch.com/2008/04/29/end-of-speculation-the-...


It looks to me like he is saying they are planning on ditching MySQL altogether. That is probably why they don't want to invest in redoing/expanding the MySQL replication.


Twitter isn't that complicated of a website, it shouldn't take them more than a few days to migrate only the high load sections (tweets) to a sharded architecture. All they need is money for more database servers (and they have at least $15 million) and know-how. And yet they're planning to ditch MySQL in favour of... text files? And they can't even make a small MySQL cluster highly available?


Here's an interesting proposed architecture that involves ditching MySQL: http://randomfoo.net/blog/id/4182


does anyone else think TC is handling this situation like a reality TV show where you always suspect they create the drama on purpose?


Yes, TC is a click generator fueled by drama.

... and this game of "let's play Twitter engineer from outside Twitter" the entire space is playing is pathetic. Twitter will grow Twitter. Ev has been through this before and will guide the team and the service to stability. For the rest of us there are better things to do than trying to scale Twitter from an armchair.


At least we know the number of "stories about Twitter scaling" scales. There has been a big increase in the volume in these stories, but they just keep coming.


The original TC article made me feel rather upset. What kind of journalist takes that kind of tone and makes personal attacks against anyone? Arrington has done this repeatedly over the last few months.

Startups have to strike a balance between getting the architecture right and getting a product out the door. The bright and lucky ones come up with a minimally correct infrastructure which is approximately equal to what they can hammer out with their existing resources in the least amount of time. Most of the time, teams aren't experienced or lucky enough to pull that off, in which case it makes sense to try to fail early --- this means not spending time on infrastructure. They get the concept out, and see if anyone wants it at all. There's no point in releasing a product which scales to 100 million concurrent users if no one wants it. In its early days, Twitter must have done what it could to get its service up and working, and used MySQL for the same reason you (yes, you!) use it: the team was familiar with it, and was used to either stuffing code with SQL strings or using an ORM.

After the service took off, a lot of things could have happened. Has Arrington bothered to ask Blaine Cook if he had the resources to do The Right Thing? Did Blaine have the time, or did his company tell him to sit in front a glowing screen and do CPR while the CEO goes out to raise cash? A journalist should have at least indicated that he attempted to hear the other side of the story.

http://en.wikipedia.org/wiki/Yellow_journalism


Journalist?


What is it with some people whining about the (supposed) "damage that Twitter has done to the community". Not really bright, IMHO. Nobody is being forced to use twitter and so far no one has apparently come up with an alternative that is viable enough for droves of people to switch. I don't want to play the uncritical Twitter advocate, but I like the concept, the service and this answer to TechCrunch in particular.


Why Twitter, being a glorified IM, is not moving to an IM architecture like http://www.ejabberd.im? Telecoms have solved their availability issues with Erlang. Why reinvent the wheel?


I heard that Twitter does use ejabberd for a bunch of things, including its "fire hose" stream of all messages. Anyone care to confirm?


Even if they defend their lead architect, he still failed at scaling twitter.


Thanks for downmodding me. But it still holds. If he is the lead architect, it is his job to design the system that works! If it doesn't work, then it's his fault and not anyone elses.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: