
Twittering About Architecture - raghus
http://dev.twitter.com/2008/05/twittering-about-architecture.html
======
ricardo
There have been a lot of speculation and analysis about the challenges that
this type of site will face but I think twitter's problems are much simpler.
It was built by a group of people who had never scaled a site to that level
before. They've been trying to cope with their success but it sounds like they
don't have the right people to write scalable software and don't have enough
operations people to keep their systems up.

I attended Railsconf 2007 and two of twitter's developers held an impromptu
session to discuss the scaling challenges they faced. Twitter had just
experienced several days of downtime and the room was filled with people
interested in hearing all about it. It was pretty obvious from that one hour
session that their issues are more than just technical.

At one point they showed some code that was used to balance some of their load
among a group of servers. Unfortunately they wrote the function in such a way
that it was limited to a maximum of 10 servers. Once they exceed the capacity
of those 10 servers the whole site came crashing down. They seemed like nice
guys and were excited about their work but it was obvious based on their
comments and the crowd's reaction that they were out of their league. I got
the impression that they are doing whatever it takes to keep the site up and
running today, even if it brings the site down tomorrow.

Slides from the session I referenced: [http://www.slideshare.net/al3x/scaling-
twitter-railsconf-200...](http://www.slideshare.net/al3x/scaling-twitter-
railsconf-2007)

------
gruseom
The post says little that's specific. What it does say - our problem is
different, it's hard, we're making progress - might all be true, but these are
also the kinds of things people say when a software project is in trouble.
It's hard to tell anything at all from reading the post.

~~~
joshwa
They did call out one of the many "why is it so hard to scale twitter" posts
as being "one of their favorites":

[http://www.hueniverse.com/hueniverse/2008/03/on-scaling-a-
mi...](http://www.hueniverse.com/hueniverse/2008/03/on-scaling-a-mi.html)

~~~
nirmal
Everyone should read this blog post and the 2 followups. It brings a lot of
insight into the parts of Twitter that bring it down and it has nothing to do
with the message pushing part of Twitter.

I have a Jabber bot running that is hooked into the entire public timeline and
it continues to run even when the main website has gone down. I don't think
you can directly compare Twitter's issues with issues of Meebo, AIM, GTalk,
etc.

------
KirinDave
It's hard for me to imagine why they're choosing Scala over Erlang. It seems
to me like one of their major requirements is to have task distribution.

Erlang makes distribution and concurrent execution semantically identical,
Scala doesn't. And there doesn't seem to be a good reason to base on the JVM,
they aren't already next deep in Java libraries.

I'd really like to know what they're thinking over there.

~~~
jksmith
This task does seem to have Erlang written all over it, but of course we're
not intimately familiar with all the details.

The mindset for selecting tools to build apps which need to scale seems be
similar to the one we all had back in early dos days, when the company
programmers were having to write all the internal software. We were either
unaware of better solutions or we simply felt compelled to write all the stuff
ourselves, from spreadsheets to a GL module. Now, with all the customizable
apps around, why would anyone want to write a custom GL module?

By extension, why try to reinvent solutions for scaling problems when Erlang
already does a bunch of the work for you?

------
jbyers
Nicely written. The authors of nearly all the myriad "how to fix twitter"
posts I've read have trivialized scaling a site that has different properties
from most online services. Scaling _anything_ to twitter-like traffic levels
is hard, and you won't know exactly why it's hard until you do it. I'm deeply
skeptical of any armchair architect who says otherwise.

~~~
jrockway
_Scaling anything to twitter-like traffic levels is hard_

But we've had e-mail working fine for years, and Twitter is just e-mail.

~~~
jawngee
Not sure why you got modded down because it's essentially true and I took
their post to be alluding to as much. My intuition tells me that though they
are huge, they aren't remotely in the same area as AIM, MSN, GMail, Yahoo
Mail, etc. Twitter hangs right in the middle of being an instant messaging and
email service.

I'll leave the armchair architecting to others.

------
axod
Why is there so much media coverage of twitter when the average person has no
clue what it is, and probably never will?

~~~
nreece
I agree. Meebo, for example, works on nearly as complex an architecture as
Twitter, but they seem to be scaling well.

~~~
axod
I'd say far more complex.

------
thomasfl
They're hiring people which can "Code using primarily Java, Ruby, C/C++ and
Scala" <http://twitter.com/help/jobs>

~~~
michaelneale
An interesting collection of tech. There was someone from twitter at the scala
lift off day I noted.

------
brianlash
I admit their "Something's technically wrong here" error was getting old, and
they needed to address the issues that have made that message so pervasive
lately. So they did (if indirectly).

But people like to act like Twitter owes them something. Fact is, it's a free
service.

In the spirit of building their business they clearly want -- in fact they
need -- to build the best product they can; the fact that it's taking too long
by _our_ estimation doesn't matter. They're not going to leave money on the
table so you can believe they're working hard to fix the problem and start
working on their revenue model.

Kudos to the Twitter team for returning critical volleys from some in the tech
community without losing their cool.

------
redorb
great post, Basically put up or shut up! (code that is) ... I hope they get
resumes from the post of people who can help.

\- Re-writing their whole system incrementally will be expensive; I think they
might need google money and brains to complete the whole thing... or a lot
more than 15mm.

~~~
ruslan
Rewriting any system does not take much money, it takes a lot of time and a
pair of _good_ devoted hackers. If you stuck with bad ones, even money won't
help. I believe, twitter's hackers are good enough to deal with the problem
and I don't think their system is so complex. It's too young to get covered
with all the dust and rust :-).

------
wave
I am sure MA's account will be the first one that will be moving into the new
reliable architecture :)

------
atog
That's a well written post IMO.

