
Twitter's problems have never been Rails problems - sant0sk1
http://blog.evanweaver.com/articles/2008/07/10/a-statement/
======
blader
The only problem with Rails are all the armchair scalability experts and
backseat developers who read TechCrunch and use Twitter and think they know a
thing or two about scaling. When Julia Allison goes on web radio and starts
holding court on it, you know that whole meme has really jumped the shark.

Complaining about Rails' scalability is a really really good signal that
you've never had to scale anything. If you had, then you'd already know what a
small part your application framework plays in a scalable architecture.

~~~
j2d2
Can you explain a little about the problem? It makes sense to me to store
everything in the database and scale that. A solution to scaling the database
seems to be horizontal partitioning where possible. Is this hard to do?

Am I missing something about why scaling frameworks like rails or django is
hard?

On the topic of twitter scaling, I can see the need for heavy database
activity here causing scaling issues. Can anyone provide insight into how they
solved this if it wasn't something like horizontal partitioning?

~~~
mechanical_fish
_It makes sense to me to store everything in the database and scale that... Is
this hard to do?_

Is Larry Ellison a multibillionaire?

I think it's impossible to explain database scaling and performance tuning in
a handful of paragraphs, and I'm certainly not the one to try. If I _were_ to
try and tell you exactly what it is about Twitter that makes engineers cry,
I'd point out that every single page is dynamic, every user's main page
requires a giant JOIN, there are lots of writes coming in all the time from
every direction and writes are harder to scale, low latency is a requirement
for many people, and there's no obvious axis along which to "partition"
Twitter. For example, the PlentyOfFish guy talked about how he could split up
his databases based on geography -- it's overwhelmingly likely that people in
South Bend, Indiana want to look for dates within fifty miles of South Bend,
Indiana, rather than in Spokane. But on Twitter I can follow anyone and anyone
can follow me, so the giant JOIN that builds my homepage has to span the
entire dataset. And, sure, you can build a cache for every user, but then
every user who sends a tweet to 1000 followers triggers the update of one
thousand caches, with one thousand internal messages to one thousand event
queues... and when one machine full of user caches goes down, what then? It's
not acceptable to drop the message on the floor for a subset of users.

Some folks will solve this, but they will be better hackers than I, they will
spend a lot of money on hardware, and they will drink a lot of coffee. And it
will take _time_.

~~~
j2d2
_If I were to try and tell you exactly what it is about Twitter that makes
engineers cry, I'd point out that every single page is dynamic, every user's
main page requires a giant JOIN, there are lots of writes coming in all the
time from every direction and writes are harder to scale, low latency is a
requirement for many people, and there's no obvious axis along which to
"partition" Twitter._

That's more along the lines of what I was looking for. Thanks. This sounds
tricky...

Just came across this link now: [http://highscalability.com/scaling-twitter-
making-twitter-10...](http://highscalability.com/scaling-twitter-making-
twitter-10000-percent-faster)

------
Hoff
Twitter is a multicast-capable software-based crossbar switch, with a
necessary form of message journaling and replay.

Where the destination of or target of the message is what selects and pulls
the messages.

And with sparse addressing.

Twitter reverses the norms of routing; the receivers control the routing. It's
a distributed forest of packet-cloning and routing servers.

And the entirety of the routing tables involved here are stonking huge.

I'd not look to apply nor use Rails and typical databases here (first); I'd
approach this from a completely different direction. This application just
doesn't fit into a classic database. You're not even sharding the right pieces
here, if you're looking at the database(s).

------
jonknee
Humorously he updated the post to highlight a comment that's "right on". A
comment about a popular Rails site (insiderpages.com) that apparently handles
high traffic with is. Except that it's unavailable at the moment. Right on
indeed!

~~~
aditya
works for me.

------
jawngee
Are you sure it isn't a rails mindset problem?

~~~
mechanical_fish
Yes, indeed -- if by "rails mindset" you mean "build the minimum amount that
will work - YAGNI; launch early to gauge interest and get feedback; use
established, standard tools, methods, architectures and libraries when
possible; premature optimization is the root of all evil; when you do hit a
scaling problem, it will almost certainly be architectural in nature."

Of course, once your app explodes in popularity and you realize that it's more
of a messaging service than a CRUD app, you've got a problem -- the Rails
Mindset problem. But this is the problem that the Rails user _wants to have_.
The problem the Rails user _doesn't_ want to have is the one where you spend
months learning how to engineer a scalable messaging app, launch it, and find
that nobody cares and that it can't be marketed.

(And that was the _obvious_ risk for Twitter at its inception. _Everyone_
thought Twitter was a waste of time at first. _I_ certainly wouldn't have
wasted more than a few weeks building it, and I would have never imagined that
I'd wind up struggling to move millions of messages a day.)

If you want a world where you don't have _any problems at all_... I don't know
what to do, but don't found a startup. ;)

(Sorry you got downmodded, BTW... my advice is "be less terse". I tried the
Zen Koan method of posting a few times when I started out here; it's
unreliable.)

~~~
swombat
I wish I had more than one mod point to give you.

Scalability issues are a very nice problem to have.

~~~
sant0sk1
I gave him one on your behalf. Now if you upmod this comment you will no
longer owe me one ;)

~~~
swombat
Done! :-P

Now up-mod me again. Let me know when you've done it, and if we keep this up
we might find out whether pg imposed any limit on commenting depth ;-)

~~~
rapind
now you both owe me.

------
bprater
Twitter isn't your standard Rails application. It was hacked out with Rails
because it was easy with Rails. Nobody clearly understood that Twitter wasn't
a traditional Rails app until it was too late.

The biggest scaling issue with frameworks (like Rails) that obscure the
database operations, which tend to be the most expensive, is that
inexperienced developers won't spend time picking through the logs, trying to
find ways to eliminate or speed up queries.

~~~
gaius
That's very true. I mean, the very same people who use Rails probably sneer at
VB but it's the same principle - you can trade development speed for runtime
efficiency.

The important thing is to be aware when and why you are doing this.

------
icey
Who is Evan Weaver? How does he know what Twitter's problem is?

~~~
sant0sk1
Evan Weaver is a Ruby programmer who Twitter hired awhile back to help fix
their scaling problems.

~~~
icey
Thank you; that wasn't immediately evident by looking at his bio on his blog.

~~~
mhartl
It isn't? <http://blog.evanweaver.com/about/> says "I currently work at
Twitter" and has a link to his resume, which says he started there in May.

~~~
icey
Bizarre; I didn't see that there when I looked earlier.

------
ericb
One problem is that startups don't usually approach scaling like mature
enterprises. Your users should not be your _load test_.

Twitter has money now. It's time for them to grow up and use a test
environment that mirrors their live site, use something like loadrunner,
simulate the next level of traffic, and remove bottlenecks _before_ that
traffic-level hits their live site.

~~~
LogicHoleFlaw
Most startups, if they approached scaling like mature enterprises, would never
have the opportunity to become a mature enterprise.

~~~
ericb
I agree with you--my point was not that twitter should have initially
approached scaling like large enterprises do--only that as a natural
consequence of smallness, small companies don't and aren't in the habit.

Twitter now has the resources and real user load. The window where performance
problems are permissible closes eventually--see Friendster. _Premature_
optimization is the root of all evil. But when your userbase is threatening to
leave and dreaming up alternatives on their blogs daily, it's hard to call it
_premature_.

~~~
swombat
You're making it sound like so-called "mature enterprises" approach scaling in
a mature way. Perhaps "enterprises" like Yahoo, Amazon and Google do, but I
worked in a very mature investment bank for 4 years and though they had some
load testing, none of the systems I ever saw built were prepared to handle a
load that might grow by a factor of 10 in a year. In fact, most of them
struggled to slowly deliver even with the current load. That's true for both
internal bank systems and external client-facing ones.

~~~
ericb
So your company _does_ load testing. I didn't claim every big company tests
adequately, or even has a process in place, so lets put straw men away.

Consider apps like major hosted tax apps on tax day (Quicken, H&R block,
etc.). Consider the load they are under--a substantial portion of the US files
their taxes in a 12 hour period. It makes twitter look _wimpy_. And they
release what is essentially a rewrite _every year_.

How do they make something so critical work under a massive load their app has
yet to undergo, in advance, while twitter fails? Load that grows 100x in a
year can be modeled, estimated, and simulated and the software can be tuned
and hardware scaled _in advance_. They manage it. We have the technology.

------
BrandonM
As of now, this submission has 49 points. That means that its point total is
2.5 times higher than the number of words and half the number of characters
that the original post had (before the "postscript" was added). Does anyone
else think that's a bit absurd, especially considering that its not some
timeless piece of advice or pearl of wisdom?

------
volida
"I just want to go on record saying that none of Twitter's problems have ever
been Rails problems."

The way that the phrase is wrriten, demonstrates that there was lack of
organization.

~~~
j2d2
It also doesn't say anything. It'd be nice if he explained a little more of
his position. A common thing I hear is that Rails simply puts the problem of
scale into the database. The problem may not be a Rails problem, but the
design of Rails doesn't solve the problem either. It just puts it somewhere
else.

I have not worked with Rails, though. This is just what I've read. I'd prefer
if an insider like Evan Weaver would elaborate a little more on what problems
they had and how they solved them. All I know is that some people say Rails is
a problem and they have evidence for that and this guy says it's not and
suggests nothing more.

~~~
nostrademons
Many folks would say the database is where scaling problems belong. Lots and
lots of programmer hours have already been expended in figuring out how to
scale databases; no sense duplicating that effort in scaling your web
framework.

I'd say Twitter's problem is that they're a multicast messaging app with a
database backend, a combination that tends not to work so well. As an intern
in college, I worked on a financial J2EE app that tried to do everything
through JMS messaging queues backed by Oracle. It had similar scalability and
performance problems.

In Twitter's defense, they didn't know they'd be a multicast messaging app
when they started, and a database is a logical choice for what they did know
they'd be (a website). They'll figure things out; they just need time and
resources to rearchitect.

~~~
j2d2
_Many folks would say the database is where scaling problems belong._

This makes sense but I've found the db to be a rather large bottleneck. On one
hand, it's nice to store everything in a single blackbox and the apps don't
have to care about data. On the other, databases can be slow and I figure some
kind of caching is involved between them.

I could see that a site would simply use the db to generate static pages which
would then be cached, but as you said, multicast messaging with a database
backend seems like a tricky combination.

Edit: I realize I've kinda thought outloud here. I've summed up the general
idea with the question below.

I take the problem to be that going to the actual database is necessary. What
_are_ the possible solutions for something like this?

Horizontal partitioning seems like the obvious answer. Is it _that_ easy,
though?

~~~
LogicHoleFlaw
Horizontal partitioning depends on your data. Do you have lots of joins and
interdependent data? Partitioning will be tricky. Do you just need to store a
lump of data relevant to one customer, many times over? Horizontal
partitioning is easy.

Relational databases are really cool. They're not the right technology for all
apps though. The big wave-makers right now are things like Google's BigTable
and Amazon's SimpleDB. They're implicitly horizontal but with reduced querying
abilities compared to traditional SQL.

It's a hard problem to scale a traditional relational DB across multiple
independent commodity boxes like we see at Google or with AWS. The "stuff
everything in a giant hashtable" approach scales nicely across a bunch of
cheap boxes.

