
Twitter Shifting More Code to JVM, Citing Performance and Encapsulation - colin_jack
http://www.infoq.com/articles/twitter-java-use
======
raganwald
I like this. I recall a startup some years back with a vision to build
something complex. The founder sat down with everyone and said that we had an
important decision to make: Do we build something that grows or do we build
something that evolves?

Neither way forward was presented as an obvious win. Building something that
grows meant overengineering and trying to forecast the future. Building
something that evolves meant constantly rewriting things, taking time away
from new development.

Twitter seems to have taken the "evolution" route, and it has served them
well. I'm not at all sure they would be this successful if they'd tried to
build on the JVM right from the start using the technologies available at the
time.

~~~
bborud
The problem with Java is mainly that people think they have to use all the
horribly complex, intrusive, badly written, underperforming frameworks that
exist in the Java sphere. And then, of course, you will inevitably be screwed.

To this day it takes _real_ effort to convince Java programmers that a lot of
"best practices" are anything but.

For instance I still see Java programmers use frameworks that require them to
maintain mountains of flimsy XML configs for things that they ought to have
done in code. Both to get the benefit of letting the compiler do the work of
weeding out boneheaded errors and to get rid of unnecessary flexibility that
just leads to more work and more confusing, hard to read code.

Java is a great language in which you can be very productive. But productivity
means that you have to crack down on people who drag along J2EE-crap, or
whatever crap was invented to make J2EE-crap slightly less crap. It also means
you have to mentor people actively and harass anyone who even thinks of doing
in XML what can be accomplished much faster in testable, compiled Java code.

~~~
nerd_in_rage
Yep, I saw this same thing. Mountains of XML deployment descriptors /
"component wiring" (some "Spring" garbage), needlessly verbose and usually
unneeded mapper classes, useless interfaces (my rule is if you are only going
to have ONE implementation save the interface for later after you know you got
it right and need it), extra layers that do nothing (except "map", of course),
etc.

kill me. Makes me want to kill myself.

~~~
GrooveStomp
This exactly describes my very brief experience working with Java
professionally. Back in university it was no big deal - the differences
between C++ and Java weren't _that_ big. In a production environment, though;
... XML, XML, XML!

All the XML-wrapper tools for Java are supposedly the reason why there is so
much XML config stuff going on - but the result is that you end up writing soo
much code in XML instead of the actual language that you're supposed to be
using.

It's like somebody looked quickly at the MVC pattern, got it all wrong, and
now Java has very tight coupling between XML and Java code for any production
environment that hopes to leverage existing tools.

~~~
bborud
And the really bad part is that in Java XML has always been extremely painful
to work with.

------
fizx
One of the key things that was kinda skimmed over: You should check out
<http://github.com/twitter/finagle>.

It takes async network programming with netty into a functional programming
paradigm. Programming scala/finagle network services is much nicer IMO than
coffeescript, ruby/em/fibers, raw netty. I can't wait until we release our
finagle-based cassandra client. It's been really nice to work with.

Here's some sample code:

[https://github.com/twitter/finagle/tree/master/finagle-
examp...](https://github.com/twitter/finagle/tree/master/finagle-
example/src/main/scala/com/twitter/finagle/example)

~~~
jorgeortiz85
I've been using Finagle over the last couple of weeks to build a small little
side-project. It's been almost a religious experience for me. This is how
network programming should be done.

------
clintjhill
In my opinion, the best part of their infrastructure is their willingness to
specialize in languages within each stack. It isn't simply "Java everywhere".

"To allow developers to choose the best language for the job, Twitter has
invested a lot of effort in writing internal frameworks which encapsulate
common concerns."

The single best investment companies can make is to allow developers to choose
their specialty, and their language. Otherwise you have a huge overhead of
skill set mismatch. And your talent pool can be bigger if you're open to more
than one language.

It's a refreshing view.

~~~
badmash69
Twitter can afford the to have multiple languages in their stack but most
start-ups simple don't have that luxury. If you have a early stage start up
with about 10 developers and you need to quickly build a set of server side
component or services for your product, would you want to use multiple
languages to develop that or use one programming language ? Having developers
choose best languages for the job would be disastrous if you end up with your
stack written in 5 different languages. I would not discount the importance
keeping codebase easy to manage.

~~~
j_baker
In my experience, languages don't make a codebase more complex. What makes a
codebase complex is how many subcodebases you have. In particular, I avoid
touching my company's JavaScript code not because I can't do JavaScript but
because it's difficult to learn an entirely new set of APIs.

If you can allow multiple languages to share common code like you can on the
JVM, then I say it's ok to go crazy.

------
lemming
This sounds slightly misleading to me:

 _The primary driver is honestly encapsulation, so we can iterate faster as a
company. Having a single, monolithic application codebase is not amenable to
quick movement on a per-team basis. So when we decide to encapsulate
something, then because of our performance concerns, its better to rewrite it
in the JVM for most systems, than to write a new Ruby system._

It sounds from that like their primary driver for using the JVM is actually
performance, but that they only decide to rewrite components when
encapsulation drives them to do so. I can't see how the JVM provides any
encapsulation benefits over Ruby for new systems.

------
sehugg
Netty is a godsend. Java threads are extremely memory-hungry, so async I/O is
a must for handling many connections. We routinely handle 200K simultaneous
connections on our push servers without breaking a sweat.

~~~
fictorial
Mind if I ask what the specs are for those servers?

~~~
sehugg
AWS m1.large. We don't use nearly the entire CPU or memory footprint, but
m1.small just is a wee too tiny and there's nothing in between.

------
russellperry
"...static typing becomes a big convenience in enforcing coherency across all
the systems. You can guarantee that your dataflow is more or less going to
work, and focus on the functional aspects...But as we move into a light-weight
Service Oriented Architecture model, static typing becomes a genuine
productivity boon."

A 'productivity boon'? I don't understand. At the risk of invoking the ancient
static vs. dynamic religious war, this statement makes no sense to me.

I get that if your codebase is tangled enough, and your unit test suite is
inadequate to "guarantee that your dataflow is more or less going to work"
that maybe _rewriting_ significant portions of it in a type-safe system makes
sense. I guess. But without specific code examples it's hard to say exactly
what he's talking about.

Myself, I've spent many years in both static and dynamic environments and I
know exactly where I'm more productive -- and it's not wrestling complex
parameterized types to the ground, pulling up abstract classes or interfaces,
and/or configuring IOC containers, abstract factories and the like.

I wonder though -- this has echoes of Alex Payne's criticisms a couple of
years ago, which I think Obie Fernandez addressed pretty well:

[http://blog.obiefernandez.com/content/2009/04/my-reasoned-
re...](http://blog.obiefernandez.com/content/2009/04/my-reasoned-response-
about-scala-at-twitter.html)

~~~
wpietri
_A 'productivity boon'? I don't understand. At the risk of invoking the
ancient static vs. dynamic religious war, this statement makes no sense to
me._

I don't know what they mean either, but my first guess has to do with company
size and mobility of staff.

I love dynamic languages most when I'm coding solo or with small teams. I
don't need to express a lot of things to the computer when they're so clear in
my head. But if I'm going to take over an adequately maintained code base, I'd
rather it be in a static language, because more of the intent is explicit.

At this point Twitter has a lot of engineers and is still growing, and they're
in a very dynamic business. It's plausible to me that they get a global
productivity boost even though static languages could feel like a productivity
hit to each individual engineer.

~~~
colin_jack
> But if I'm going to take over an adequately maintained code base, I'd rather
> it be in a static language, because more of the intent is explicit

Out of interest would you feel the same if both codebases had adequate test
coverage?

In the post one of the reasons given was that with a static language you can
pretty much guarantee that a dataflow is going to work, that you won't be
caught out by getting the wrong type. I'd see this being most useful at the
edges of the system, and in those cases incoming data would normally go
through some validation anyway (including through a schema in many cases)
which would normally make clear the types involved.

Having said that I do think in those cases being able to specify types can
makes things slightly easier for newcomers, I'm just surprised its seen as a
big enough advantage to be one of the key motivators to switching language.

~~~
prodigal_erik
A decent static type system ensures every object will be compatible with the
types of all its references, no matter how the program may manipulate them.
Only 100% path coverage could replace that guarantee, and that's generally
regarded as infeasible.

No project I've worked on in twenty years had test coverage I could call
"adequate", though I realize this is partly my fault. Hard-core TDD from day
one might get you as far as "mediocre", and the industry average is much worse
than that.

~~~
colin_jack
Sure thing but I was really thinking more of the "more of the intent is
explicit" issue (and its affect on developer productivity) not guarantees
regarding compatibility.

------
jrockway
jrockway's law: add enough developers to a project and it eventually becomes
Java.

~~~
aristidb
Except it seems to be Scala here?

~~~
aphexairlines
"In the case of the search team, since they do a lot of work on Lucene, which
is Java-based, they have a lot of experience in writing Java code. As such it
is more convenient for them to work in Java than Scala or another language."

------
SoftwareMaven
One of the important lessons, IMO, is that you should always be making
pragmatic decisions that work today and into the _near_ future. You can't
predict how your system will change over time, so engineer in today's needs
and let tomorrow take care of itself.

Pragmatic failure inevitably leads to analysis paralysis. Just worry about
getting stuff done. :)

------
petenixey
I wonder if the reason Twitter never sees any significant evolution in product
is because they've weighed themselves down with too much iron cladding on
their services or if they've had more time to iron clad their services because
they never evolve the product.

Neither may be related but for a large company with very little product they
seem to produce astoundingly little.

~~~
akronim
They may have very little functionality, but do it at scale. e.g. even
searching tweets (which doesn't seem to work that great...) is a massive
undertaking (not _astoundingly little_ ) that they're obviously still working
at.

~~~
petenixey
That's a fair point and I don't trivialise the engineering however I am taken
aback by the just how little product they produce. In a similar timeframe
Facebook had created vast swathes of product and dealt with far greater user
numbers than Twitter.

------
rockarage
This is similar to facebook going from php to c++ via hip hop. (
<http://developers.facebook.com/blog/post/358/> ) Very few companies will ever
reach the scale of twitter and facebook. Building with ruby, php, python..
whatever your team is comfortable with is still ok.

------
msie
Holy name-soup, Batman! Stuff that was new to me: Gizzard, Finagle, Blender,
Netty and Earlybird.

------
gary4gar
I wonder how the story had been, if twitter had opted for Python, instead of
ruby.

Do they still have these problems or in these aspects python is better than
ruby?

~~~
hkarthik
Given the huge growth of the site in the first few years and the relative
inexperience of the founding team with problems of this magnitude, I expect
the story would have been the same with Python, Java, .NET, etc.

The simple Web Server -> ORM -> Relational database architecture that most
modern web frameworks utilize can easily break down under tremendous
concurrent load, especially if you attempt to run it on commodity hardware.

------
riprock
As a JRuby fan, I have my fingers crossed that they will switch to JRuby and
help it become "more mature." :)

------
cheez
Hopefully they don't have any Java experts on board.

