
How We Moved Our API From Ruby to Go - spimmy
http://blog.parse.com/uncategorized/how-we-moved-our-api-from-ruby-to-go-and-saved-our-sanity/
======
aikah
Go is a good target to move something to when one knows one's business domain
and performance issues.

But I wouldn't recommend it to beginners trying to build a secure web
application, because they will have to roll their own code when it comes user
registration, authentication, authorization management, and i'm not even
talking about writing thread safe libraries where one has to deal with mutexes
and shared memory access.

If you need something highly concurrent and you know what you're doing, by all
means. But Go is definitely not RAD for classic website development, and it's
easy to shoot oneself in the foot with Go. It's by no mean a silver bullet.

I'm surprises Parse is coded in Rails,I assumed to was built on Nodejs, since
it uses JS for cloud code.

~~~
spimmy
Good points. Technically we still use rails for our website, and have no plans
to move off of it. Rails is _great_ for websites! It's just our API that we
rewrote in Go.

~~~
beat
So if you had to do it all over again, would you write the API from the start
in Go, or would you start with Ruby and transition to Go again?

~~~
spimmy
Hard to say. Ruby really did let us move and ship products insanely fast,
without having to sink precious engineering time into boilerplate and standard
libraries. Most startups fail, and it's often because they couldn't move fast
enough. So I don't think it was a bad choice at the time. We were able to do
the rewrite once we had grown up a bit, gotten acquired, hired more engineers,
and had a proven business model that was worth investing in.

~~~
supster
Did you guys consider Erlang or Elixir? Elixir might have especially been a
good fit given the history with Ruby.

------
mattknox
Twitter never used Jruby in any significant capacity. Note that the article
she cited ( [http://www.infoq.com/articles/twitter-java-
use](http://www.infoq.com/articles/twitter-java-use) ) said we were evaluating
jruby, but chose not to use it because the tooling around MRI was much better
(in substantial part due to the efforts of twitter's backend team at the
time).

We probably could have made a great jruby memcached/MySQL/thrift client, but
it wasn't clear that doing so would have much performance win, as jruby itself
wasn't dramatically faster than MRI. It would have, however, made it really
easy for us to offload intense bits of code to java code, which probably would
have been a faster upgrade path than rewriting in scala as we did.

~~~
fizx
Our production thrift code generator (now at
[https://github.com/twitter/scrooge](https://github.com/twitter/scrooge)) was
JRuby for about a year, before Robey decided it was an abomination and rewrote
it in scala.

JRuby was easy because you can Maven require it from a Java project. Ruby
already has a Thrift IDL parser, so I just stole the AST from that, and used
an ERb template to write out corresponding scala. The whole thing was maybe
200LOC.

But yeah, that's the only JRuby that ever did anything production related at
Twitter.

------
waterside81
I'll add another data point. We're in the process of moving our entire API
stack to Go from Python. First it was in Django, then Falcon, now more & more
pieces are in pure Go, with a little cgo sprinkled for good measure. Apart
from being a language that's easy to pick up if you're familiar with Python,
Go is obviously a heckuva lot faster and way easier to deploy.

We cut down our EC2 instance usage by 2/3 with more improvements yet to come.
One machine alone can handle 1000 API calls / second - and our API calls are
performing complex calculations, not just disk I/O.

It also allows us to deploy our API within customer's networks if they choose,
which we previously accomplished using Virtual Machines - which sucked.

------
bsaul
Afaik, the jvm is often praised for being the gold standard in terms of vm,
both performance wise and in terms of tools available to instrument it. I'm
curious as to why the parse engineers were not looking forward to using it.
I've always thought using it was a good point for a language...

~~~
mahmoudimus
In all seriousness, I also wonder the same thing. The JVM is often praised for
being the gold standard, because it is. It is superior in every way to
anything else, with Erlang's a distant second.

The article mentioned that due to the asynchronous nature of goroutines,
instrumentation and metrics publishing was not a problem. I do not think this
just applies to Go, but any primarily async languages -- like node.js, etc.
However, this is still not close the JVM's incredibly sophisticated support
for metrics and insights for your running applications.

Re: Go, I can see the value in having concurrency built-in, but I see
[http://www.paralleluniverse.co/'s](http://www.paralleluniverse.co/'s)
libraries like Quasar and Comstat ultimately becoming the defacto standard in
modern day Java programming.

With Java8 in rapid adoption and the upcoming changes in Java9, alongside
pragmatic languages like [http://kotlinlang.org](http://kotlinlang.org) with
incredible tooling out of the box, Java is looking like a mighty fine eco-
system to get started with.

That being said, it is extremely frustrating developing on a mature and
advanced eco-system. As an engineer transitioning to Java from Python, I
constantly have a lingering feeling that I'm not going to deliver idiomatic
code and I don't know the best way to do things. Since I'm not aware of why
design decisions were made in a certain way, probably to preserve backwards
compatibility, I always ask myself why this was the best way and it's a bit
difficult to research why. I also needed to familiarize myself with various
patterns and Java's idiosyncrasies (1 public class per file, for example).

My 2cents.

------
xacaxulu
Between this and Pivotal moving Cloud Foundry CLI tools to Go, I keep seeing
more reasons to add Go to my CV. Ruby will always be fun, but I'm guessing
there will be some serious $$ & interest in Go projects in the coming years
(even more than currently).

~~~
morenoh149
I've been trying to learn java spring boot to add to my CV. How do you feel
about java?

~~~
NateDad
There's a million companies using Java, I doubt it'll stop being a good
addition to your CV for a long time.

~~~
fixxer
That is exactly how I feel about Java: lots of companies use it, so I should
know it.

Go is a pleasure to work in. Java, less.

~~~
codygman
> Go is a pleasure to work in. Java, less.

Sometimes I wish I'd stopped at Go instead of then exploring Racket, Ocaml,
Clojure, Haskell, and Scala so I could continue to hold your opinion.

The end result still matters the most to me, but being so aware of how
languages are getting in my way and making me do busywork is exhausting.

~~~
fixxer
I only compared Go with Java.

~~~
codygman
I know, I was just adding on that I am sometimes jealous of you being able to
say "Go is a pleasure to work in" when it isn't so much of a pleasure for me
anymore.

------
julien_c
“The MongoDB Go driver is probably the best MongoDB driver in existence, and
complex interaction with MongoDB is core to Parse.”

It's also the only one that's not maintained by MongoDB Inc. Coincidental? :)

PS: And yes, `mgo` by Gustavo Niemeyer is pretty incredible.

~~~
romanovcode
I think author should see MongoDb C# drivers.

~~~
codygman
Perhaps unsurprisingly, I'm very fond of the Haskell mongoDB package. Here's
an example:

    
    
        import Database.MongoDB
        import Control.Monad.Trans (liftIO)
        
        main = do
           pipe <- connect (host "127.0.0.1")
           e <- access pipe master "baseball" run
           close pipe
           print e
        
        run = do
           clearTeams
           insertTeams
           allTeams >>= printDocs "All Teams"
           nationalLeagueTeams >>= printDocs "National League Teams"
           newYorkTeams >>= printDocs "New York Teams"
        
        clearTeams = delete (select [] "team")
        
        insertTeams = insertMany "team" [
           ["name" =: "Yankees", "home" =: ["city" =: "New York", "state" =: "NY"], "league" =: "American"],
           ["name" =: "Mets", "home" =: ["city" =: "New York", "state" =: "NY"], "league" =: "National"],
           ["name" =: "Phillies", "home" =: ["city" =: "Philadelphia", "state" =: "PA"], "league" =: "National"],
           ["name" =: "Red Sox", "home" =: ["city" =: "Boston", "state" =: "MA"], "league" =: "American"] ]
    
        allTeams = rest =<< find (select [] "team") {sort = ["home.city" =: 1]}
        
        nationalLeagueTeams = rest =<< find (select ["league" =: "National"] "team")
        
        newYorkTeams = rest =<< find (select ["home.state" =: "NY"] "team") {project = ["name" =: 1, "league" =: 1]}
    
        printDocs title docs = liftIO $ putStrLn title >> mapM_ (print . exclude ["_id"]) docs

------
HugoDias
I would love to know about Parse's stack's, please, consider sharing with us
:)
([http://stackshare.io/trending/tools](http://stackshare.io/trending/tools))

------
jbbarth
The article makes some very good points, and I'm surprised you don't talk more
about the deploy advantages for instance. But I was a bit annoyed by little
things that I think are inexact:

\- the "one-process-per-request" meme along the post applies only to some ruby
app servers (there are event loop and threaded models too, think thin, puma,
passenger in some modes) and I guess reading between the lines that it's
mostly a problem of thread-safety and async support, because of the gems Parse
used to have, right? I'm sure that limits options at some point anyway, but
the statement is misleading and not really explained, I'd love to hear more
details

\- I don't understand how the comments in the little Go file snippet applies
in any way to "ruby" ; it may be rails caching mechanisms, or a specific gem,
but I have a hard time mapping those very specific details to something
intristic to ruby, it seems more like grumpy ruby bashing, like you'd have
done php bashing 5 years ago

As all rewrite stories, I think there's a part of envy/excitement over the new
cool tech you want to use (and that's fair! pleasure give you huge
productivity boosts), and also a part of success related to the fact you
_know_ the kind of things you failed in the first version, so you won't make
the same mistakes the 2nd time.

I'd love to hear finer details on those points! Great article overall anyway

~~~
spimmy
You are totally right, most of the stuff that really hurt us was Rails
middleware magic, not Ruby itself. I should have been more precise -- grumpy
rails bashing, not grumpy ruby bashing! FWIW we still use Ruby on Rails for
our website and it's great for that.

I'm hoping to get some followup posts from the backend eng team on specific
interesting problems we ran into during the rewrite.

& yes deploys with go are the freaking bomb :)

------
headius
It's a decent article but the justification for rewriting is totally
discredited by not even having tried JRuby. Many apps drop right in and get
true concurrency, better GC, and faster performance for free. It sounds like
JRuby wasn't even given a chance, and I know they never contacted the JRuby
team to talk about it.

------
rdw
This bit jumped out at me: "200 API servers ... to serve 3000 requests per
second". That's only 15 RPS per server. Is that normal for Rails?

~~~
spimmy
we had to way overprovision to handle even momentary spikes in availability
from any backing store. we aimed to run at around 20% unicorn worker
utilization under normal conditions.

~~~
twelvenmonkeys
But doesn't this sound less about over provisioning and more about optimizing
the code you already had?

~~~
spimmy
optimizing won't help you here, unfortunately. the process-per-request model
is fundamentally flawed past a certain scaling point.

------
AKifer
Starting a business with rails makes sense and still do. And later moving to
another technology that scales better makes sense too. After all, you will
probably be a millionnaire at the time you need to scale your product so why
worry ?

~~~
caseysoftware
It's less about having the money and more of having other resources like time,
people, and skills. Or having enough understanding on _when_ you need to
execute the growth plan. And suffering the consequences if you time it wrong
or execute poorly.

But of all problems to have, there are many worse than exploding growth.

------
mordocai
Rails does not require a process per a request model. It is great that you had
success moving to Go, but you could have moved to a different model with rails
and likely solved your problem.

~~~
gfodor
It is seriously amazing to me that "running MRI rails with threads" was not on
the list. At the very least, before ruling these things out completely based
upon research alone they should have prototyped some of the easier solutions
(and potentially deployed them to one or two machines) to prove their
hypotheses. Just saying "JVM tuning is hard, lets re-write the API" is the
type of thing that falls into all the typical traps of second system syndrome.

Of course, on the other side of things, everything feels rosy -- but
counterfactually all the effort they spent on this could have potentially gone
elsewhere if they resolved their scalability issues with Rails in a simpler
matter. (Or even better, contributed those solutions back to the community.)
This was a move that fortunately worked out, but it sounds pretty high risk to
me and is the type of thing that can kill companies if they bet wrongly.

------
barosl
> Stuff like doubly encoded URLs

Could you elaborate on this? It sounds a bit scary. Does this mean that Rails
tries to decode a URL several times until it can't be decoded? If so, isn't
this problematic if some (arguably crazy) person tries to send "%2F"
literally, not "/"? I'm half sure I'm misinterpreting, so here to ask.

------
vruiz
I wonder if had Microsoft open sourced C# earlier the choice would have been
different. Seems like it.

~~~
badpenny
C# isn't cool.

------
crimsonalucard
Didn't even look at Haskell. Does any expert know how Haskell would fare in
this situation?

~~~
codygman
Disclaimer: Not an expert, just someone who has some "Real World" (dayjob)
Haskell experience.

Like Go using say warp[0] and async[1] likely with similar performance numbers
but with less code, more static typing guarantees, and simpler[2] code. Like
Go though, you'd deploy with a static binary. This is just a wild guess
though, I would need to know specifics of the Go application they've created.

0:
[http://hackage.haskell.org/package/warp](http://hackage.haskell.org/package/warp)
1:
[http://hackage.haskell.org/package/async](http://hackage.haskell.org/package/async)
2: I find redundant code like Go requires[3] to be more complex. Haskell code
can be complex, but simple, straight-forward, not trying to be complex Haskell
is very simple and concise. 3: Well, you can use interfaces and lose type
safety. Or you can use reflection and make things dog slow.

------
culo
What tools are you using to do API management on top? Have you seen the Open-
Source KONG? [http://github.com/mashape/kong](http://github.com/mashape/kong)

------
dirkgadsden
"How We Moved Our API From Ruby to Go and Saved Our Sanity"... Right, so using
Go will alleviate severe mental illness? Why isn't Golang all the buzz in
professional psychologist and psychiatrist circles?

~~~
spimmy
it's more like, getting paged in the middle of the night over and over and
over is a known mental health problem :)

~~~
orenmazor
so true. just reading your comment was enough to make me feel frustrated and
anxious.

