

From Ruby to Go: a rewrite for the future - itsderek23
http://blog.scoutapp.com/articles/2014/09/25/from-ruby-to-go-a-rewrite-for-the-future

======
vidarh
I find it a bit funny to see concurrency dragged out in a case that's pretty
much a perfect example of where Ruby handles concurrency quite decently, both
in term of performance, and compact, simple code.

E.g. his URL example:

    
    
        require 'net/http'
        require 'uri'
    
        urls =     ["http://www.cnn.com","http://espn.go.com/","http://grantland.com","http://www.newyorker.com/"]
    
        urls.collect do |url|
          Thread.new do
            response = Net::HTTP.get_response(URI.parse(url))
            puts "#{url}: #{response.code} #{response.message}"
          end
        end.each(&:join)
    

I'm not so convinced about the performance argument, though. I have a pure-
Ruby statsd implementation collecting massive amount of metrics from dozens of
servers and calculating averages, feeding time series to Redis, pulling them
out and rolling them up back into Redis and to a Couchdb server for longer
term storage - it was a "hack" I threw together to replace our (admittedly
simplistic) usage of Graphite because Graphite does things to your disks that
no hard drive should have to suffer through, and it's not breaking a sweat.
I'm sure you can squeeze out a bit more with Go, but for a monitoring setup,
if you're not spending most of your time waiting on IO in the kernel, I'm not
quite sure what you're doing.

I can see their point on packaging and delivery, though, having had to deal
with packaging up three sets of patched gems for three different Ruby versions
just in our office environment + servers internally for an internal tool. One
of the reasons I'm experimenting with my ahead-of-time/"mostly static"
compiler for Ruby is that I'm very much for static binaries.

But we also already do have jruby which makes packaging a lot easier too.
Granted I can understand people not wanting to deal with the JVM.

~~~
th0br0
Could you elaborate on the negative hdd aspects of Graphite? I'm evaluating
using it and would love to know more!

~~~
vidarh
(the short-ish version: make sure you test it with a realistic number of
metrics and updates coming in from a realistic number of servers; if it can
keep up on hardware you're willing to throw at it, then by all means go for
Graphite - it has tons of flexibility that makes it a convenient way to get
started. Just expect it to need a fairly hefty amount of disk IO compared to
the amount of data you feed it in; the interfaces to Graphite are simple
enough and narrow enough you can replace it later if need be, once you know
what parts of it you're using)

Graphite essentially assumes that disk is cheap. Even fast disks. And either
you're not collecting very big data sets, or you roll up to more rarely
updated series very rapidly, or you have a hefty high performance RAID to put
it on, or you distribute it over multiple servers.

And for a lot of people it's probably true it'd be cheaper to just buy a
suitable set of SSD's and doing RAID10 and/or spreading it out over multiple
servers or something than spend time putting together something else.

But if you start seeing yourself needing to scale up substantially, it may be
worth thinking through exactly what your needs are and whether you can get
away with less by giving up on some of what Graphite offers.

And one outcome of the expectation of fast disk is simply that Whisper - the
underlying storage code - makes some assumptions that leads to lots of
unnecessary system calls; lots of seeks that turns out to be no-ops, and small
reads and writes. All of them are common problems that people tend to be
unaware of, but they were painfully obvious after some strace usage. Just
fixing that would improve Graphite performance substantially even without
ditching functionality, but I really did not feel like digging into unfamiliar
Python code to get halfway there when I could just as quickly replace the
whole thing.

That assumption of being prepared to use a fast RAID just for some metrics for
our dashboards and monitoring didn't sit well with me in general.

And given that we basically just used it to aggregate a set of time series
over relatively recent history and don't need most of Graphite's features, and
that we don't particularly care if the occasional server crash takes out some
of the most recent history with it on one server when we can cheaply run a
second collecting most of the same data (and would've ended up with more than
that with Graphite), it seemed like worth doing things differently. I
basically replaced Graphite (the subset that we use of it) with this, in the
course of a few days:

* A tiny statsd implementation in Ruby, that flushes data to a Redis with disk snapshots turned off. It does this every 10 seconds.

* An aggregation script that's a couple of hundred lines that uses key based filtering to aggregate all 10-second data for any five minute interval into a new time series, and again aggregate all five-minute interval data into hourly time series, and so on, and that moves all data older than a few hours to a disk backed CouchDB (could put it into "anything" really)

* A tiny Sinatra app that provides a somewhat Graphite compatible JSON output (but with no attempt to support the Graphite query syntax or server side image rendering.

* A slightly modified Graphene version to build our dashboard on.

It does by no means do everything Graphite does, but what it does, it does
without brutally hammering disk. It consume a tiny fraction of the resources
Graphite used too.

------
jewel
For completeness sake, here is an example of how to make it parallel in ruby:

    
    
        require 'net/http'
        require 'uri'
        
        urls = ["http://www.cnn.com","http://espn.go.com/","http://grantland.com","http://www.newyorker.com/"]
        
        threads = urls.map do |url|
          Thread.new do
            response = Net::HTTP.get_response(URI.parse(url))
            puts "#{url}: #{response.code} #{response.message}"
          end
        end
        
        threads.each &:join

~~~
clubhi
Is that really parallel? I thought Ruby had a GIL.

~~~
ad_hominem
MRI has a GIL, but the GIL doesn't block on I/O like HTTP requests (but you'll
still be pegged to one core unless you do process forking yourself). JRuby and
Rubinius use native threads so you can use all cores with just Thread.new.

~~~
Arnor
> JRuby and Rubinius use native threads so you can use all cores with just
> Thread.new.

Almost. JRuby uses JVM threads so the actual threading model depends on the
underline JVM.

~~~
rubiquity
What versions of the JVM don't map JVM threads directly to Os threads? I
thought green threading was removed a long time ago. I would love more info,
thank you.

------
xiaq
I think go's approach to readability is comparable to "simple English". Both
have a minimal set of words (tokens) and syntax constructs. You lose some
witty constructs, and you often find yourself writing in a stupid way.

But, since the code is always plain and stupid, you go through very few mental
twists when you are parsing the language. Since code is usually read much more
than written, this advantage is a huge.

~~~
increment_i
I agree. I find it far more valuable to have a language with minimal
constructs that can be easily read by everyone on the team rather than a
language that tries to include every functional bell and whistle. Think I'll
start giving Go a serious look.

------
iagooar
I'm a huge fan of Go, but moving from Ruby to Go can't be a decision based on
performance gains only.

There are a lot of advantages the Ruby language and ecosystem provide. What
about programmer happiness and productivity? What about the experienced Ruby
devs you surely have? Are they wanting to switch to Go? What about the lack of
experience using Go? Are you sure you're going to avoid all the pitfalls of
the new and shiny language? Isn't better hardware cheaper than your devs'
time?

I'm not saying that your decision is wrong, and probably you have put a lot of
thought into it, but most of time, developing software is using tools you know
well, that are battle hardened and have been proven to deliver fast.

~~~
curun1r
When the performance you're considering is your own machines, I wholeheartedly
agree with you. But when the performance you're considering is that of your
customers, the story changes.

The guys who wrote the story are building a monitoring product and their agent
code will be installed on all of their customers' machines. Every CPU cycle
and byte of RAM should be sacred to them because it takes those resources from
their customers' applications.

------
slowernet
Go is faster than Ruby, but trying to show that with a benchmark that spends
most of its time blocking on HTTP responses is kind of daffy.

------
th0br0
I shall never understand this "Ruby is slow and difficult to parallelize so
let's use Go!" mindset. People knew about the negative aspects of Ruby in
advance (i.e. GIL + performance) but still chose to use it even though back
then (whenever that was) there were other alternatives - no doubt without the
big userbase that Ruby had thanks to Rails. Now there comes Go which makes
parallelism easy(er) but introduces challenges of its own and also has some
downsides (i.e. polymorphism) which Ruby solved better - but whatever: "a good
developer can always work around such challenges". So in the end, most people
are just following the hype and (IMHO) exchanging one bad apple for another...

~~~
pkroll
Go wasn't an option then. Obviously the languages that WERE options, weren't
preferred. The article mentions the authors lack of Java love specifically.
There are plenty of other situations where Ruby (or something else) was
eventually a bottleneck, and people moved to Java, or wrote their own PHP
compiler, or moved to .NET version of ColdFusion... Now that Go's an option,
some of the folks looking for speed are going to choose it.

------
ChikkaChiChi
The Scout team developed a tool based on languages they knew, they proved that
there was value and demand in the marketplace, and then they decided to
implement a performant architecture that can be optimized to handle a larger
volume of traffic.

Hopefully this post can help some HN startups from making the mistake of pre-
optimization. You're not Google until you are Google.

------
WestCoastJustin
Probably a really stupid question, but does Go have a Rails like framework for
web apps?

~~~
ansible
There are a bunch of them available. This list is not complete. [1] [2]

I'm currently looking at Negroni [3], but haven't actually used it yet.
Written by the same guy that did Martini, one of the popular frameworks.

[1] [https://code.google.com/p/go-
wiki/wiki/Projects#Web_Applicat...](https://code.google.com/p/go-
wiki/wiki/Projects#Web_Applications)

[2] [http://codecondo.com/4-minimal-web-frameworks-go-
golang/](http://codecondo.com/4-minimal-web-frameworks-go-golang/)

[3]
[https://github.com/codegangsta/negroni](https://github.com/codegangsta/negroni)

~~~
notduncansmith
I'd hesitate to call Negroni a framework. In fact, from Negroni's readme:

> Negroni is _not_ a framework. It is a library that is designed to work
> directly with net/http.

------
zem
i've not been paying a lot of attention to golang because it never seemed to
solve a need i had, but the recent discussions around the blogosphere about
how well it does cross-compilation have got me excited. every time i've tried
compiling ocaml code on windows it's been a horrible experience.

~~~
ChikkaChiChi
I've cross-compiled several programs I've written in Go, and it has been
painless.

What is even more exciting is that the binary includes everything your program
needs to run, so runtime installations are no longer a barrier of entry.

I've been able to spool up a VM, grab my bins, and do what I need to do in
extremely short order. It's great!

------
pselbert
"One thing that was getting our nerd juices flowing: Go."

I have a hard time believing that statement. Ignoring whatever "nerd juices"
may be, surely something that was entirely designed for dead simple imperative
syntax isn't getting them flowing.

You want something different, I get that. What you really want is something
statically compiled and cross-platform.

~~~
pkroll
You have a hard time believing that statement, or you'd rather not? 'cause
other than calling the author a liar, you're saying it's not possible for
someone to be excited about the language. That's awfully presumptuous.

------
no_future
If you're so serious about your web stack and know it is going to need all
these fancy and performance features, why build on Ruby in the first place?
Why not use something that is performant with a history of successful apps
behind it like Java?

It seems that everyone in web is just blindly following whatever flavor of the
month tech stack tech bloggers who probably spend more time writing about
software than they do writing software are hyping. CRUD apps(IE 90% of web
based companies) are probably one of the most solved problems in tech. Why is
it that every week there is some post by some inane startup nobody has ever
heard of about "why we rewrote our stack from Ruby/Python to X shiny new
language that people use buzzwords I probably don't even know the meaning of
to describe"? This sort of lines up with that post from yesterday I think
about companies just wanting to push out the product as fast as possible
rather than caring about the actual quality of it. Why wait until you need to
scale and hire 10 100k/year engineers to rewrite your app into a capable
language and migrate it with 0 downtime instead of just doing shit right the
first time? Seems like it would be a lot less headache, even if VCs were
throwing piles of cash at you. Snapchat, for whatever other things you can
fault them technically for like being hacked or whatever, did this right. They
built on Java and deployed to App Engine and their application scaled right
up.

I can understand extreme cases like Twitter(went from Ruby/Rails to Scala),
where it was essentially a trivial CRUD application that eventually got so
enormous that language performance features became an issue.

Also I don't understand why web people are so obsessed with "concurrency". How
does concurrency help webapps that are just getting a request, doing some
stuff with databases(which have their own well implemented concurrency
mechanisms) and sending a response back so some tween can look at pics of
their friends and send gossipy messages? Unless you are doing something that
requires long running processes like video streaming or an online MMO type
game, what is the point of it? Oddly the the parent posts' link's application
in question is some kind of server monitoring service, so I really don't know
why they chose Ruby.

~~~
ChikkaChiChi
Probably because they knew Ruby and it doesn't always make sense to pre-
optimize unless you know there is demand for your product.

~~~
no_future
>because they knew Ruby

If the only language I know is Visual Basic should I code my web stack in
Visual Basic and deploy it to Microsoft Azure?

A surefire way to tell if a developer is lazy(and not in the good way) and
takes little pride in their work is if they show an interest in a platform or
language solely because its in a language they are familiar with, instead of
picking tools that are the best for the job at hand. I mean this reasonably of
course, if your app is a reasonably good fit for Ruby and you're familiar with
Ruby more power to you. I just don't understand companies developing odd
domain specific/specialized software in languages and platforms not meant for
it and then complaining that it runs too slow.

Something I'm also unsure of is why almost every company these days seems to
be choosing Rails to build on. It doesn't seem to be that they are just
picking the language which has the largest available pool of developers
available; from what I've read Ruby as a language was completely obscure until
Rails came along, and even now isn't used much outside of Rails and the
occasional script. Python is really popular and used in many domains ranging
from web to scientific computing to interfaces for stuff like Arduinos and has
a batteries-included de-facto web framework, Django, so I would imagine there
were more Python programmers available, but you see far fewer companies using
a Python stack. Java also has lots of battle tested web systems and there is a
sea of cheap Java developers(might as well hire a cheap dev if your initial
product is going to be a rushed piece of shit), though it is verbose, has a
longer dev cycle, and is considered to be for the enterprise type businesses
run by Wall Street fat cats everyone is trying to "disrupt".

Is it because Rails is just trendy, or some other reason?

~~~
pkroll
Rails is excellent for building CRUD apps, what it was designed for, and as
you say that covers a lot of ground. Plenty of code available for all sorts of
common features on the web. I learned Ruby specifically to run Rails, and
mostly, Ruby is a lot of fun to write. That it's staggeringly, hideously,
Godot-level slow is eventually problematic but by then you've got enough
traffic to pay for multiple servers. :)

