
Open Sourcing Our Go Libraries (2014) - joss82
https://tech.dropbox.com/2014/07/open-sourcing-our-go-libraries/
======
mjackson
Using Go (or Scala or Java or whatever) doesn't magically scale a large
system.

Scaling problems in large systems are rarely solved at the micro level, i.e.
you don't "scale" by simply gaining the ability to run more operations in a
single thread. This is always the problem with "language X is more scalable
than Y" debates. From my experience scale has little to do with language,
everything to do with how you use it.

This post brings to mind another written by Alex Payne a few years back about
scaling in the large vs. scaling in the small:

[https://al3x.net/2010/07/27/node.html](https://al3x.net/2010/07/27/node.html)

The danger with making hand-wavy claims with very few technical details is
that it perpetuates the notion that there are certain magic bullets out there
that will magically make your system "scale" if you use them. In the past few
years we've seen quite a few large companies in SV and SF make the switch to
some shiny new language (Scala, anyone?) only to find several million dollars
and many months later that they're still having conversations about how to
make stuff scale. Only now they're talking about making it happen in a
language that is fairly new to their organization and with which they have
only a few real experts.

~~~
scott_s
Your comment reads to me as if you think the author implied that they think Go
"magically" provides scalability. I inferred nothing of the sort. Rather, I
assumed they wanted to use a language which provides concurrency as a
primitive, to make it easier for them to write concurrent code.

This switch (Python to Go) for this motivation (better concurrency support)
seems reasonable to me, particularly since Python does not have a good
concurrency story. (I love Python. But I would not choose it if I wanted high
concurrency.)

~~~
waps
Also you should consider that the more you scale, the more "scaling in the
small" saves you money/machines/operators/...

Why ? Less resource-hungry code requires less resources and doesn't cause
problems quite so quickly (tends to cause harder problems though). 8 years ago
I switched the directory on a site that required >30 servers to run to tmpfs,
and did some serious sql optimization in the php code. The month after that I
turned down 20 of them (actual servers + caching and load balancing servers
that weren't necessary anymore).

------
joss82
I find the comments most interesting. An example:

For us, one of the biggest latency wins comes from the fact that go can truly
execute sql statements in parallel (whereas python's GIL serialized these
parallelizable operations). In general, single-threaded go is at least 5x
faster than pure python (without c-module).

~~~
zzzcpan
They are not executed in parallel though, only asynchronously.

~~~
noselasd
How/why are they not executed in parallel ?

~~~
themartorana
They _MAY_ be executed in parallel. Concurrency != parallelism[0] BY DEFAULT.

That said, if you're running Go against _n_ number of CPUs, then yes, the
concurrency may in fact happen in parallel.

[0] [http://blog.golang.org/concurrency-is-not-
parallelism](http://blog.golang.org/concurrency-is-not-parallelism)

------
Jabbles
"switch from dropbox formatting to std formatting"

[https://github.com/dropbox/godropbox/commit/5ed34e410e1c9fe8...](https://github.com/dropbox/godropbox/commit/5ed34e410e1c9fe8ea274eaf809d0ee8ad2bdc03)

~~~
mikecb
Did they just run fmt on it?

~~~
jayrox
yes

~~~
mikecb
I wonder what dropboxes reasoning is for not doing that in the first place.

~~~
rcarmo
Probably because Go indentation and spacing can be a little annoying at first
when you're used to Python (or simply because they followed some kind of
generic coding standard).

It took me a while to "give up" and find _fmt_ -massaged code natural.

------
SEJeff
Serious question, not having looked at their caching lib just yet, are they
going to be able to beat groupcache, written by golang upstream?

~~~
mediocregopher
Groupcache addresses a very narrow use-case of caching: immutable data. Key's
in groupcache cannot be changed or deleted, which is what allows for some of
the cool things it does like distributing keys to multiple nodes automatically
and prevention of stampeding. It's useful for things like caching lots of
small static files (which I believe is what google uses it for), but it's not
useful as a db cache where things are constantly changing.

Just glancing at dbox's caching package it looks like a much more general
cache, with deletes and sets and all of that. So the two aren't really
comparable.

~~~
SEJeff
Thanks for the comparison, that makes a ton of sense. As someone else also
pointed out, since a lot of Dropbox's infra is python, it does make sense for
them to have a drop in memcached replacement. That means groupcache is
effectively out.

------
mikebo
Interesting to note they released these back in July of last year.

------
carbocation
I'm a little surprised they went with memcache instead of groupcache (unless
they are also using memcache on their Python processes). Would love to know
more about that choice.

~~~
jdcarter
Probably because of other services--not necessarily written in Go--which
already use memcache. Don't fix things that aren't broken, right?

------
thomasfromcdnjs
Awesome work but I imagine at some point it really is going to make sense to
split those projects into their own repos.

------
wnevets
when/where is Go faster than Python?

~~~
voidlogic
1\. Go, being a statically compiled language/having a closer to the metal
memory model, generally has higher single threaded performance.

2\. Go allows true concurrency in many situations Python cannot (due to GIL
etc); Go also supports lightweight threads that are multiplexed over actual OS
threads.

3\. Go makes it easier to control and reason about heap allocations.

4\. Go is even easier than Python to integrate with C/ASM code.

~~~
dragonwriter
> 2\. Go allows true concurrency in many situations Python cannot (due to GIL
> etc);

It would be more accurate to say that Go allows _parallelism_ in situations
where Python (in the standard implementation, at least) does not. Calling
parallelism (what the GIL prevents) "true concurrency" isn't particularly
helpful.

~~~
voidlogic
True, I was imprecise. Thanks.

------
didip
I hope Dropbox replace the python client agent to Go. That will hopefully cut
down the memory consumption.

------
noelwelsh
So I guess that Python JIT compiler
([https://tech.dropbox.com/2014/04/introducing-pyston-an-
upcom...](https://tech.dropbox.com/2014/04/introducing-pyston-an-upcoming-jit-
based-python-implementation/)) isn't working out so well then?

~~~
rcarmo
Snark aside, I don't see what one has to do with the other. It's perfectly
sensible for them to keep pushing pyston to have an alternate solution.

~~~
noelwelsh
I wasn't snarking, merely asking a question. It's valid to ask whether the
Python compiler is working out since the stated goal is "to produce a high-
performance Python implementation that can push Python into domains dominated
by traditional systems languages like C++."

Since Dropbox has rewritten a chunk of systems in Go it suggests that Pyston
isn't working out.

~~~
rcarmo
If you check out the announcement timings, it was far too soon for Pyston to
have an effect. You can't churn out PyPy-grade JITs like they were burgers...

------
cdnsteve
What version of Python does Dropbox use? 2 or 3?

------
gnufied
This is one the cases where I wouldn't mind reverting the original title to
what OP linked and let readers draw their own conclusions.

------
thrownaway2424
I'm detecting a bit of a trend here.

* Start with python * Find out it's a little horrible. * Hire Guido * Python still a little horrible. * Switch to Go

~~~
calebm
Dropbox is not dropping Python: "Dropbox will continue to develop majority of
its features in Python. We have only migrated performance critical components
to Go."

~~~
alexbardas
It's more or less a matter of time until everything will be migrated to go.
Since it has better performance and works good for them, there's no point not
to do it.

~~~
mitchty
Not necessarily. First easy reason for not migrating: the cost to do so
outweighs the benefits. Why port working code to a new language when all
you're doing is maintenance?

I don't know about you but its hard to sell upper managers on complete
rewrites of things when the end result is: no real change but it
might/should/could run faster. Unless performance is a concern to be addressed
the risk of changing technology stacks doesn't seem a great idea.

~~~
thrownaway2424
You don't have to rewrite anything, the old language can still die. If you're
writing all your new code in X and your codebase growth is accelerating, the
existing code in Y will look less and less relevant over time.

~~~
mitchty
Perhaps, but it doesn't sound at all that the case is all new stuff is go for
them.

