Hacker News new | past | comments | ask | show | jobs | submit login
Disqus: It's Still About Realtime, But Go Demolishes Python (highscalability.com)
94 points by geerlingguy on May 7, 2014 | hide | past | web | favorite | 48 comments

While I'm pretty sure that Go is much faster than python in nearly all tasks, I'm still skeptical that a) they built a full replacement of the system in only a week and b) that they didn't improve the algorithm in the go code. I work at a python/django shop and I recently updated a process that could only handle 5000 transactions in 5 minutes to be able to handle over 100,000 in just under 3 minutes. This was all in python would go been a lot faster? Sure but how much is the new code path and how much is the advantage of golang?

Also what to point out that I fully realize that Go is a lot faster than python at preforming the same task but this kind of analysis is dangerous because it leads inexperienced engineers to believe that changing languages is a good performance optimization which should usually be the last step/idea considered. It's expensive and dangerous to switch languages mid stream and while it's sometimes the exact cure it needs to be weighed very heavily to make sure it's the right situation. This article, at least the headline misses that piece of the puzzle pretty fully IMO.

Actually, I think "Go is faster" has two primary built-in points:

1. It's faster in terms of real performance - given its proximity to C, optimization at the compiler level for multiple platforms, that its highly concurrent and run compiled (usually), this is no shocker.

But, 2. It's faster than transitioning from a dynamic scripting language than it would be to transition to, say, C or C++ or Java, etc.

Not that it's syntactically similar (obviously), but that language, testing and deployment design won't present any egregious bottlenecks and that there's so much built into the core library that complements an http-based service such as this.

In other words, the prospect of going from Python (or Rails or PHP or whatever) to "faster language X" has historically imposed an inherent cost in terms of development that I think a language like Go mitigates to some degree by design.

> and run compiled (usually)

I think you are talking about 'go run', which one might easily mistake for an interpreter because of the speed of the compiler, but it's actually compiling your code and running it. Were you talking about something else?

I wasn't talking about go run, no - there have been a few examples of running go interpreted I've seen posted here. They may well just be experiments but I left the "usually" as my caveat.

Edit: I swear there was an interpreter demo'd, but I may well have been thinking of something like agora, which interprets Go-like code within Go.

I was actually fishing for cool stuff I hadn't heard of, like agora. I don't get to dive into very much Go news! Thanks

Not sure I agree with you there. IMO, porting between languages with similar abstraction levels is not that hard. I ported some Python code, whose logic took weeks of fine-tuning, to C# in less than a day. Go still has a garbage collector and its type system is similar in philosophy to Python's duck typing. Unless you are porting extra dynamic code (eval(), monkey-patching) or deeply intertwined with an unusual library, it should be a fast process. On the other hand, I was leaving Python get real OS threads, not to increase performance.

This is nitpicky, but Python threads are kernel threads. It's the GIL that restricts CPython to a single core.

OTOH goroutines are green threads multiplexed onto kernel threads (aka N:M thread model).

This obviously also depends heavily on the complexity of the application and the available resources.

Even so, a production-ready transition in a few weeks time doesn't leave a ton of time for benchmarking and testing, but as someone who operates on a fly-by-the-seat ethos more often than I should I'm in no position to be tossing stones around.

There has been articles talking about rewritting being less of a terrible idea if your writing a SaaS. Not to mention the throw one out, you will anyway idea. I'd be interested though, in hearing more about what they rewrote into Go. Its most probable its not all of their codebase.

Just our realtime system, and we run some asynchronous tasks with Go now, specifying to replace a few things that were done by Celery previously.

See this shitty thing I wrote to bridge the gap: https://github.com/mattrobenolt/go-celery :)

Yes, it sounds like its a websocket server. A lot of that stuff is relatively low complexity if you glue everything together with a message queue since they can be made language agnostic very easily.

> I'm still skeptical that a) they built a full replacement of the system in only a week

Keep in mind that Highscalability bases it's articles on other 3rd party research. They don't talk to the people they are writing about, even when those people (like me) offer to help them get their facts right.

In all likelihood it's just completely wrong.

While I do certainly make mistakes I take great pains to be as accurate as I can. And I do have a lot of original interviews on the site, though some problems have kept me from doing them lately. I give the sources directly in the article and they are directly from the people involved in the project. And if there are any inaccuracies in any article I would be happy to make corrections. I don't recall when you offered to correct mistakes, but if you did so and I missed it then I apologize.

I assure you, I built this in a week. :)

See, a first person source! :)

Nonetheless, my statement otherwise still stands.

It seems as if they replaced one component of the system in a week, not the whole thing.

It sounds like this is a very contained server process that does one thing: pull things from rabbitmq and pushes that to nginx which does the push stream. So this could be written in a week easily.

To be fair, there were some optimizations that were made as well. Everyone does that, right? :)

But overall, specifically our realtime service is a hybrid of CPU intensive tasks + lots of network IO. gevent was handling the network io without an issue, but at higher contention, the CPU was choking everything. Switching over to Go removed that contention for us, which was the primary issue that we were seeing.

Right and that's awesome and ALOT more useful than the article because that seems an ideal use case to switch to go over python.

This specific use case is multiplexing messages to lots of connections, which is almost the ideal use case of Go anyways. It wasn't a matter of re-implementing complex business logic. 1 week is reasonable given the specific problem.

> b) that they didn't improve the algorithm in the go code

The port to GO will do it (I think), because it have built-in facilities to do concurrent programing across all the library..

Go is looking more attractive, but the language is not just the language.

When Go has available a significant fraction (subjectively) of the "batteries included" that Python has, then I'll start investing time in Go. Is it there yet (subjectively)?

It's definitely getting there. I'd actually say that Go's network "batteries" are just about as comprehensive and possibly easier to use than Python's (ie, socket, urllib, urllib2, httplib, smtplib, pop3lib, etc.).

It's also pretty cool that Go has a full crypto library in written in Go.

I was seriously looking at Go for a desktop app using the promising new go-qml, but there seems to be no elegant way to use Qt's system tray feature with go-qml. I'd say Go still has a way to go before it's got as many possibilities as Python for desktop GUI development.

But for command line utilities and network apps, I'll definitely reach for Go over Python now (although I'm increasingly playing around with Rust and hoping I can eventually just use Rust for just about everything).

Edit: Removed imaplib since Go has no equivalent in its stdlib.

I think Go is pretty competitive with Python's standard library. It doesn't have an FTP client (we use and are happy with github.com/jlaffaye/ftp) or an equivalent to imaplib (although github.com/mxk/go-imap/imap looks interesting), but for the rest of our workload Go has everything we need.

Aside: for SFTP we use (and like) https://github.com/minusnine/gosftp/

It depends on what batteries you want.

The Go standard library is pretty amazing as far as standard libraries go (I'd say it's more useful than python's, and definitely contains less cruft). Community provided libraries are expanding rapidly, but there is a strong focus on web services all around.

The standard library is actually quite good: http://golang.org/pkg/

Thanks to all replies. It looks like for work I could use it for personal automation/productivity, but it would be harder to convince others in the department to use anything I might write that depends on newish/non-established third party libs. I work for BigCo. I'm using xlrd, for example, in a few python scripts, and even that's going to raise an eyebrow or two.

As for home, I'm playing around with imap and some related things at the moment, and pointers to 3rd party libs might be fun for me to try. We'll see.

Their asinine policy against package managers (yes, let's go BACKWARD. Everything was better in the good 'ol days!) is keeping me away from it. I don't have time to deal with more self-aggrandizing douche bags.

For a start, Go's SSL/TLS doesn't disable certificate checking by default ;-). Does that mean it's more "batteries included" than Python for TLS and every TLS-using protocol?

> In only a week a replacement system was built

This part is telling to me. It means their codebase isn't so complex, and optimizing their application for their hardware would take as much time (or less) than adding hardware. For people with large, complex codebases, scaling horizontally increases capacity much faster.

In the past Go has had trouble with garbage collection lag due to a global mark and sweep implementation[0]. Was this not an issue with Disqus's new implementation?

[0] https://groups.google.com/forum/m/#!msg/golang-nuts/S9goEGuo...

For the record Python uses a generational garbage collector.

You can run multiple instances of the same Go program to reduce the effect of GC lag

> yields it's own benefits as well.

yields its own benefits as well.

I was going to ask how pypy performed on their original python code but it looks like gevent doesn't support it yet (or didn't at the time).

Node was not selected because it does not handle CPU intensive tasks well

I'm curious to hear what sort of CPU intensive tasks Disqus does.

why were they using python for performance critical code in the first place? Go seems closer to java's niche to me.

Disqus was originally built on Django: http://blog.disqus.com/post/62187806135/scaling-django-to-8-...

It seems that this is what the found team was most comfortable with, so it makes sense that they proceeded to solve problems using tools they already knew well. At some point, they exhausted how far they could take their existing tools and started investing into new tools.

What Go web framework did they move to?

We're not using a framework. This is a tiny component and the rest of Disqus is still Django.

It doesn't say, but probably none. My guess would be they just used "net/http" + Gorilla (maybe). Gorilla: http://www.gorillatoolkit.org/

Given it sounds like this application is not "web facing" (i.e. not an API nor rendering HTML), the use of any "web framework" or Gorilla doesn't make much sense.

I write back-end stuff like this all the time and it often has admin interfaces.

I imagine at their size they're using raw Go without any framework.

They had a fairly popular post on the subject a while back:


In short, since they were not really pushing volume with Python, just feeding data to Varnish, throughput wasn't such a big deal.

Yes, why is Go constantly compared to Python?

For me, it felt comfortable writing Go, considering I've been mostly writing Python for the past 8 years. So the transition was a lot more natural, compared to using Scala or Erlang or anything else.

Again, this is my subjective opinion, and this is why we chose to use Go for our new stuff instead of something different.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact