Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What highly scalable thing have you built with Go?
102 points by jonathan-kosgei on July 2, 2018 | hide | past | favorite | 75 comments
I've just read this article by Randall Degges on how ipify.org scaled to 30 billion API calls a month on a few Heroku dynamos after the app was re-written in Go.

Have you re-written any of your applications in Go and experienced significantly higher performance?

We wrote our bidder (in app advertising) in Go. It is globally distributed (close to the exchanges) and handles 1.5-2M requests/s (OpenRTB,~50-90k/s per instance) with a p99 of 10-20ms (without network latency). Really happy with Go, especially the GC improvements done by the Go team in the last few releases. For a previous similar project we used Ruby which was quite a bit slower.

Similar problem space (we don’t bid but capture rtb data for analysis).

Similar throughput, our bottleneck at this point is moving data around.

We’ve abandoned channels for most of this. The next major improvement would be to rebuild the http stack & that’s just not worth it.

Someone shared this with me https://blog.golang.org/share-memory-by-communicating - Share memory by communicating.

And also pointed me to nats.io, a messaging system that handles 10M messages per second on a $50 server.

See the comment at: https://www.indiehackers.com/forum/how-we-handle-25m-api-cal...

Should fasthttp be used in production? It seems to get a lot of flak for not fully implementing HTTP.


Like every engineering decision it is a risk/reward spectrum. But unless you have profiled to know that net/http is your bottleneck, no, you should almost certainly not use fasthttp.

We use it in production, though fair warning we have very limited endpoint and consumers that allow us to test it aggressively.

What I really meant is we'd need to take things down to the bare network stack. Lots of our memory use/bottlenecks are fairly deep into the std lib.

Same here - we don’t use channels in the hot code path and did replace the Go HTTP parser.

How large is your payload? Asking out of curiousity.

Between 2-40KB, the average is likely closer to 2-4KB (Gzipped JSON or Protobuf)

At Uber we built Jaeger (https://github.com/jaegertracing/jaeger), which is doing something like 200K writes per minute into our Cassandra cluster.

I was curious as to how Jaeger's performance compares while running on Elasticsearch 5.x/6.x NoSQL cluster versus the Cassandra cluster you described?

Go Cassandra... Elasticsearch is a pain in the arse. I have yet to meet someone who has confidence in it at as primary store of data except with the lightest of workloads.

Personally I don't know. We don't use the elasticsearch backend internally.

A BitTorrent client https://github.com/anacrolix/torrent, and several projects using it. The original idea started in Python which just didn't cope. Things are probably better now in Python with green-threading being a standard concept, but you just can't easily get the throughput you need without minimal overhead, and that overhead is just too high in Python.

Which is strange given the first version of bittorrent was written in Python.

Also, the default torrent client in Ubuntu is Deluge, which is in Python too. I download 20 torrents at 20M/s without any issue with it.

But I get that using threads, downloading from 1000 clients is annoying to code, and indeed you are right, today with asyncio, the story is different.

And I definitely understand the appeal of Go for network concurrency for a lot of projects.

I just don't get how Python didn't fit the bill for this particular one. It's just a client, after all.

Maybe I'm missing something.

First guess: Packaging and distributing python binaries is more difficult than it is with Go.

(I like Python a lot; I've written tens of thousands of lines of it. But its code packaging systems leave much to be desired).

Oh yeah, no argument there. Even now with the fantastic nuitka that compiles Python seamlessly to a stand alone executable, you still don't have the cross compilation story go has. And you have to be careful with libc.

But the thread is about highly scalable things, isn't it ?

Packaging is inherently harder in Python, because of the emphasis on modularity. But conda does a good job. I've been very happy with Miniconda.

We are talking about end user packaging.

Lib packaging is a solved problem. For installing, people are just using pipenv. For creating, it's just an 2 lines setup.py file and a ini file to fill in now (http://setuptools.readthedocs.io/en/latest/setuptools.html#c...)

Yeah in it's early days, the torrent network was small, and peer counts were limited. The original, and many later Python implementations used event loops, which side step concurrency implementation overheads (like threads), and very often do a lot of heavy lifting in C. I'm not sure that Deluge is Python for the torrent part. The standard is to use libtorrent these days.

Python can handle a torrent client with appropriate tools, but you just have to be extra careful about algorithms etc.

Deluge is really bad with more torrents. After 1000 it's basically unusable. Most people who do long-term seeding use multiple instances of Transmission or rTorrent. There are some manager tools to help with that, these usually also do the balancing.

I used to split them up with around 3000 seeding / client which worked fine.

Ah ok. That's a very specific use case though. Regular users won't do that. But i can understand that non asyncio python is not good for that specific usage.

Isn't deluge's actual torrent handling all implemented by libtorrent, which is c++?

Python is c. The ui is gtk which is c. Python is always mostly sugar on top of c.

What is scalable about a client?

You're associating the word scalable with the typical web-scale definition. Don't do this. The OP didn't specify what kind of scaling, and a torrent client has a lot of scalability (vertical?) in the way it manages connections and DHT.

It can scale to your all 4 cores and burn your battery.. kidding aside most bittorent clients handle hundreds/thosuands of connections and sparse files as the splits and number of running clients increase, so scalability is a concern here

I think it is not the definition of scalability. It just means it is resource efficient. Scalability usually means that a service can handle the load beyond a single node capacity, at least for me.

I once wrote a thing that went all over gmail figuring out where all the pending mail was supposed to go and presenting it as an interactive dashboard. It was easy to do in Go because baking in the html templates, static assets (like d3) and backend logic is pretty simple with Go’s standard libraries and build system.

I wrote another thing in Go that determined the backend latency of an anti-abuse system within Google. That prober made about ten million requests per second. Again I chose Go (over C++) not for its performance but for the ease of giving that thing a fancy interactive status page.

Curious how Go made the interactive dash easier. Maybe something like a direct connection between channels and websockets?

Go has the html/template package which is a type-safe template language for generating html, with loops and whatnot. It’s awesome.

Go also has built-in support for blurting out any structure as JSON so it’s dead simple to write jsonp handlers.



Gotcha - I thought by interactive you were perhaps referring to realtime capabilities.

At SendGrid, much of our stack is Go but started off as Perl and Python. Our incoming clusters are geographically distributed to reduce latency, but a handful of nodes do just fine processing 40k rps. We could dramatically reduce cluster size, but choose not to for reasons around availability. These incoming requests generally create four to eight logging events that are processed and emitted for stats, tracking, and/or published to customer webhooks. Additionally, our MTA is now in Go, and each incoming request usually has some multiplier for the number of recipients.

We typically expect around a 20x improvement in throughput when we rewrite a service. Granted this depends on the nature of the service.

As much as reduced server costs and greater performance are awesome, one of my favorite parts is the increased maintainability of the services. Perl's AnyEvent and Python's Twisted (aptly named, btw) were much harder to reason about. Go's concurrency and simplicity make it a win for us.

By "concurrency and simplicity" are you primarily referring to Go's CSP model / first-class support for channels and goroutines?

Correct, for the concurrency model. For simplicity, while the concurrency solution fits there, I intended "the simplicity of the language." There is usually one clear way to do something and jumping into a new code base is usually pretty easy.

Nice - thanks :)

A significant proportion of Segment's event-handling pipeline and processing code is written in Go. This includes our "Centrifuge" system for ensuring reliable event delivery to HTTP destinations, which we recently blogged about: https://segment.com/blog/introducing-centrifuge/

With the exception of C (or perhaps Node.js for single-threaded programs), I can't imagine we would be running as efficiently on our AWS compute resources if we'd written our code in a different language.

I've written many distributed systems in Go for scalability reasons and more recently have been working on micro, an open source toolkit to help others do this https://github.com/micro/micro. The core of which starts with go-micro, an RPC framework for building cloud-native applications https://github.com/micro/go-micro.

Building systems that scale is not an easy task. Go lends itself very well to this task but at the same time there's more required than just the language. The communities belief in libraries rather than frameworks actually hinders this progress for others. Hopefully other tools like my own will emerge that sway people towards the framework approach.

Hi, we rewrote in UserEngage few modules from python3 to golang.

On python we use Django, DB postgres&citus, rabbitmq, redis.

Our main cluster have more than 25mln API requests daily.

For us Golang is >70 times faster than Django.

Awesome! I've always thought Django to be terribly bloated. Go seems like the best option for any kind of API.

> I've always thought Django to be terribly bloated.

Well it depends. It does handle clickjacking, XSS, i18n, l18n, identification, authentification, database handling, CRSF and much more out of the box.

Things that you eventually code yourself, or that you should disable if you don't need it.

What usually happens is that people delegate this complexity more and more to the front end, which becomes, indeed, bloated. Or install more libs. And code glue for them because they are not integrated.

All in all, Django is slow compared to go, but it's not bloated if you take it in the context of what necessary work it saves you. It's opinionated. It has sane default. It's ready to use.

Unless you plan to have static website or an unsafe one.

For 90% of applications Django has everything they will need from an application framework and is fairly easy to manage.

If you're in the 10% yeah it can be a pain in the butt. However it's not impossible or even particularly difficult to migrate to something else.

TiDB is a distributed HTAP database compatible with the MySQL protocol(https://github.com/pingcap/tidb)

Go-Jek and Grab are both very strong in SE-Asia in ride sharing. Both their backends are (re)written in Go. Each had a presentation on it during Gopher Con 2018 in Singapore. They are also big sponsors of that event: https://2018.gophercon.sg/

[Edit: one of them had some impressive stats on reducing servers while increasing demand]

We have built a small geo redirection server using Golang and Redis that handles around 50M requests per day. We have optimized our stack to cut TTFB and Golang make it easier to achieve this.

Very impressive, do you have a write up on this?

high performance != highly scalable. I was really hoping for some discussion on highly scalable systems and not necessarily optimizing database queries.

High scalability is somewhat subjective, and often not that interesting. It's relatively easy to design an inefficient solution and "scale" it by throwing more machines at it. It's hard to reduce the footprint of something that's big and/or slow.

The hardest problem I've bumped into is getting the model right for making something distributed at all.

Yes! I'd be curious what Go does differently from other languages that helps with that second part.

Same thing I'm looking for :)

Average of 8 seconds request down to around 80ms. Main issue being the original component in the application was making heavy use of the ORM and honestly, had I really taken a step back and hand crafted queries and used plain old PHP objects I likely might not have needed to rewrite in Go. Doing so however is a really great exercise and leaves perhaps the most complex part of the application nicely separated from the CRUD side of things so I'm still thinking it was a pretty good move.

Message queue specifically built to meet our needs, writing to an optimized XFS volume.

Saved us a ton over the pre-built queues we'd been using.

>>>> Message queue specifically built to meet our needs

Mind sharing needs not met by "standard" message queues (rabbitMQ, Apollo, NSQ, etc)

>>>> optimized XFS volume

Is this just some specific mount options or is it a more complex setup ?

Thanks. Asking out of curiosity.

Noticed I missed the XFS question. Honestly I’m not entirely sure, DevOPs spent a fair deal of time fiddling with it to get best results. We switched to XFS from EXT4 because we were running out of inodes under heavy traffic.

In our case very large messages without the need for a separate store, autoscaled queue managers, instant recovery, and zero chance of double processing messages.

I guess in the end the database will be the bottleneck in many cases.

Totally agree, but I'm looking for more case studies like the one I shared.

YTBmp3 https://www.ytbmp3.com is built completely with Go. Autoscaling cloud instances based on load with a custom scaling solution is also written in Go. It handles very long running requests on streaming transcoding and compressing. Go's net/http is exposed to the internet with great results which adds to infrastructure simplifications, graceful reloading (important for long running requests). Go allows so much infrastructure simplifications that there is not even a single container involved. :)

> Have you re-written any of your applications in Go and experienced significantly higher performance?

Probably not what you're looking for, but I improved the runtime of my shell prompt by one order of magnitude by porting from Python to Go. https://blog.bethselamin.de/posts/latency-matters.html and discussed at https://news.ycombinator.com/item?id=15059795

I think I remember coming across this. Thanks for sharing!

CockroachDB, but it was conceived of in Go

CockroachDB is written in Go? That is awesome, didn't know that. May look more into it now

The DNS server (https://github.com/abh/geodns) for the NTP Pool (https://www.ntppool.org/en/) does close to a hundred billion queries a month across a bunch of tiny virtual machines around the world. The steady state load is about 30k qps, but with frequent brief spikes to many times that.

We wrote https://trackcourier.io frontend in Go. Its been really stable so far and is ridiculously fast.

Could you describe what you mean by this? I briefly skimmed the html for the page at that link and that page is clearly using angular.

The server is probably a frontend to something else.

I migrated a image upload service from Python that was using 10’s of server to golang using only 3 servers handlings 500% more traffic than the old version

I never seen any Python website handling upload from Python. You always delegate that to a front-facing server like nginx, then you do stuff in a task queue that calls image magic and co.

Where is Python your bottleneck in your old architecture ?

I rewrote a network TCP scanner in Go (about a year ago) and it performed much better. It scans a /12 for 50 common server ports in roughly 40 minutes. https://github.com/w8rbt/netscan

What language was the original one written?

We wrote a simple fcm/apns gateway in go. Not sure what its throughput is these days, but it must be hundreds of millions or billions of requests per day. (Life360)

When you wrote:

>how ipify.org scaled to 30 billion API calls

I was thinking "an hour" and thinking "damn that's impressive" (it would be 8.3 million per second), or obv per minute or per second would be even more impressive. (And someone is using that really heavily). Instead the end of that sentence is "per month", so 11,415 per second.

You don't need to "scale" for that :) you just need 1 good server.

Go is like a web-safe C. :)

By the way here is the article is talking about:


Thanks for linking to that. Let me include it in the question.

Applications are open for YC Summer 2021

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact