Hacker News new | past | comments | ask | show | jobs | submit login
Large-scale, semi-automated Go GC tuning (uber.com)
122 points by mulkave on Jan 12, 2022 | hide | past | favorite | 104 comments



This looks like what people have been doing with Java for years, except worse because Go lacks tuning. A common approach with Java is to bootstrap your service with a maxheap based on the environment.

https://eng.uber.com/jvm-tuning-garbage-collection/

Here's another blog from Uber on JVM tuning.

Some notes:

1. They have options

2. They have a powerful gc log, no need to roll your own finalizer thing like this

3. Choosing the max heap size, which seems like what they actually wanted here, is trivial

You could say "but that's more complex", but to me it's that the JVM has far more mature features, tons of tooling and options that they could explore and adopt, and that the obvious wins are trivial to achieve through basic parameter tuning.

Further, this GOGC parameter seems to be a very weird knob. The knobs you'd run into with the JVM will often be a lot more straightforward ie: a static value for heap size vs some number that's based on a working set percentage.

I wonder if over time Go will end up with a tunable GC.


This to me looks like the classic "language 1 needs you to write more code, language 2 needs you to know and use more features", which is an endless debate.


I don't really see it that way. The JVM has a lot of tooling already for GC tuning and a lot of simple, powerful knobs to turn. It seems not only simpler and more straightforward to tune the JVM but the ceiling for what you can do is much higher.

With Go there's one parameter, and in my opinion it's a very strange one. It also seems strange to have to (imo) hack GC metrics in using finalizers, whereas with the JVM it's simply provided to you.

Full disclosure though, I think Go is a bad language, so I'm biased.


its weirdly black and white to think a language is "bad", particularly one in heavy use


I think C is a bad language. Necessary at the time? Sure. Is everything about it bad? Of course not. But by modern standards it’s bad, and the sooner we can migrate code away from it and onto languages with fewer footguns the better.

It’s weirdly black and white to assume that just because someone thinks a language is “bad”, that opinion doesn’t have nuance.

Full disclosure, I also think go is a bad language ¯\_(ツ)_/¯


I don’t understand, you think ideally there would be 0 code written in go? You don’t see anything interesting or appealing about it? Also, calling a language bad is black and white, observing that is not black and white.


I just tried to make the point that thinking a language is “bad” doesn’t mean that there isn’t nuance behind that opinion, and so you immediately jump to the conclusion that I believe such a language has zero redeemable qualities and should be stricken from the Earth?

Hell, I enjoy writing C but I still think it’s a bad language.


If you want me to think you have a nuanced opinion, express a nuanced opinion. The opinion you expressed is unsubstantiated, and instead of substantiating it, you’re attacking me. Poor showing mate.


The grandparent was disclaiming his bias regarding his own personal dislike of golang. A full-blown point-by-point critique would have been unnecessary and inappropriate in that context.

Nothing I’ve replied here has been an “attack” on you. I simply tried—gently at first—to suggest that “x is bad” should not be equated with “x is irredeemable” which is not exactly a charitable or reasonable interpretation in the context in which that statement was originally written.

Further I directly expressed a more nuanced opinion as an example in my and you still chose to discard that nuance and interpret the opinion as black and white.

I think you’ll find that on a scale of positivity from -1.0 to +1.0, most people perceive “bad” as somewhere along the lines of “< 0.0” and not “= -1.0”.


Why do you think go is bad?


Why does it matter so much to you that I do?

I have zero interest in rehashing an argument that’s been made here hundreds if not thousands of times already, and by others far more eloquent and convincing than myself no less. Feel free to read my post history. Or simply search for virtually any golang-related post on this site. Whatever arguments you find, I probably agree with at least 80% of them.


Ok, in summary, go is bad, there’s a nuanced opinion behind it, and you won’t tell me what it is. Also, I’m bad for assuming that this is a black and white point.

Have a nice day.


I am genuinely, honestly confused as to how you cannot see how all these points are congruent with one another.

I believe go is a bad language. I have nuanced, lengthy, and detailed opinions behind that belief which stem from 24 years of software engineering, 4 years of professional experience specifically with golang, and professional experience writing, deploying, and maintaining production software using C, C++, Rust, Java, Ruby, Perl, and JavaScript. And I have zero interest in rehashing the past twelve years' worth of arguments against golang with someone who's repeatedly signaled a frustrating level of obstinance.

Whatever wild conclusions you choose to jump to from there are your own doing, not mine.


What I am getting from you are a lot of defenses and very little substance. You can claim to have nuanced opinions and no interest in discussing them, but then why are you even talking to me? What is the point of declaring that go is bad, if you are unwilling to discuss it? Is that supposed to be persuasive? I'm here to have a discussion on a discussion board. Your position is inconsistent with your actions.

Also: I don’t believe you. You have provided no evidence that you actually have a nuanced opinion, you’ve simply insisted upon it it’s possibility. And I don’t think there’s any reason I should believe you.

It feels like trying to get trumps tax returns. “They’re great returns” he insists, but he will generate all sorts of arguments to try and stop you from actually seeing them.


I don't see anything interesting or appealing about Go at all. I've read and written Go, I've watched Pike's talks on it, and I follow its development.

I could talk a lot about why I think it's a bad language, it would be hard to summarize it since I'd want to cite Pike's talks on "simplicity", articles on Go's GC implementation, discuss error handling, what I think makes a language "good", etc.


false dilemna, if you are too mediocre to study your tech you can still write more code, it's not an exclusive property of language 1


I think the GPs point is that, starting from 0, both require effort AND are reusable.

The "debate" is usually between people who have committed to either path (creating tools or learning tools) and think that just because the marginal cost of my approach, for me is 0, it must be 0 for everyone else. Which is patently false.


I feel we can safely assume that tuning the JVM GC might cause behavior that’s pathologically slow but never actually incorrect, which is inherently safer than rewriting a bunch of production code (probably more than once) hoping for a similar result.


seems like they were able to tune it just fine given they wrote a library.


I didn't say otherwise, I'm just comparing the two approaches. We have a very interesting, rare situation where one company has published a blog post on GC tuning across two languages, it seems appropriate to compare them.


I've written my own Go (subset + extensions) -> C++ transpiler and using it on a game project: https://www.youtube.com/watch?v=8He97Sl9iy0 -- No GC, it does have slices and has access to an entity/component API and with that I think you're basically set and don't need GC for games.

Example transpiler input / output: https://github.com/nikki93/gx/blob/master/example/main.gx.go... becomes https://gist.github.com/nikki93/97ff376abb6718427387bb9cca2f... Can call to C/C++ (including templates) w/o overhead.

That said, for logic that is heavy on async and escaping closures like how a lot of Go server code tends to be, a GC is maybe a reasonable tradeoff?


> I think you're basically set and don't need GC for games.

This is kind of a semantics question, I think. The vast majority of games need some sort of lifetime management, and it's just a question of who does the lifetime management and what mechanism you use for it - refcounting, a mark/sweep GC, freeing everything at certain points of time, arenas, etc. If you're using an entity/component system to manage lifetime you have a GC - you wrote it.

In my experience shipping games in C# compared to shipping them in C/C++ - you can do everything without the GC touching your stuff if you're really dedicated, but it's often not worth the trouble considering that any modern GC can handle scattered temporary per-frame allocations for you no problem with very minimal pause times, as long as you're thoughtful about it and the set of objects it needs to walk isn't too big. For example, if your data structures mix native data with pointers to object instances, a GC will have to sweep all of that data - splitting texture references out of a big table of draw calls means that the draw calls are now pure data and they don't need to be swept by a GC.

Personally I prefer always having access to a GC because it means code that doesn't need careful lifetime management can be simpler to write and doesn't have issues like double-frees hiding inside it - things like automated tests, configuration UIs, debug consoles, and things you run once at startup or when loading a level. You can often go back and optimize some of this stuff later, too - for example LINQ is a notoriously messy feature in .NET's standard library that allocates tons of short-lived garbage, but the compiler makes it possible to replace all those LINQ data structures with non-allocating ones without having to rewrite your queries - but doing that moves costs elsewhere.

If you're getting specifically harassed by pause times you're likely going to be paying costs with other systems, like if you use refcounting any time you touch that refcount you're burning cpu cycles and pushing other stuff out of cache (and the refcounting gets much more expensive if you have to use atomics for thread safety).


I agree completely with this. There's been a widely-held misconception, brought on by certain programming styles and libraries, that GC implies trashy, fat programs that are wasteful. But in fact, as you point out, programs that are tight about memory allocation, but nevertheless GC'd for a few things here and there, can still run great with little overhead. Allocation discipline is more important than most other considerations, IMHO.


I also tried running vanilla Go in Wasm (one of the targets I want to support) directly and was hitting GC pauses and frame drops every second. Garbage generated due to temporaries has also been an issue in an engine I worked on that had Lua scripting. This isn't just a theoretical thing, it's clear it's a problem. When you say "programs that are tight but" and "allocation discipline" it's like why add the cognitive overhead when you can not have the problem at all. I want to / want people that use this engine to be able to just write naive / straightforward code and get a high perf ceiling off the bat. GC is not actually helping in the game case since game logic is ultimately explicit about entity lifetimes.

So there isn't a reason to use it and then work around it. If anything I think it's because the ergonomic language work after C++ (C#, scripting languages) tended to include GC so using them meant having it. This is an exploration in having a language that doesn't do that and preserves ergonomics (and it's working / promising).

All that said, the GC thing isn't the main or only reason that motivated this approach, it's just one of the points among everything else. Portability (the resulting C++ compiles and runs in Wasm, native desktop, mobile is supported) and control (being able to decide language and resulting execution semantics, having direct integration with tools) are the main things, at a high-level.


Yup, pretty much. I do think that "GC" usually these days refers to a mark and sweep GC (certainly in the context of a discussion about Go?) or at most a refcount kind of thing applied to all reference-like language entities, not "any lifetime management system," the way I see folks use that term.

But yeah I've found that the entity component data structure is a good lifetime management system very well suited to the game scenario, so there's no need for a different / more complex thing. And this is an exploration in how that + a simple / ergonomic language around it (along with growable arrays (slices)) pan out when making games in practice. There's no manual free calls or lifetime management anywhere in the code.

Re: "using GC but then needing to / being thoughtful to make sure it's going ok" -- that's the thing. It seems better to not have to need to think about it, by having a system that's better suited to the thing you are working on. You also don't need to "often go back and optimize some of this stuff later too" because it just already has good performance with the straightforward code, and you don't need to add complexity. "splitting references out / pure data" -- that is indeed what the language nudges you to do by only having pure data. Essentially: yes, you can achieve the desired thing with intentionality and extra cognition in a different system (the same was true with other kinds of cognitive overhead in C++) and this is an exploration in developing a language + tools that focus on and bias toward the desired thing by default. Like I'm imagining folks getting started with gamedev using this + the integrated tooling and internalizing the practices you're talking about (that's a stretch vision, the current scope is to just build and test it in the context of one specific game project).

I'll be releasing this engine + an example game with it soon, but here's what the code for this main game project (the one in the video) looks like (all the components, the top-level game loop, and then some example game logic): https://gist.github.com/nikki93/0425d9ead9eb7810075434d006f3... It's just data structures, and then functions that do gameplay stuff on them. No lifetime management.


> Yup, pretty much. I do think that "GC" usually these days refers to a mark and sweep GC

Mark and Sweep is antiquated for GCs. There are some edge cases where it is still useful. But most heavily used sytems should use more modern GC algorithms, e.g. generational GCs.


Not trying to belittle you or anything, but I have a hard time understand why someone would use Go without Goroutine and Channel.


I'll read your statement as a question and answer that.

I was using C++ for data-oriented gameplay coding, and I was interested in exploring making a language frontend that compiles to it to clean it up, as a side project (lots of dark corners to run into with C++, and I collected some experience on what those were since I'm managing a C++ game engine codebase at work). I needed a core that was basically a cleaned up C, which is what the C-ish core of Go is (the part other than goroutines, channels and GC), and Go has a good parser and typechecker library you can use. Goroutine and channel are cool for distributed server code or whatever, but not actually that useful for game programming. The main thing is having structs, procedures, some nice ergonomics over those (slices, type inference, non-escaping lambdas, occasional generics) and then metaprogramming so you can reflect over the data structures and have serialization and inspector UI. These are the elements actually relevant to game programming.


> I needed a core that was basically a cleaned up C

Why didn't you consider D in 'Better C' mode? (https://dlang.org/spec/betterc.html) Not only does it already exist but with very high likelihood is more polished (by virtue of the man-years already invested into D) than a single person's ad-hoc compiler of a subset of Go to C++ could probably be. Unless of course you absolutely needed to use some pre-existing Go code...


I actually tried D as part of this exploration. I forget but it was confusing which of dmd or the llvm one to use, and ultimately the language ergonomics were not that much of an improvement over C++. Like "auto" is still super weird. I'm going for an actual improvement here, not another botched but slightly improved thing from the past. I think D's tooling worked the least stable-ly out of the box of everything I tried. It wasn't promising. That said, D was one of the things I explored the semantics of for ideas, and it's good that it tries to do the metaprogramming stuff. UFCS is also ok but needing to decide foo(o) and o.foo() at each callsite is actually mental overhead vs. the choice being decided in Go (I wrote the whole engine and game in Nim before so I have experience with this).

Just because something has a bunch of years in it doesn't mean it's a good idea for a specific context. The transpiler I have here is just 1500 lines of code and captures all the semantics it currently supports. It uses Go's parser and typechecker from the stdlib and feels on the whole more polished than D as a result (generics are definition-checked, Go's package / module system just work, all the existing Go editor support and godoc etc. just work, ...). It's much easier and straightforward to metaprogram by just editing this simple piece of logic than squeezing it into language features (I've also done the same engine in Nim, explored in Zig, and written it once over in C++). I can, for example, make it so if you mark a function a certain way, it's also compiled to GLSL and useable as a shader (with structs shared). Or make it so types marked a certain way have all their pointers reference counted. There's way more control in this scenario, and the point is to have control to take matters into one's own hands and actually improve things.


Go without goroutines is still the same simple, straightforward language that people love to work with. The vast majority of Go code being written doesn't use goroutines.


If you want to compare languages on features, lack of complexity is also a feature.


But lack of proper abstractibility, expressivity is not a “feature”.


Lack of other people's abstractions is a feature too :)


If it is a bad abstraction, sure. But a good abstraction is much much easier to understand then God knows how many lines of code with God knows what program flow.


I don't think goroutines / channels is a clear win or even that relevant for data-oriented gameplay code. I go into this more in-depth in this comment: https://www.reddit.com/r/golang/comments/r2795t/i_wrote_a_si... (it addresses a bunch of points)

This is what the game code looks like: https://gist.github.com/nikki93/0425d9ead9eb7810075434d006f3... -- I don't think CSP helps much to improve on that while keeping serializability of state.


After all, abstraction is a layer of indirection, hiding meaning.

What is a "proper abstraction"?


Feel free to add together the set of a set of an empty set and a set of a set of a set of an empty set, but I prefer calling it 2+3.

Or feel free to manipulate the voltage in some wire, and make sure that it is reliably understood at the other side as the same bit pattern you sent, but I prefer issuing an HTTP packet. These are all abstractions, hell, there is no field building as much on abstractions as IT does. We have to be on like 8-9 levels of abstraction to even do anything non-trivial.


What a joke. Go's GC was always pushed as a 'no tuning required' solution, as opposed to e.g. JVM collectors. As it turns out in real practice, there's no one size fits all solution and it does require tuning in non-toy applications. Go just keeps repeating the same mistakes that's been solved elsewhere 40-50 years ago.


oh my a company that runs on over 70k CPUs has atypical problems. surprise surprise. oh and look, they were able to resolve it with a few lines of code. the horror.


I think this is an overly harsh take. In many applications Go's knob-less GC works well enough and avoids any fiddling at the cost of non-optimal memory utilisation in some cases. It's not obvious to me that adding loads of knobs to help tuning for special cases is an overall win as it undoubtedly adds complexity for many users who are not bothered about this fine tuning.


Given that two of the original team members also ignored the state of art of systems programming and just got lucky with way UNIX licensing was managed in the early years, Go's approach to language design isn't a surprise.


That "lucky" is conflating two things: UNIX winning and UNIX surviving. The second part is not chance based, imho. So open question is weather any of the contemporary "state of the art" approaches would not have run into design issues that become manifest at internet scale utility. Naturally the original gophers must think UNIX overall made the right choice ignoring SoTA.


When the option is between free beer or paying tons of money for Xerox, DEC and IBM alternatives, it is hardly based on merit.


That's granted. That's selection step 1. Then UNIX cum Linux approach eats the world for a couple of decades, pulling its weight. If it was inherently flawed, if SoTA approaches were addressing inherent issues, then it would have failed selection steps 2..n. And we would have long ago bitten the bullet to replace the "lucky" legacy. So, yes, the lucky contender has issues, has warts, can certainly benefit from x, y and z, but as is, it continues to be viable at scale and that is not "luck".

Second, we simply don't know, considering all aspects including human resources and engineering economics, whether Xerox, DEC, or IBM's alternatives would have fared better than UNIX. I'm open to learning why/how if this is a shut case in your opinion.


I bet if UNIX was sold with the same market prices as Xerox, DEC, or IBM's alternatives, it would have been a footnote on the history of OSes, like many others since the mid-1950's.


You elicit my first ever OMG. Yes, granted! :) But that's just mechanics of computing history. That was the lucky break. But it wasn't 'garbage' getting lucky. It wasn't some lame specimen that survived a battle that killed off titans of a species. It was and remains a contender. And, to the point, its authors determined (certainly influenced by not being or as SoTA aware as academic CS lights) that a compelling 80% solution with a [simple] versatile conceptual model can work and it has. Same story goes for Go the language. Of course it is like PLT reto with bolt-ons, but like it or not, it is a viable approach to building software and systems.


Weird to not see any references to https://github.com/golang/go/issues/48409 / https://github.com/golang/proposal/blob/master/design/48409-...

There's been a lot of back-and-forth on how to improve the tuning situation in go. See https://github.com/golang/go/issues/42430. It seems like Michael Knyszek will be doing something about it for go 1.19.


On first read this article made little sense to me.

Ok, go has GOGC env. Ok, you can tune it based on stuff. Do they tune it live as part of process life or is it pre-comoputed at start? Is gogctuner a library?

> As we mentioned above, manual GOGC is not deterministic

What?

More importantly, why the engineering effort is spent on that tool as opposed to just trying to reduce allocations. I've spent countless hours trying to reduce allocations of the hot path. This is a good strategy - Go GC cost becomes negligible if it doesn't have anything to do!

But then it hit me. The missing context is probably other services/tenants interacting with system resources.

Am I wrong in reading it as: In time of the low load, they want to burn less CPU at the cost of more memory. At peak time, they do business as usual. Reducing GC frequency at low load, is generally meaningless. In most systems most operators care about performance/cpu/latency on peak.

Unless....

Unless you have other tenants. They probably run batch jobs at low times, and if that is the case, then indeed, burning CPU for low utilization GO jobs is a waste of CPU.


The irony is that Java removed its old Go-like collector (CMS = Concurrent Mark & Sweep) a couple of years ago, and its next-gen (G1) and next-next-gen (ZGC) collectors have improved so much in recent JDK versions (15 and up) that they require little if any tuning at all anymore.


To run adequately, Java most likely needs a better collector than Go in the first place because Go has sparser object graphs. A triangle with three XYZ points would be four contiguous objects in Java (if implemented as a class with three fields of another class) but only one in Go. On the other hand, Go needs to handle interior pointers. So the requirements for the collectors may not necessarily be the same.


That's true, but Go's GC, while possibly not needing to be quite as sophisticated as Java's, is still not sophisticated enough, even for Go's needs, to match Java's collectors' performance or convenience. Whether or not Go could actually come up with a collector to get Java-like behaviour for Go programs while being significantly simpler is still an open question.


Would primitive objects help here? In theory the points could be primitive and not heap allocated? That way the GC only has to track the Triangle?


Yes, but Java does better than Go even without this help, because Go's GC is too simple for its own needs.


With the upcoming Valhalla project, a Point field will be flattenable.


But...that's exactly what Go does?


the metrics have said otherwise. the golang team has implemented more sophisticated GC algorithms similar to the JVM's and the performance differences where negligible and had significant downsides.

uber was able to solve there issue so the GC in golang seems perfectly fine.


They did not experiment with anything remotely resembling Java's new GCs, but rather aspects of very old ones (they just tried a relatively simple generational GC). Go's poor performance relative to Java's on memory intensive applications is largely due to its primitive GC, which does not work perfectly fine for any challenging workload.


fundamentally they did, they implemented a generational GC.

The optimizations the JVM does in the newer ones are well understood optimizations but don't address the root issues with implementing generational GC in golang.

the issues was with the write barriers and requirements around moving data which can't be avoided in generational GC implementations because you have to move data and update pointers.

most of the benefits of generational GC doesn't exist in golang because of escape analysis allocating data on the stack.


ZGC isn't currently generational and still outperforms Go's GC, and the benefits of Java's advanced collectors don't (just) come from their being generational. I agree Go has some difficulty with compaction among other things, but that only means that it will take some time to fix. Important Go workloads perform poorly because of its GC. I don't know what Go has to do to fix its GC, but it's not perfectly fine.


regions have generations. its the same concept just partitioned to make the working sets smaller.

there isn't anything fundamentally wrong with golangs GC. attempting to apply the solutions for the JVM to golang is fundamentally flawed and ignorant.

the languages have fundamentally different approaches to memory allocations. and the reasons behind the JVM implementations simply do not exist in golang.

the new pacer is in the works to attempt addressing many of the edge cases. https://github.com/golang/proposal/blob/master/design/44167-...

you seem to lack fundamental understanding of what the actual issues in the golang runtime are with relation to its GC.


> attempting to apply the solutions for the JVM to golang is fundamentally flawed and ignorant.

I'm not saying that Go should use GCs like Java's. I'm just saying that Go's GC does not work well enough for Go's needs, and that Java's GCs now deliver a better experience. Maybe Go needs something entirely different, but it does need something better than what it has now.


I don’t think that ZGC is next-gen over G1, they just choose different tradeoffs. G1 is very good at throughput, and is very good at upholding its target pause times even under loads. But it doesn’t promise that low latencies as ZGC does, but latency is fundamentally one end of the spectrum, where the other end is throughput.

But otherwise agree with you.


I really hope that they will release this as a library. We're having the exact same challenges running Go in production.

The biggest challenge with Go in production is that, as the article points out, Go doesn't have a maximum memory setting that can be used to tune the GC. If you run Go with a memory limit on Kubernetes, it's common for it to simply run out of memory rather than using the limit for backpressure.

This is kind of surprising, since Go is so entrenched in the Kubernetes world (and both Docker and Kubernetes are written in Go). It's possible that Google itself doesn't develop that much stuff in Go, and that the Go team doesn't have a huge incentive to innovate in this area.

There have been a few attempts at improving heap management, including an aborted attempt at respecting ulimit [1], a promising implementation of a SetMaxHeap() function [2], and a propsoal for dealing with backpressure [3], but these projects have mostly failed to get proper traction. It's a complex problem that needs a cohesive solution.

Fortunately, there is now a proposal [4], which has been accepted, to add a soft limit to Go, which has a more thought-through design [5], though I'm not sure if it's being actively worked on yet.

I'm also not sure if that proposal, when implemented, will make the Uber approach redundant, or if these are in fact complementary. If Uber could open-source their library, it might be a good solution until Go itself has better GC management.

[1] https://github.com/golang/go/issues/5049

[2] https://github.com/golang/go/issues/16843

[3] https://github.com/golang/go/issues/29696

[4] https://github.com/golang/go/issues/48409

[5] https://github.com/golang/proposal/blob/master/design/48409-...


I have 0 experience in Go, and for me it looks like "Figure 11" in their article contains a full example of the needed code. Maybe I'm wrong.


No, their code adjusts the GOGC value. The code fragment, as I understand, just shows how to essentially get a notification every time the GC runs.


The fact you can make such savings tweaking the GC, suggests the applications themselves could be improved not to produce so much garbage in the first place.

A win is a win, and its still a very nice saving for barely touching the application code.


That may not always be productive, because Go’s GC has a pacer which ensures there’s at least one collection every… 2mn I think?And since Go’s GC is non-generational, every collection has to traverse the entire heap.

Discord had an issue with that for an LRU cache service, where their memory usage was basically constant (very little garbage generated), but because the heap was quite large the pacer would trigger a huge CPU spike every 2mn as it would need to traverse the entire thing, looking for something to release (which would not exist).


Wouldn’t it be solved by a generational GC?


Depends on the implementation, of the LRU and of the generational system.

But mostly it depends on whether the pacer would perform a minor or a full collection in that scheme./


AIUI, Discord's issue would largely not arise in current Go code. In fact if I remember correctly Go had a near-fix to the issue in flight in beta by the time they finished their stuff, but having had a rewrite in hand by then there was no reason to go back. (Were it half in hand, one might have to discuss sunk cost fallacy, but when it was entirely in hand and half-deployed, going back just costs more.)

However, even though Discord's article is technically out of date in terms of their exact numbers, the principle still holds, just at larger scales. If one keeps scaling up, eventually one will encounter fairly fundamental and difficult problems that take odd solutions, and no fully automated memory solution will solve them.

I would observe, though, that these complaints are arising at a very significant scale. It is a common error in programmers to assess their needs as if they are going to be writing code running on a hundred servers maxed out on the resources at near 100%-CPU when in reality their code is going to comfortably run on one instance with 5% of one CPU in a day.

I say without hesitation that if someone is looking to run dozens of maxed-out servers, Go is a bad choice and it is a mistake to even start writing that code in Go. (There's many even worse choices; if Uber was trying to write the same service in Python or something... yeowch.) But if someone rejects Go because it can't hit that use case, but the use case couldn't possibly hit that scale unless every person on the planet become a customer five times over, that's making the exact same mistake. Go is a good solution for many very common use cases, but it's not that hard to do some Feynman estimations at the start of a project and notice that it's getting kind of close to the comfortable limits for Go.

(Even growth isn't really an excuse. Resources are so abundant that you should take a log-based view, or an exponential-based view if you prefer. I like to have an order-of-magnitude buffer minimum in my design for the largest possible scale I could face, and most of the time that's pretty practical nowadays. If I have a case where Go would work, but I'd only really have roughly a factor of 2x growth before it would become a problem, I wouldn't use it. It's too easy to consume that by either usage growth, or future changes in what the system needs to do, or error in the Feynman estimation. But resources are, as I said, so abundant that by the time I'm maxing out a 32-core or 64-core system with however much RAM that comes with nowadays, I'm running a lot of stuff.)

I would be curious if they've got a "rewrite in Rust" effort going. Wouldn't be surprised to see it cut the CPUs yet again by half or thirds. Depends on how big & complicated the service in question is.

I guess you could even use that as a metric... if someone come up to you and said "I've got a magic button that if I push it will cut your code's CPU usage in half. How much will you pay me to push it?" and if the answer is a non-committal shrug, Go's a fine choice. I have about a dozen Go services and I'd pay you about a buck to push that button, because they're already way more efficient than I need. Uber would clearly pay quite a bit.


It’s rarely worthwhile to spend a human being’s time just avoiding garbage, and Rust is a better fit for those cases. If we can tolerate garbage, the JVM has more productive languages and better GC throughput (Go wins only on latency, with more shorter pauses).


The JVM is better for low-latency as well with ZGC and Shenandoah.


The scale of Uber is a bit incredible. Savings 70 000 cpu cores is a lot for a taxi app.


Calling Uber just "a taxi app" is IMHO like calling Facebook just a website. Sure, that's what the end user gets, but there is plenty of engineering involved such as maps service (I believe they have their own, but I might be wrong), geo searching for locations, batch processing for data analytics, ML experiments etc.

I believe Uber is also well known for building everything in house as opposed to using common cloud services. So they need to run their stuff.


Makes me wonder what on earth all those cpus are doing. Many people take trips by taxi, but not THAT many.


Right? Like our product isn’t Uber scale, but our Go microservices each get about 10,000 active users per core before the autoscaler adds a new instance. I’m curious their numbers.


"Uber drivers completed 4.98 billion trips in 2020, a 27 percent decrease from the 6.9 billion trips in 2019"

Now, granted, 2021 is probably down on that further but it's still a lot.


7 billion per year is 222 trips per second on average.


With each trip taking an average of 15 minutes, that’s about 200k concurrent, live sessions. Each session has at least two participants, so 400k live users any given second. I imagine their spikes are significantly higher than their slow periods, too.


Well sure. And I've worked on software run by ISPs that have had more than 4M concurrent active sessions. Handled by something like 20 machines. Yes, they probably have way more overhead per session, but it is still a really high amount of servers.


So a bit less than 6 live connections per CPU saved with the garbage collector configuration.


Imagine if they had used a non GC language, could've saved a lot more.


Or they would have never reached the market.


Uber migrated to Go from python and node.

By the time they migrated they already were mature in the market.


Ok, then we can rephrase OP's comment: "or they would still be using their old software, because reimplementing everything in a non-GC language took too long"?


There's always a tradeoff. Perhaps it would decrease development speed, or maybe introduce memory leaks?


Other than some low-level systems programming, there is hardly any workload that would have trouble due to GC itself.


Maybe I'm missing something but I don't understand why they choose to use a 'chan time.Time' instead of a 'chan struct{}'.

Is there any advantage using a 'time.Time'? This is just a simplification to make the example less obscure? Or the change is so insignificant(%-wise) that it makes no sense to optimize for it?


Where do you see this? The article doesn't mention channels at all.


In the example they gave. Figure 11.


Ah — I didn't realize the code fragment was an image, so Cmd-F didn't find it. My guess is that this doesn't show all their logic. They talk about tracking the interval between each GC, so they might use the timestamp for that.


I wonder if they could open source this library?


Still feels like golang or maybe a 3rd party addition (not sure if that’s possible? Assume not) could just make a GC with a few more knobs, and would be a net benefit to the community. Every time I read about golang gc feels like the solutions are fighting the framework a little/a lot.

There is tons of great gc research/experiments/learnings out there that can have real benefits, maybe the JVM with its zillions of knobs is a step too far but find a middle ground seems like it would help people.


Comparing JVM tuning with Go GC tuning is night and day. I've never not needed to tune the JVM when deploying software. Conversely, I've never needed to tune Go GC.

That's not to say tuning isn't valuable or necessary, but that the vast, vast majority of Go programs will never need tuning. I cannot in good faith say the same about JVM programs, even with the most recent and modern GC profiles.


If you had to tune the JVM, you hadn’t tried it in the last 8 years at least.

Other than possibly max heap size, G1 should only ever be tuned by the target pause time value, which chooses between latency and throughput.


That checks out since anyone starting a new project in Java in the last 8 years. From my perspective, C# is just Java but better. I understand why Java has worked so hard not to break stuff, but it has left a language that is fundamentally broken in about 3 ways. (typesystem/generics, boxed vs unboxed types, Nullable by default)


How is nullable by default fixed in c#? The typesystem is perfectly fine, as well as generics. Do note that the majority of languages does type erasure - it is not a flaw, but a feature.

Boxed/unboxed I agree could take a bit of love but that is happening with Valhalla.


The type system isn't fine. Try making an ArrayList<int> or a Integer[]. Also, due to type erasure, it is impossible to check the type of a generic at runtime which can be really annoying.


> Try making an ArrayList<int> or a Integer[]

Yes and that is due to boxing/unboxing. This is also being worked on.

And while you indeed can’t check the generic type of a generic object, I really rarely see any reason for that. Like, if you have written code in any language without reflection, you can’t do that for any object and it is not a hindrance in itself.


It's also easy to over-tune your JVM because there's so many options. For modern JVM, I've found that only setting a large enough heap size is good enough for the majority of the software.


Stuck in Java 7?


This article seems to have made it work. Have you seen cases where it doesn’t?


You know who needs to read this article... Freaking influxdb




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: