However, it does not address the base issue, which is that Go uses a conservative garbage collector, and more values look like pointers in a 32-bit world.
The only real fix would be to improve the garbage collector's
understanding of which values are pointers and which are something else
(e.g., floating point numbers that happen to look like pointers). And
that is not an easy fix.
How does this GC work? Is it literally just marching through the heap looking for pointer-sized values in the range that has been mapped to the process?
It sounds crazy (and it is!) - but it often works reasonably well in practice. SBCL (one of the most performant Common Lisp implementations) has an 'imprecise gc' that works the same way - and I've seen reasonably heavily stressed processes with uptimes in the weeks/months.
Remember, even a few years ago (before fastthread, etc) a reasonably loaded Ruby on Rails app couldn't stay up for more than ~10 minutes w/o memory leaks forcing a restart (DHH said 37Signals as doing ~400 restarts/day per process, IIRC) because the runtime was such a piece of crap. Yet, many people still used it to solve real problems and make real money.
At least Go's memory leaks are much slower than Ruby's ;-)
The SBCL garbage collector doesn't work like described in the grandparent comment. It's precise for the heap, and only conservative for the registers and stacks.
I can't say for sure how the Go GC works, but I would assume it likewise isn't fully conservative. E.g. the "avoid struct types with both integer and pointer fields" advice would be pointless for a fully conservative GC, but does make sense if structs containing no pointers are allocated in a separate memory region that's not scanned for potential pointers.
Both data on the stack and registers may potentially contain the sole live pointer to a particular heap object, so unless you only run GC from some sort of main event loop, it's generally necessary to treat the stack and registers as potential roots as well.
Probably Golang developers/implementers should change their compiler to insert "magic" constant before each 32-bit pointer in the GC memory, so all such pointers could be quickly discovered during full linear scan of the heap during GC, regardless of target address.
This "magic" constant should be non-valid value in 32-bit single precision float point format, far away from usual mmap()'ped address ranges, and far away from small integers.
Maybe this "magic" value should be randomized to prevent DoS attacks on Go runtime library.
When Go gets a precise collector, simple implementations of this will work. Doing this kind of thing in Go requires importing the "unsafe" package, and any memory allocations done by code importing "unsafe" could be marked as possibly a pointer.
However, it would probably be possible to write code involving two packages, one of which does not import "unsafe", to lead to dangling pointers and eventual crashes. That is why you should be careful about code that imports "unsafe".
Almost — C# has a cool feature whereby you can do pointer arithmetic if you pin the objects in question first, so the GC won't collect or move them. (The language statically enforces this by forbidding you from taking the address of a value until you pin it.)
That's cool; I think another way of doing it is to allow programs to request a sandbox where the GC doesn't go so they can do all of the pointer-arithmeticking they want as long as all their pointers fall within the sandbox by the time they're written through or dereferenced.
Now, DHH tells me that he’s got 400 restarts a mother fucking day. That’s 1 restart about ever 4 minutes bitches. These restarts went away after I exposed bugs in the GC and Threads which Mentalguy fixed with fastthread (like a Ninja, Mentalguy is awesome).
If anyone had known Rails was that unstable they would have laughed in his face. Think about it further, this means that the creator of Rails in his flagship products could not keep them running for longer than 4 minutes on average.
Repeat that to yourself. “He couldn’t keep his own servers running for longer than 4 minutes on average.”
I've never been much of Ruby/Rails guy - but my understanding is that everyone had to restart Rails a few times an hour cause the memory leaks were so bad in those days.
I certainly wasn't anywhere in the loop, but at least on my site, FastCGI didn't work, and Mongrel did.
Zed's rant/flame goes into so much personal detail that he doesn't really articulate the point. Rails was receiving this enormous amount of hype, but there wasn't even a working application server yet.
No one cares how the sausage is made. They care that they have sausage. I write code to make money, not for uptime competitions. If something that makes money needs to be restarted every 4 minutes and customers are willing to pay for it why should I give a shit?
FCGI problems were very customer-visible. Prior to Zed coming along, the issue wasn't being discussed much anywhere, making it unclear if it was your code, ruby issue, framework issue, or appserver. Now it is your problem to fix!
But Ruby is more or less a "rapid development" environment.
You pay for quick development turn-around with machines.
Go is not in that kind of space as far as I know since, as I understand it, Go is sold as low-ish level language competing with C++ and C for system development. For something that would replace those languages, unexplained, systematic leaks would be bad (I mean, my C++ programs might leak but it's reasonably easy to discover why. That matters).
From experience, development in Go can definitely be classified as rapid. The type system stays out of your way, and compile times are insignificant. Try compiling on play.golang.org, or tour.golang.org to see what I mean.
About this issue: yeah, its unfortunate, but its a property of the class of garbage collector that Go uses atm. My servers are all 64-bit, so doesn't really affect me. But I do feel bad for those who are trying to run Go on ARM, or on 32-bit servers.
That would seem to also present a DOS vector (even on 64 bit) if a user can get the program to store data (of any type, e.g. char or floating point) that happen to be binary-equivalent to pointers to large allocations.
This is realistically not an issue. First, on a 64 bit machine, the range of actually mapped addresses is small relative to all the possible values that can fit into 64 bits. Second, from an attacker's perspective, the values corresponding to mapped memory are extremely difficult to predict, and the values binary-equivalent to large allocations are impossible to predict, even with access to the source.
If you think you can still do it, all of *.golang.org and golang.org are running Go on Appengine, with the source code being freely available. This is your opportunity to get a back door into Google's servers.
If you make a huge allocation (many pages), isn't the Go runtime very likely to call malloc()? For large allocations, malloc() is going to get you a bunch of fresh pages and you will generally get the address of the start of a page. The offset of the pointer within the page is then likely to be deterministic, so you probably only need one unit of pointer-equivalent data per page. If you have enabled huge pages (e.g. 2MB, not uncommon), then you have already soaked up 21 bits of the 48 bits of address space that are actually used by x86-64 implementations, leaving only 27 bits for a collision. The stack grows down from 2^46 and typical heap values on x86-64 are still well within 32 bits. Finally, a collision need not be frequent to be a serious DOS concern.
The Go runtime does not call malloc(3) for heap, it reserves address space at known high locations (over 2^32) with mmap(2) using the MAP_FIXED flag, and it does so in 16GB increments (or is it just one 16GB allocation? can't remember).
I won't comment on the DOS concern until I've investigated further.
You need more than that. You need a server that regularly allocates large amounts of memory and then leaves them unreferenced so that the garbage collector can collect them. Then you also need the program to store data that you control, and to also keep references to that data--after all, if that data is collected, then the faux-pointers no longer pin the other allocations. Overall this does not sound like a common allocation pattern for servers.