The only real fix would be to improve the garbage collector's
understanding of which values are pointers and which are something else
(e.g., floating point numbers that happen to look like pointers). And
that is not an easy fix.
How does this GC work? Is it literally just marching through the heap looking for pointer-sized values in the range that has been mapped to the process?
It sounds crazy (and it is!) - but it often works reasonably well in practice. SBCL (one of the most performant Common Lisp implementations) has an 'imprecise gc' that works the same way - and I've seen reasonably heavily stressed processes with uptimes in the weeks/months.
Remember, even a few years ago (before fastthread, etc) a reasonably loaded Ruby on Rails app couldn't stay up for more than ~10 minutes w/o memory leaks forcing a restart (DHH said 37Signals as doing ~400 restarts/day per process, IIRC) because the runtime was such a piece of crap. Yet, many people still used it to solve real problems and make real money.
At least Go's memory leaks are much slower than Ruby's ;-)
I can't say for sure how the Go GC works, but I would assume it likewise isn't fully conservative. E.g. the "avoid struct types with both integer and pointer fields" advice would be pointless for a fully conservative GC, but does make sense if structs containing no pointers are allocated in a separate memory region that's not scanned for potential pointers.
You pay for quick development turn-around with machines.
Go is not in that kind of space as far as I know since, as I understand it, Go is sold as low-ish level language competing with C++ and C for system development. For something that would replace those languages, unexplained, systematic leaks would be bad (I mean, my C++ programs might leak but it's reasonably easy to discover why. That matters).
About this issue: yeah, its unfortunate, but its a property of the class of garbage collector that Go uses atm. My servers are all 64-bit, so doesn't really affect me. But I do feel bad for those who are trying to run Go on ARM, or on 32-bit servers.
This "magic" constant should be non-valid value in 32-bit single precision float point format, far away from usual mmap()'ped address ranges, and far away from small integers.
Maybe this "magic" value should be randomized to prevent DoS attacks on Go runtime library.
x = malloc(1); // allocates block 'a' in memory
int i = (int) x;
x = 0;
i = i - 1;
// gc runs here and frees 'a'
*( (int*)(i + 1) ) = 123; // failure
But note that isn't really fair to conservative GC, as no GC, precise or not, can cope with hidden pointers like that. So your example isn't really a strike against Go.
However, it would probably be possible to write code involving two packages, one of which does not import "unsafe", to lead to dangling pointers and eventual crashes. That is why you should be careful about code that imports "unsafe".
The problem with an imprecise GC is that having an integer that looks like a pointer, could prevent an object from being freed.
- Shaw's opening volley: http://web.archive.org/web/20080103072111/http://www.zedshaw... [lots of stupid personal flames - but his technical points are consistent with what I remember from the time]
- DHH's response: http://david.heinemeierhansson.com/posts/31-myth-2-rails-is-...
- I can't find a copy of shaw's actual response, but here's the relevant HN thread: http://news.ycombinator.com/item?id=364659
The salient point (quoting Zed):
Now, DHH tells me that he’s got 400 restarts a mother fucking day. That’s 1 restart about ever 4 minutes bitches. These restarts went away after I exposed bugs in the GC and Threads which Mentalguy fixed with fastthread (like a Ninja, Mentalguy is awesome).
If anyone had known Rails was that unstable they would have laughed in his face. Think about it further, this means that the creator of Rails in his flagship products could not keep them running for longer than 4 minutes on average.
Repeat that to yourself. “He couldn’t keep his own servers running for longer than 4 minutes on average.”
I've never been much of Ruby/Rails guy - but my understanding is that everyone had to restart Rails a few times an hour cause the memory leaks were so bad in those days.
Zed's rant/flame goes into so much personal detail that he doesn't really articulate the point. Rails was receiving this enormous amount of hype, but there wasn't even a working application server yet.
If you think you can still do it, all of *.golang.org and golang.org are running Go on Appengine, with the source code being freely available. This is your opportunity to get a back door into Google's servers.
I won't comment on the DOS concern until I've investigated further.