I should probably clarify that little bit of hyperbole. We do track known issues (of course), but bugs in production are extremely rare -- about one per year.
MPS appears to have been designed in a single core era, and it would be great if you would address any architectural changes that were made to address the current prevailing multi-core platforms.
Yes indeed I'm very aware of this. The overall architecture and abstractions can support efficient multi-core operation, but the implementation is behind. Development is mostly paid for by single-threaded clients at the moment! But I have plans. Well spotted, by the way.
Most of the problems are around the bottleneck of stopping threads on other cores in order to get privileged access to memory while preserving their consistent view of the heap. What the MPS needs is its own protection map. Most OSs don't help much with that, although Mac OS X, for a while, allowed you to map the same physical RAM at two addresses with different protections. Sadly, no longer.
This is a really interesting topic and I could fill your screens with a wall of text. It's true that the MPS currently does not collect concurrently, however the only thing that makes it not-concurrent is a critical point in the Shield abstraction where the MPS seeks to gain privileged access to memory (usually in order to scan it for GC). The critical point is where ShieldExpose in shield.c has to call ShieldSuspend to preserve the shield invariants. This is the only point in the MPS that prevents concurrency, and the rest of the MPS is designed to support it.
The restriction could be removed if either:
* the MPS could use a different set of protections to the mutator program
* the mutator program uses a software barrier
The first one is tricky, and the second one just hasn't come up in any implementation we've been asked to make yet. Given a VM, it could happen, and the MPS would be concurrent.
So, I believe there's nothing fundamentally non-concurrent about the MPS design. It's kind of waiting to happen.
[I've put this in a comment in the code. Thanks for making me write it down :)]
Is there any documentation that says what this does/how it works? The documentation on the web site seems to be for people already familiar with the code.
e.g. how does it compare with the Boehm collector?
Looks like it's a precise GC, not a conservative GC like Boehm. That means that your language has to provide a list of roots manually, while Boehm discovers them automatically. The downside is that Boehm can misidentify pointers (and it also has to jump through extra hoops to achieve incremental and generational collection, while in MPS it should be more straightforward).
It's both a precise and conservative GC, depending on what you declare to it, and which pool classes you use. In commercial deployment and in Open Dylan it's a mostly-copying GC (i.e. a mixture of both) see http://www.memorymanagement.org/glossary/m.html#mostly-copyi...
MPS allows for precise rather than conservative GC. It supports doing copying collectors (see poolamc in the design docs for example).
It has some pretty nice telemetry and event logging as well.
Overall, I really like MPS and it is pretty powerful. It is a nicely structured library. It is more work to use than Boehm, but it does more than Boehm.
Yeah we're not really a drop-in malloc replacement like Boehm, though we could have some glue which deploys the MPS like that. Take a look at the Scheme example to see how the MPS integrates at present http://www.ravenbrook.com/project/mps/version/1.110/example/...
OpenDylan (http://opendylan.org) has been using MPS since MPS was first created. (Both were created at the now defunct Harlequin.) We've been working on reviving OpenDylan over the last year as well and this MPS update is a big boon for us.
If the licensing terms aren't suitable for you (for
example, you're developing a closed-source commercial
product or a compiler run-time system) you can easily
license the MPS under different terms from Ravenbrook.
Please write to us <mps-questions@ravenbrook.com> for
more information.
(And you can see from the exception that we got for OpenDylan that such things are possible.)
I was excited to use MPS in the runtime of an open-source compiler, but if users of the compiler had to pay in order to use it for closed-source apps then that would be a big turn-off.
I can't give you figures right now, but the reason we have commercial clients for the MPS is that we have extremely low pause times. Side-by-side comparisons are expensive to arrange. I'm working on that :)
One of the largest problems with the sun-jvm GC implementations is they have global pause times to rewrite pointers. This is (in part) because there is no read barrier, and thus all threads must be paused to be 'fixed' after object moves.
Additionally with CMS, there is global heap fragmentation and recompaction which can take literally minutes on a large (8gb+) heap or slow machine.
Setting aside exact figures, can you comment on the above challenges and issues?
Sure. The MPS approach that's deployed commercially is to use an unfashionable hardware read barrier to amortize the cost of the pointer rewrite, allowing the heap to be compacted incrementally. There's nothing that takes minutes or even seconds -- the cost is spread through the entire execution. In some sense, all this can take "minutes" of real time, but not at the cost of stopping the program doing stuff. The MPS design has always been to spread the cost evenly, not push it into a cataclysmic event in the future.
That said, we have an abstract framework that could be made to work in other ways, depending on requirements. It wouldn't be much work to hook in a write-barrier-only pool class, etc. etc. and have it co-operate. However, most of the development effort so far has gone on the read barrier approach.
As to how this effects overall run-time, well, that's where we'd have to arrange a side-by-side comparison, and make sure it included one of those compactions :P
I am not really sure the current state of things, but to "get real" I feel any GC needs to be able to handle:
- multicore/multithreaded
- heap sizes of 20-200GB and larger ideally
Think about it this way... We have the same goal, of diminishing the use of manually allocated ram to the smallest possible place. Think of the GC vs malloc as compilers vs assembly language. Assembly has it's place, but it's no longer because compilers cannot generate efficient object code in a vast number of circumstances. Lets do the same for GC!
I think we'd need a more careful definition of "handle" there!
And we're definitely not proselytizing garbage collection here. The MPS is a framework for both manual and automatic memory management (and co-operation between the two). One of our main high performance commercial applications is all about the manual management, and for very good reasons.
But nobody should be rejecting GC out of hand, that's for sure.
The 2002 ISMM paper on MPS says that there is a global pause. As usual, it may be fine for small heaps and allocation rates, but eventually it will become a problem.
There is currently a "global pause" to scan thread registers and stacks at "flip", but that's all. Is that what you meant?
Since we're just moving our commercial clients onto 64-bit, we don't have experience except up to the 4GB limit (or 3GB in old Windows) which they approach regularly. We're still waiting for it to become a problem. I may have a different opinion in a year.
One of my short term goals is just to get some measurements together to publish. Watch this space, and sorry that they aren't here yet.
> The MPS has been in development since 1994 and deployed in successful commercial products since 1997. Bugs are almost unknown.
Then half way down the page is a list of known bugs.