

Compiling OCaml directly to a new cloud operating system - nl
http://anil.recoil.org/papers/2010-hotcloud-lamp.pdf

======
mathgladiator
Woah, this is like my dream on steroids.

during my weekends when I work on node.ocaml (
<http://github.com/mathgladiator/node.ocaml> ), I occasionally think "gosh, If
I could just get rid of that pesky OS, then I would have the most perfect
server ever!".

~~~
avsm
that's a great project you have there too, the more the merrier! :) I'll drop
you a note in a few days describing my libevent replacement in Mirage; it's
basically using LWT to convert async events into synchronous looking code.
Would be interesting to compare notes...

------
wmf
There may be some good ideas in this paper, but I find them obscured by
problematic details.

 _Under normal kernels, the standard OCaml garbage collector cannot guarantee
that its address space is contiguous in virtual memory and maintains a page
table to track the allocated heap regions._

They couldn't find ~1 GB of contiguous address space (out of 128 TB available)
under Linux, so they threw out Linux completely?

 _Each Mirage instance runs as on a single CPU core, and depends on the
hypervisor to divide up a physical host into several single-core VMs._

GIL getting you down? Just define away parallelism and let the programmer
handle it!

 _x86-64 does not have segmentation, and Xen protects its own memory using
page-level checks and runs both the guest kernel and userspace in ring 3. This
makes system calls and page table manipulation relatively slow, a problem
which Mirage avoids by not context-switching in ring 3._

That processor has hardware virtualization acceleration for a reason. Working
around performance quirks in obsolete hypervisors doesn't sound like a good
use of time IMO.

~~~
gaius
It's explained in practically the next sentence:

 _In tight allocation loops, the page table lookup can take around 15 % of CPU
time, an over-head which disappears in Mirage_

Functional languages like OCaml allocate and deallocate memory much more often
than traditional languages like FORTRAN. So while you are perhaps correct in
some situations, this approach would seem more optimal for functional
programming.

~~~
wmf
My point is that I don't think they needed that page table and its overhead in
the first place; they could just allocate the heap contiguously.

~~~
avsm
(author here)

Yes, the principles described here could easily be applied to a full Linux
kernel. In fact, one of the hacks on my TODO list is to statically link a
Mirage application against a Linux kernel (without a userspace) to run them on
the bare metal.

The point of defining a single address space and the smallest possible C
runtime is to start from the other end: rather than stripping away 14 million
lines of C code, I preferred writing a few thousand lines and jumping straight
into my runtime. It is an awful lot easier to experiment with something like
[http://github.com/avsm/mirage/tree/master/runtime/xen/kernel...](http://github.com/avsm/mirage/tree/master/runtime/xen/kernelthan)
the full Linux kernel by quite a long way. Note that the current tree isn't
finished yet; I'm pulling out dietlibc entirely at the moment, so the final
kernel binaries float around the ~200KB mark for a typical webserver. Then,
also consider hypervisor-only features like live migration or PV
suspend/resume that can be further optimised heavily and more easily in a
minios instead of Linux. Or that Mirage is single-vCPU and event-driven only
(no interrupts), and it starts to look quite different from Linux.

The Mirage IO library is also very portable; an application which uses only
TCP/UDP (instead of the lower level Ethernet) will compile on Linux/*BSD using
select/epoll/kqueue sockets to be "just" a high performance webserver. The
cool thing is that by using these APIs, we can also build applications that
compile to small specialised operating systems, while developing them as usual
on UNIX.

For a final entertaining hack, they are also portable enough to run directly
in the browser as Javascript applications thanks to Jake Donham's ocamljs
project. We're still integrating the Websocket code in to make this properly
finished, but it's a fine side experiment into portability :-)

~~~
wmf
For the specific problem of a contiguous heap, I don't even think kernel
modifications are necessary; the VM should be able to look at /proc/self/maps
and find the appropriate address space.

In general I realize that y'all are looking for rationalizations for an
exokernel-style design, but IMO they need to be fundamental issues and not
bugs.

~~~
avsm
I think you missed the bit of my reply where I explained how this could all
work on Linux too. And you seemed to have also missed the other bit where I
explained why a microkernel is nicer (hint: concurrency, no need for multiple
processes or user space context switches, etc).

------
benatkin
Why not erlang? Seems better suited to this use case.

In erlang, compiling erlang from within erlang is a normal way of doing
things, since things are broken down into tiny erlang processes.

<http://www.erlang.org/quick_start.html>

~~~
avsm
Part of the Mirage goal is to experiment with different parallelisation
frameworks on top of a very simple serial core: Erlang's OTP is one way to do
this, but there are many others. Personally, I love OTP but dislike the Erlang
syntax and lack of static types, hence my choice of OCaml (and also, there are
very reliable OCaml-to-Xen bindings available around as part of the Citrix
XenServer project).

As another poster on this thread put it (being sarcastic I think, but actually
spot on):

 _GIL getting you down? Just define away parallelism and let the programmer
handle it!_

We are indeed building multiple parallelisation strategies on top of Mirage,
which work across cores and hosts seamlessly. But first things first, and
getting the efficient serial version out is top of the list right now...

------
TheAmazingIdiot
Wow. Why does it feel like a throwback to the Transputer and Occam.

Though, I have always wondered why an idea of a free distributed operating
system was never implemented. With the old HeliOS, all you needed to do is add
it to the serial links and processes would migrate to it as needed. And when a
machine was turned off, those processes just migrated elewhere.

