

The next operating system: building an OS for hundreds, thousands of cores - bitsai
http://web.mit.edu/newsoffice/2011/multicore-series-2-0224.html

======
russell
Color me skeptical. The project will fail, of course, because it is too
ambitious. There are too many required new developments for it all to come
together: new chip, new OS, new forms of scheduling, a lot more bookkeeping,
not to mention new programming paradigms and compiler technology. I would not
be surprised that bookkeeping and bandwidth would eat up 90% of the processing
cycles. (Obviously, I'm pulling this out of my nether end.)

It feels to me that is the next iteration of refinements of technology that is
40 - 50 years old. Caches are what, from 1970? Interconnect issues date from
the same time. I think machine cycles are in abundance, the scarce resource is
interconnections for data flow. So one of the first things you want to do is
organize your data flow so that processing is local. Think simulations like
vision processing, weather prediction, rendering, where a processor can work
locally and pass on a reduced amount information to its neighbors. The
interesting problems arise when the results have to be delivered non-locally.
If you store them in main memory for the recipient to pick them up, you run
into bandwidth problems.

So what I see as needed is gazillions of low level worker bees with modest
bandwidth requirements that have semi-permanent connections to the consumers
of their output. Think the human brain, Google search, image rendering.

Apologies for the rambling, lack of citations, etc. etc, but I am interested
in HNer's views on these issues.

~~~
m0th87
I think you're seeing this project too much from the perspective of whether it
will succeed/fail in the marketplace. That is not the point of research. This
should be viewed as an exploratory search for new paradigms in multicore
computing. Some components might work out to something usable, most won't.

~~~
russell
I agree with you completely. It is worth doing for what we learn, but what I
was trying to say is that the real breakthroughs are going to come from other
directions. The people who will get rich are probably the generation after
that.

------
kmod
I think there's some confusion about this project, since the article doesn't
go into to much detail about what it actually is, apart from some high-level
technical highlights. My understanding (which is potentially outdated) is that
Angstrom is a collaboration effort between different groups who are trying to
combine their separate research into a larger project.

For the criticism that the project will "fail", I think there's some merit.
Will this produce the next computation system that the world uses? Probably
not. Will this push the boundaries of engineering and improve on the state of
the art? Almost certainly. And due to its scale, and the fact that Angstrom
was created by combining multiple separate projects, there's also room for
individual projects or ideas to succeed even if others don't.

My background is that I worked on the FOS project, which is the part that is
most related to the title of this post. If you were looking for more technical
meat, I encourage you to check out the project page:
<http://groups.csail.mit.edu/carbon/?page_id=39>

------
ph0rque
Since starting to learn Erlang, I've been wondering if there's been any effort
to build an operating system on top of the Erlang VM to exploit its
concurrency... seems like it might be a good fit for this project.

~~~
kmod
The problem is that parallelism, and performance at this level in general, is
not the kind of problem that you can solve by adding additional layers of
abstraction. I can only speak to FOS, and not the other projects, but the idea
behind it is to optimize the stack all the way down. My guess is that if you
tried to run the Erlang VM on a 1000-core cpu, it would run into Linux
scalability issues, regardless of how parallel Erlang is. Instead, we need to
think about the base primitives of how we interact with the hardware, and
potentially redesign them in a way that let us fully utilize the hardware.

~~~
gtani
[http://groups.google.com/group/erlang-
programming/msg/75cbea...](http://groups.google.com/group/erlang-
programming/msg/75cbeab74644dc0a)

[http://www.tilera.com/about_tilera/press-releases/linux-
appl...](http://www.tilera.com/about_tilera/press-releases/linux-applications-
are-scaling-better-ever)

( June 2009, Erlang (BEAM) on Tilera 64-core )

------
Mongoose
Too much overview, not enough meat. Are there any whitepapers available for
the Angstrom Project? The publication page of their website just says "coming
soon."

<http://projects.csail.mit.edu/angstrom/Publications.html>

------
inoop
As long as they don't write the thing in C ...

~~~
marshray
What would you write it in?

~~~
inoop
The problem with C is that it's not memory safe. Therefore you need hardware
hacks (MMU) to separate processes. If you use a memory safe language you lose
the ability to do pointer arithmetic (and the speed hacks that come with it),
but you don't need to remap the page table every time you switch between
processes. This makes threads and processes pretty much the same thing. Code
injection is impossible in a memory-safe language, which helps with security
(I'm talking about buffer overflows here, not SQL injecion, XSS etc).

Using a higher-level language can also help with many other things, for
example integrated garbage collection, built-in serialization and process
migration, intelligent message passing (pass objects around rather than
bytes), etc.

A good example of an implementation of these ideas is Singularity
(<http://research.microsoft.com/en-us/projects/singularity/>), which is
written in a superset of C#.

~~~
cynicalkane
_Therefore you need hardware hacks (MMU) to separate processes_

I cannot apprehend the confusion of ideas that would lead you to believe this
is caused by C. It's assembly, not C, that runs on your machine, and assembly
is not memory safe. Microsoft's Singularity runs managed code, so there's a
software layer providing memory protection.

An operating system is not defined by its preferred programming language.

~~~
inoop
"I cannot apprehend the confusion of ideas that would lead you to believe this
is caused by C. It's assembly, not C, that runs on your machine, and assembly
is not memory safe"

There is no confusion of ideas, but I like the Babbage reference :)

In Singularity the code is compiled first to CIL, then to x86, x64, or ARM by
an AOT (ahead-of-time) compiler. Now here's the thing: the OS loader does not
load Assembly code, it loads CIL code, which it then compiles further down to
Assembly. Since CIL is verifiably memory safe (like Java bytecode), and
assuming the AOT compiler is not buggy, the OS is memory safe. Hence no need
for MMU/MPU to do memory protection.

So even though you're right that Assembly is not memory safe, the final
compiler stage (in this case CIL can be seen as an intermediary language) is
implemented in the OS (the loader), and the OS will reject any non-memory safe
code. No code that can write outside array boundaries, violates type safety,
and attempts pointer arithmetic will be allowed to execute.

So no, Singularity does not provide a software layer for memory protection,
it's purely an (intermediary) language feature.

~~~
caf
The software layer for memory protection here is the AOT compiler. As you
alluded to yourself, if Mark Dowd finds one of the bugs in your AOT compiler,
then you are toast.

The nice thing is that these protections can be applied ahead-of-time rather
than at runtime, as the MMU does.

It has been shown (see Native Client) that even a slightly restricted subset
of x86 assembler can be made verifiably-safe. In principle, you could do the
same thing with C - in fact, you likely wouldn't even have to change the
language definition, since the egregious abuses in C mostly result in
"undefined behaviour", which allows the implementation to detect and trap them
rather than hose the machine. (It is a curious fact that C itself isn't
inherently "memory unsafe" - merely almost every implementation of C ever
written is).

~~~
inoop
People have attempted to make C memory safe (i.e. bitc, cyclone), so you could
write an OS in that, I guess. My original point however was that C itself is
not, and that supporting C means supporting non-memory safe drivers, services,
and programs.

If you can come up with a program that verifies whether a C program is memory
safe or not (without constraining the language spec), well, hats off to you
sir, I think I know a couple of security experts who might want to have a word
with you :)

------
runT1ME
Azul's systems seem to be doing fine with hundreds of cores, and I'm pretty
sure it's just modified linux.

------
stcredzero
_an OS for hundreds, thousands of cores_

That's exactly what Microsoft Azure claims to be in their marketing
literature.

~~~
kmod
This is an unfortunate case of people adopting technical terms for PR
purposes. When we (FOS) say "OS", we mean that when you write a program for
FOS, you think about it as if you are writing it for a single computer. It
doesn't matter if that computer has multiple processors in it, or even if
those processors are in different boxes and are only connected by ethernet.

~~~
stcredzero
By reporting what Microsoft is saying in their marketing literature, I'm
neither advocating what they say is true, nor am I saying the work referred to
in the article is a copy of their efforts. Evidently, you jumped to some
conclusion like this. I am merely commenting on the behavior of their
marketers. As a long time observer of the tech industry, I always find the
behavior of marketers interesting, though not always in pleasant ways.

(I think it's wise to pay attention to how such forces change language.)

