> With Redox OS being a microkernel, it is possible that even the driver level could be recompiled and respawned without downtime, making it incredibly fast to develop for.
Honestly, I am always excited about systems like Redox that are going for the integrated and consistent user experience approach into building OSes. The microkernel approach on modern hardware sounds interesting here given that they skipped 32 bit entirely. The impression of most developers here keep referencing the Tanenbaum-Torvalds debate of 1992, as a justification for adopting monolithic kernels, which made sense at the time.
Fast forward into 2019, given that the CPU-level vulnerabilities are now alight, the need for a microkernel OS written in Rust could not have been greater. The reasons against a microkernel was commonly associated with the IPC and 'context-switch' performance impacts, but I find that the security implications here can help with countering these CPU-level vulnerabilities and the performance-concerns are in-directly solved by hardware advancements/implementations given for free or optimising the OS to run on multi-core systems.
Systems like Fuchsia have done both and I hope Redox follows and does this too.
I’m not sure I understand what you’re talking about. All the ugly CPU vulnerabilities make context switching very, very slow. A microkernel needs to contest switch more than a monolithic kernel, and the overhead keeps increasing.
> All the ugly CPU vulnerabilities make context switching very, very slow. A microkernel needs to contest switch more than a monolithic kernel, and the overhead keeps increasing.
In the case of the Zircon microkernel in Fuchsia, unlike Linux, almost all system-calls are asynchronous/non-blocking with a small number that are blocking [0], which is interesting since the OS is also fundamentally optimised for multi-core systems. Combined with both features, it makes it very suitable for real-time based applications, something that Linux is not optimised for and requires fundamental tweaking and changes for Linux or even some other traditional microkernels to achieve.
Thus, I doubt that on a system like Fuchsia/Zircon, the context switch would be notably 'very, very slow', even if Fuchsia was running on modern-hardware optimised for microkernels such as Zircon if that were to happen.
Reading around, Zircon syscalls are non-blocking at the scheduler level. Each syscall still requires a full context switch into kernel mode, though.
They can probably do some smart stuff with having each syscall just add stuff to a work queue so userspace can resume as quickly as possible, but it's still fundamentally more context switches than e.g. Linux would have, which if side-channel attacks keep coming out might be a problem for them.
My impression on systems from the days before linux/Intel began having real SMP is that if you are willing to go off the rails completely on supporting 1-2 core machines and making every CPU available to every task, then blocking side channel attacks no longer requires all the work..
If mitigation is starting to amount to ~20%, that's roughly a CPU and soon to be 2-3, that can be told to never process directly for Userland and never tell Userland when exactly output queue contents of sensitive tasks become available. All that work essentially being free compared to running with Intel's fixes and full CPU flexibility.
> Fuchsia/Zircon, the context switch would be notably 'very, very slow
The part that the Intel fixes makes slow is the mode switch, since you need to flush caches and trash data that may leak. Making it async doubles (at minimum) the number of mode switches needed to do work.
Monolith kernels like Linux have so much layers on top to emulate microkernel features, like user space drivers, containers, hypervisors that in the end they end up losing whatever advantage they had as bare bones kernels.
Additionally there are quite a few production quality microkernels, that people keep forgetting, because most only look at desktop OSes.
I’m really curious to see what the container orchestration landscape could look like on top of microkernels. It feels like this should be an area where they could particularly shine, or at least reach feature parity.
Couldn't a microkernel forego the expensive mitigations when switching between kernel processes? It wouldn't get the level of isolation that it ideally should, but it would still gain a lot over a monolithic kernel with fully shared memory.
Hard to make this a consistent coherent argument though. The microkernel will be secure because it has Spectre mitigations. The microkernel will be fast because it doesn't use Spectre mitigations. ?
The "microkernel" will be more secure than a monolithic kernel because a breach in one subsystem will confine an attacker to what is possible with Spectre exploits, as opposed to being able to read and write any memory directly.
But parent claims that with CPU vulns, the need for microkernel has never been greater. You're talking about disabling those mitigations. The need for general address space separation hasn't been changed by Spectre.
> CPU-level vulnerabilities and the performance-concerns are in-directly solved by hardware advancements/implementations given for free or optimising the OS to run on multi-core systems.
I don't get it. If these vulnerabilities involved leaking data across process boundaries, and the fixes make processes slow, how do more processes help?
Indeed, mitigations slow down boundary crossings of protection domains (kerne/user, user/user), which slows down context switches, which are the very thing that are performance critical for microkernels.
I have to wonder if rusts safety guarantees and approach to concurrency will allow some performance optimization across those boundaries. They seem like related problems.
There are some very smart research kernels based around only allowing managed, safe code to be ran, so that language security does all the heavy lifting instead of hardware security. The problem then, though, is that every language unsafety bug is now a kernel bug that lets you read other processes' private keys or write to kernel memory.
Rust is supposed to be safer. There are still plenty of type system unsoundness issues or LLVM unsafety issues that leak through every so often.
You could replace virtual memory as an isolation mechanism with capabilities? That would allow unsafe languages like C to be used, as the hw would enforce that capabilities are unforgeable. E.g. CHERI.
> the guarantees provided by any safe language are insufficient to prevent information leakage
i’m curious why?
i understand if a “safe language” provided unsafe capabilities (like marking a section with unsafe keyword to access registers or raw memory) then of course that’s a problem... but if you don’t have global variables and memory access, shouldn’t it be possible to cover it with just software?
maybe i’m missing something?
(and apologies if the answer is obvious ^^)
> but if you don’t have global variables and memory access, shouldn’t it be possible to cover it with just software?
Can you index out of bounds and catch the exception? You're attackable via Spectre. Can you run multiple threads and time operations? You're vulnerable to MDS.
The attacks don't rely on memory unsafety -- they rely on the state of the CPU being changed by doing fairly normal things, while an observer probes the state by doing other fairly normal things.
> Can you index out of bounds and catch the exception? You're attackable via Spectre.
ah, i see... ok assuming you could never access out of bounds (all arrays must be typed by length) would that still apply? or do you mean any kind of exception handling that can be deterministically invoked? what if a hypothetical safe language had no exceptions?
> Can you run multiple threads and time operations? You're vulnerable to MDS.
would that more or less depend on wether or not hyper-threading is enabled or not? assuming you can choose which cpu you are using(that is not vulnerable like intel) would that negate needing hardware memory protection when used with a “safe” language?
> The attacks don't rely on memory unsafety -- they rely on the state of the CPU being changed by doing fairly normal things, while an observer probes the state by doing other fairly normal things.
yes, thanks. i was just wondering what the limits to that re in terms of “safe’ languages vis-a-vis hardware memory protection/os etc...
Regarding performance, there is a really interesting discussion started by Martin Děcký: we spent decades optimizing hardware for monolithic kernels - maybe we could optimize it for microkernels to gain performance.
Not sure if anything happened there after the FOSDEM talk.
Nothing will happen until people start using microkernels. And people won't start using microkernels until companies make some basic optimizations for them.
There may be a way out of the loop exploiting hardware parallelism, or if they bring some improvements for memory locality leading to orders of magnitude improvements. But even exploiting memory locality, that would bring some very high and obvious gains is locked at the same kind of loop.
QNX and L4 has had competitive performance since forever. And they have been heavily used in embedded systems, especially QNX. Basically, in my mind the only thing that stood between QNX and total world domination was that it wasn't open source. Many flavors of L4 are open source, but they're just a piece of what you need for a complete OS.
Performance is a myth too, performance of shared memory monoliths became uncompetitive long ago when they had to adopt to multi core systems, where locality, batching and asynchrony play very significant role. High performance research in networking and distributed system basically converged on asynchronously communicating isolated processes.
it is quite the reverse: multicore systems means that shared memory monoliths need to be fast and a lot of transistors are spent to make sure that they are. There is a reason that multicore cpus do not look like networked machines.
Cache coherency is optimistic, message passing is pessimistic.
> There is a reason that multicore cpus do not look like networked machines.
Wait, what? I'm looking at an image of an unlidded Zen 2 CPU right now and it's a bunch of discrete compute devices surrounding a switch. Inside these devices are multilevel caches, just like real world network systems. The central "chiplet" multiplexes access to external buses over multiple serial PCI lanes; any network engineer would would recognize this pattern as trunking. All of this looks exactly like a network.
No amount of transistors can make shared memory monoliths fast on multicore systems, because the problem is in architecture unfriendly to such systems, not in hardware. Monoliths can be fast if they drop shared memory and use actor model, not atomics and mutexes and programming with shared memory. But then it would no longer make them monoliths. Basically what makes monoliths slow on multicore systems is also the thing that makes them not microkernels. I mean if you split everything into actors, why even run all of them in kernel? Group them into isolated OS processes.
Synchronisation between two cores is fastest with hardware support, so shared memory with cache coherence protocol.
Cache coherence protocol are all designed to keep everybody in sync. So if you scale to thousands of cores, you better sync selectively and application specific. Thus it makes sense to use message passing (actor model).
It is unclear where the it makes sense to switch. Somewhere between 2 and 1000.
Hybrids are also an option. Supercomputers often use OpenMP (shared memory) and MPI (message passing) in a two level way.
> There is a reason that multicore cpus do not look like networked machines.
They don't?
In practice cache coherency is a correctness mechanism; actually using shared memory concurrently R/W is problematic, performance wise. That's how we get scatter/gather-style systems, specifically to avoid concurrent R/W.
It is an interesting discussion, I'm excited to see a full OS in Rust, something I was unsuccessful in achieving with Java[1].
On the performance question, I have a new 64 bit system coming with 64 threads[2], and I'm really curious to experiment with non-relocating microkernel architectures (architectures where kernel features have a fixed address that doesn't change while running). This is something I dreamed about at Sun using a Dragon (64 thread SPARC machine) and Spring (the research OS) but it was too expensive to allocate one of those machines to our small group for OS research.
[1] Yes, there are OSes written (mostly) in Java, but at Sun we tried a completely Java OS and the safety features of the language prevented it. The 'controlled relaxation' of safety features in Rust here (as opposed to 'jump into native code, all bets are off!') are awesome.
[2] In theory I'm in the queue for the first arrival of an AMD TR-3990 for my sTRX motherboard. We'll see how that holds up. :-)
The working theory was the JVM was just a machine specification, which could, if you wanted, be implemented as an immutable processor (either a chip or a set of chips).
So if you can write a multi-user OS that only used the JVM (so it was 'Java all the way down') then you could build a Java chip for embedded systems, use a JVM for running it on a commodity processor, etc.
I had a lot of fun building the PiDP11/70 kit last year which is a PDP 11/70 built on top of simh on a Raspberry Pi. You can run BSD 2.1 on it which is decidedly Old School, but it (and RSX-11M+) are classical multi-user OSes. And that system has all the pieces of what we were trying to do with Java which is a virtual machine, compilers that converted code to run on that virtual machine, and an Operating system that can be compiled to run on it.
Congrats jeremy, looks fantastic! That video showing the miniscule boot time brings me hope that it is possible to write software that takes advantage of the crazy-fast computers we have these days.
LLVM is (at least currently) being dynamically linked to allow shipping multiple copies of LLVM (one for emscripten, one for everything else) without having to ship multiple entire compilers. Now that emscripten uses LLVM's built in wasm backend it's not needed any more, but currently support is still in the compiler. This is a PR to remove it: https://github.com/rust-lang/rust/pull/65703
I've been wondering about the same actually. bitcode likely needs swift's LLVM or another apple provided LLVM fork and you'd probably want to link dynamically. I guess they are not intending to offer compilation to bitcode any time soon.
One of the standard tests of compilers is generally that they bootstrap, compile themselves with bootstrap, compile themselves with themselves, compile themselves with themselves again, and finally compare the binaries. Any differences at that point are considered to be a bug.
Honestly, I am always excited about systems like Redox that are going for the integrated and consistent user experience approach into building OSes. The microkernel approach on modern hardware sounds interesting here given that they skipped 32 bit entirely. The impression of most developers here keep referencing the Tanenbaum-Torvalds debate of 1992, as a justification for adopting monolithic kernels, which made sense at the time.
Fast forward into 2019, given that the CPU-level vulnerabilities are now alight, the need for a microkernel OS written in Rust could not have been greater. The reasons against a microkernel was commonly associated with the IPC and 'context-switch' performance impacts, but I find that the security implications here can help with countering these CPU-level vulnerabilities and the performance-concerns are in-directly solved by hardware advancements/implementations given for free or optimising the OS to run on multi-core systems.
Systems like Fuchsia have done both and I hope Redox follows and does this too.