Hacker News new | comments | show | ask | jobs | submit login
Robigalia: seL4 and Rust (robigalia.org)
129 points by xj9 478 days ago | hide | past | web | 63 comments | favorite



Hi, project lead/main/pretty much only developer ( ;) ) here.

I didn't plan on posting this for another week :( There's a bunch of stuff that still needs doing, especially with docs and tutorials. The writing on the website is only a first draft. There are also no easy tasks for new contributor onboarding at this point in time, which I think is really important before anouncing a new project.

This largely started as a learning project by me and a few of my friends at uni last semester. A lot of people were interested in it, so we (I) decided to take it somewhere.

Here's the announcement email draft:

https://lists.robigalia.org/pipermail/robigalia-dev/2016-Jan...


Very cool. Are you planning to write device drivers in Rust initially or something else like incorporating a rump kernel?


Drivers will be pure Rust. Someone else can port a BSD to run on seL4 if they want.


Was there any discussion over implementing a hypercall interface for the userland and writing drivers in Rust that are rumpkernel-ready? Instead of writing yet another OS-specific network stack, be able to reuse the Rust drivers for BSD and all other rumpkernels.


None of us have experience with rump kernels, but I'd be very interested in seeing what is needed to make that work. Right now the drivers we're writing are parameterized over hardware-access with traits. If a rump-ready driver could work in that existing framework, that'd be ideal.



A short sentence about WHY would be helpful for people not knowing about seL4 or rust.


This is awesome very good combination. seL4 is formally verified and then slap a memory safe and performant user-level code on top.

More info about L4 micro-kernel:

https://sel4.systems/Info/FAQ/

Now user-level here means drivers as well. So that would be interesting how it would work with Rust. Will it all default to unsafe code and just end up with pointer twiddling and memory overwrites and all the same errors. I hope not.

Currently it seems that the verification mode had DMA turned off and for SMP they run a kernel instance on each core and then send messages to each other via shared memory. That probably means not load balancing or not scheduling?


Re: low-level Rust, I've been really enjoying following this series:

http://os.phil-opp.com/

It shows a variety of ways to safely wrap unsafe code and then provide strongly-typed safe interfaces to the hardware, no pointer twiddling required. A particularly good example is the section "Some Clever Solution" in the "Page Tables" post.


Ooh nice guide. Thanks for the link!

I was looking at the printing to screen section:

http://os.phil-opp.com/printing-to-screen.html

It is so helpful to have the explanation of what is going and why the code looks the way it is, what are some alternatives and why those wouldn't work.


Sorry, I think I downvoted you by accident. HN's mobile interface makes scrolling risky... :)


We've all been there :).


Multicore support + verification is in the works, see https://ssrg.nicta.com.au/publications/nictaabstracts/Peters...


Create a secure POSIX-compatible userland on top of seL4.

This goal fills me with nerd lust. Getting ourselves a secure foundation to build on doesn't solve all our problems (not by a long shot), but such a drastic reduction in attack surface would be an incredible achievement.


>> Create a secure POSIX-compatible userland on top of seL4. > > This goal fills me with nerd lust.

I remember a L4 talk in which the speaker showed how more than 95% of the code of new OSes where devoted to being POSIX compatible and yet they 1) lacked a lot of features 2) missed many edge cases 3) had to modify their beautiful kernel design to accommodate POSIX use cases.

I think the only way forward is to do like Plan9 did. Just forget about POSIX and implement your own set of primitives (but first study what has been done in the last 40 years!). POSIX compatibility can come later with 80/20 library that maps POSIX calls and behaviour to the new primitives.

And maybe POSIX is not needed at all. If your OS is interesting enough people will add #ifdef'd code for your platform, just like many UNIX-y applications do now for Android, OS X or Windows.


FWIW, OS X is UNIX 03 certified: http://www.opengroup.org/openbrand/register/brand3612.htm

Android, for these purposes, is, of course, Linux with a different libc, so most of the POSIXy stuff is there, even if bionic is pretty awful.


Then add NetBSD drivers as rump kernels to get it running on real hardware. True, drivers in Rust would be even better, but IIUC, the whole point of using a microkernel is that the drivers would be running in user space with restricted privileges.


It helps in fault containment which helps availability primarily. Without IOMMU, they still can wreak havoc. So, drivers shown to be safe is still beneficial. I've so far seen that done with Ada/SPARK, w/ static analysis, Java w/ driver partly in JVM, and most recently ATS language. Rust would make a nice experiment.


Not just that, but I worry about IOMMU bugs. A safe driver would help mitigate those.


Good point. Few think about that. Hardware will periodically fail, especially on deep-submicron processes.


I was just recently wondering if an LSTM neural net could be trained as a decent decompiler from LLVM IR to Rust source. Feed that with Clang output of an established OS and coreutils(i was thinking minix3 but maybe seL4) and get a working if not necessarily idiomatic Rust OS. Refactor at leisure.


Like other commenter said, it would be unreadable. You'd be better off using a transformation system like Semantic Designs DMS or Stratego language with heuristics. Series of rules, analyses, and transforms applying until new code exists.

Real problem with your scheme is that the typing and structure are important to effectively using Rust. Those that you get in LLVM will be nothing like what you would want in Rust and you'll need additional information. In other words, source to source with annotations is a better option.


Why would you ever do that? You'd end up a bunch of unreadable rust code wrapped in a gigantic unsafe {} block.


The LSTM would be trained against an existing high quality Rust code base and it's IR; so I wouldn't expect a bunch of unsafe literal translations.

Whether the C side IR has enough debug metadata or could be annotated automatically would be the make or break.


Perhaps unrealistic, but a brilliant concept.


It's the same foolish concept people have been reinventing since early secure UNIX's hosted on security kernels. Long story short, it was impossible without breaking applications and lots of staff back when UNIX had "dozens" of system calls. It's not going to work this time.

The solution high assurance security adopted was to virtually or physically isolate a UNIX/Linux stack that interacted with security-critical software that ran directly on microkernel and through mediated interfaces. This resulted in commercial work like INTEGRITY-178B and LynxSecure with Nizza, Perseus, and OKL4/OKLinux in OSS space. Look up "Mikro-SINA VPN" (yes with a k) for great case study. Middleware like Camkes was developed for integration. Similarly, kernels like INTEGRITY and PikeOS developed safe/secure runtimes that ran directly on kernel for Ada and Java.

What this project and all others like it need to do is this:

A. Drop all efforts at rebuilding UNIX in language X.

B. Get a user-mode version of Linux running on it with user-space drivers on microkernel for one, easily-acquired board.

C. Redirect safe language efforts on creating a secure runtime on the microkernel.

D. Create middleware tools along lines of Protocol Buffers, ZeroMQ, or LANGSEC stuff. These should allow seemless, efficient integration between untrusted (esp GUI) stuff in Linux VM and trusted stuff in safe lang uage runtime.

E. Begin on a POSIX-like runtime whose components can be stripped on a case-by-case basis. Focus on syscalls used by critical libraries and services first.

F. The current goal of full POSIX. 99% dont make it to E. If they get here, then great. If to E, then we get at least some apps. If to C and D, then we at least have trustworthy runtime.

That's my two cents worth and builds on what's historically happened in these efforts.


> A. Drop all efforts at rebuilding UNIX in language X.

... OK, but then you say "full POSIX" later. So how does that differ?

> B. Get a user-mode version of Linux running on it with user-space drivers on microkernel for one, easily-acquired board.

How is this relevant to what happens later? The whole goal is to not rely on the Linux kernel.

> C. Redirect safe language efforts on creating a secure runtime on the microkernel.

There aren't "safe language efforts". There is a safe language that already exists, and the effort has already been expended to make that exist. The project is creating a secure runtime on the microkernel using that safe language.

> D. Create middleware tools along lines of Protocol Buffers, ZeroMQ, or LANGSEC stuff

You mean like serde? This is already done for the Rust ecosystem.

> E. Begin on a POSIX-like runtime whose components can be stripped on a case-by-case basis. Focus on syscalls used by critical libraries and services first.

How is this not what this project is doing?

> F. The current goal of full POSIX. 99% dont make it to E. If they get here, then great. If to E, then we get at least some apps. If to C and D, then we at least have trustworthy runtime.

... in other words, exactly what the project is doing?


Re A

You did notice how I went from an extremely hard task (full POSIX) to a series of them that gradually deliver more and more useful functionality over time if resources and labor permit? First goal is a runtime. Next is key POSIX features (80/20 rule). Next is full POSIX. If you cant do a runtime, you cant do POSIX.

Re B

Linux kernel is a user-mode API that gets you lefacy apps inmediately. You gradually replace the syscalls with your own and eventually get off it. Meanwhile, critical stuff runs outside Linux VM on a Rust runtime directly on microkernel or other trusted components. WAY better than just Linux kernel.

Re C

No, there are dozens of projects trying to make OS'S or runtimes in safe languages. There's closer to 100 that started as learning projects then petered out when hard reality set in. Dead link after dead link. Hence, I said it in plural as my recommendation and gradual progression applies to most. Including a Rust project.

Re D

Im not a Rust user so that's new to me. Good to know they have it. Thanks for tip.

Re E

You addressed that point in isolation. The context is a series of steps with gradual increase of difficulty and intermediate deliverables. Starting at Step 5 is clearly not my recommendation.

Re F

And Step 6 follows Step 5 if author had skill and resources to achieve it. Starting at Step 6, like 5, dramatically increases odds it will be another entry in archive.org rathet than protecting real systems.

More to point, I can name many systems that tool my route to success even on a budget and some resisting NSA pentesting w/ source. Maybe it's ny broken memory but I can't name even one project that implemented POSIX in a safe language clean slate. Hence, my recommendation.


I didn't understand the parent comment either. For what it's worth: this is the first project I've seen that really made me want to start playing with Rust.


It's pretty simple: almost all efforts to duplicate POSIX in a safe language failed. The commercial practice is porting a POSIX or Linux API in C to a virtualized partition that runs on microkernel. Safety-critical apps run either directly on the microkernel or on a minimal, language-specific runtime running directly on kernel. Keeps TCB tiny. Overall app/system is split with untrusted components in POSIX/Linux system & security-critical stuff in the minimal partitions. Unlike safe POSIX, this is a proven model done by many small teams.

In this case, it would be a RTOS or user-mode Linux, plus a custom runtime or target for Rust apps, and IPC/middleware to connect the two. Way easier to build. POSIX comes later if effort comes along.


And for the Rust half, that's exactly what's in our roadmap and introduction :) Robigo (the POSIX environment) is a long-term stretch goal so that I can run my own OS on a RISC-V laptop of my own devising and still run traditional UNIX programs like GNU. I think we're in violent agreement.


Im on mobile and didnt realize you were project head when I first replied. Id have been more helpful. My bad.

Im glad your doing it gradually with intermediates. That will let it benefit spinoff projects or even products hopefully. Plus keep you from getting overloaded.

I also like that RISC-V is in your long-term vision. Keeping that in mind early will help avoid architectural mistakes early on. Id say use the ISA closest to it in meantime so little change is necessary.

Far as overall scheme, I'll email you some reference work if your interested in prior work on safe runtimes, split apps, high-security dev techniques, and so on. Might be able to integrate that into idiomatic Rust. That's your call though.


I'd be very interested in that! I don't have a lot of experience in this area, it's just something I'm interested in and started hacking away at. Email is in my profile.


You've got it exactly right. AFAIK Linux can already be run in userspace on seL4, and there is also a VMM.

This is a different project.


Nobody said it was same project and it would be orthogonal to others. My claim is redoing POSIX will likely fail while first doing a Rust runtime for seL4 kernel is much easier and straightforward to use. Essentially, like Nizza security architecture or MILS kernels but with Rust for apps that run directly on microkernel.


Yeah, Graphene on seL4 might be a more practical path for running legacy apps. http://graphene.cs.stonybrook.edu/


That's interesting. Ill have to look at it on desktop. Seems backwards so far though: says it's implemented on top of Linux. My model has a stripped Linux running in user or guest mode on top of a microkernel. See OKLinux, L4Linux, Nizza Security Architecture, or LynxSecure's Linux for examples. Then you just need middleware which becomes a source file or library for your apps.


This is exciting. Sounds like exactly what Gernot Heiser was talking about: a sufficiently safe language for userland on seL4. Rust was the first that came to mind for me.


Gernot is a smart guy. I learned a lot from him on this stuff. Moreover, the ideas Im pushing here were described and implemented by his people at OK Labs then deployed in millions of phones. Also used in desktops and servers by US, UK, and German firms.

Rust is a good choice given its adoption. Only if they focus on a runtime instead of POSIX, though.


I know the website isn't terribly clear right now (it was posted a week earlier than I planned), but the first goal listed is exactly this.


In order to support more than the unikernel or embedded use case I hope they'll take in the beloved rump kernels as it's unlikely to maintain an own copy of network and FS drivers, not speaking from graphics.


As I mentioned in another comment, someone else can port BSD if they want. Our goal is pure-Rust. We're already working on NIC support and a TCP/IP stack.


Great! Once there's enough of this working, build a bootable router image on it for some existing small router. That will get it some use. Maybe even a secure router.


I don't fully understand why. The idea of microkernels is to mitigate instability/insecurity from a lack of memory safety. This is exactly what rust actually provides.


seL4 provides many more guarantees than that:

" In short, the implementation is proved to be bug-free (see below). This also implies a number of other properties, such as freedom from buffer overflows, null pointer exceptions, use-after-free, etc.

There is a further proof that the binary code which executes on the hardware is a correct translation of the C code. This means that the compiler does not have to be trusted, and extends the functional correctness property to the binary.

Furthermore, there are proofs that seL4's specification, if used properly, will enforce integrity and confidentiality, core security properties. Combined with the proofs mentioned above, these properties are guaranteed to be enforced not only by a model of the kernel (the specification) but the actual binary that executes on the hardware. Therefore, seL4 is the world's first (and still only) OS that is proved secure in a very strong sense.

Finally, seL4 is the first (and still only) protected-mode OS kernel with a sound and complete timeliness analysis. Among others this means that it has provable upper bounds on interrupt latencies (as well as latencies of any other kernel operations). It is therefore the only kernel with memory protection that can give you hard real-time guarantees."


I think I missed the correctness proof for the rust compiler and runtime.


pcwalton said a correctness proof of the type system is in-progress [1].

[1] https://news.ycombinator.com/item?id=7601944


It is. But that will still only prove whether the type system can correctly prevent memory safety problems. As many security people correctly point out, memory safety doesn't guarantee security of the programs you write in Rust. seL4 has a stronger security proof that can guarantee not only that seL4 doesn't have (for example) use-after-frees but that it also actually works right (for a formally-specified notion of "works right").

The reason you'd want to combine seL4 with Rust is to give confidence that (a) the core part of the system (microkernel) works right and that (b) the less core parts of the system (drivers, libc) are least free from nasty memory safety problems. To a first approximation, it's a compromise between a project like the OS in the "Writing an OS in Rust" series, which guarantees memory safety but doesn't guarantee that any part of the system actually does the right thing, and a hypothetical fully formally-verified OS, which is (with current technology) something that would be way too expensive and time-consuming to be practical.


Great explanation!


Exactly. It's same reason high assurance requires source-to-object code traceability even for SPARK Ada. Of course, it's nice the tooling helps with that. Rust needs same thing plus either proven or zero runtime target. Unless it has it already.


I have plans slated for doing proofs about Rust programs in 2017, if not sooner (I'm interning at Data61 on the seL4 verification team starting in two weeks, and if I don't get a chance to work on Rust verification then, I'll start after I'm done)


Oh hell, you going to be working with the top talent. You might learn some high assurance skills after all. That makes your project 2x as interesting. Good luck on the new job. :)


Rust has long been capable of running without a runtime (that's how all OS development in Rust is expected to operate) and formal verification of the language spec is ongoing (though full verification is still probably years out, and who knows if the compiler will ever be formally verified).


Hell yeah! Appreciate the updates. My idea there was a runtime or overhead came from an early OS attempt in Rust where they couldnt get past real mode. Had to use non-Rust code for that and it was using too much space IIRC. They've gotten past those obstacles now?


You need about 100 lines of assembly to get to a Rust kmain in long mode.

Rust has effectively no runtime these days. Same amount as C or C++, basically.


That's great to hear. Thanks for chiming in with the good info as always.


Any time. If you're interested in what exactly that code is, check out Phil's blog, linked upthread.


I'm guessing you mean this:

http://os.phil-opp.com/entering-longmode.html

Interesting stuff and seems like a good description of the code. Bookmarking it in case I do more in this area.


A security-focused microkernel does three things (simplified): isolation in time and space of various partitions; mediated communication between them; facilitate management of hardware resources. Note that nothing on this list prevents damage within a partition, just between them. So, one still needs to secure the partition's code and data against attack if it must be trusted.


Is there any concern about the level of effort required to formally verify a kernel being a roadblock to continuous improvement (e.g. 64 bit support).


No. See any of the papers, eg http://ssrg.nicta.com.au/publications/nictaabstracts/Elphins.... The maintenance effort is quite reasonable. They've managed to evolve, and continue to evolve, the kernel even with the ever more stricter properties proven about it.


Yes, that can be a serious problem for high assurance software. The best explanation comes from Steve Lipner at Microsoft here:

https://blogs.microsoft.com/cybertrust/2007/08/23/the-ethics...

The system he refers to is here:

http://www.cse.psu.edu/~trj1/cse543-f06/papers/vax_vmm.pdf

The seL4 team certainly built tools to make the job easier. However, their work takes even more effort than the development of some early designs. Lipner and others indicated it took several quarters to turn a major change on a product. Schell's company usually quoted 2 years per new design based on their highly-assured GEMSOS kernel. The STOP OS, INTEGRITY-178B, and VxWorks MILS similarly moved slowly in new features due to assurance requirements. Now, most of these were evaluated against all kinds of security and functional criteria whereas seL4 just did development assurance: other stuff for seL4 is ongoing research at NICTA while finished for alternatives. So, in some ways it's less work and others it's a lot of work.

So, rule of thumb is that you will have these issues: less features, slower development, slower time-to-market (critical weakness), slower maintenance, less HW support, possibly worse interface/management, and performance hits potentially. Also, worst part is that anything you apply high assurance to that hasn't been dealt with in CompSci requires fresh R&D then the actual work. That with time-to-market and feature gap were huge, deal-breakers in the past.

So, all that might be fine depending on the nature of the problem. It might not even all show up in the project. Yet, you should assume it because it showed up in many projects and products before. Still does. It's just that high assurance systems are precise, labor-intensive, custom developments that also require uncommon to rare expertise. No wonder seL4 is still behind even ancient QNX in what it can pull off and is actually behind the separation kernels in functionality. Assurance target just makes it that much harder. Maybe labor available, too.


Great project, best of luck with it.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: