Hacker News new | past | comments | ask | show | jobs | submit login
Orange_slice: Research Kernel and Hypervisor in Rust (github.com)
126 points by adamnemecek 10 days ago | hide | past | web | favorite | 32 comments





The coolest part of this, to me:

> This is going to be developed live? Yup. Check out My Youtube Channel or my Twitter. I announce my streams typically a few hours ahead of time, and schedule the streams on Youtube. Further for streams I think are more impactful, I try to schedule them a few days out.

> I'm going to try to do much of the development live, and I'll try to help answer any questions about why certain things are being done. If this project fails, but I teach some people about OS development and get other excited about security research, then it was a success in my eyes.

> I have already scheduled a stream for an intro on Wednesday: Intro Video

https://github.com/gamozolabs/orange_slice#this-is-going-to-...


I wonder if they'd be interested in setting up a Patreon, some other regular dev streams I follow have them. This seems like impactful, important work.

I greatly appreciate the interest in support! I don't have a need for funding for projects like these, as I get plenty of enjoyment out of doing them on my own. For now I'd suggest helping the many other projects/blogs/charities out there! If my following grows I'd love to add a Patreon and directly feed any proceeds there back into the community via projects, free trainings, blogs, etc.

I'm happy enough with where I am in life, and I'd really like to just give back and teach others. I was fortunate enough to have a mentor at a very young age (and many other great teachers and leaders throughout life), and I recognize the huge advantage it gave me. If I can pass knowledge on to the next generation (and current generations of course), that's the best outcome I can hope for!


Would you mind sharing who you watch? I have found a few good ones but feel like there are loads I am missing!

Best video I have found to date is https://youtu.be/1rZ-JorHJEY but there are only 2


Harvard cs50 does live infosec CTF: http://www.twitch.tv/cs50tv/v/401612368?sr=a&t=2s

Lots of game developers everywhere.

Someone was building a compiler from scratch in a series and I can't remember the username, will add if I find it.


There is one really big technical issue here which I don't see answered on the site: given "it will be multiprocessing from day one", how are they going to handle data races deterministically and efficiently? This is the issue that has made all previous attempts to implement multicore record-and-replay quite inefficient.

My apologies, I wasn't very clear in my readme.

It's intended the kernel itself is multiprocessing (support for multiple cores, and multiple VMs). But currently the intent is only to run a single core per VM for now, as getting VT-x or SVM to be deterministic on a single core itself will be hard enough. If we accomplish this, then the next goal would be to maybe look at multiple cores.

It is important that I can run multiple single core VMs on a single machine as I plan to use this framework for fuzzing, and retrofitting multiprocessing to a kernel is always a source of many, many bugs.

Hope that clears it up a bit!


I saved a few papers on deterministic multithreading both to knock out concurrency errors and support formal verification. I'm not sure how helpful they are in your use case. I'm just dropping them in case they give you ideas:

https://people.csail.mit.edu/mareko/asplos073-olszewski.pdf

https://people.cs.umass.edu/~emery/pubs/dthreads-sosp11.pdf


What is multicore record-and-replay?

They say 5x. Is 5x unrealistic?

Nice. I'm sure you could do a lot of cool things once you have this working well by playing around with the "butterfly effect".

For example introduce a single X us jitter in the RTC interrupt and see how long it takes or which part of the machine state diverges from the baseline run.


"The end goal is a deterministic hypervisor, capable of booting Windows and Linux, with less than a 5x performance slowdown to achieve instruction-and-cycle level determinism for cycle counts and interrupt boundaries."

You've set some very lofty goals here, that is, you're pushing the bleeding-edge envelope; I hope you succeed wildly, but if you do not, then the next best thing to success is an "Adventurer's Journal". Sort of like you're an explorer, exploring a new land that no one has ever visited. By writing down journal entries of your exploration, you leave both a guide and a map -- for whoever next decides to explore this uncharted territory with your same goals. The history of computing is filled with people who gave us key stepping stones to advance technology in some form or another... If you aren't successful, then becoming one of these people is the next best thing...

Anyway, wishing you luck in your adventures, and I hope you succeed wildly!


Why Rust? (In general..)

Are C memory-related bugs still such a real problem?

IIUC, Rust can prevent these kinds of memory vulnerabilities. But it cant really present the million of other types of vulnerabilities, such as mis-configuration, wrong logic in the code, races, etc

Am i understanding this correctly?


> Are C memory-related bugs still such a real problem?

I believe so, yes.

https://www.zdnet.com/article/microsoft-70-percent-of-all-se...

> IIUC, Rust can prevent these kinds of memory vulnerabilities.

Yes: if your Rust code has memory errors, there's a bug in an "unsafe" block (~1% of the code in a project of mine, which I think is typical) or in the compiler.

> But it cant really present the million of other types of vulnerabilities, such as mis-configuration, wrong logic in the code, races, etc

It can prevent data races. It can't prevent all possible race conditions.

https://doc.rust-lang.org/nomicon/races.html


> if your Rust code has memory errors, there's a bug in an "unsafe" block

Minor clarification around this part. It's possible for a bug in safe code to break an invariant that some correct unsafe code is relying on, if that safe code has private access to something that the rest of the world doesn't. For example, by changing only safe code inside of Vec, you could introduce a bug that set the wrong capacity. Since all the unsafe code in Vec assumes the capacity is correct, that would break everything and cause tons of UB. Vec is sound, though, because the capacity is a private field, and safe code in the caller can't change it. But it does mean that when we're auditing unsafe blocks, we might also need to audit the safe code in the same module, depending on what invariants the unsafe blocks are assuming.


I guess that's debatable, your point is that code that should be unsafe might not be, but then that did mean the unsafe code is wrong in a sense. It assumes an invariant that isn't enforced.

Some people describe it as unsafe code "infecting" its containing module. This section of the nomicon goes into more detail: https://doc.rust-lang.org/nomicon/working-with-unsafe.html

> But it cant really present the million of other types of vulnerabilities, such as mis-configuration, wrong logic in the code, races, etc

That's correct. The great achievement that Rust brings to the table is that it gives you the safety of a traditionally dynamic language like Java or JS with native performance.

Let's look at it from a security angle.

For some domains, the security that Rust provides is enough to create really safe solutions inside those domains. Think of PDF viewers, image viewers, browsers, etc. Here, it takes in untrusted input, parses it, and displays it to the user, and the codepaths where privileged stuff happens, e.g. reading arbitrary files, are only very sparse.

In some domains, Rust alone won't be enough. Think of TLS implementations or kernels. Here, almost any logic bug is also security relevant. Ideally you'd have computer proofs for their security. The language is largely irrelevant, all you need is support for your language in the proof tool.

Now, let's look at it from a development angle:

C requires you to manually malloc, free and check whether the malloc was successful. Rust does that for you, and although there are no "failible" allocators in the standard library, the language itself doesn't prevent you from writing your own containers etc. that do it, if you really need failible allocators.

The bugs that Rust prevents are precisely those kind of bugs that are most annoying to debug. I'm very glad that I don't have to deal with stuff like this [1] or this [2] any more. For the second bug, it wasn't even a crash, it just had weird behaviour. Classical nasal demons.

[1]: https://github.com/minetest/minetest/commit/57a461930ba13b0b...

[2]: https://github.com/minetest/minetest/commit/1f76808e4fa5a198...


> The great achievement that Rust brings to the table is that it gives you the safety of a traditionally dynamic language like Java or JS with native performance.

Plenty of other modern and old languages are equally fast and memory safe. Calling it a "great achievement" is unfair to other languages.


> Plenty of other modern and old languages are equally fast and memory safe.

Name three.

I don't know any language that is both memory safe and as fast as Rust. ATS might fit the bill, though from what I've read, it has rather exotic semantics and an even steeper learning curve than Rust.


Ada, D, Swift, .NET Native, OCaml, Haskell, Oberon-2, Modula-3, Active Oberon, Component Pascal, Sing#, System C#, Eiffel, Java (AOT compiled), Common Lisp, Swift.

> Ada, D, Swift, .NET Native, OCaml, Haskell, Oberon-2, Modula-3, Active Oberon, Component Pascal, Sing#, System C#, Eiffel, Java (AOT compiled), Common Lisp, Swift.

None of those are as fast as Rust, at least not in all cases [1]. Sure, they're within the same order of magnitude performance-wise, but software engineering is probably the only technical discipline which dismisses a factor of 2 as irrelevant – compare the automotive, aerospace, or energy generation industries, which fight for single-digit percent efficiency gains.

[1] I'm talking about idiomatic code, not turning-Haskell-into-an-even-uglier-version-of-C or similar.


Not as fast as Rust?

It seems to me you have some learning and profiling to do.

Hint, many of them do enjoy multiple implementations from several vendors and even had OSes richer than Redox available.

Also some of them do enjoy LLVM backends.


> It seems to me you have some learning and profiling to do.

Please, enlighten me. If you could link some benchmark results, it would be very much appreciated.

> many of them do enjoy multiple implementations from several vendors

Having multiple implementations doesn't make a language faster.

> and even had OSes richer than Redox available

Being able to write an OS in a language doesn't allow any conclusions about its performance.

> Also some of them do enjoy LLVM backends.

An LLVM backend doesn't necessarily make a language as fast as C or Rust. You could write an LLVM backend for Python and it would still be slow as molasses.


> Please, enlighten me. If you could link some benchmark results, it would be very much appreciated.

Well you just did a Rust Force HN post, bashing languages regarding performance, some of them like Ada, being in production since the early 80's, controlling human critical systems in real time.

Besides GNAT, there are PTC ObjectAda, GreenHills Ada, SCORE Ada, PowerAda, Tartan Ada all being used in automotive, aerospace, or energy generation industries, where zero lines of Rust code exist today in production.

But hey, Rust is faster. /s

> Having multiple implementations doesn't make a language faster.

Sure it does, it is a big performance difference to use JRuby or MRI Ruby.

> Being able to write an OS in a language doesn't allow any conclusions about its performance.

Slow OSes don't get users.

> An LLVM backend doesn't necessarily make a language as fast as C or Rust. You could write an LLVM backend for Python and it would still be slow as molasses.

You are quite right, after all D beats Rust when using LLVM in compilation speed.

It is all a matter of man power, willingness to pursue certain goals and skills.


> Well you just did a Rust Force HN post

You made very strong claims without evidence, I made a counter-claim, but it's me who is part of the "Rust Force"? Fine, have some evidence:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/... [1]

Rust ist faster than Ada in some benchmarks and never slower than Ada in others. You scolded me for not doing enough profiling. So, where is your hard data?

> some of them like Ada, being in production since the early 80's, controlling human critical systems in real time.

Performance is not the only factor, and typically not the most important factor, when it comes to safety critical systems.

> Sure it does, it is a big performance difference to use JRuby or MRI Ruby.

And both are slower than Rust.

> Slow OSes don't get users.

The speed of an OS is not its only selling factor, especially for niche uses.

> You are quite right, after all D beats Rust when using LLVM in compilation speed.

You're moving the goal post. The criteria were "as fast as Rust" and "as safe as Rust", not compilation speed. Yes, Rust is slow to compile, that's not the point.

[1] The benchmark game is not the best evidence, but it's better than nothing.


> Rust ist faster than Ada in some benchmarks

Let's take the number of programs listed on the benchmarks game for a language implementation as some-kind-of indication of community effort; and note that twice-as-many Rust programs are listed as Ada programs.

> and never slower than Ada in others.

fannkuch-redux

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...


Rustc is faster than GNAT Ada in some micro-benchmarks, most of them total worthless for production code running in airplanes, medical devices, train security systems.

If I am supposed to now start coding micro-benchmarks across all AOT compiled system languages against Rust, then I don't have any issue losing Internet points.

So now the point isn't anymore that a language cannot change execution speed by changing implementation, rather how it compares with Rust.


Which language do you have in mind?

> But it cant really present the million of other types of vulnerabilities, such as mis-configuration, wrong logic in the code, races, etc

It can prevent multiple types of data-races.

It has nullable-types as an option (I.e. not the default), and not checking for null is a type-error flagged by the compiler.

It has checked array-access, preventing out of bound reads.

It has immutability by default, preventing accidentally modifying data meant to stay constant.

Etc etc.

All of this together help creating higher quality, more secure code.

If you can have all those errors go away for “free”, why choose a language like C where all of these errors are possible?


Memory related bugs are quite common (I see many daily as a security researcher).

I have previous hypervisors and kernels I've written (in assembly, C, and Rust) [https://github.com/gamozolabs/falkervisor_beta and https://github.com/gamozolabs/falkervisor_grilled_cheese].

Memory corruption was my most common issue in these kernels, and when working with fuzzing and security research, my confidence in my own tooling got in the way of root causing bugs. I would find that sometimes bugs that I "detected" in the code I was fuzzing, was indeed due to corruption in my own code. This pushed me a bit towards finding a safer language to write my kernel in, not for security, but for code quality.

I was a pretty hardcore C fan and I never saw myself getting into a higher level language like Rust. However the cleanliness of the output code got me immediately hooked a few years ago. I do a lot of work on low-level development and optimizations, and having a compiler with predictable properties of emit code is really important to me (such that I can have a decent idea in my head what the emit code will be). Having allocators be scope based rather than garbage collected really helps with this, and helps with the usability of the language for kernel development.

Also you mentioned races as something Rust does not prevent, but it does prevent traditional "exploitable" race conditions, by enforcing that all types shared (passed via message passing, or via globals) must be "Sync", which means they must be proven safe to share between threads. Using atomics or wrapping things in Mutexes is one way to make things sharable. However, Rust does not prevent deadlocks, which are fairly common as just "bugs", especially in kernel development.

That being said, Rust has many things that I do not like, such as the clumsiness around working with generic arrays of >32 elements, and working with raw "plain old data". There's definitely a lot of research in modern Rust going into web assembly and other features that I have zero interest in personally, while some of the systems aspects can be a bit lacking. But, I am not personally contributing time to the Rust project, so I cannot complain too much. That being said, it is full featured enough to write bootloaders and kernels in, and I use it for all of my projects for the past few years.

TL;DR: Rust is fast, prevents many of the most common bugs (and many of the hardest to reproduce/fix bugs such as corruption/UaFs/etc), and has predictable codegen which is useful for optimization and systems development.


> and working with raw "plain old data"

> while some of the systems aspects can be a bit lacking

Could you elaborate a bit on those two points? I think it'd be very valuable feedback for the Rust devs.

(Also, thanks for all the effort going into teaching this kind of stuff <3)




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: