Hacker News new | past | comments | ask | show | jobs | submit login
Writing a RISC-V OS in Rust: System Calls (stephenmarz.com)
201 points by azhenley 28 days ago | hide | past | web | favorite | 34 comments

For inspiration, have a look at the plan 9 syscall table:


This is how you make a simple yet flexible OS. Use an rpc protocol to do all the talking to objects within the system which are exposed as files. It's very REST like and allows any programming language to interact with those objects by reading/writing them as files. So you have a very uniform interface that eliminates a lot of binaries as now simple scripts can handle the work of reading and writing those files. Then realize that you can easily export those files to other machines on a lan or the internet. So the tcp stack of an internet facing machine can be exported and mounted on multiple machines who now all talk to the internet via that one machines tcp stack. And that can even be done across the net to bypass "great firewalls". Bye bye nat.

> Use an rpc protocol to do all the talking to objects within the system which are exposed as files.

At some point, you're just adding pointless layers of indirection by doing this. NT tried this approach, and it is anything but "simple".

You can do many things quite naturally by interacting with bytestream-like or block-like objects that are exposed via some sort of hierarchical namespace ("files"), but this doesn't really cover everything. Some things may be better done, even in a script, by interacting with a "binary" that in turn just wraps a binary library exposing some kind of ABI.

I don't disagree, per se, but designing for the exception is also a good way to end up with a mess. You have to carefully measure the impact (performance, maintenance, developer impact, etc.) of every special case.

I'm a n00b when it comes to these sort of things, how would one upload a texture to a GPU, for example, using this? Open the "GPU driver file", seek to some specific location and write the texture bytes?

One way is to open the driver file and use an ioctl call to allocate a shared buffer and write your texture there.

But have you gained anything then? The application doing the ioctl still has to be acutely aware of how exactly to perform the ioctl, no? That is, I couldn't just `cat` a texture into my GPU.

Like WebDAV.

As far as I can remember Inferno did syscalls in a different way, given that Limbo with DisVM was the main programming language.

They added a couple pieces to help the VMs, but the syscall layer is pretty Plan 9 inspired on Inferno. A few of the man pages call out 9p explicitly. All of the Sys_* here are the system calls on Inferno https://bitbucket.org/inferno-os/inferno-os/src/master/os/po...

Thanks for the hint.

>"Each mode has a different ecall cause. If we make an ecall in machine mode (the most privileged mode), we get cause #11. Otherwise, if we make an ecall in supervisor mode (typically the kernel mode), we get cause #9."

Could someone say if the kernel runs in Supervisor mode what is the intention of machine mode in RISC-V?

Stuff like "this board/chip needs real time response in order to not melt itself despite not being able to trust that the OS is hard real time" like System Management Mode was originally intended.

It's also been handy for trying out supervisor mode concepts and adding software based support on chips that don't support all the features like hypervisor extensions.

right, afaiui basically you can trap on unimplemented instructions and emulate risc-v extensions without hardware support (very cool imho)

this short video explained it well (to me anyways): https://www.youtube.com/watch?v=4JIvnWEs_pA

(hypervisor trap explanation at 11m20s mark)

Machine mode is the equivalent of el3 on arm. That's where processor firmware tends to run. There is no paging and very little in the way of security boundaries or seatbelts.

It was supposed to be where vendors can patch up differences and provide platform bindings. The (standard) Linux kernel expects to be started in S mode and doesn’t include M mode code. OpenSBI is an example of template firmware that belongs in M mode, not OSes.

I just wonder what chances any new general-purpose OS kernel may have against Linux.

I suspect that whatever replaces Linux will probably involve a paradigm shift at the lowest level. Perhaps capability-based IPC, perhaps seamlessly integrating local and networked nodes, etc. It’s going to take major major improvements to move the world off of the Linux kernel.

I imagine it's going to be an exokernel that can run pretty arbitrary user lands as if they were containers, built on cap based security.

If anyone agrees with me and wants to help out making that happen on an open source code base, hit me up.

How different is this from the NT kernel? Or Fuchsia?

I don't have time to work on this, but I highly recommend looking at using the erlang Virtual machine as your exokernel environment.

Why BEAM? I know it’s good, just curious why this specific use case.

I'm seeing a few people heading in this direction (and wrote very provisionally about possible wider applications here: https://mastodon.social/@mala/103548286708316253 ). I think it's certainly an idea that's in the air.

Have you looked at https://github.com/nanovms/nanos yet? What you are describing is fairly close and this is under active development.

Yes, that's definitely where things should be going. Definitely interested, though I don't have too much time to dedicate for an open source project.

Is Qubes OS what you are talking? I heard it can run Windows or Linux like containers on it.

Qubes = Xen hypervisor

What kind of help do you need?

You could be right, but it's interesting to note that Linux took off because it was actually trying not to be too ambitious. I can't find it now, but I think Linus's original email announcing it said something along the lines of it just being a toy project for experimentation, and not a "proper" OS like the one GNU is building. As you're probably aware, the kernel of the GNU OS, Hurd, involved am ambitious microkernel design, and now 30 years into its development it's still not usable.

That has more to do with the people involved than that they were microkernels.

You just described s hybrid of OS/400 and Plan 9.

Only with the deep pockets of FAANG like companies.

Android or ChromeOS would never have taken off the ground without Google's keeping pushing them forward no matter what. In both OSes the kernel is hardly exposed to userspace, Google could move to something else and it would be hardly noticeable to app developers (OEMs and rooted phones is another matter).

And then there is Fuschia.

Outside general-purpose domain, the IoT world is getting full of BSD licensed OSes, RTOS, Zephyr, NuttX, Tizen RT.

Ironically Zephyr is managed by Linux Foundation, although it doesn't have anything to do with Linux code.

That would be wrong battle to fight first. Pick one special-purpose market it can win and take that. Rinse, repeat.

Dog knows IoT needs something a lot better than Linux to run on; fat flabby old-school hacker-culture footguns are absolutely not an option when smart tech is being built into everything and everyone. That tech must be small, hard, absolutely focused, and take ZERO shit from NO-ONE; starting with itself.

Remember, the only code which can be 100% guaranteed impossible to compromise is code that isn’t there. Today’s Linux kernel is around 20 MILLION LOC, and counting. No prizes for figuring what any competing product’s USP must be.


“General-purposeness” designed into the product at the start enables it to adapt to each new market quicker and easier, but it shouldn’t be a public selling point in itself. Because the moment you start pitching it as being “everything to everyone”, everyone wants to throw their own crap into it as well—and you’re back to building ”another Linux”, in a market that already has a resident Linux and no room or need for another. Think smart. Don’t covet all the things that Linux does well and try to replicate those. Instead, identify the things Linux doesn’t do well (and CAN’T ever do well because of why and how it’s built) and knock those balls clean out of the park.

Think about how Steve Jobs finally beat Microsoft at its own game. Not by building a better PC, but by fundamentally redefining what “Personal Computing” means; and then being first to market with the perfect product for that.

I agree: Trying to create a new general-purpose OS is futile. You're handicapping yourself by attempting a feat which likely can't achieve anything beyond your own satisfaction, which will be limited by the aforementioned self-handicapping. It's like sublimating your musical drive in a knock-off, unoriginal rock band in the vague hope you'll someday become the next Nickleback. Go for the gusto and put the Research in Research OS. Make something to explore an idea mainstream OSes can't pursue because it would break too much compatibility. Be Captain Beefheart, or at least Tool.

Fuchsia has the advantage that Google can force it onto a bunch of phones, Chromebooks, and personal assistants and get instant "market share".

I think it's likely the #1 threat to the Linux dominance in the backend, ironically by hijacking the front end first. Based on market share, it could create an expectation that AWS, Azure, etc, provide backend support. Doesn't hurt that Google owns K8S as well. See, for example, EKS. Google can manipulate Amazon in select spaces.

Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact