Hacker News new | past | comments | ask | show | jobs | submit login
Hubris – A small operating system for deeply-embedded computer systems (oxide.computer)
560 points by jacobwg 55 days ago | hide | past | favorite | 238 comments



I really like this quote from the manual:

<<There are a class of "ideal attractors" in engineering, concepts like "everything is an object," "homoiconicity," "purely functional," "pure capability system," etc. Engineers fall into orbit around these ideas quite easily. Systems that follow these principles often get useful properties out of the deal.

However, going too far in any of these directions is also a great way to find a deep reservoir of unsolved problems, which is part of why these are popular directions in academia.

In the interest of shipping, we are consciously steering around unsolved problems, even when it means we lose some attractive features.>>


I wonder if the attributes of Hubris and similar systems -- real-time, lack of dynamism -- will become "ideal attractors" for developers not working in problem domains where these things are absolutely required, especially as the backlash against the complexity at higher layers of the software stack continues to grow. In other words, I wonder if a sizable number of developers will convince themselves that they need an embedded RTOS like Hubris for a project where Linux running on an off-the-shelf SBC (or even PC) would work well enough.


Right now we are going in the opposite direction. Web developers on HN refuse to learn proper embedded programming, and instead stack abstraction on top of abstraction with MicroPython and using Raspberry Pis for every job under the sun.

It is a shame that Arduino/AVR never bothered implementing support for the full C++ library. If the full power of C++ is available to the end user, then perhaps alternatives like MicroPython would be less attractive.


On the contrary, because the experience isn't the same, with MicroPython you drop a py file over the usb and you are done.

There is a REPL experience, and Python comes with batteries, even if MicroPython ones are tinier.

Python is the new BASIC in such hardware.

Also the Arduino folks don't seem to have that high opinion about C++, https://www.youtube.com/watch?v=KQYl6th8AKE


Your metaphor that “python is the new basic” stems from desktop computers use of BASIC to teach beginning programming skills on largely 8 bit machines.

Once you started with BASIC you presumably moved to learning assembly language, as many of these machines gained c compilers, you might have tried to obtain one.

The entire point of micropython is a friendly introduction or friendly prototype platform or learning platform. In no way does micro python take advantage of the hardware nor could it ever directly talk to hardware.

One should not treat all programming languages the same as they have different purposes and python is not fit for the purposes a c or c++ is fit for, aka memory allocation etc. The number one lesson a beginning embedded programmer should take away from arduino is that controlling hardware is about writing specific bit patterns to memory locations. Sorry, this is not something python can do or was designed for.

Deeply embedded means “embedded Linux won’t suffice”

Your car braking system had better not be a micropython program.

There are actual safety proofing systems in which code is proven, and the python interpreter itself will not come close to passing as the complexity is too large. (Formal verification is the Search term you seek (


Yeah when somebody says something like 'deeply embedded' the platform that comes to my mind is the Dreamcast VMU, which has a cpu that doesn't (AFAICT) yet even have a C compiler. ("C compiler....the idea was abandoned."--https://dmitry.gr/?r=05.Projects&proj=25.%20VMU%20Hacking) I doubt something written in Rust would be adequate for such a CPU.


> Once you started with BASIC you presumably moved to learning assembly language, as many of these machines gained c compilers, you might have tried to obtain one.

I would have guessed quite a lot of people went from BASIC to Turbo Pascal. But you're talking 8-bit machines; maybe that was only available for 16-bit and up?


"Arduino folks" are defined by their limited engagement, so are not reliable arbiters of the value of tools in the toolbox.


If you bothered to watch the linked video, the Arduino folks are the ones that produce the tools in the box.

They only picked C++, because C would be even worse, and it provided an easy way to have their Arduino like language without creating their own compiler.

They have no plans to ever provide proper C++ support.

So if that makes you jump, go watch a Dan Saks talk about C++ adoption on embedded domain or not.


They picked c++ because it’s a low level capable language that has a reasonably familiar syntax, and their primary training wheels are these simplified arruino libraries.

The actual runtime framework with its “setup” and “loop” methods is a reasonable proxy for an RTOS or the framework an experienced embedded developer would have built as a general runtime system.

There’s nothing wrong with Arduino, except that its SPI SD card library won’t give you good bandwidth but that’s because they wanted it to be an understandable simple access library for SD cards, and you will need to go further if you want reasonable performance.


I think MicroPython is a boon to beginners in the space. They have the option to go deeper when the projects run into limitations.


I would tend to disagree because MicroPython is so abstracted that it resembles writing regular Python on a server more than it does anything embedded.

Just as an example, the WiFi setup resembles a server far more than an ESP32 with esp-idf. All you do is give it the connection details, and MicroPython seems to handle the details like trying to reconnect in the background. It's not far off from what systemd-networkd or similar provides. esp-idf forces you to handle that yourself, and to think about what you want to happen in that situation.

MicroPython also doesn't support threads afaict, so you don't even have to handle scheduling threads.

I like MicroPython as a way to run Raspberry Pi like stuff on the cheap, and it's a great learning tool in that sense, but you're still too far from the hardware to really be learning about embedded systems.


So like BASIC Stamp and Propeller?

Or like C64, Atari, ZX, Amiga, TI BASIC?


Sure, that's happening, but some of us are also tempted to use Rust for applications where an easier to learn, more popular language would be good enough.


This doesn’t seem like an apt comparison. Using the wrong tool for the job is expensive forever; learning a tough language pays dividends forever.


I think you are right and I would add Turing incompleteness to that list. If your problem isn't Turing complete, then a complete language is actually probably not the best tool for the job. Incomplete languages actually give you more power than in a Turing complete language in that case. Completeness is a feature of a language and like all features there are tradeoffs. The ability to express more kinds of problems comes with the cost of not being able to leverage the contours of a particular problem (e.g. monotonic data, no cycles) to increase things like speed, debuggability, parallelization. This can enable cool features things that seem completely out of reach today. e.g. rewinding the state of your program to debug production errors. Modifying programs as they're running. Automatically parallelizing computations across an arbitrary number of cores. The ability to query the provenance of any variable in your program.

Datalog's incompleteness for example allows it to resolve queries faster than a complete language like Prolog due to the simplifying inferences it can make about the code.


If the predictions are true that we will see more and more specialized hardware due to the end of Moore's Law, then we will see more OS services just look like services running on separate processors. Special purpose hardware doesn't need a batteries included operating system. We could argue whether a modular OS still counts as general purpose, but I'll let you guys do that.

With IPC, latency becomes the elephant in the room. An RTOS can't remove that, but it can help.


My Android phone is full of processes talking among each other over Android IPC, including the drivers themselves.

It is already more common than common monolith defenders think.


I doubt it. Real time adds its own flavor, and a small OS doesn't come with everything you might want. It's useful when it's what you want, not when you don't want something else.


Containerisation already goes a way towards this, each executable is bundled with what it needs and nothing else. And the next step is one app per hardware, where you have maybe a minimal stripped OS that just launches your app, effectively a container in hardware. I think Google does this a fair amount.

The jump from this to RTOS is large, though. The abstractions are different. The limitations are different. You probably need to rewrite anything you need. And what do you gain? Mostly only predictable latency, and the ability to run on very limited (but cheap) hardware. Which you need why?


Also, massively reduced energy consumption if you choose the right hardware (e.g. ultra low power microcontrollers).


You can get a lot of the way towards that without needing a RTOS though.


> Which you need why?

If I'm selling a million Tamagotchis I'd rather use the 5 cent part over the 50 cent one and pocket the extra $.45M. A $5 part is a nonstarter.


> Containerisation already goes a way towards this, each executable is bundled with what it needs and nothing else.

Maybe in theory, but in practice most people still ship an entire OS in their containers (most of the time it's Alpine and it's not to big of a deal, but too many times it's an entire Debian!)


That impulse will occur as they try to do more and more complex things with systemd.


> In the interest of shipping, we are consciously steering around unsolved problems, even when it means we lose some attractive features

Huh, this is more like pragmatism than "Hubris".


We'd like to think so, but it's really hard to convince people that writing an entire new OS in Rust is the pragmatic choice. The name is a good reminder of the balance there, in my humble opinion.


To which I would add that naming a system "Pragmatic" may well be an act of unspeakable hubris, one that the gods would surely punish with a plague of architectural astronauts...


Haha, pulling a bit of reverse psychology there?


Memento Mori


"Hubris" is supposed to be a name, not a description.


But names are often (supposed to be) descriptive.

At least originally; then the thing so named may drift away from what it originally was (intended to be).

Or sometimes the name is self-depreciatingly ironic / sarcastic, which I'd guess was the case here. So kind of "descriptive-but-in-reverse".


We are gettig an increasing amount of interesting Rust operating system for different uses.

- Hubris for deep embedded

- Redox OS for Desktop/Server (https://www.redox-os.org/)

- Tock for embedded (https://www.tockos.org/)

- Xous for trusted devices (https://xobs.io/announcing-xous-the-betrusted-operating-syst...)

I assume there are more.


We also built a Rust framework called FerrOS (https://github.com/auxoncorp/ferros) atop the formally-verified seL4 microkernel.

It has a similar set of usage idioms to Hubris it looks like in terms of trying to setup as much as possible ahead of time to assemble what's kind of an application specific operating system where everything your use case needs is assembled at build-time as a bunch of communicating tasks running on seL4.

We recently added a concise little persistence interface that pulls in TicKV (https://docs.tockos.org/tickv/index.html) from the Tock project you referenced above, and some provisions are being added for some more dynamic task handling based on some asks from an automotive OEM.

Also, sincerest h/t to the Tock folks.


Looks like a really cool idea to build on the formally verified C code and building everything else in Rust on top.

I defiantly think all the embedded Rust and even others will end up sharing lots of code.

I would like to spend some time working Datomic style on a more powerful immutable DB that could run on different KV interfaces. Lots of configuration should more immutable.


If you can talk about it, how does that play out with standards like AUTOSAR and its C++ requirements?


I'm not sure I understand the question exactly. I'm familiar with AUTOSAR (both "classic" and Adaptive) as well as with ISO 26262. Happy to answer, just not quite sure I fully understand the question.


It requires C++17 and has been merged with MISRA.

So how does a car manufacturer uses Rust and eventually be compliant with AUTOSAR/MISRA certification?

It would be interesting to know if there is some movement in that front.


I've seen this merger of AUTOSAR and MISRA C++ mentioned a bunch around the internet, but I haven't been able to find any details about the new standard anywhere.

Do you know where I could find more info? Have I just been looking in the wrong places?


Okay, just to clear up some potential confusion between us, and for any onlookers. You probably already know all this, but I'm just writing down the context through which I'm interpreting your question, so please bear with me for a moment...

MISRA and AUTOSAR are both consortiums in the automotive ecosystem. They're both industry standards bodies, not regulatory standards bodies (more like Khronos than ISO). The former worries mostly about defining best practices for convention and code style for primarily C and C++. The latter worries about defining a spec and interfaces for a middleware and application framework that's supposed to help automakers have a common portable (between MCUs) mechanism for developing, aggregating, and deploying ECU software. What's been merged from AUTOSAR to MISRA is basically AUTOSAR's codification of C++ subset and best practices into MISRA's version of C++ subset and best practices.

In the automotive world, the certification standard that ultimately matters is ISO 26262. It would be possible to write MISRA compliant code that still didn't qualify under ISO 26262 because of the process by which that code was developed. Likewise, it would be possible to have an AUTOSAR spec implementation that didn't qualify under ISO 26262 for similar reasons. It's also entirely possible to deploy software in cars that is neither MISRA compliant nor running on AUTOSAR. This is because ISO 26262 is 99% about process requirements and only about 1% technology prescription. Additionally, ISO 26262 is a self-certification regime almost everywhere in the world. It's a liability mitigation for when things eventually go wrong, not necessarily a direct regulatory hurdle that must be cleared. Lastly, ISO 26262 provides for the ability to offer evidence in support of deviations taken from the latest version of the standard's prescriptions and/or technology callouts.

Alright, now with all that out of the way...

With respect to seL4 and FerrOS. AUTOSAR Classic is for MCUs. seL4 and thus also FerrOS are not. They're for application processors. That said, it would be trivial to write FerrOS applications that could communicate over the protocols that are commonly defined for AUTOSAR Classic, so that you could have a FerrOS-built ECU interacting with the rest of your AUTOSAR-built ECUs. It would be likewise fairly trivial to create a basic HAL, carve out a slab of resources, and host an AUTOSAR Classic environment on one or more FerrOS tasks. It would just be kind of an odd thing to do given what kinds of things AUTOSAR Classic is used for and not entirely clear what the ultimate value would be. What I could imagine some day would be someone wanting to create an Adaptive AUTOSAR (a different AUTOSAR standard that's for higher-level applications and compute resources like ADAS and IVI) implementation atop something like seL4 via FerrOS (especially in the future when FerrOS supports bundling "applications" that aren't written in Rust).

As that relates to Rust in cars generally. That's a matter of risk tolerance and evidence investment. Rust has no ISO qualified toolchain, though Ferrous Systems has organized a project around a premise of getting Rust's development process, or some curated subset of it, to the point of producing the documentation and evidence of being so qualified. However, until that time it's incumbent on the user of the non-qualified toolchain to provide its certification evidence for the tools and technologies used. So, I expect for the companies who are dabbling in Rust, this is likely what they're planning on doing.


Thanks for taking the clarification effort, looking forward to improvements on the domain.


Microsoft is a bit schizophrenic in Rust's adoption.

In one side of the fence you have Azure IoT Edge, done in a mix of .NET and Rust.

https://msrc-blog.microsoft.com/2019/09/30/building-the-azur...

On the other side you have the Azure Sphere OS with a marketing story about secure IoT, yet the SDK is C only,

https://docs.microsoft.com/en-us/azure-sphere/app-developmen...

To the point they did a blog post trying to defend their use of C,

https://techcommunity.microsoft.com/t5/internet-of-things-bl...


> To the point they did a blog post trying to defend their use of C,

I do a lot of Rust work, but C still occupies a prominent place in my toolkit.

In fact, this is fairly standard among all of the professionals I know. The idea that companies need to abandon C and only do Rust as soon as possible is more of an idealistic idea on internet communities.


Sure, but they should be honest about it instead of marketing Azure Sphere OS as an unbreakable castle, when it happens to be built in quick sand.

Additionally they could also allow for all GCC languages like Ada and C++, given it is the SDK toolchain, instead of being C only, if the security marketing from Azure Sphere is to be taken seriously.


> Microsoft is a bit schizophrenic in Rust's adoption.

It's only "schizophrenic" to the degree that you expect an organization of thousands of individuals to behave like a single human mind.


I expect the business unit that sells "secure" IoT devices, to follow the guidelines from Microsoft Security Response Center, but that is expecting too much I guess.


Expecting a Profit Center to follow instructions from a Cost Center is akin to expecting the US Congress to act on advice from the General Accounting Office.


Generally when MSRC files security issues, they tend to get fixed quickly and across not just the current release but previous ones too.


It is a matter of who calls the shots, and what happens when guidelines are ignored.

Given your description apparently on US Gov anyone can get away with not following them, no wonder an US company acts accordingly.


> Microsoft is a bit schizophrenic in Rust's adoption.

This is pretty famously not limited to Microsoft's views on Rust. Probably at any sufficiently large company, though Microsoft is the tech company that gets most discussed about. At a previous employer, there was a division that was gung ho on Go and React, while another was Java and Ember.


My point was more related to the views on security than Rust.

Microsoft Security advises 1. managed languages, 2. Rust, 3. C++ with Core Guidelines.

Then comes out this group selling "secure IoT" devices with a C only SDK.


I don’t know the product, but you could certainly make a “secure iot “ device with C. Formal verification is again the search term you seek. System states. State machines being a good metaphor. Array bound checking doesn’t really matter if it’s a statically allocated array of a fixed known size. Arguably dynamic code vs static code is a bigger issue than language choice, and you see static code here in Hubris, and they explain why, so it’s not just about Rust but about the system design


Interesting that MS would adopt Go or Java. I would have thought that everything in the GC'd lang space would have to be C#. Is java common there?


To clarify, my previous company (the one which had Go and Java) is not Microsoft


Ah, I misread the context


>Redox OS

What is interesting about this OS except it is made with Rust? I mean some interesting architecture , new exiting features that are not in Windows/Unix world?

My question is if Redox OS is more similar to other hobby Oses we see here or is something more well thought like Plan9, Fuschia, Singularity ? Is there a link where someone that is not a Rust fanoboy (but maybe an OS fanboy) reviewed/described it in more detail?


Redox is Plan9-inspired except it goes a little further with some of the core concepts.

Instead of everything being a file, everything is a scheme (basically a URL).


I thought Unix was “everything is a file” and Plan9 was “everything is a filesystem”?


Maybe so. I guess the point is, the concept that different types of resources need different protocols is baked in, rather than picking one type of abstraction and applying it to every type of resource.

https://doc.redox-os.org/book/ch04-10-everything-is-a-url.ht...

There are some filesystem-like usage patterns built on top of that which are universal, but they're more limited.

https://doc.redox-os.org/book/ch04-06-schemes.html


On Unix "everything is a file" until sockets came into the party.


Rust is the language of choice for dApps on Solana as well (which I thought was an interesting choice.)


That is completely true. Rust is where it is going for cryptocurrencies that need to have very secure smart-contracts and scale up to millions of users with tens of thousands of transactions per second.

Even if it is completely unrelated, it is 'at least' production ready and will be used more than these projects would ever be used in terms of Rust projects.


Rust in Cryptocurrency is mostly a marketing play (and I say this as someone who does a lot of Rust).

Are there even any cryptocurrencies that allowed less-safe languages like C in the first place?

In my opinion (as a Rust dev), Rust is weirdly over complicated for what they’re trying to do. Common types scripting languages are basically more than sufficient (and safe)for these applications.

It’s only a matter of time before a crypto project overplays the safety of Rust and them has a huge heist due to a logic bug, which will further contribute to jokes about Rust programmers. Most of the Rust devs I know are wary of Rust crypto projects.


We need to distinguish nodes written in Rust vs smart contracts.

Given the context assuming smart contracts you may have a point. Often consensus will be much slower and a limiting factor in execution. Solana may be a bit different here in their efforts to parallelise independent transactions.

High level provable languages always seemed like a good idea to me for smart contracts. As you say Rust doesn't necessarily seem like the sweetspot for this.

The Ethereum EVM assmebly is wildly unsafe but regularly used for the sake of gas and other things that would be impossible (most hilariously string manipulation). Solidity is unsafe with respect to things like overflow. It doesn't have memory unsafety in the traditional sense. Partly because it is allocate only and you don't have to deal with bounds checking yourself.


If that includes C++ then basically every cryptocurrency that’s not written in Rust (Bitcoin, Ethereum, Monero…)


Rust won't protect smart contracts from logic bugs.


It depends on how much logic and/or arithmetic you can get away with encoding into the type system. We abuse the heck out of it to restrict things like register & field access, state-machine transitions, and also track resource allocation/consumption. That said, it's incredibly painful to develop anything that way, and it also doesn't ultimately prevent a different problem of the "model" you've written down in the types being wrong. So, it's not a panacea, and it's incredibly difficult, but it can winnow down the surface area of potential problems and bugs... or at least move them to compile-time.


I don’t think anyone said it would.


> Rust is where it is going for cryptocurrencies that need to have very secure smart-contracts

There is an implication here that Rust will help make smart contracts "secure", but AFAIK the vulnerabilities in smart contracts have been in their logic, not in their memory/type safety or whathaveyou.


I have an embedded real-time control project that is currently written in Rust, but runs with RTIC (https://rtic.rs/), a framework which is conceptually similar (no dynamic allocation of tasks or resources) but also has some differences. RTIC is more of a framework for locks and critical sections in an interrupt based program than a full fledged RTOS. Looking through the docs, here's the main differences (for my purposes) I see:

1. In Hubris, all interrupt handlers dispatch to a software task. In RTIC, you can dispatch to a software task, but you can also run the code directly in the interrupt handler. RTIC is reliant on Cortex-M's NVIC for preemption, whereas Hubris can preempt in software (assuming it is implemented). This does increase the minimum effective interrupt latency in Hubris, and if not very carefully implemented, the jitter also.

2. Hubris compiles each task separately and then pastes the binaries together, presumably with a fancy linker script. RTIC can have everything in one source file and builds everything into one LTO'd blob. I see the Hubris method as mostly a downside (unless you want to integrate binary blobs, for example), but it might have been needed for:

3. Hubris supports Cortex-M memory protection regions. This is pretty neat and something that is mostly out of scope for RTIC (being built around primitives that allow shared memory, trying to map into the very limited number of MPU regions would be difficult at best). Of course, it's Rust, so in theory you wouldn't need the MPU protections, but if you have to run any sort of untrusted code this is definitely the winner.

Hubris does support shared memory via leases, but I'm not sure how it manages to map them into the very limited 8 Cortex-M MPU regions. I'm quite interested to look at the implementation when the source code is released.

Edit: I forgot to mention the biggest difference, which is that because tasks have separate stacks in Hubris, you can do blocking waits. RTIC may support async in the future but for now you must manually construct state machines.


> Hubris does support shared memory via leases, but I'm not sure how it manages to map them into the very limited 8 Cortex-M MPU regions.

What I did in a similar kernel was dynamically map them from a larger table on faults, sort of like you would with a soft fill TLB. When you turn off the MPU in supervisor mode you get a sane 'map everything' mapping, leaving all 8 entries to user code.

The way LDM/STM restart after faults is amenable to this model on the M series cores.


Neat, I didn't know that the MPU fault handler was complete enough to allow for restarts.

Now that the source is available, I took a look at what hubris does - it is not actually anything fancy, just a static list of up to 8 MPU regions per task [1].

It seems that leases aren't actually shared memory, but rather just grant permission for a memcpy-like syscall [2]. This is slightly better than plain message passing as the recipient gets to decide what memory it wants to access, but is still a memcpy.

[1] https://github.com/oxidecomputer/hubris/blob/8833cc1dcfdbf10...

[2] https://hubris.oxide.computer/reference/#_borrow_read_4


That's interesting, and pretty neat what you can do with Cortex M mpus. Leases seem an interesting twist to regular message passing.


I don't think they link the binaries. It's more like, put them each on executable flash in separate places and the kernel just calls them.

The intent here seems to be that each binary has no need (and no ability) to get all up in another binary's business. Nothing shared, except access to the RTOS.


It doesn't need to link any symbols, but I believe it does need to do relocations if the code isn't PIC, and to relocate the task's statically allocated RAM.


Your link does not appear to work, maybe this one [1] is intended instead? I can't resolve that name, at least.

[1] https://rtic.rs/


Maybe they want it to be possible to have closed source and open source task to be mixed.


In general we plan on making the system as open sourced as we possibly can, so there's no specific thought currently being put into closed source tasks. While it could work, the build system doesn't actually support that at all right now.


I assumed you guys wouldn't do this, but thought this could lead to a larger adoption in this space.

Alternatively I thought maybe you needed to have some closed source component from a vendor and could only include it like that.


Yeah, I hear you on the adoption thing for sure, though to be honest, right now we are laser focused on shipping Oxide's product, and so if nobody else but us uses Hubris, that is 100% okay. As our CONTRIBUTING.md mentions, we aren't yet at the stage where we're trying to grow this as an independent thing.

It's true that vendors are likely to be an area where we have to deal with some things being closed source, though we're not simply accepting that as a given. This is one reason we're writing so much of our own software, and also, poking at the bits that must remain closed: https://oxide.computer/blog/lpc55


I definitely didn't mean it as a feature request - blobs aren't actually that common in embedded (esp32 and some motor driver libraries are the most common exceptions), so I don't think it's important for adoption. In fact, not supporting it enables future ergonomics improvements and code sharing between tasks, so I appreciate that it's not a driving factor in the design.


> esp32 and […] are the most common exceptions

Well, nearly anything having to do with wireless is typically blobs :/

Even Nordic has blobby SoftDevices, though you don't have to use them since Apache NimBLE exists (and rubble in Rust though that's only usable for advertising-only for now).


Now that it's open, how open are y'all to MRs? I want to port it to a few archs, but I'm not sure whether to hard fork or try to upstream.


We spent a ton of time trying to strike a balance in our CONTRIBUTING.md. Basically, we are happy to get PRs, but at the same time, we reserve the right to ignore them completely at the moment. We're trying to focus on shipping our product, and so are unlikely to be able to spend time shepherding PRs that aren't directly related to that. It's not you, it's us. For now. :) So yeah, we love to see the activity, and please give that a try if you'd like, but it's unlikely we'd merge new arches upstream at this point in time.


Word, makes sense. One of the major reasons why I'm interested in hubris in the first place is the strong opinion I have that systems code particularly should have a use case more important than "hey, look what I can do with this systems code". Lack of spoons on y'all's part kind of comes with that territory.


In addition to everything that steveklabnik said, it would be interesting to know which architectures you're eyeing, as some are much more modest (e.g., other Cortex-M parts/boards) than others (e.g., RISC-V). Things get gritty too with respect to I/O, where variances across different MCUs and boards can run the gamut from slight to dramatic. As steveklabnik points out, we are very much focused on our own products -- but also believe that Hubris will find many homes far beyond them, so we're trying to strike a balance...


I was eyeing RISC-V M/U-Mode with PMP. That's the closest thing to the semantics of Cortex-M from a memory protection perspective I can think of that's in common use still, plus I've got ESP-C3 and K210 dev boards laying around. I've been wanting to use them in my home automation, am cursed with the knowledge of what a nice RTOS feels like, and well, those yaks won't shave themselves.

Sounds like I should plan to do that on my own at the moment, but I'll keep it in a github style fork in case y'all's focus moves in that direction.


Check out Xous OS they are doing this kind of thing for RISC-V and want to use PMP.


> instead of having an operating system that knows how to dynamically create tasks at run-time (itself a hallmark of multiprogrammed, general purpose systems), Cliff had designed Hubris to fully specify the tasks for a particular application at build time, with the build system then combining the kernel with the selected tasks to yield a single (attestable!) image.

I worked briefly at John Deere, and their home-grown operating system (called "JDOS", written in C) also baked every application into the system at compile time. This was my only embedded experience, but I assumed this was somewhat common for embedded operating systems?


It's been a long time since I've worked in that world but in the micro-controller world it is common.


Yah it seems like they're conflating high level OS'es with low level RTOS'es. Perhaps comparing their OS to a general purpose OS while doing embedded / RTOS things. Take:

> Precedence for the Hubris approach can be found in other systems like library operating systems, but there is an essential difference: Hubris is a memory-protected system, with tasks, the kernel, and drivers all in disjoint protection domains.

It's great that the drivers run in a different memory domains. Still I'm not convinced this isn't that different from many RTOS'es. Zephyr RTOS which can do memory protected tasks and defines each task's stack statically.


Yeah, its common for RTOS. All the things in automotive (OSEK, Autosar) worked the same. And OS like Greenhills Integrity can also do full isolation between modules.


Can anyone explain to a non-server person what Oxide hopes to accomplish? Is it basically just a new server with its own OS that makes it more secure?


Their target market is essentially private cloud space combined with turnkey rack space - a pretty common on-premise setup where you order not individual servers, but complete racks that are supposed to be "plug and play" in your DC (in practice YMMV, spent a month fighting combination of manglement and Rackable mess).

You can think in this case of the final product as pretty big hypervisor cluster that is delivered complete. I'll admit more than once I'd kill for that kind of product, and I suspect that the price/performance ratio might be actually pretty good.

The operating system in this case is used for internal service processor bits (compare: Minix 3 on Intel ME, whatever was that proprietary RTOS on AMD's PSP, etc. etc) that help keep the whole thing running and ship shape.


Bingo. My guess is this is for control plane microcontrollers


Brian has also complained in interviews about how many microcontrollers are already on your motherboard and how few of them Linux really controls. It's all proprietary and god knows what's actually running in there (and how many bugs and security vulnerabilities they have).

None of those are a great situation for multitenant scenarios.

This doesn't have to be control plane only. It could also be IO subsystems.


That said, Oxide doesn't get as much control in hands of owner as Raptor, but Raptor doesn't provide high integration rack like that :<


It is primarily being used for the root of trust as well as our service processor, aka "totally not a BMC."


>a pretty common on-premise setup

I have been wondering if it will become a thing in some cloud hosting services as well. I guess we need to see their pricing.


Depending on market, I would be totally unsurprised to see some cloud providers using turnkey racks (though they might usually have nicer deals with places like quanta), and oxide could definitely strike some contracts there, though the question is how it would mesh with the existing setup


Just to be clear, this OS Hubris is for the service processor. Its an OS for firmware, not the main OS that will run on the CPU.

However they will likely ship with a something derived from Illumos and bhyve hypervisor. You can then provision VM threw the API (and likely integrated with tools like terraform or whatever). You will likely not interact directly with Illumos.

Its basically attempt to help people make running data-center easier.


Pretty much "let's redo datacenter hardware from the ground up for current requirements, cutting off legacy things we don't need anymore"


But the BMC would be the #1 item on my list of "things I don't need any more". How do you come up with a scratch legacy-free universe that still includes BMCs?


Because BMC is a term for function (which turns out to be very useful and important) not a specific technology (I like the tongue in cheek "totally not a BMC" used by some people from Oxide)


You don't need low-level remote management anymore? Or what specifically are you associating with the term "BMC"? (i.e. for me, "BMC" is "turn it off and on and force it to boot from network, remotely")


Correct. The larger my installation becomes, the less I care about the state of individual machines. The ability to promptly remediate a single broken machine becomes irrelevant at scale.


But at scale you now have more and more machines that are going offline. That tends to me to push the organization more and more to having something doing this sort of management. And without a BMC-like system, that means more in-person work, which again, at scale becomes a real cost burden.

It sounds to me more like at the scale you are at you are no longer the person making sure that individual computers are still running, and so are forgetting that this job needs to be done.


So if a machine behaves odd/goes away and its OS doesn't respond you don't want management plane to be able to redeploy it/run hardware checks/... automatically?


If you put it that way you make it too simple. The question is whether I want a second, smaller computer inside my larger computer that may at any time corrupt memory, monkey with network frames, turn off the power, assert PROCHOT, or do a million other bad things. It's not just a tool with benefits. It has both benefits and risks, and in my experience the risks are not worth those benefits.


but we are talking in the context of a project which specifically aims to do these things without the baggage of other platforms. And these things are BMC functions.


So, you manage computers using crashcarts like in ugly old days of PC servers?

I thought that had considerable issue with scaling...


I guess you thought wrong. What do you think this hardware tech is doing in this video?

https://youtu.be/XZmGGAbHqa0?t=224

Men with carts scale perfectly. One guy can manage 1000 machines. 2 guys can manage 2000. Perfect scaling.


I see no crashcart there, only hw maintenance while crashcarts are where they belong, in the trash of history (yeah, google servers have BMCs)

(For reference: a crash cart is a cart with monitor, keyboard, mouse, cabling for them, and possibly external drives to help you install base OS, used for when you have no remote management)


That's exactly what they are doing. The are removing as many thing from the BMC as possible. It only contains a few things, it boots and hand over and allows for some remote control of the low level. That's it.


No, it's a rack level design with the target market being not companies that buy single servers and fill racks with them but hyperscalers that need to fill whole datacenters with servers.

Basically there are a huge range of scale levels where having hardware on-prem make sense financially.

Also, Bryan Cantrill has some sort of personal nitpick with modern servers basically being x86 PCs with a different form factor, and with the fact that in modern servers hardware and software do not cooperate at all (and in some occasion, hardware gets in the way of software).


> but hyberscalers that need to fill whole datacenters with servers.

I strongly doubt this is aimed at Amazon, Google, or Microsoft (hyperscalers). They all already have their own highly customized hardware and firmware. If that is their target I wish them luck. There’s no margin and a ton of competition in that space and as long as they’ve been working on this that feels like a pretty poor gamble.

What I believe this is actually targeting is small enterprise and up. A company that has dozens to thousands of servers. They’re willing to pay a premium for an easier go to market.


There's a big "turnkey rack" market, where multiple servers might be delivered as complete racks and are supposed to be already wired up and everything.

All ranges of business except very small turn up in those purchases.


> I strongly doubt this is aimed at Amazon, Google, or Microsoft (hyperscalers).

Indeed, that is not aimed at those hyperscalers.


They are pretty much the "only" hyperscalers. The only two you could add is possibly Alibaba and Tencent Cloud.


I think IBM (softlayer), hetzner, and ovh would disagree. They may not have the breadth of services but they measure their scale in datacenters, not servers.


They are certainly large with their own Datacenter, along with Oracle which is growing fast. But they are not HyperScalers. At least not by first and common analytical definition of it. And no one in the same industry, ( Linode or DO ) would think of Softlayer and Hetzner and OVH are hyper scalers. The three names combined wouldn't be equal to Azure or GCP scale.

Even TecentCloud and Alibaba are relatively new addition to the term once people discovered their scale. Although generally speaking Hyperscaler used by industry analyst still dont include these two when they use the term.


Joyent?



Which was a result of Joyent being unable to scale like the hyperscalers because there was no 3rd party that could make the hardware as well as the hyperscalers. That's what Oxide is for, to fix what Joyent was unable to do, to enable others to become hyperscalers.


Hyperscalers have already moved their custom stuff in a direction quite far from x86 PCs (how many new form-factors and interconnects and whatnot are under the Open Compute Project already?) while the typical Supermicro/Dell/HPE/whatever boxes available to regular businesses are still in that "regular PC" world. This is what they're trying to solve, yeah.


I think this OS is intended to run in embedded context where there are significant memory constraints; read its description, no runtime allocations etc.

I linked two speeches where he goes over this in a bit more detail, but I hope the presentation opens it up even more.


I haven't looked at Oxide in depth. Hubris seems to be about reducing the Attack Surface of a server by

* decreasing the active codebase by at least three orders of magnitude

* using no C-Code (Rust only?)

* most code is kernel independent and not privileged (e.g. drivers, task management, crash recovery)

Also: Administration is mostly done by rebooting components.


Hubris is for system management components like the BMC, not for the main CPU.


> The Hubris debugger, Humility…

That is some great naming



If anyone else wondered about the term BMC: https://www.servethehome.com/explaining-the-baseboard-manage...


I'd like to hear more about Oxide's development process. Was this designed on an index card, and then implemented? Or was it done with piles and piles of diagrams and documents before the first code was committed? Was it treated as a cool, out-there idea that's worth exploring, and then it gradually looked better and better?

It's hard to get software organizations to do ambitious things like this, and it's impressive that this was done on a relatively short timescale. I think the industry could learn a lot from how this was managed.


So, the Hubris repo itself will show a bunch of that history, but in particular, Cliff used the "sketch" nomenclature for the earliest ideas. I think in those first days, our thinking was that we were hitting a bunch of headwind on other approaches -- and had an increasingly concrete idea of what our own alternative might look like. I think whenever doing something new and bold, you want to give yourself some amount of time to at least get to an indicator that the new path is promising. For Hubris, this period was remarkably brief (small number of weeks), which is a tribute to the tremendous focus that Cliff had, but also how long some of the ideas had been germinating. Cliff also made the decision to do the earliest work on the STM32F407 Discovery; we knew that this wouldn't be the ultimate hardware that we would use for anything, but it is (or was!) readily attainable and just about everything about that board is known.

To summarize, it didn't take long to get to the point where Hubris was clearly the direction -- and a working artifact was really essential for that.


Cool, thanks. This matches my experience with ambitious projects that actually succeed:

- Start with a really good engineer getting frustrated with existing stuff - but frustrated in a targeted way, and over a long period of time, not just a few weeks of grumpiness.

- Let them loose for a few weeks to sketch an alternative.

- Pause, and then the hardest part - smell whether this is going in the right direction. This just takes good taste!

- Make a decisive cut - either it's not working, or Let's Do It!

I can think of four or five ambitious projects I've been on or around that have really worked well, and they all seem to have worked in this way. I don't think I realized this clearly until this comment thread - thank you.


It was probably done using RFDs (Requests for discussion),. You can read more on the process here [1].

But someone from Oxide would need to tell you exactly how many RFDs took to desing and implement Hubris.

[1] https://oxide.computer/blog/rfd-1-requests-for-discussion


There was an RFD for Hubris; it laid out the basic concepts and design goals, as well as non-goals. But after that, it's largely just iterating. When I joined there were four or five people working on Hubris regularly; we have weekly meetings where we sync up, talking about what we're working on, and discuss things.


Intersting choices of names, Hubris and Humility. Combined with the style of the page, it gives to me a solemn and heavy feeling. Especially compared to most projects presented that tend to be very "positive energy and emojis". Their website is also beautiful https://oxide.computer/. Though I wonder who's the target for this. Is this for cloud provider themselves, for people that self host, for hosters? For everyone?


I think the names are very clever.

The OS is named Hubris. Building a new Operating System does take a lot of confidence.

The debugger is named Humility. It can be humbling to know your program is not working correctly and use a tool to discover how it is broken.

Impatience would be a great name for the task scheduler. (Because you want your task to run NOW!)

Laziness would be a great name for a hardware-based watchdog timer. (Because you keep on putting it off / resetting it until later.)

Compare: https://www.threevirtues.com/


Cantrill has talked quite a few times about this, it is for people that still build their own data centers.


Their podcasts is similarly interesting even for me who has no real (professional) interest or knowledge about building computers.


Speaking of interesting names, their control plane is called Omicron: https://github.com/oxidecomputer/omicron


The target market is users that want to build their own cloud infrastructure, but don't have the scale required to go directly to ODM's to have their own custom designs manufactured.


> Their website is also beautiful

But horribly broken for me (mobile firefox), with text cut off at borders and overlaid by images.


Apologies, pushing a fix now. I broke this earlier today!


Much better - thanks!


Same is true in mobile Safari (iOS) but I'll cut them some slack as long as it doesn't work in Chrome on iOS (since then it would be a Chrome specific hack, since Chrome on iOS uses the same engine as Safari.)


When Cantrill and his team works on something, HN listens, and for good reason. Startups like Oxide show that there's room for a lot of innovation still on a smaller scale, even within fields like HW.


It's really a breath of fresh air.


The github links don't work, are the repositories still private?


It's going to be presented at a talk in ~9 hours, so probably: https://talks.osfc.io/osfc2021/talk/JTWYEH/


Do you know how to watch those sessions?

The site is quite confusing, stating that registration and login are only required for speakers, but then there is no information regarding the talks beyond the description.


You had to have bought a ticket to watch the talks.

That said, I believe that they're putting them all up on Vimeo after the conference, so it won't be long until you can watch them.


Thanks, looking forward to them.


Yeah also noticed and got a little bit upset about this.. I mean publishing a website with broken links does not seems very smart nor makes very much sense to me..


Ha, sorry -- HN scooped us on our own announcement! We had the intention of turning the key on everything early this morning Pacific time, but we put the landing page live last night just to get that out of the way. Needless to say, we were a bit surprised to see ourselves as the #2 story on HN when we hadn't even opened it yet! It's all open now, with our apologies for the perceived delay -- and never shall we again underestimate HN sleuths! ;)


Did you forget to put a license in the repo? I'm guessing you meant to release it under the MPL.


That was one of the PRs that was to be merged before opening up, yes. I merged it one minute before you made your comment :) https://github.com/oxidecomputer/hubris/pull/270

(And yes it's MPL)


That's very impressive. Response time of -1 minute? Best I've ever seen.

;-)


Ha!


That said if Hubris OS will be presented at a talk later today I guess things seems to be more clear to me.

Are kind of way very much looking forward to the talk and the presentation of the operating system btw.


I guess we need to keep ourselves busy with some docs, as that one works.


yeah sounds as a very good suggestion to me :)


Has Oxide released any information on the price range of one of their machines? I assume if they're targeting mid-size enterprises it would be outside what I would consider buying for hobby use, but it would be sweet in the future if there was a mini-Oxide suitable for home labs.


AFAIK they won't even sell individual machines, the product is a whole rack.

Since they aim to open source everything, there probably will be a way to use their management plane and stuff with a homelab eventually :)


The supervisor model reminds me a bit of how BEAM (Erlang/Elixir) works although I'm sure that's probably where the similarities end.

As much as most of this is way over my head, I'm always fascinated to read about new ground-up work like this.


> no C code in the system. This removes, by construction, a lot of the attack surface normally present in similar systems.

Not to be too pedantic here, but it's important to note that the absence of C code, while arguably a benefit overall, doesn't by itself guarantee anything with regards to safety/security...I suppose there's going to necessarily be at least some "unsafe" Rust and/or raw assembly instructions sprinkled throughout, but I can't yet see that myself (as of the time of writing this comment, the GitHub links are responding with 404). Nonetheless, it's always refreshing to see some good documentation and source code being provided for these kinds of things. Many companies in this space, even these days, sadly continue to live by some outdated values of hiding behind "security through obscurity", which is somehow championed (though using different words) as a benefit even to their own customers, so it's refreshing that others (Oxide among them) are really starting to take a different approach and making their software/firmware publicly available for inspection by anyone inclined to do so.


To be clear, that sentence refers to the sum total of the things in the previous sentence, not just the bit about C. And it's "a lot of the attack surface" and not "guarantee" for a reason. We don't believe that simply being written in Rust makes things automatically safe or secure.

There is some unsafe Rust and some inline assembly, yes. I imagine a lot less than folks may think.


I will admit perhaps I was a bit too loose with my own interpretation of the statement there. I think maybe this was influenced by my being tired of grand statements others have made in the past about the infallibility of writing code in Rust (even with liberal usage of "unsafe" without a proper understanding of what this implies).

It's all too often I see some cargo cult-style declaration of "no more C; it's all Rust" as if that has somehow solved all problems and absolved the programmer of the responsibility for ensuring their code is otherwise safe and correct (granted, Rust does make this easier to do), and IMHO this just ends up doing a disservice both to those proclaimers and ultimately to Rust itself. To be clear, this is not a statement against Rust by any means but rather a complaint against the conduct of some of its practitioners.

With that being said, I feel I really also need to state here that I absolutely do not believe the above to be the case with the announcement here...Even to the contrary, I would say, as the name "Hubris" says it all. It's great to see Rust used in practice like this, and I look forward to seeing more of the details in the code itself!


> it's all Rust" as if that has somehow solved all problems and absolved the programmer of the responsibility for ensuring their code is otherwise safe and correct (granted, Rust does make this easier to do)

Rust is all about making it easier to ensure safety and correctness, yeah! It's still a tough job, but significantly easier than C or C++.


Which is why they state a lot of the attack and not all of the attacks.


That's what bootable Modula I offered on the PDP-11, over 40 years ago.


This needs citation, but what does the "that" even refer to? I'm genuinely curious because there's little on it; did you use it? Did it survive? And what particular aspect of Hubris and Humility reminds you of this system?


Compile all the processes together, allocating all resources at compile time. Modula I had device register access, interrupt access, cooperative multitasking (async, the early years) and it worked moderately well on PDP-11 machines.

Yes, I did use it. We wrote an operating system in it at an aerospace company.[1] It didn't work out well. Sort of OK language, weak compiler, not enough memory given the poor code generation. It was just too early to be doing that back in 1978-1982. We got the thing running, and it was used for a classified high-security application, once.

[1] https://apps.dtic.mil/sti/pdfs/ADA111566.pdf


Thanks for the reference -- that's helpful. KSOS is not a widely known system, but I can certainly see some similarities in approach with Hubris.

That said, there are far more differences than there are similarities, and it's a gross oversimplification to say -- or even imply -- that the work in Hubris somehow replicates (or is even anticipated by) KSOS. More generally, I find this disposition -- that a new technology is uninteresting because it was "done" decades ago -- to be generally incurious, dour, discouraging, and (as in this case) broadly wrong on the facts. We as a team have as much reverence for history as any you will ever find; it is not unreasonable to ask those who have lived that history to return the favor by opening their minds to new ideas and implementations -- even if they remind them of old ones.


No, not KSOS. The Modula 1 environment. That was one of Wirth's early languages. Modula 2 is better known. Modula 1 was for embedded. This was the first language to have compile-time multitasking, something very rarely seen since. Here's a better reference.[1]

One of the most unusual features is that the Modula 1 compiler computed stack size at compile time, so that stacks, too, were allocated at compile time. Recursion required declaring a limit on the maximum recursion depth.

It's interesting as a working example of how minimal you can go running on bare metal and stay entirely in a high level language. Few languages since have been specifically dedicated to such a minimal environment.

This is what you need for programming your toaster or door lock.

[1] https://www.sciencedirect.com/science/article/pii/S147466701...


I guess you could consider Oberon-07 to be one of such languages, specially since Wirth made it even smaller in each revision.


As someone who's only worked with a prepared hardware kit (a dsPIC33F on an Explorer 16 that came with cables and the debugging puck), if I want to pick up the board they recommend in the blog post, do I need to make sure I get any other peripherals?

This all seems very cool, and I badly want to poke at embedded stuff again, but I have whatever the opposite of a green thumb is for hardware. Advice would be appreciated ^_^


Right there with you, dude. I just bought an H753ZI (1), hope it's enough, we'll see.

1: https://www.st.com/en/evaluation-tools/nucleo-h753zi.html


For anyone still reading this a week on, I got the recommended board in, and I needed a micro-USB cable -- preferably two, since it has two sockets (one for power and one for, I assume, programming). Luckily, I already have a few of these hanging around for charging my wireless headphones.

It is completely beyond me, however, why one of the sockets is mounted upside down relative to the other one.


How are these docs being built? I really like how these look and it looks to be asciidoc based, but I can't seem to find a build script for these.


There's an open PR with the details, I set it to deploy every push to that PR for now so we could make quick fixes. It just runs asciidoctor, nothing fancy.

https://github.com/oxidecomputer/hubris/pull/272

(specifically https://github.com/oxidecomputer/hubris/pull/272/files#diff-... )


HTML says

<meta name="generator" content="Asciidoctor 2.0.16">

so I guess https://asciidoctor.org/


Their mention of individually restarting components and "flexible inter-component messaging" really reminds me of the BEAM. Very exciting!


Reminds me of QNX. It was an amazing OS and restarting display drivers over the network was just one of its amazing abilities.


Not an accident. Cliff was influenced by QNX, Minix, L3 and L4 in his design (specifically, QNX proxies directly inspired Hubris notifications). And for me personally, QNX -- both the technology and the company -- played an outsized role in my own career, having worked there for two summers while an undergraduate.[0] (And even though it's been over 20 years since he passed, just recalling the great technology and people there makes me miss the late Dan Hildebrand[1] who had a singular influence on me, as well as so many others.)

[0] http://dtrace.org/blogs/bmc/2007/11/08/dtrace-on-qnx/

[1] https://openqnx.com/node/298


There should be a retweet on HN!

QNX was one the most impressive OS I've seen, especially for its time. From the full OS in a 1.44Mb floppy disk to restartable drivers, real-time, etc. IPC with messages is built in and most things ran in userland.

It ended in the hands of BlackBerry which is probably not the best home for it...

Edit: I googled out of curiosity, and despite being closed source, it seems to still be marketed by BlackBerry and is supposedly a market leader in embedded OSes! More than 20 years later, well deserved.

https://www.automotiveworld.com/news-releases/blackberry-qnx...


QNX is still going strong even with Blackberry. I worked at a company recently that heavily relied on QNX for safety critical embedded OS projects. It's Qt and QML integration made rapid prototyping a snap. Unfortunately it requires pretty (relatively) hefty processors so I never personally got to use it.


Yes, it's still around. Sadly nowadays feels a bit neglected, and isn't quite keeping up. They have their markets that have little other choices and/or are very conservative in switching to something else, and will live of those for quite a while.


Most of Cisco's high-end service provider routers were running IOS-XR on top of QNX. They switched from QNX to a Linux kernel (specifically, Wind River) around 7 years back.


It is an amazing OS, https://blackberry.qnx.com/en


No surprise, since Bryan Cantrill worked at QNX for a short time in the 90s.


Their repo is a rare case which embraced git submodules. For some reason they generate a lot of friction and not used often.


They are in fact a giant pain, but sometimes, still the best option.


Did you try https://github.com/ingydotnet/git-subrepo ? Looks like it vendors in other repositories, making submodules entirely transparent for consumers and still allowing sumbodule workflow for authors.


We didn't, thanks for the tip!


I think reference provide more info than above announcement itself:

https://hubris.oxide.computer/reference

Looks amazing imo. Waiting for github code :D



I feel like Rust is everywhere and nowhere at the same time; how do they do it?


"They" is just a lot of software engineers who really really like it and want to be using it, but can't use it in their day job, so continue to talk and talk and talk about it in the hopes that it's used more. I'm one of them (not using it currently, want to use it).


Indeed, for example, one day Rust will be part of the default VS install, and then I can justify having native components for .NET stuff written in Rust, until then C++ will have to do.

https://docs.microsoft.com/en-us/windows/dev-environment/rus...


MLM, effectively.

The hype machine is pushing it into some places it is far from mature enough for. The hype depends more upon bad-mouthing other technologies than its merits can justify.

This is not to discount actual merits, or to doubt the needed maturity will come, in time. And, embedded is certainly a place it is already mature enough for.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: