That line reminded me that NetBSD added Lua for writing kernel modules. (https://news.ycombinator.com/item?id=6562611)
Anybody have any experiences to share from this?
When it comes to Linux, one could say that most reasons to avoid it are historical, but this does not quite paint the awkward truth -- namely that, for most of the kernel's lifetime (since back in 1991), C++ compilers simply did not have the level of maturity and stability across the breadth of platforms that Linux required. Linus Torvalds' stance on this matter is pretty well-known: http://harmful.cat-v.org/software/c++/linus .
Today, when x86-64 and ARM are the only two families that you need to care about in the following ten years or so (maybe RISC-V but I rather doubt it), it probably makes sense to look at C++ for operating systems work, but the runtime is certainly heavier than back when Linus was writing about it, too. A modern C++ compiler has a lot of baggage; C++ was huge back in 1998, now it's bloody massive. IMHO, all the reasons why you would want to use C++ (templating support without resorting to strange hacks, useful pointer semantics and so on) are reasonably well-served by cleaner languages with less hefty runtimes, like Rust. What these alternatives do lack is the amazing level of commercial support that C++ has.
I don't think any "run-time" feature was added since, though. It's all either OS support (<thread>, etc, that you wouldn't use in-kernel anyways) or template stuff that has 0 impact on runtime (and actually sometimes helps decreasing code size).
If some guys are able to run c++ on 8kb microcontrollers, there's hardly a non-political
reason it couldn't be used in-kernel.
See also IncludeOS: http://www.includeos.org/
You can certainly strip things down to a subset that can fit 128K of flash and need only 1 or 2K of RAM at runtime, but the question is not only one of computational resources used for the library itself. Additional code always means additional bugs, the semantics sometimes "hide" memory copying or dynamic allocation in ways that many C++ programmers do not understand (and the ones who do are more expensive to hire than the ones who do not), and so on. You can certainly avoid these things and use C++, but you can also avoid them by using C.
I agree that mistrust and politics definitely play the dominating role in this affair though. I have seen good, solid, well-performing C++ code. I prefer C, but largely due to a vicious circle effect -- C is the more common choice, so I wrote more C code, so I know C better, so unless I have a good reason to recommend or write C++ instead of C, I will recommend or write C instead. I do think (possibly for the same reason) that it is harder to write correct C++ code than it is to write correct C code, but people have sent things to the Moon and back using assembly language for very weird machines, so clearly there are valid trade-offs that can be made and which include using language far quirkier than C++.
The real answer is historical, and cultural. On the latter, Unix is a product of C (well, and BCPL) and C is a product of Unix. They two are intertwined heavily. The former is as was mentioned a product of the relative crappiness of early C++ compilers (and the overzealous OO gung-ho nature of its early adopters perhaps as well...)
C++ without exceptions, RTTI, etc. has a lot to offer for OS development. Working within the right constraints it can definitely make a lot of tasks easier and cleaner.
It won't happen in Linux, tho.
I call BS on this.
So many mistakes in C simply cannot be made in C++ if you follow the well-established coding patterns and don't try to switch to C-style code. e.g.: You just cannot simply forget to free any resource, because you never have to do it with RAII. You cannot forget to check a status code and return early because you don't have to; the exception will propagate until someone catches it. You cannot forget to initialize a vector because it initializes itself. I could go on and on.
That said, there is a huge caveat to what I am saying above: I am comparing experienced programmers in each language with each other -- those who are basically experts and know what they're doing. I'm not debating whether it's easier to shoot yourself in the foot with C++ if you use it with insufficient experience (and part, though not all, of the reason is that you will probably write C-style code most of the time, and write neither proper C nor proper C++). I'm saying that an experienced programmer is much more likely to write correct code in C++ than C.
What additional code?
> You can certainly avoid these things and use C++, but you can also avoid them by using C.
Right, but what you can't get with C is destructors and move/ownership semantics.
> I do think (possibly for the same reason) that it is harder to write correct C++ code than it is to write correct C code
The ability to write typesafe data structures with move/ownership semantics and specified interfaces while being a superset of C would lead some to say that this is not true.
All the code that you need in order to support smart pointers, templates, move/copy semantics, exceptions and so on. To paraphrase someone whose opinions you should take far more seriously than mine, you can't just throw Stroustrup's "The C++ Programming Language" on top of an x86 chip and hope that the hardware learns about unique_ptr by osmosis :-). There used to be such a thing as the sad story about get_temporary_buffer ( https://plus.google.com/+KristianK%C3%B6hntopp/posts/bTQByU1... -- not affiliated in any way, just one of the first Google results).
The same goes for all the code that is needed to take C++ code and output machine language. In my whole career, I have run into bugs in a C compiler only three or four times, and one of them was in an early version of a GCC port. The last C++ codebase I was working on had at least a dozen workarounds, for at least half a dozen bugs in the compiler.
Huh? unique_ptr has a customizable Deleter though; you should be able to provide a custom one so it doesn't call delete.
And doesn't a kernel implement a kmalloc() or something anyway? You would just write your own operator new and have it do what your kernel needs, and the rest of the standard library would just work with it.
smart pointers are their own classes, and exceptions would certainly be disabled in kernel-mode, sure, but for the rest ? which additional code ? there's no magic behind templates and move semantics, and no run-time impact. It's purely a compile-time feature.
Unique_ptr is just a template and does not need the standard library. Also move/ownership semantics don't need unique_ptr.
> The last C++ codebase I was working on had at least a dozen workarounds, for at least half a dozen bugs in the compiler.
There are at least four compilers that been extensively production tested - ICC, GCC, MSVC and Clang. Which one had these bugs and was it kept up to date?
That was targeted at x86 with 2MB of RAM usage, though. So, I quickly looked for a microcontroller RTOS thinking that would be a good test. I found ScmRTOS that claims to be written in C++ with size being "from 1KB of code and 512 bytes of RAM" on up. So, it seems feasible to use C++ for small stuff. I'll also note that it has some adoption in high-assurance industry with standards such as MISRA C++ (2008) already available. They might be using powerful CPU's, though, so I looked up the MCU stuff.
The throwaways are talking safety features. I used to think that was a benefit of C++ over C. Following Worse is Better, that's no longer true: the ecosystem effects of C produced so many verification and validation tools that it's C language that's safer than C++ if one uses those tools. There's piles of them for C with hardly any in FOSS for C++ if we're talking about static/dynamic analysis, certified compilation, etc. I put $100 down that a CompCert-like compiler for C++ won't happen in 10 years. At best, you'll get something like KCC in K Framework.
The reason this happened is C++'s unnecessary complexity. The language design is garbage from the perspective of being easy to analyze or transform by machines. That's why the compilers took so long to get ready. LISP, Modula-3, and D showed it could've been much better in terms of ease-of-machine-analysis vs features it has with some careful thought. Right now, though, the tooling advantage of C means most risky constructs can be knocked out automatically, the code can be rigorously analyzed from about every angle one could think of (or not think of), it has best optimizing compilers for if one cares little about their issues, and otherwise supports several methods of producing verified object/machine code from source. There's also CompSci variants with built-in safety (eg SAFEcode, Softbound+CETS, Cyclone) and security (esp Cambridge CHERI). duneroadrunner's SaferCPlusPlus is about only thing like that I know of that's actively maintained and pushed for C++. The result of pro's applying tools on a budget to solve low-level problems in C or C++ will always give a win on defect rate to former just because they had more verification methods to use.
And don't forget that, like with Ivory language, we can always develop in a high-level, safer language such as Haskell with even more tooling benefits to extract to safety-critical subset of C. Extracted code that's then hit with its tooling if we want. We can do that in a language with REPL to get productivity benefits. So, we can have productivity, C as target language, and tons of automated verification. We can't have that with C++ or not as much if we had money for commercial tools.
So, these days, that's my argument against C++ for kernels, browsers, and so on. Just setting oneself up to have more bugs that are harder to find since one looses the verification ecosystem benefits of C++ alternatives. This will just continue since most research in verification tools is done for managed languages such as Java or C# with what's left mostly going to C language.
> The throwaways are talking safety features. I used to think that was a benefit of C++ over C. Following Worse is Better, that's no longer true: the ecosystem effects of C produced so many verification and validation tools that it's C language that's safer than C++ if one uses those tools. There's piles of them for C with hardly any in FOSS for C++ if we're talking about static/dynamic analysis, certified compilation, etc. I put $100 down that a CompCert-like compiler for C++ won't happen in 10 years. At best, you'll get something like KCC in K Framework.
Lots of people think additional safety features result in safer code. They likely do most of the time, but when you need to swear to investors, to the public and to the FDA that your machine will not kill anyone, what you want to have is results from 5 verification tools with excellent track records, not "my language does not allow for the kind of programming errors that C allows". Neither does Ada, and yet they crashed a rocket with it, with a bug related to type conversion in a language renowned for its typing system (not Ada's fault, of course, and especially not its typing system's fault -- just making a point about safety features vs. safety guarantees).
A more complex language, with more features, is inherently harder to verify. The tools lag behind more and are more expensive. And, upfront, it seems to me that it is much harder to reason about the code. C++ does not have just safety features, it has a lot of features, of all kind.
I'd disagree. Modern C++ has much better safety features than C ever has.
gstreamer, gtk, etc, are really easy to work with and browse the source.
This is why I love golang as well.
Side note, it is funny how much gstreamer and glib try to add c++ "ish" features to c.
Interestingly, I just found Linus' stance on Golang :) https://www.realworldtech.com/forum/?curpostid=104302&thread...
beware, performance and code size does not always go hand in hand!
I used to work on one. It was developed in the 90s, around the same time C++ became an ISO standard. I think most of you have used it without knowing. :)
I think OOP is natural for writing kernels and most mainstream kernels implement some kind of object system because of this. I also wrote my toy kernel in modern C++, despite that I'm "fluent" in both C and C++. So yeah, C++ kernels are here, just not in Linux land.
Wait what? What happened to MIPS, z/Architecture,Power Architecture, QDSP6, TMS320?
Back when Linus was ranting, one could make a convincing case for supporting all of these architectures in an OS meant for general-purpose workloads, from server to desktop and from thin client (eh? feel old yet :-)?) to VCR -- and . Now you can sell an OS that supports nothing but x86-64 and a few ARMs, and you get most of the server market, and virtually all of the desktop and mobile market.
Obviously the architectures that you need for a particular project are the ones that you want to "care for" -- but in broad terms, it is perfectly possible today for someone to have worked as a programmers for ten years without encountering a single machine that is not either x86 or ARM-based. Someone who had 10 years of programming experience in 2003 probably ran into a SPARC system whether they wanted or not.
S390X (IBM mainframes) and OpenPOWER variants are fully supported by Ubuntu and those are not insignificant market either in banking, finance, or numeric computing.
One reason why Linux runs all Top 500 supercomputers is because it they have number of relatively recent systems close to the top running Sunway (Alpha derivative) , Power and SPARC64. Fastest supercomputer today runs on Sunway.
There was an announcement for a 64bit mips android device for 2016 (Warrior i6400) but I didn't hear anything about it. Did you?
Also there was an ainol tablet with the mips architecture (nov07) which flopped (those are the only instances I can recall where mips on android really came into play).
After all, most of the “extra stuff” that C++ adds on to C has no inherent hardware dependence and exists only on the frontend. By the time the code makes its way to the backend, templates have been monomorphized (i.e. copied and pasted for each set of parameters); methods have been lowered to functions with added ‘this’ parameters; hidden calls to destructors and conversions have been made explicit; and so on. For any given C++ program, you could write a C program that compiles to near-identical IR, so if the backend can reliably compile C code, it should be able to handle C++.
True, that hypothetical C program might not look much like a typical human-written program. In order to achieve zero-cost abstractions, C++ code tends to rely a lot more on inlining and other backend optimizations than pure C, even though those optimizations apply to both languages. So if the backend isn’t reliable, C++ may generate more ‘weird’ IR that could end up getting miscompiled, and the level of indirection may make it harder to figure out what’s going on. But compiler backends these days are a lot more reliable than they used to be. And buggy backends can certainly cause problems for C code as well.
I despise his choice of words and general attitude towards programmers "less capable than him", but I do think there is some truth in what he says there: Imagine we'd all be writing asm only - we'd spend much more time on the drawing board and find much more elegant solutions to problems that we today solve with mindlessly writing hundreds lines of code within a "framework".
Side note: I wrote C++ in 1998; it was more "bloody massive" back then than it is today, in my humble experience.
> anybody who tells me that STL and especially Boost are stable and portable is just so full of BS that it's not even funny
> the only way to do good, efficient, and system-level and portable C++ ends up to limit yourself to all the things that are basically available in C.
You can certainly write C++ code without overabstracting it. Maybe Torvalds only ran into C++ programmers who liked overabstracting their code -- as it is usually the case with opinions of some people about other people, that part of the rant is best discarded :-).
But as my memory serves me, back when "portability" meant x86, SPARC, Alpha, ARM, MIPS, PPC, PA-RISC and m68k, the only way to do "good, efficient, and system-level and portable C++" was to limit yourself to the (relatively small) subset of C++ that was well-implemented by widely-available compilers (particularly GCC) across all these architectures -- not as a matter of discipline and code elegance, but because writing non-trivial C++ code back then usually resulted in very unpleasant trips down the standard library's source code and poring over objdump outputs trying to figure out what the hell broke this time.
Language-wise, Darwin/XNU/IOkit uses a subset of C++, notably one that excludes exceptions and templates IIRC. stdlib-wise I suspect it's even more restricted/different.
> any compiler or language that likes to hide things like memory
Is Rust well-understood, both by the users and the toolchain developers? How many independent compilers are there for Rust? What architectures do they support?
I'm not implying that Rust isn't suitable for Linux, but it has open issues, that prevent usage right now. Rust and it's core ecosystem have a lot of churn, and still feel like a rather early work-in-progress. One example not mentioned is the Rust language server and its integration with consumers.
I cheer for Rust, and hope for solid progress.
To be fair these benchmarks are aiming for speed, but the c baseline typically produces faster code with less memory, so it may very well be a case of TANSTAAFL.
Just a couple of examples.
$ find sys -name '*.lua'
No library runtime features, so no RTTI or exceptions, but we use templates and dynamic dispatch.
I can see the argument from Linus' POV about avoiding over abstraction, and knowing exactly what your code is doing on a low-level, but at the same time it leads to a lot of reinventing the wheel and makes me think there is a better balance to be found.
I'm addressing this as someone who has done C++ work on very small embedded systems (e.g. microcontrollers with no external memory) including RTOS code and low-latency control loops, and there were definitely some nice features in C++ that can save you some time and lines of code without adding a bunch of needless overhead. Of course, there was a learning curve when bringing new people on board the project about what was/wasn't allowed for performance reasons. In general though, I see no reason why there couldn't be some subset of C++ used for kernel development other than Linus' / the kernel dev community's general distaste for it.
I did some kernel development (mainly some drivers) and I am also doing a lot of embedded development (credit card terminals, personal projects for ARM Cortex-M) I must say that I am happy with this choice.
The kernel is complex enough and you want to focus on understanding what really matters without being distracted by arcane constructs that would inevitably come with C++.
My experience with C++ guys is that there will inevitably be a population of smart developers which will try to be "smart" with the language which typically ends up with everybody else spending more time on understanding what is happening than the time that was actually saved by the construct.
C does not pose that problem. C is simple and there is relatively little occasion to be very smart with it, which is a good thing, IMHO.
This still leaves some holes that the runtime provides, but all of which are easy to provide -- namely things like 'new' and 'delete'.
And here I was about to ask if anyone has a similar article based on Rust :)
I work at a company that provides backups for both Linux and windows. The entire concept was around block level backups. You could just open up the block device and copy the data directly but it would quickly become out of sync by the time you finished copying it. We did not want to require LVM to be able to utilize snapshots to solve the sync problem. On top of that we had a strong requirement of being able to delete data from the backup.
This resulted in me learning how to build a kernel module and then slowly over about 6 months creating a kernel driver that allowed us to take point in time snapshots of any mounted block device with any fs sitting on top.
Other requirements also dictated that we keep track of every block that changed on the target block device after the initial backup (or full block scan, after reboot).
I wish I could release the source but my employers would not like that :( So at least for me learning how to write kernel modules and digging in to some of the lower stuff has keep me gainfully employed over the years. It is still in use on about 250k to 300k servers today (it fluctuates).
The hardest part was not writing the module, but getting others interested in it enough so I don't have to be the sole maintainer. I like working on all parts of the product and don't want to just be the "kernel guy".
One other time I wrote a very poorly done network block device driver in about 8 hours. You can find it here https://github.com/mbrumlow/nbs -- Note I am not proud of this code, it was something I did really quick, wanted it on hand to show to a perspective employer -- I did not get the job, I am also fairly sure they did not even look at the driver, so I don't think the crappy code there affected me.
EDIT: Thanks for both replies! :)
So, there was a thing that added this feature to the Windows kernel in the product to make this work. Aside from the Linux stuff, which was totally separate. But if you don't need the writing capability, Shadow Copies are good enough, sure..
(Source: I used to work with the guy who made the above post)
Windows Volume Shadow Copy has the advantage of being integrated with he FS a bit closer. So in Windows VSS can avoid some overhead by skipping the 'copy' part and just allocating a new block and updating the block list for the file.
For the Linux systems we had the requirement to work with all file systems (including FAT). So we could not simply modify the file system to do some fancy accounting when data in the snapshot was about to be nuked. So that resulted in me writing a module that sits between the FS and the real block driver. From there I can delay a write request long enough to ensure I can submit a read request for the same block (and wait for it to be fulfilled) before allowing the write to pass through.
> (Why couldn't you use the existing feature?)
We did on Windows, VSS is used with a bit of fancy stuff added on top.
For Linux there is no VSS equivalent (other than the one I wrote, and maybe something somebody working on a similar product may have written). And even if one did come about (or is and I am just not aware of) it for sure was not available when I started this project.
EDIT: ok so it looks like the management of the snapshot space is a bit different. still, you could probably have wrapped LVM management enough to make it palatable, in less than the time it took to write a custom module
Also at the time LVM snapshots were super slow. I don't have the numbers but even with the overhead my driver created I was able to have less impact on system performance.
I was able to do some fancy stuff to optimize some of the more popular file systems by making inline look-ups to the allocation map (bitmap on ext3). This allowed me to not COW blocks that were not allocated before the snapshot. This was a huge saving because most of the time on ext3 your writes will be to newly allocated blocks.
Wrapping LVM would probably not work, and still require a custom module to do, the user space tools don't do much. LVM really is a block management system that needs manage the entire block device, so existing file systems not sitting on top of LVM would get nuked if you attempted to let LVM start managing those blocks, and you still had the issue that reads and writes were coming in on a different block device. Asking people to change mount points was not a option. There were also some other requirements like block change tracking that LVM does not have the concept of doing. This is for incremental snapshots. Without this sort of tracking you will either have to checksum every block after every snapshot if you wish to to only copy the changes. This module also was responsible of reporting back to a user space daemon that keep a map of what blocks changed. So when backup time arrived we could use this list (and a few other list) to create a master list of blocks that we need to send back. This significantly cuts down on incremental backup time. Some companies call this "deduplication" but I feel that is disingenuous -- to me deduplication is on the storage side and would span across all backups.
So yes, requiring a module is much easier than telling a customer they can't trial, or use this product until they took their production system off line and reformatted it with LVM. Many people hated LVM at the time, it was considered slow and caused performance problems, this was like 8 years ago... LVM has vastly changed and does not have these type of complaints any more. But I can tell you people still are going to scream bloody murder if we told them they had to redo their production images and redeploy a fleet of 200+ servers just to switch to LVM so they could get a decent backup solution.
Also shout out to aseipp! Miss working with you. Have yet to find a bug in the code you wrote :p
Other interesting stories would include:
- Instrumenting certain filesystem operations (all modules share the same memory space; it's possible for one module to take a sneak peak at another's internal structures). This was back before dtrace & friends were a (useful) thing.
- Real-time processing of instrumentation samples. Doing it in the kernel allowed us to avoid costly back-and-forth between user and kernel memory -- but we only did relatively simple processing, such as scaling, channel muxing/demuxing and the like. If you find yourself thinking about doing kernelside programming because carrying samples to and from userspace is too expensive, you should probably review your design first.
Actually, that's not true at all! Just look at Mesa3D and any number of GL proprietary blobs from graphics vendors. Userspace drivers are actually quite common, they just aren't what folks think of immediately when they think about 'drivers'.
This whole communication process had to happen within a small time frame, something like 50 ms. A kernel module handled the communication, decoded the packets, and automatically relayed peripheral messages to other peripherals. The kernel module made it easier to achieve stable and real-time operation within the time constraint.
When writing the same function in userspace, there was absolutely no guarantee that messages would be sent or received in time.
I was working on an embedded system and needed to have fast I2C access (mostly just wanted very small latency bc it's an instrument). I2C from Linux userspace (using ioctls) adds a lot of overhead. I started looking into kernel modules but after a day of research I found out that you can access hardware registers from userspace using mmap("/dev/mem") which is even faster than a kernel module.
but that would only work if some kernel module or the kernel itself hasn't mapped that IO space, right?
Two lessons learned:
1. dealing with tangible things for once is incredibly satisfying.
2. next time, download the fscking docs instead of relying on an intermittent 3G connection.
It'D be great if My small module Could be published As is, but I'd need to strip all addresses/filenames, and also add more proper locking (but yes, I want to publish it anyway).
I was in Australia for that work too...
The one that I'm most proud of delegates decisions on whether executables can be run to .. userspace. Which is simultaneously evil and brilliant.
i.e. User "foo" tries to run "/tmp/exploit" and the kernel executes "/sbin/can-exec foo /tmp/exploit". If the return code of that is zero then the execution is permitted otherwise it is denied.
This gives you ample opportunity to log all executables, perform SHA1-hash checks of contents, or deny executables to staff-members outside business hours. There's a lot of scope for site-specific things, creativity is the limit!
How did the user space redirection handle potentially high spikes of execve's?
And btw, thanks for all the years of debian-administration.org.
Even though it has some annoying gotchas (such as the fact ARM cores can sleep/frequency scale on demand with no forewarning, meaning cycles aren't always the most precise units of measurement), and is very simple -- this thing ended up being mildly popular. Even though I wrote it years ago, someone from CERN recently emailed me to say they happily used it for work, and someone from Samsung ported it to ARMv8 for me...
(I should dust off my boards one of these days and clean it up again, perhaps! People still email me about it.)
In case that doesn't make you want to run away screaming, I put a post in the "Who is hiring?" article:
i wrote a kernel module that allowed us to keep logs in RAM even across a soft reboot (i.e., if the device crashes or does a software update). it basically reserved a chunk of 1M at the top of RAM (using the same physical address each time). there was a checksum system that allowed it to tell during boot whether the data is still valid or if the bits had decayed.
i've also done a couple i2c drivers for temperature sensors or LED controllers. kernel work is fun.
These times were so happy - implementing a device driver seemed like owning the world for a student!
If anybody wants I'd be happy to put the source code in github - mainly for historical reasons because I'm not sure that the code will be still working today.
Here is a link to the free book (warning it's a pdf):
And here is a link to a github project with all exercises updated for the most recent kernel:
It was one of the refereces I used while implementing the module and it was a really good and comprehensive book - totally agree with the recommendation :)
Notice - I tried compiling it but was not able to (and I don't have time to research modernizing this code).
The author seems to be implying that rings are implemented at the processor level for x86-64 processors. If I’m interpreting the wording correctly that’s interesting! Coming from the ARM world I’d always thought that rings were an OS construct.
Edit: See http://www.heyrick.co.uk/armwiki/Processor_modes
At least virtual memory impacts performance, which is one thing user space code uses
len — ;
Also for anyone writing kernel code, this is indispensable: http://elixir.free-electrons.com/linux/latest/source
Nowadays, lots of code could be using devm_kmalloc, devm_ioremap, etc which will release the resources automatically when the driver detaches from the device.