Hacker News new | past | comments | ask | show | jobs | submit login
Thor – A minimalistic operating system in assembly and C++ (github.com/wichtounet)
189 points by ingve on Aug 23, 2016 | hide | past | favorite | 146 comments



Last week I read the very good 'How to build an operating system from scratch' [1], and I'm glad I did. It meant that I could burrow into the files on this github project and understand exactly what was going on.

One day, when I finally have some free time, I am totally going to do this myself.

---

[1] https://www.cs.bham.ac.uk/~exr/lectures/opsys/10_11/lectures...


That looks awesome! I did a little side project on rolling my own Unix kernel, highly recommend the guide! https://web.archive.org/web/20160301082842/http://www.jamesm...


Check out the broken thorn tutorials on the subject. They are far more complete.


I will!

I'm guessing you mean these..?

http://www.brokenthorn.com/Resources/OSDevIndex.html


They are pretty good indeed to begin with. However, the best resource there is is clearly osdev.org, both the forums and the wiki are full of information!


The OSDev wiki is the best think to hit the internet since sliced bread (with cats faces pushed through them?) but it's far from a tutorial. It's a wiki.

While it is one of the best repositories of knowledge on the subject of low level development, it's not a step by step guide.

In the same way I don't recommend that people read Wikipedia to learn about human history I'd also not recommend reading osdev to learn about low level development.

It at it's heart is organized as a reference, not a text book.

I wish that someone would go through and accumulate all the knowledge, put it into a series based textbook with chapters, throw in some example code and VMs for many architectures, and sell it to me because I'd pay. I'd also push for those at my university to construct a class around the book as it would be one of the most fascinating and knowledge packed classes I'd think you could take. (Hint hint, nudge nudge writers and professors!)

But yes to date NOTHING beats osdev.


I don't know but the Tanenbaum books."Operating Systems Design and Implementation" is old, but the appendix is the full C source code to Minix 3.0 if you buy the 3rd Edition. I had the 1st Edition with around 12K lines of C in the appendix. Great book, and I had Minix 1.5 running on my old Amiga around 1990. Minix was more portable than Linux, which came out around 4 year after Minix. I had my money on Minix when Linux came along, since Minix was a microkernel, and I thought it sounded like the better direction to take. Well, Linux won out, and funny enough microkernels came back in style anyway with MachTen, MkLinux, and QNX. QNX was very successful in the realtime OS world. I used it on two jobs. The newer Tanenbaum book: "Modern Operating Systems" I have not read, but if it is like the first one, I'd imagine it would be very educational too.


> I had my money on Minix when Linux came along, since Minix was a microkernel, and I thought it sounded like the better direction to take. Well, Linux won

well, I am too young to witnesses the things, but I read from Tannenbaum, that his main intention was to keep Minix so small, that his students were able to understand it ine one semester. And that seems like a big limitation to me ...


Small != a limitation in many cases.

I think small adds to understanding. In addition, the whole point is to keep the kernel small in microkernels, and most, if not all, userland stuff is implemented outside of the kernel, so small is good here.

You can fork and add to it if you want to expand the kernel.

I happen to have a bias for minimalism and small: Shen, Forth, PicoLisp, wasplisp (MOSREF pentesting environment), J programming language, etc...

I think there is too much bloat in web dev, backend, frontend, and PLs in general. I really appreciate it when I see somebody elegantly, and with little code or extra tooling solve a problem in a straight forward manner.


What would be a good step-by-step tutorial in your opinion?


If you'd like the links that really helped me, check out this:

http://www.osdever.net/bkerndev/Docs/title.htm

http://www.brokenthorn.com/Resources/OSDevIndex.html

https://github.com/stephenfewer/NoNameOS

https://anastas.io/osdev/memory/2016/08/08/page-frame-alloca...

https://github.com/kjiwa/x86-boot-sector-c

These are all useful. If you're more of a book worm check out Minix as said in other parts of this tree. Very very easy to read, interesting bits of history about OS development, and source code included! Nothing's useful without source code.

If you'd like to look up some general software books check out Programming the IBM PC, or something to that effect. It's commonly called The Pink Shirt Book. Has interesting stuff on IBM interrupts, coding standards, and bits on filesystems!


Let me plug in my own tutorial :) https://github.com/cfenollosa/os-tutorial


Here's a better direct link to osdev.org info: http://wiki.osdev.org/Main_Page


Yes sorry, I had just woken up and was on mobile. Couldn't gather my whits to find the link. Now that I've got a cup of joe in me I'm much more chipper and ready for this!

Yes that's the link. They walk you through everything! Top to bottom. It's a great starter that will teach you what OS development actually involves. It's like a hand holding experience through a forest of strange practices, you'll love it when you get a chance.


I read this same exact article last week too, what are the odds! Definitely have the same mindset to write my own someday. This project will be a ton of help.


Weow, excellaent reply budday.


+1 for being able to run on Bochs.

There's nothing minimalistic about Qemu. It's an enormous resource intensive program to build. And it's not even easy to run in text mode (no x11). "Qemu-lite" is sorely needed.


It's actually quite easy to run in text mode, just add -nographic. It automatically hooks the serial port up to your terminal, too.


What OS are you using?

Is it working for you in VGA text mode (no x11)?

Is this now working on other UNIX-like OS besides Linux?

I have had problems with -nographic in the past and I know others who have as well, but maybe it works reliably now?


You mean can I run it without having X11 running? Yes, of course, with -nographic. How do you think most major VPS providers run qemu? They certainly don't run their hosts with X11 going.

And yes, it runs on not-Linux, but obviously the main selling point of qemu is the access it provides to KVM, which is an interface provided by the Linux kernel. You can do software emulation of a number of architectures on a number of operating systems with qemu.


Are you saying you yourself do this? You're using BSD, running in VGA textmode, from which you run qemu with -nographic to load an image of another OS? You yourself are doing this?

Most of "major VPS providers" I know use Linux, not BSD. (Yes, there are some exceptions but the reliance on Linux is almost universal.)

I would love it if you were right and I am wrong, but I seriously doubt you have tested this recently. I will happily give Qemu another go if you have tested this recently and it worked.


Do your research, don't ask me to do it for you.

https://wiki.freebsd.org/qemu

qemu runs on Linux, OSX, Windows, and various smaller Unicies.


So you are admitting you have not tested this and you are just going by what you read.

When was the last time you used any "smaller Unices"?

Anyway, even if the problems with -nographic on x86 have been fixed on other non-FreeBSD BSD's (and I have my doubts), my comment that compiling qemu has become a substantial undertaking still stands. I suspect you've probably never even tried compiling it yourself but that will not stop you from commenting.

I compile Bochs statically on low powered computers with no problems. Cannot say the same for Qemu. Even you're not using x11 they expect you to have x11 libraries at compile time. As with most programs that grow so large, there are more than a few dependencies one can do without.


It's not minimalistic in its capabilities or implementation, but it's minimalistic in its interface. Its a tool that emulates various hardware, and provides a CLI interface that's no more complex than necessary to operate its conceptually narrow functionality.


I was referring to how long it takes to build from source and how much RAM and CPU it uses while building. The interface is fine. I love Qemu. Bellard writes great software. My early experiences with it were very good. But it has grown to become a whale of a program. I had to switch to Bochs.


The hobbyist OSes pop up a lot on Hackernews. Some are more full featured than others. What would say is the minimum feature set to be called a real OS?


Remember that since our general purpose computers are approximately Turing Machines, they can simulate anything, including better general purpose computers.

An OS is just a simulation of a "nicer" computer than the one that is natively exposed by the "real" one (i.e. the hardware).

Normally, "nicer" is pretty subjective, but in the case of computing it almost always means only two things:

* We simulate many more computers than we actually have (i.e. timesharing and multitasking)

* We simulate computers that are much easier to program.

In other words, an OS is just a simulation of having access to a bunch of nicer-to-program computers.

But, this ends up implying a multitude of features:

* Simulating many computers requires having a scheduler and process isolation. If we decide that we want individual users to have privileged access to their simulated computers, then we end up adding security features related to authorization and authentication. If we decide that the simulated computers should be able to communicate with each other, then we end up with networking since there is little difference between two simulated computers running on the same box, and two simulated computers that are running on different boxes.

* Simulating easier-to-program computers requires having memory management, virtual memory, I/O abstraction, graphics, portability, etc.

Since we end up with most, if not all, of the features that most people think are requisite for an OS, I think that my definition is minimal.


At least for my operating systems class, the requirement was to be able to boot and run a text editor, usually Vi. Didn't stop some people from building one that could run Doom though.


Ah. I was thinking something more like process isolation and device drivers.


It's probably highly subjective to make a description of what needs an OS to be an OS. For me, it would be to be able to boot on hardware, to be able to execute applications and to give these applications an abstract layer to the hardware.


If process isolation were a requirement then it would exclude many things people call an operating system.

For example, CP/M is an operating system with no support for multiple processes.


Should have called it ant-man.


And the latest commit message is "Prepare ICMP message type decoding"

It seems we have not learned from the mistakes of prior operating systems and instead of a "Windows NT ping of death" we'll have a "Thor ping of death", which is a curious image, Thor brought down by a puny ping.


That would be impossible, Thor cannot be brought down by any other than Jörmungandr. More seriously, I'll try to avoid reproducing the ping of death issue :)


Can an operating system be developed from scratch using a safe language such as Rust?

If yes, why people still even think about doing it with unsafe languages?


You can write a kernel in pretty much every language. There are some experiments with Rust: http://www.randomhacks.net/bare-metal-rust/

There are several reasons to choose a language rather than another. I personally don't like Rust at all, so I don't see any reason to use it, neither do I see any advantage in using it. But that's only my opinion. A lot of osdev currently is made by hobbyist (not counting the handful of really succesful kernels), so the main argument is simply personal and there is nothing wrong with it. Moreover, keep in mind that Rust is pretty new while C/C++ are decades old, this means more support, more portability, more people that know how to use it, ... This is not negligible at all.


Forget the experiments. Try this one:

https://www.redox-os.org/


> If yes, why people still even think about doing it with unsafe languages?

What you need for OS or embedded development is a language that doesn't get in your way. C doesn't get in your way. C is the first language that both stayed out of your way and also provided reasonable productivity features (over BCPL, B, and assembly).

It was only recently that people have figured out how to come close to a "safe" language that doesn't get in your way.

Here is a test case: grab an ARM Cortex-Mx processor. Boot, configure the I/O matrix, and set up, oh say, the counter/timer peripherals for PWM motor control and quadrature decode. If you can do that without a trip to an "unsafe" language, that is a worthy accomplishment for your safe language.

Another test case might be doing page swapping and page-table updates on a virtual memory subsystem on a multi-core CPU without a trip to an unsafe language.

I guess a more direct answer to your question is that when you get down to the hardware, you have to deal with weirdo bit fields, volatile hardware registers, strange read/write ordering requirements, interlocks, and even some totally non-interlockable volatile stuff. C can do it (although I admit it uses language features not everyone sees every day).


Funny you should mention the Cortex-M :) https://github.com/helena-project/tock/ (written in Rust, just linked to it down thread as well)


I will check that out.


$ grep -r unsafe | wc -l

292


Sure. Hardware is inherently not safe; the key is to build up safe abstractions around it by using the type system.


> C is the first language that both stayed out of your way and also provided reasonable productivity features

If we ignore the history of computers outside AT&T....


Example? Standard language that never required a trip to assembly language for system level programming?

On IBM's, PL/I didn't cut it that regard.

On a UNIVAC you could do a lot of system programming in FORTRAN, but only because there were a zillion proprietary extensions that accessed system programming primitives.

BASIC on a Commodore 64 had PEEK() and POKE() but that doesn't count either, very non-standard.

FORTH? Nope... still asm at the bottom.

Concurrent Pascal? Nope... not safe, despite the type checking, because you could twist variant records however you liked. And I think there was asm at the bottom anyway, IIRC.

COBOL, well, CDC COBOL had a bunch of extensions, too, but you still ended up in asm for system programming.

Show me the source code for a device driver in a high-level language that pre-dates C/B/BCPL.


> On a UNIVAC you could do a lot of system programming in FORTRAN, but only because there were a zillion proprietary extensions that accessed system programming primitives.

Just like pure ANSI C without compiler extensions.

> Standard language that never required a trip to assembly language for system level programming?

Try to implement 100% of libc in ANSI C without writing one single line of Assembly, syscalls into the underlying OS or compiler specific language extensions.

> Show me the source code for a device driver in a high-level language that pre-dates C/B/BCPL.

ESPOL on Burroughs B5000, 1961.


You shouldn't complain about extensions to the language. C needs extensions too. For example, many architectures have 3 (or more) memory spaces: program/code, data, IO. If you want to do anything on IO, you'd have to rely on non-standard (proprietary) extensions to C, or, like FORTH, get to asm at the bottom, and call those functions from your C program.


FORTH? Nope... still asm at the bottom.

I call shenanigans!


Either my sarcasm detecter is broken today, or you've never read the source code of a FORTH system. Or perhaps you are commenting that FORTH doesn't have high-level programming constructs -- but of course FORTH allows meta-programming, so you roll your own.


I'm saying that it's shenanigans that he's justifying disqualifying FORTH as a systems programming language because "it's ASM at the bottom." So I'm on your side of that debate.


> Example? Standard language that never required a trip to assembly language for system level programming?

That would exclude C as well.


> I guess a more direct answer to your question is that when you get down to the hardware, you have to deal with weirdo bit fields, volatile hardware registers, strange read/write ordering requirements, interlocks, and even some totally non-interlockable volatile stuff. C can do it (although I admit it uses language features not everyone sees every day).

I agree with you that building an entire OS to interface with hardware without using _any_ unsafe code is, by some definition of the word, impossible.

However that doesn't mean you must limit yourself to unsafe languages for the other 90% of the operating system code.


Sure it can and it has been done multiple times, although the industry hasn't cared much besides high integrity software, where software errors cost human lives.

For an example, check the "Project Oberon" book.

http://people.inf.ethz.ch/wirth/ProjectOberon/index.html

The revised 2013 version, with its own FPGA as replacement for the Ceres Workstation hardware of the original version.


"Why doesn't everyone agree with me that statically enforced safety is the only necessary consideration when building hobby projects?"

And yes, an operating system can be developed from scratch with Rust, maybe with a sprinkle of assembly, but every attempt I've seen makes widespread use of `unsafe', so the value mostly comes from Rust's modern type system with its traits and enums.


> every attempt I've seen makes widespread use of `unsafe', so the value mostly comes from Rust's modern type system with its traits and enums.

http://os.phil-opp.com/ follows the Rust way of making safe abstractions as much as possible.


Dave Evans at the University of Virginia taught a class on that in Spring 2014: http://rust-class.org/

Rust has changed a lot since then, but you might take a look there.


> unsafe languages

There is no such thing.


> Can an operating system be developed from scratch using a safe language such as Rust?

We don't know.

> If yes, why people still even think about doing it with unsafe languages?

Because it works.


> > Can an operating system be developed from scratch using a safe language such as Rust?

> We don't know.

Sure we do. Redox[0] boots into a desktop environment with a filesystem, etc etc. What we don't really know is how performant/portable/etc you could make it if you put the same amount of manpower into it as went into <other operating system of choice>.



We do know some prototype can boot. We don't know about operating system being in use in any environment for any purpose.

Damn, we don't know could the browser engine written in Rust work or not, despite the fact writing a browser engine was the goal of Rust from the day one.

By the way, network card driver can be written in LuaJIT (and perform well) too[1]. Shall we do it?

[1] http://lukego.github.io/blog/2013/01/03/snabb-switchs-luajit...


> Damn, we don't know could the browser engine written in Rust work or not, despite the fact writing a browser engine was the goal of Rust from the day one.

We kind of do at this point. It renders reasonably well, faster on some workloads than existing browsers. The majority of work still to do involves chasing down rendering bugs, and building a shell around it. That's a lot of work, but it's the sort of work that isn't breaking new ground like the work with integrating JS and the DOM and a GC into Rust, it's regular boring work. (There's some exciting things going on - see WebRender - but they're mostly theoretically optional and the browser would work perfectly fine and performant if it had much of the same infrastructure as it has today.)

> By the way, network card driver can be written in LuaJIT (and perform well) too[1]. Shall we do it?

Lua's already used for some embedded development. Using LuaJIT as a network driver in some cases seems reasonable to me - especially if you were running it as a userspace process under a microkernel.


Kind of, indeed. Last time I've seen comparison of html5ever with plain C HTML parser, we suddenly realized it is painfully slow, 'uses the proof-of-concept DOM' and whatnot[1]. So we don't know. We've yet to see anything written in Rust used in production environment, just anything.

> Using LuaJIT as a network driver in some cases seems reasonable to me

Why don't we see LuaJIT zealots in every thread on every board in every discussion of something written in C or C++, I wonder?

[1] https://www.reddit.com/r/programming/comments/4snfz7/the_fir...


> we suddenly realized it is painfully slow, 'uses the proof-of-concept DOM' and whatnot

Err, no, you've misread that. html5ever serializes into an arbitrary datastructure, provided by the program using html5ever. It also ships with its own datastructure to serialize into for testing purposes (the "proof-of-concept DOM"), which Servo does not use. It's therefore not a benchmark of html5ever as used in Servo.

More reasonable would be benchmarking Firefox's HTML parsing as it stands in Firefox against Servo's HTML parsing, from text to a fully built up DOM as usable by the rest of the browser in both cases.

On the other hand, in some rendering benchmarks (once the DOM has loaded and the page is actually rendering), Servo is much, much faster than other browsers[0].

[0] https://www.phoronix.com/scan.php?page=news_item&px=Google-S...


  > We've yet to see anything written in Rust used in production environment, just anything.
This is not true: https://www.rust-lang.org/en-US/friends.html


An OS in C++?

But Linus said...

http://harmful.cat-v.org/software/c++/linus


As other's have stated, that is just Linus's opinion, which he is entitled to. It's also even understandable given his position managing all of the Linux kernel and git development work, and the large number of people who contribute (or try to anyways...) If you don't want people to blow their damn leg off, don't give them a shotgun (http://programmers.stackexchange.com/questions/92126/what-di...).

That said, this absolutely doesn't mean that you CAN'T do systems level programming in C++. I've actually done real RTOS and motor control work on a small embedded platform that went in a robot using C++! This is in an actual shipping product too, not just a hobby project...

As Linus said, it's definitely easier to come up with something inefficient in C++, and you do have to limit yourself to a sane set of features. That said, I think some of the things C++ brings to the table can make development a lot easier without sacrificing performance as long as you have a disciplined team and sane coding guidelines. But what makes sense for a personal project or a small team may not make sense for a larger project, and while I think Linus is justified in his opinion, you definitely shouldn't take it to mean that you CAN'T do systems programming in C++.


Totally agreed with your comment until that:

> as long as you have a disciplined team and sane coding guidelines

Is it not possible at all anymore to just hire people actually understanding what they are doing? Or cargo cult is here to stay?

I am sorry - please don't take that as a personal attack, but explaining people that C++ can actually be used in OS development by mindlessly sticking to some rules doesn't seem a way to me... but rather a counter-argument to what you said yourself.


It's more that certain features can be way more expensive than they look.

e.g. say you're working on an embedded system, and you want a string, so you do:

    std::string s = "fnord";
At first it seems to work fine --- but you firmware image's RAM requirements have just gone up by 32kB, and a week later there's a crisis when adding another feature causes the system to stop linking because the RAM address space is full.

What happened is that std::string uses the heap to store the string data, so adding the line above caused the linker to pull in all the heap code and allocate a 32kB block of RAM to put the heap in. Because previously, the product wasn't using a heap: it was using static memory allocation throughout.

That example's contrived, but only a little. I've done the must-avoid-dynamic-memory-allocation dance many times in real life. (I've also discovered that printf() requires a raise() implementation on some platforms.)

A more realistic one is that embedded platforms typically have RTTI turned off, which means no exceptions, which means no throwing exceptions from constructors, which means two-phase construction throughout your program and you have to be really careful about which bits of the STL you use...


I do of course understand what you mean.

> At first it seems to work fine

It so happens that I have been working in OS- and low level areas for many years, and independently am also a pretty early adopter of C++. I have a reflex of routinely giving a quick glance to the link map, and am frequently dumping assembly for areas of code I have doubts in.

All this to say that reading 'it seems to work fine' in the context of OS development kind of provokes a skin reaction in me.


In your string example (and in RTOS/C++ programming in general) couldn't you just change the default allocator to not use heap memory ? Then continue using std::whatever ? of course you'd have to keep a very close eye on your memory pool, but wouldn't this be one way to solve the problem ?


Quite a lot of embedded architects would say that allocation should never happen at all. That rules out the STL and much modern C++ style.


No No No! Don't apologize, its a great question! And to be fair I'm absolutely against "cargo cult" programming, and my point was actually that when you are doing C++ in this type of environment, you can't always stick to a hard set of rules or the conventional wisdom. When I say guidelines, I mean just that, and not a rigorous law you MUST adhere to.

As an example: When I was working on a project for a very constrained embedded device, we needed to get some extra man-power on our team for a few sprints to help out with some functionality. One of the pieces of our system was a "debug console" I had written that allowed some interactivity with the system over a serial port. The new guy was a very sharp engineer, but he typically worked on higher level stuff than we were doing. He wanted to add some functionality to the debug console, and dutifully started writing stuff in using the C++ string handling libraries. And consequently he blew our stack budget, and we ended up very quickly rewriting part of it together.

Now, the point is he wasn't doing anything using some crazy STL functionality or Boost, and he was doing the "right" thing by handling strings using the Standard Library functionality. What we had to do for our system was actually bend the conventional knowledge ("Don't write a string handling library yourself"), because we knew exactly what we needed, and exactly what resources we had.

So perhaps I could have phrased my point better. When I say "have a disciplined team and sane coding guidelines", I don't mean a team that codes by the book, I mean a team that knows what it is doing, and knows when the rules are meant to be bent. In our case sane coding guidelines meant we did things that were against the conventional wisdom, but they were sane, because they were justified in our case based on our engineering analysis. We were certainly open to breaking / changing these guidelines but it had to be justified. (And in fact our 'guidelines' were less a set of rules about how you needed to do every last detail, and more of a set of project specific "design patterns" and a large set of lessons learned in a shared wiki page which described issues we had run in to, and justified certain design decisions that were made)

(Edit: Other examples included disabling RTTI, and completely disabling and disallowing the usage of C++ exceptions to write our own error handling. Against the common advice to use what the language gives you, but made sense for our application)

Again, no need to apologize! I could have made my point clearer, and I hope I did, but please feel free to follow up with me! I'm always looking for ways to improve :-)


Ha. You posted this while I was writing my answer, and I see you (very nearly) used the exact same pair of examples that I did. This is possibly a hint as to where the pain points are...

(Your post is better than mine, though.)


Haha, I was just about to reply to your comment with nearly the same thing! I think one day I may need to do a book on C++ for embedded folks. Chapter 1 will be "Please don't use std::string!".


Could hold true to the HLL guys as well. I seem to remember at least a couple of performance analyses of apps in a High-Level Language where string concatenation was killing performance.

Easy to do. Tough to always remember the impact of what's going on under the hood.


HLL's (at least those which have a VM) often try interesting techniques to combat this, since string handling is so often a performance problem (it's a performance problem because they often make it relatively painless to manipulate strings, at least to the point where it's not obvious you may be doing something really inefficient). It's interesting if you follow the development of a language while it's being developed, you can usually see a few of the techniques they've used come and go. Starting with simple string handling, then global shared copy-on-write strings, then ropes, possibly a fourth weird representation, and likely back to one of the prior, simpler models.

At least, that's what I hazily recall from the long, jumbled Perl 6 history, but that includes a few changes to the language and multiple VM's and multiple string handling regimes per VM, sometimes.


or iostreams. BTW I sure would lap the book up.


Thank you, I get your point. Re-reading it, and other comments here, I am getting reinforced in suspicion I already had for some time -- people don't do operating systems in C++ simply because they cannot hire big enough teams of C++ developers skillful enough to code in an OS environment.


What kind of C++ are you writing? That's the real question. My friends and I that work on bare-metal talk about "embedded C++". Yes, it is C++, and you can take advantage of some C++ features, but some things are not in the mix. Like, for instance, if you have 8K of physical SRAM and no virtual memory manager, new and dispose aren't going to do a lot for you.

C++ has it's place in OS development, but you need to know exactly what code is being generated, and you need to understand with excruciating precision how memory is being allocated.


That's Linus's opinion, yes. Not everyone has to share it.


It's a really annoying widespread opinion. I've been looking for people who really do write OSes in modern C++.


IncludeOS? [0]

From the CppCon 2016 presentation description:

"Early in the design process we made a hard choice; no C interfaces and no blocking POSIX calls. We’ve done everything from scratch with modern C++ 11/14 - Including device drivers and the complete network stack all the way through ethernet, IP and ARP, up to and including UDP, TCP and recently also an http / REST API framework. To achieve maximum efficiency we decided to do everything event based and async, so there's plenty of opportunities to use lambdas and delegates." [1]

[0] http://www.includeos.org/

[1] https://cppcon2016.sched.org/event/d30a43dae4a490dce81a3dfc6...


Is that really a "hard" choice? Aren't blocking POSIX calls usually just a case where the kernel blocks for you on what is essentially an asynchronous operation anyway?

Maybe it's meant to be hard in the "we decided we weren't going to be POSIX compliant which has implications" send and not the hard to implement sense?


Nice :) I didn't know this project. I'll definitely read about this project.


I'm Alfred from IncludeOS. We're open source, so check us out on GitHub if you want to try IncludeOS or would like to participate: https://github.com/hioa-cs/IncludeOS

We also have a chat if you have any questions: https://gitter.im/hioa-cs/IncludeOS


Hello.

I'm the author of the mentioned project. It's indeed a very annoying opinion. Although there are indeed some difficulties in writing an operating system and quite some runtime support to implement, I would say that it's worth it only to have a more powerful language. I'd rather use a language that I really like rather than be forced to use one I don't really care about.

I cannot say that C++ is the best language to develop an operating system, but I would definitely say that it's possible if you really know the language. And you don't have to use the complete language (I didn't enable exceptions nor RTTI in my OS).


I don't write OSes in modern C++ at the moment (most of my day job is embedded Linux nowadays, go figure...) but I've seen a lot of sane, OS-level C++ code. Even C++ code that I have bad memories about (uh, Symbian) is partly justifiable.

The only real beef I have with C++ is its complexity. As I did less and less high-level programming, I forgot about many of C++'s pitfalls and nowadays I'm always tiptoeing when I have to write C++ code. I think a lot of it is unwarranted, or at least not needed when you're writing system-level code (but my impression may be distorted by the fact that I'm used to embedded systems and high-reliability applications; my sight may be narrow here).

Edit -- oh, by the way: it's worth pointing out, in the context of a thread that mentions Torvalds' opinion on the matter, that the whole thing was written a while ago.

The Internet endlessly recycles some of these arguments, e.g. STL still gets a lot of criticism that hasn't been true in a while. Yes, STL was terrible, terrible fifteen years ago, but that's, like, 50 years in computer years.


The STL might be the best programming success story. It used to be awful, but now I rank it among the best standard libraries of any language.


that's because Stepanov named and shamed standard library and compilers implementors whose standard library wasn't compliant/fast enough.


BeOS, Symbian, L4, Genode, Mac OS X drivers, and big parts of Windows are all examples of C++ use in OS development.


BeOS is the reason I learned C++. I don't recall exactly, but either I didn't understand how or there was no other option but C++ if one wanted to hit the BeOS API. Pity Be Inc. got shafted so bad by MS monopoly abuse.

They had zero vendors willing to put their OS on a box and resell it because MS threatened the vendors with pulling their right to sell Windows if they did. This wasn't particular to Be, but to any other OS.


Yeah, still have my Be CDs stored somewhere.

However regarding Microsoft, I actually think that vendors were as guilty as Microsoft.

They could have chosen not to take Microsoft's discount and try to sell alternative OSes, even if it meant having to face a few challenges.

The one that takes is as guilty as the one that gives.


I would have to really dig for the emails the company sent out (I should have them archived someplace), but I thought the issue wasn't discounts, but a revoking of the license to sell Windows if the OEMs didn't comply.

I did a quick search and came up with Quora answer given here: https://www.quora.com/Why-was-the-BeOS-dropped

but this doesn't jive with my memory. I will have to see if I can find an old email or maybe dig further on the net. I certainly don’t want to be rewriting history.


Even if it was revoking the license, vendors had an option, and they have chosen the easy one out.

I remember quite a few small shops trying to put a fight selling other kinds of computers, they might have lost in the end, but they tried to walk a different path.


I disagree: there is a lot of competition between PC builders so this ´discount' isn't really optional and Microsoft should have been heavily punished, for offering this discount.


Funny this thing to blame just the side that gives but not the one that accepts.


I think only L4/Fiasco.OC and L4ka::Pistachio are C++; other L4-style kernels (seL4, OKL4, among others), including L4 proper, are written in C.


And there was Chorus back in the 90's.


Genode (essentially a microkernel abstraction layer, and some useful userland, a full Qt implementation and a reasonable UNIX-y layer) is entirely C++, with some C++ microkernels available for it (Fiasco.OC and Nova being the two I'm playing with).


I'm the maintainer of a proprietary C++14 RTOS for work. What do you want to know?


I'm not OP, but I'm interested in the fact that you specified C++14. Are there any features specific to C++14 that you use? I have about two university classes worth of experience with low-level software, but I can really only see the deprecation attribute and binary literals being useful? All of the other language additions (added support for type deduction, templates, lambdas, and the mixing of all three to various degrees) seem like they would take up too many resources to be useful.


We switched from C++11 to C++14 mainly for constexpr (it existed in 11, but had a lot of restrictions that limited it's usefulness).

The advanced template techniques (and to a degree, I'm throwing type deduction in there too), when treated skeptically do lead to more efficient code. It's easy to go off the deep end though, which is why I said "when treated skeptically". As for lambdas, we have an entirely asynchronous OS, so they're really nice for callback glue.

EDIT: Binary literals actually come up way less than you'd think, even for deeply embedded (I think the smallest thing we ship on currently is 16KB of RAM). Everyone here knows hex like the back of their hand.


Yes, but his opinion counts more than some John Doe's as Linus supports his case with solid arguments.


'It's made more horrible by the fact that a lot of substandard programmers use it, to the point where it's much much easier to generate total and utter crap with it. Quite frankly, even if the choice of C were to do nothing but keep the C++ programmers out, that in itself would be a huge reason to use C'

Solid argument that.

His two other arguments:

- infinite amounts of pain when they don't work (and anybody who tells me that STL and especially Boost are stable and portable is just so full of BS that it's not even funny)

Considering that the linux kernel is strongly tied to the specific C dialect supported by GCC and is not portable at all, this argument is bullshit.

- inefficient abstracted programming models where two years down the road you notice that some abstraction wasn't very efficient, but now all your code depends on all the nice object models around it, and you cannot fix it without rewriting your app.

You can abstract yourself in a corner in C as well. In C++ you are more likely to pay less for abstractions.

Sorry, but I take it personally when me and my colleagues are called substandard programmers.


>>Sorry, but I take it personally when me and my colleagues are called substandard programmers.

I don't know what kind of programming you do, but he is talking mainly about kernel programming. He is mainly criticizing the people who want to bring C++ in the kernel world. Anyone is welcome to prove him wrong.

But I think, C++ is more suited for applications programming, where efficiency is not a prime concern and abstraction-costs are justified.

Even in such cases, (e.g. git, which is an application program) an excellent programmer like Linus can be very well productive with C and does not need the (sloppy/non-sloppy) abstractions provided by C++.

>>You can abstract yourself in a corner in C as well. In C++ you are more likely to pay less for abstractions.

Agreed. But the point you seem to be missing is that C doesn't force any abstractions on you. The STL/Boost abstractions are much more inefficient.

Linus talks about these inefficient abstractions and that's a solid argument because C++ abstractions come with various sorts of hidden costs. (e.g. even name mangling can be a significant cost factor in Kernel.)

>>Considering that the linux kernel is strongly tied to the specific C dialect supported by GCC and is not portable at all, this argument is bullshit.

The issue of portability to different compilers is not an important concern for kernel programmers. If they find a particular tool (e.g. gcc with some C dialect) perfect for their purpose, why should they bother with other tools?

Remember, for Linux kernel programmers gcc is just a tool to produce their product, the kernel.

edit: added point about portability


>I don't know what kind of programming you do, but he is talking mainly about kernel programming.

This specific rant was about using C++ in git. BTW I work in what could be called soft realtime systems.

> But I think, C++ is more suited for applications programming, where efficiency is not a prime concern and abstraction-costs are justified.

If efficiency is not a prime concern, using C++ is hardly justified.

> [...] the point you seem to be missing is that C doesn't force any abstractions on you.

Nor does C++.

edit:

> Even in such cases, (e.g. git, which is an application program) an excellent programmer like Linus can be very well productive with C and does not need the (sloppy/non-sloppy) abstractions provided by C++.

and that's perfectly fine, if one feels more productive in a certain language, more power to him, but please let's not spread FUD.


>>If efficiency is not a prime concern, using C++ is hardly justified.

Well, yes and no. Yes, I agree with you as C++ gives you more control than most other high level languages out there, e.g. more control over how you manage your memory. I can hardly imagine someone writing kernel in managed languages, like, Java.

No, I don't agree with you because sometimes, when very low-level aspects of machine also become prime efficiency concern, using C++ is hardly justified. See my point about C++ not forcing any abstractions on you, given at the end.

It reminds me what someone once half-jokingly said about C and assembly: "C gives you all the power of assembly language with the same ease of use".

In the kernel world, it becomes a joke, as the level of abstraction provided by C is very high as compared to the one provided by assembly language (e.g. mainly due to struct, union and cleaner subroutine syntax) and the cost of this abstraction is extremely low.

The benefits of using C are tremendous: e.g. code portability and readability.

>>Nor does C++.

Yes, but when you don't use any non-c abstraction provided by C then it reduces, almost entirely, to C (barring templates).

Templates are extremely good mechanism to provide abstraction (especially as compared to inheritance) but their cost (e.g. cost in terms of code bloat and in terms of the cognitive load if one actually wants to dig deeper and see/tweak the generated code to investigate/address some performance issues) seems prohibitive at least in the performance sensitive kernel programming.

The kernel hackers have found a neat-but-not-so-neat way around it: by using C macros. Macros are in fact C's templates. I am not saying macros lead to cleaner code and so on but when you compare them to C++ templates, their cost-benefit equation in the kernel programming world seems justifiable.

>>and that's perfectly fine, if one feels more productive in a certain language, more power to him, but please let's not spread FUD.

I agree with you whole heartedly about one's choice of language. I personally would have chosen C++ and even Python over C to implement an application like git or its parts.

Not to play advocate for Linus here (he doesn't need a half-witted advocate like me), but he seemed to be spreading FUD about C++ because, supposing C++ were allowed, he seems to feel that many people will start using its abstractions without being aware of their costs. It's very easy to get tempted to use available abstractions and if the abstractions start leaking (as he pointed out) then fixing the code that relied on those abstractions becomes a difficult issue.


> The STL/Boost abstractions are much more inefficient.

Well, the STL/Boost libraries are designed for certain goals. If you have different goals, you should use something different. But in any case, which specific abstractions are you talking? Which scenario are you optimizing for and what common case, worst case perf numbers are you looking to hit?

>. (e.g. even name mangling can be a significant cost factor in Kernel.)

What in the world are you talking about?


> Anyone is welcome to prove him wrong.

Apple and Microsoft already did.


Please provide some links.



> You can abstract yourself in a corner in C as well.

It's much, much harder to do than in C++. It's also much more obvious when it happens, and in collaboration others will spot it quickly.

His tone is harsh but I find it completely right. I prefer to use C at work exactly because the same kind of software would be an absolute nightmare to write in C++ (it is very low-level soft, extremely similar to kernel code).

C++ is good if you can enforce very strict guidelines and if every single programmer that contributes to the code is very good at C++. Those are pretty big if, especially if you work with a partially open source codebase.


As long as you avoid "virtual", I can't even think of any C++ features which would lead you towards worse performance characteristics than C.

Templates may bloat your binary size and increase compile times, but they're plenty fast.


There's also making unnecessary copies of things like strings and vectors passed to functions and exception overhead (in binary size at least)


Rvalue references and std::move have largely obsoleted the argument of copy overhead. It makes particularly a dramatic improvement in container efficiency. C++11/14 really is a different language than C++98.


> Considering that the linux kernel is strongly tied to the specific C dialect supported by GCC and is not portable at all, this argument is bullshit.

Linux has been ported to over two dozen architectures so it is extremely portable. Yes, it is tied to some GCC behaviors, but many operating systems are tied to compilers. Some extensions make the code identical across different archs, like __builtin_return_address.

Windows is tied to particular C features, which is why MSVC is resistant to C99. Plan9 was a special dialect and compiler of C.


The context of course was portability to different compilers, not architectures.


Oh gotcha... I interpreted Linus's comment about archs. Boost is supported on far less archs than the Linux kernel is:

http://www.boost.org/doc/libs/master/libs/context/doc/html/c...


Note that that's the supported architectures, i.e. those that are routinely tested. 99% of boost in architecture/machine independent and will work anywhere that has a standard compliant compiler.


Sure. Obviously all of this depends if the context is git or the kernel. But Linus (in context of the kernel) cares more about the 1% than the average person, for things like atomics, mutexes, etc.


I do believe that when programming at "machine level" your language needs to be as simple and straightforward as possible. It needs to obey you like a sword. Yes, you need some mastery, but you know that your every action is effective. No lost effort, no mind tricks, no unnecesary complexity.

I find that every abstraction level makes me more unhappy and somehow lost..

So yes, I do believe Linus knows something and for sure he's not trying to fool himself. When you really try to build something to actually be used, all those abstractions will bite you..


The problem with C++ is more about collaboration than with the language. It's harder to misuse C than C++. When you have a large group of people contributing, proper usage matters. Everything Linus rants about is improper usage.


On the other hand smart pointers (which I think didn't exist when linus wrote this?) and RAII make C++ harder to misuse than C.


I think it's more about the feature surface. C++ is a gigantic language while C is fairly limited. I actually like C++ when I'm the only one writing it, but my experience has always been that for complex use cases (large) c++ code bases tend to always evolve to a mess of complexity, even with expert programmers.

But yeah, smart pointers alone help a lot with writing tidy c++...


There was a time that C++ was considered a gigantic language. Just as there was a time that Common Lisp was considered so large that it collapsed under its own weight.

Many languages, including "simple" Java, are about as large as C++ right now ( https://channel9.msdn.com/Events/GoingNative/GoingNative-201... , around the one-hour and fifteen minute mark). To be honest, one thing that makes working in C++ relatively hard is the fact that C++'s standard library is significantly smaller than the competition. You have to do more yourself.


There is a difference between language complexity and library complexity. The Java language is very simple, but the Java standard libraries are a collosal, complex system.

C++ the language is extremely complex and riddled with pitfalls and minefields. Just try to pin down the formal definition of such a widely-used term as "rvalue". OTOH I find the C++ standard library to be reasonably straightforward and well-designed. The STL in particular is brilliant.


Brainfuck has a limited surface. That doesn't mean it isn't hard or error-prone.


But for a totally different reason. C++ is hard (to me, anyway) because it's gigantic and people tend to make it into a mess. BF is hard because it's intentionally obtuse.


Aren't large parts of Windows written in C++ ?


I'm pretty sure it is! ( I'm also pretty sure that for a subset of the HN crowd, you may have just proven Linus's point :-P )


I think even HN readers must accept that the Windows kernel is rock solid. Maybe even better than Linux - Windows even gracefully handles graphics drivers crashing and can restart them virtually seamlessly. Linux just panics.


Graphics (speaking of modern 3D graphics) on Linux sucks, because NVidia and Co do not really care about it. It has gotten better (e.g. the deep learning crowd is mostly on Linux using cuda), but is still a far-cry from the stability of other sub-systems.

So I would say it is mostly gaming. There are not many games on Linux (again, getting better, but it takes time), so gfx vendors do not allocate big resources to support it, so the developer-experience is suboptimal, the classic chicken-egg problem.


The kernel is written entirely in C. See various WRK releases you can find online.


Until Windows 8, which they introduced C++ support on the kernel and deemed C89 as good enough, with the way forward being C++.

Yes, the latest VC++ do support C99 library, because it is required by the C++ standard and the new MSVCRT.dll is actually written in C++ with extern "C" entry points.



Does this actually use classes and other features of C++? Scanning a few files on Github I see namespaces being used but not much else.

Very impressive either way - I was just curious how C++ was leveraged.


I'm using a few classes and some class hierarchy as well. I've reimplemented std::vector and std::string and a few other features from the STL and use them both in kernel space and user space (the STL is not standalone, need the glibc and I didn't want to port everything). I'm using quite some templates in the library part. I'm using auto from C++11 and a few constexpr functions. I have disabled exceptions and RTTI. I'm using RAII principle as much as possible (but this can be improved a lot still). I'm using references when I can remove pointers.

On the other hand, a lot of code is clearly very close to C. When you're doing some low-level things, parsing memory structures, paging, ... there are not a lot of features from C++ than can help. Moreover, there is a lot of code that could profit from some refactorings:P


I agree 100% with Linus, especially the boost comment. It was heavily used in a project I worked on and "boost" soon became a curse word.


I don't agree with Linus, but yes, boost is horrible.


I don't agree with you, boost is great.


Rather than continue with unsupported opinion, I'd rather make the specific point that Boost served well as incubator and proving ground for such constructs as shared_ptr and unique_ptr, which have been subsumed into the C++ standard as undeniably huge improvements. As a result, the unfortunate abortion auto_ptr has finally been able to be consigned to a well-deserved resting place in Hell.

Other parts of Boost have been considerably less impressive, and virtually nobody uses them. I struggled at length trying to get Boost::Parameter to work, with zero success. Boost::Format at least works, but ends up cumbersome, and does not approach the usability of {} formatting in Python.


You should not really judge a tool by the bad use the people make of it. Hammers are perfectly fine with nail but they are horrible at cutting bread


Linus now even uses Qt for its own hobby application.


s/its/his/ ?


Yeah, too late to edit.


Well, when St. Linus Torvalds said those words, (1) g++ was not as good as many other C++ compilers, (2) the world was still full of very bad examples of C++: if you were lucky, back in those days you could have found some projects using C++03 but the majority were stuck to C++98 -- or worse -- and, (3) he called out the holy principle that "at my place, I make the rules". I do not think that in 2016 there are good reasons not to use C++ for an operating system. Even without using the STL -- which would require custom allocators at that level -- incapsulation of data inside objects, inheritance, templates and namespaces alone are a reasons good enough for me to prefer C++ over C at any time nowadays. In 2016, that is more a cargo cult not to use C++ for operating systems implementation than anything else.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: