Sure he doesn't know what he doesn't know, but he has decided to fix that. Which, if you know your history, is not a whole lot different than Linus back when he was calling out Minix for being crap.
The challenge here is that the barrier to speaking on the interwebs is quite low so you can make a fool of yourself if you're not careful.
Jean Labrosse, who wrote uC/OS (which everyone called mucos), in his original thesis statement made many of these exact same arguments. And like your author he made some choices that he felt were reasonable, only to learn through experience that perhaps they weren't a well thought out as he had hoped.
I am a huge fan of people just saying "How hard can it be?" and jumping in. Enjoy the ride, they can surprise you if you under estimate them.
So assuming this person notes that they are getting a ton of traffic from HN, and comes here to investigate, my three suggested books are :
Operating System Concepts , Operating System Implementation , and The Design of UNIX . Preferably in that order. Any decent college library should have all three in the stacks.
Further, there are some good ideas here -- in particular, jumping directly to long mode allows one to avoid much of the replaying of the history of the PC that one is historically required to do to boot an x86. Most civilians (and I dare say, most of the negative commenters here) have NFI how ugly this process is and how much of a drag it is on future development. With the decline of the PC, it's reasonable to believe that a future is coming in which x86 becomes primarily a server-side microprocessor -- and begins to shed much of the baggage from the misadventures of the 1980s and 1990s.
All that said: there is a certain arrogance of youth here, and one hopes that when reality has sandblasted it away, the resulting humility will find its way to a formal computer science education and ultimately into professional software engineering; our discipline needs more people who have dared to write an OS from scratch, not fewer.
If you have a basic aptitude for coding, not knowing how hard a task is isn't a liability; it's a powerful asset. If you haven't actually done real work on an OS, you wish you thought it was as easy as this guy does.
Worth noting that the person that asked him this was Alan Kay.
There's two kinds of fear; fear of the unknown - where you may learn to wade in anyway - and the more awkward fear of the known. The latter, if you believe the project's basis to be questionable, can destroy motivation. It made me quit my last job.
I don't know a lot about math (I still don't) but I always wanted to write a 3D render engine. I also was afraid of the complexity. But then I figured I didn't know anything about it's complexity because I never tried.
So I started with Processing and wrote an photo-realistic ligh-tracer. Very slow but very very fun. Then I wrote a multi-core version which was a little faster. Then an path-tracer which was faster. Then an exporter for Blender. Then I ported the project to Java.
Is it as good as current render engines? No! But I don't care because I learned a lot and had a lot of fun.
Then I liked to have a CNC router. But I could not afford one. So I just built one and it worked!
Moral of the story: just do it. You will fail sometimes but who cares? You will always learn a lot.
iRobot 10 hours ago | link [dead]
The BIOS on the original IBM PC totally made it easy to code a homebrew bare metal OS which could access all the peripherals. These BIOS calls exists today in i7 based motherboards allowing OS's coded in the 80's to still (mostly) function on a modern Motherboard, its biggest fault was being 16 bit which meant all newer OS's needed to write there own once they switched to 32/64 bit mode.
The BIOS layer made the PC easier. I find it frustrating that almost every new SOC I get, even from the same manufacturer requires me to re-code all my IO routines.
If a common BIOS existed across the ARM/x86 which emulated the simpleness of the original BIOS concept and not the API hell you usually get now, there would be a lot more adventurers in homebrew OS's
Note to iRobot: it looks like your comment 165 days ago about "Melissa[sic] Gates" got your account killed.
Here's the thing about all this: those are questions with answers that are straightforward to find, and he is probably going to find them, because he seems totally fearless.
I could never in a million years write a blog post that so forthrightly laid out the stuff I don't know and sort of plaintively said "I'm going to try to figure this stuff out, and in the meantime, I'm putting system configuration under /system/configuration and not /etc". I think I envy him intensely.
Note to people babbling about what a herculean task building an OS is: I started coding as a teenager in the early '90s and multiple friends of mine wrote plausible protected mode operating systems. A basic operating system is not that hard. Going head-to-head with OS X as a one-person project is insanely ambitious, but who cares? By the time that becomes totally apparent to him, he may well be an unstoppable systems programming juggernaut.
Everything is part of the system, so the system configuration should be either /configuration or /config. But then, you already have /etc... why bother?
Anyway I doubt he's aiming to be POSIX compliant so... he's free to do what he pleases. The journey is going to make him a much better programmer.
As one of those fools that wrote their own OS (A QnX clone, which I'm now wondering if I should port it to the raspberry pi in my non-existent spare time) I can completely sympathize.
The good news is that the way clueless newbies learn is by doing, and there is nothing more helpful here than healthy self over-estimation. It stops you from being discouraged when you probably should be and great things can come of that. Worst case he will learn, and probably a lot more than from building yet-another-to-do-list-in-insert-fashionable-language-here.
And if you haven't written your own OS just yet, trust me it is easier than it seems and harder than it seems at the same time. It's easier to get started and to get something working (especially with VMs nowadays, in my time we had to reboot the hardware 50 times per day (cue 'hah!, you had hardware' comment including chisels and stone tablets)) and harder because it is just simply hard to get it perfect.
Oh, and Tanenbaum was right.
Oh, and agreed that Tanenbaum was right. ;)
I think Linus' argument was basically that microkernels require distributed algorithms, and distributed algorithms are more complex.
But maybe in a multicore world that argument is weakened. I like this paper: "Your computer is a distributed system already, why isn't your OS?"
> Did the experience from writing your own OS solidify that belief?
Absolutely. Micro-kernels have many advantages other than a slight overhead due to message passing (and a large chunk o that overhead can be overcome by using the paging mechanism in a clever way). They're easier to secure, much easier to stabilize, support such luxuries as on the fly upgrades without powering down with grace and allow you to develop drivers in userland greatly simplifying debugging and testing as well as allowing you to do hard real-time (and by extension soft real time) much easier than you could ever do it using a macro kernel.
I've built some pretty large systems using QnX in the 80's and 90's that I would have a real problem with re-implementing even today on todays hardware without the benefits brought by a micro kernel with network transparent message passing. If you haven't used a setup like that then it is probably hard to see the advantages, it goes way beyond some theoretical debate.
In practice two systems side-by-side, one running QnX, one running Linux will have the QnX system come out way ahead in terms of responsiveness for interactive tasks and things like latency and real world throughput.
We'll never know what the world would have looked like if Linus hadn't been as pig headed during that whole debate. Likely we wouldn't be stuck with a re-write of a 30 year old kernel.
The bit where Linux got it right and Tanenbaum got it wrong was that GPL'ing an OS was a much better move than to do a deal with Prentice Hall (who published the minix source). And minix wasn't the most elegant micro kernel either, which may have skewed Linus' perception of what it was that Tanenbaum was getting at.
My guess is if he would have used QnX instead of having looked at minix that he would have readily agreed with Tanenbaum, but we'll never know about that and Linux is here to stay for a long time.
If you haven't used QnX give it a shot and see how it works for you, you might be pleasantly surprised.
I prefer open source so I've been taking a look at Minix 3. It seems really cool. And it's only 6 or so years old -- at the time of the argument Minix wasn't meant to be a production system, but now it is.
I feel like it must be easier to trace application performance with Minix since you have natural points to insert hooks. With monolithic kernels it's hard to understand what is really going on.
I see a lot of potential advantages of a microkernel in distributed systems. For example, Amazon EC2 has well known I/O sharing issues. With a microkernel, you could fairly easily reimplement the file server with your own custom disk scheduling logic based on identities (not Unix users) and priorities.
In Linux I know there is some work with containers, but I don't think it is as customizable as you would like.
It's not even remotely newsworthy, though.
 Programmers seem to come in 3 archetypes; the systems guy who dreams of building a new OS, the language guy who dreams of building a new programming language, and the networking guy who dreams of building a new protocol. Some folks are parts of all three, I once thought if you wrote an RPG where the characters were coders these three areas would be where you would add skill points.
a massively minimal os is not that hard.
I for one love being in way over my head. Keeps things interesting!
When Musk decided to build his own rocket, people thought he's a nut. Now he's an icon of innovation. A qualitative leap in any area requires a bold outlook, which is bound to be unpopular with the establishment. As PG once noted, if you experience a lot of opposition, it may be a sign that you're on the right track :)
I think the guy could use our support. HN is not very different from him in spirit.
Do you still recommend the Seventh Edition over the newer release?
What I’ve found out so far:
Master Boot Record (MBR);
Bootloader – the program that takes it over from MBR and loads your Kernel;
How to write your own MBR and write it to Disk on windows.
I’ve written a small utility in Visual C++ that allows directly to read/write from disk (download here, source included for Visual Studio 2010 Express);
How to write bare bones C kernel entry point.
How to write “naked” functions on Windows in Visual Studio
Missing link - I still don’t know how to properly step from MBR to Bootloader to Kerlen, that is – write your own MBR code that would load bootloader, pass the execution to bootloader that would load and pass execution to bare bones C kernel:
What exactly is Global Descriptor Table (GDT) and Interrupt Descriptor Table (IDT), and how it looks in C and Assembly?
How and when, and again how, if when is later (for example in Long Mode or Protected Mode) to set up all this GDT and IDT stuff. They say you have to set it up before the kernel. Then they say you can set it up with dummy values and set it up later in kernel. Then they say, that to set it up, you have to be in Real Mode, so your kernel (which might be way over 1Mb of real mode space), needs to switch between modes. And then if your kernel is over 1Mb, you can’t access memory locations after 1Mb, and so on… It’s confusing, but I’m going to find it out and post it here later on.
How to handle Interrupts in C?
Will they perform as callbacks that await some return values or do I have to use inline assembly to process them correctly;
Is it possible to write MBR in C?
I do understand that you still have to set ORG to 7c00h, and use some specific assembly instructions, but if they could be wrapped in C using inline assembly and C entry point can be glued with few lines of assembly code, why not?
It sounds like you don't have the experience required to make an OS. I certainly don't either (I'm no C-head) so I am in no position to snark, but you're going to fail in this endeavour.
That doesn't mean it's pointless, though- I think it'll be a tremendous learning experience in getting to grips with the core of how computers actually work. So, good luck. Just don't go thinking you're going to make the next Linux out of this.
EDIT: It's also important to note that the author didn't submit this to HN. He didn't say "take this, HN amateurs!", he just posted something on his blog that someone else picked up.
There is a big difference between saying "You don't have what it takes" and saying, "I think the obstacles you face will be very large, good luck!"
You really need to be extremely familiar with computer internals, have a very good grasp of the instruction set architecture of the platform you're targeting, have expert C knowledge, and at least be very comfortable programming with assembly.
And even then you're going to fail unless you're working with a large team - there's just too many parts of a modern operating system for one person to ever tackle. Even if you only want to support a single set of hardware (one graphics card, one network card, etc), you'd spend years just writing drivers for everything and by then the hardware would be long obsolete.
I'm not trying to say that he shouldn't do it - it is great for learning how computers work on a low level - but he really does need lots of experience, and should go into it with the correct expectations.
Operating systems become large because of hardware support. They are also large now because what is considered 'operating system' has changed - from including a desktop environment to including a web browser.
The reason why OS X and Windows have large teams and year-long development cycles is because they are a complete stack of applications, not just a kernel, fs etc.
To get an OS written from scratch to boot on specific hardware with some basic functionality should be an 8-12 week job for a competent C developer.
The best thing I did in my teens was to grab x86 docs and attempt to write an OS. I thoroughly recommend it to everybody as a good project to learn system and development (along with writing a compiler and writing a simple database).
With all of the (largely web/app-centric) development work that I have done, there are numerous concepts in OS development that baffle me. I don't claim to know what it takes to build an OS, but I know that it is beyond me.
Also, I wasn't questioning his skill as a programmer, I was questioning the undertaking that he is discussing. Any solo developer would struggle to create an OS (especially when they have a full time job to contend with), let alone someone that is going to have to learn a lot along the way. So my suggestion was to go ahead, but treat it as a learning experience rather than a deliverable product.
But what is the purpose of mentioning this?
If you are recommending that he go ahead with the endeavour, then what does it matter if he ends up with a deliverable product or not.
Also, these kind of projects can have a timeframe of years, regardless of programming skill. If he finishes it, it will be a result of him sticking to it in the long run, not a result of how much C knowledge he had on the first day of the project.
It _will_ be a tremendous learning experience for him to try his hand at writing an OS. But there's always the dark side, the discouragement, difficulty, and flames from the internet along the lines of "Lol, you'll fail." (And I don't mean to belittle the parent post – I'm referring to all the other responses this guy is going to get.)
So take a look at Pure64: it's a project from the University of Waterloo – unofficially – but for someone just getting their feet wet, it's a great way to get down to writing Bare Metal Code but you can skip a whole lot of pain!
I don't think many people that decide to build an OS, regardless of C/asm experience, have the goal of making the next Linux. Most (myself included) get into OS development because it's interesting and a great learning experience.
he didn't say anywhere that he expects to build the next Linux, it is all about learning. in that, he has already succeeded, as you can see from his progress all that he has learned and all the little tips he has picked up from trying
i'm just shocked that on a site about hackers that somebody would call out the efforts of a hacker to learn something new as 'going to fail'.
It all grew from there.
That's how you learn. If I had the skills, what would be the point of the exercise?
Second, a lot of grunt-code can be found in open source projects, so most of the tedious/time-consuming programming can be eliminated if he chooses to follow this option.
Drop preemptive multitasking, cache-optimization, modes, virtual memory, and networking and there's not too much left.
And finally, it doesn't take that long to understand the GDT. Mine became corrupted once so I took a day to learn how it worked -- fixed that crap in a hex editor.
So no, he's probably not going to invent the next highly polished OS that handles every edge case and has been rigorously tested against bugs, but then again I don't think it's unreasonable to see a simple little functional OS.
I agre with you, though. As with any hobby project, picking the bits that are interesting or fun makes it easy to keep going.
http://en.wikipedia.org/wiki/Global_Descriptor_Table (maybe you're thinking of the GPT?)
A recipe for coming up with good ideas is looking at something that has been around for a while and asking "What would it be like if it had been invented now?".
And if all else fails, this will be a wonderful learning experience.
I know this is a cliche, but before writing comments like this ask yourself if a) you are unnecessarily discouraging somebody for no reason and b) you would have said the same thing about successful projects when they were first started.
Finally, the one piece of advice I'd give on this project is that you shouldn't focus too much on the overly ambitious goals and don't worry if you can't accommodate for all of them just yet. The likelihood of achieving the next milestone is somewhat proportional to the number of milestones you have achieved. Get to v0.01 and take it from there.
a.) No, I don't think so. Considering the number of half-started kernels on OSdev.org, the creators of which incredibly often were convinced that they were going to create the newest awesomest kernel everyone is going to use, trying to get the OP to take a good look at his skillset seems like a proper thing to do.
b.) I would have, if the creators seemed completely ignorant about the basics of what they were trying to do.
This is not to say that I would tell the OP to give up on OS development - I personally love it. But I think a serious reality check is in order, and because we get people similar to the OP on the OSdev forums an awful lot, sometimes this frustrates me.
and asking "What would it be like if it had been invented now?".
creat(2) would have the second 'e'.
I would have mentioned it, but I was speaking as if Unix were invented today. Plan 9 is an almost textbook second-system effect; it seems like you have to have invented a Unix first before even starting on something like Plan 9.
Doomed = not commericailize-able? That doesn't mean its without value to the person behind it if they learn and derive satisfaction from the experience.
With the obvious caveats that you write as much of your code to be cross-platform as possible, and be careful with endianness.
What does Microsoft use?
Nothing is more flexible that command line tools as it lets you do scripting(programming your own environment).
It does not matter if you want to use "already made by someone else windows, text, menus and buttons", but it is essential when you make (or manage what other people's make) everything.
However, my impression is most of them edit code in MSVC, build with the DDK tools, and debug with windbg/ntsd/kd like most other driver developers.
A really, really good book for this that I've read is "Developing Your Own 32-Bit Operating System" by Richard Burgess. It starts you from the beginning, and walks you through all the steps in writing a really basic OS.
It's old and out of print, but it's definitely the best one I've seen.
Edit: I just found the website, they are offering the book free here:
Not to disappoint you, but you should try doing some more low level programming or dabbling with some existing OS code to have an idea how this kind of programs look like. Maybe having a look at Minix for a reference of simple OS?
Have you thought about targeting ARM? Its architecture may be way less trickier than most Intel CPUs.
Well, good luck with that. Worst case scenario, you'll end up reading lots of interesting resources.
Can anyone comment on whether this is really true?
Also, there are many other details that make targeting ARM much easier, for instance the bootloading process on ARM is more straightforward with no BIOS or EFI involved.
Too ambitious. Doing that requires millions of dollars and tens of thousands of man hours to make. How do I know? I do electronics and low level programming work and I am really good at it. Just understanding the bugs that manufactures put in hardware and then solve in software(because it is way cheaper) takes a ton of work.
As I suppose he is not super rich, he will have to convince people to join their project, a la Linus.
Good luck with that!! I really wish a clean-no backwards compatible OS were real, I will add native OpenCL, OpenVG and OpenGL to the list, but my arachnid sense tells me a person that does not use Unix will have a hard time getting traction with geeks.
A quick read of the "About" page is probably in order:
What to say?
"Someone holding a cat by the tail learns something he can learn in no other way" --Mark Twain.
Here's the tip of the tail:
In case you're lost: http://news.ycombinator.com/item?id=4815463
You could have just posted the book recommendations with a handwave. Dismissing other people's ideas because "they don't know better" (or you know better) doesn't add anything and is harmful to discussion.
There's no comparison whatsoever here. My high-speed cargo rail proposal/idea is actually DESIGNED for criticism out of the self realization that I am no expert in the field. I gathered as much data as I could. Did a bunch of math. Studied some of the issues involved and devoted a non-trivial amount of time to understanding the underlying issues.
Had you engaged me privately you would have also realized that I am very aware of the near-impossibility of the project as I proposed it due to a myriad of issues, not the least of which are political and environmental. Of course there's the simple practical fact that it is probably nearly impossible to trench new territory to build a new railroad system in the US today.
The more important point of raising the issue was to highlight the issue of just how badly ocean-based container shipping methods are polluting our planet and creating a situation that is has escalated into the proverbial elephant in the room.
So, yes, I've done a bit more work than the OP has done in truly understanding --in his case-- what operating systems are about, how to write the, why things are done in certain ways, the history behind some of the approaches, what works, what definitely does not work, and more.
And, yes, I have written several real-time operating systems for embedded systems, some of them mission critical. And, no, in retrospect it would have been a far better idea to license existing technology but as a programmer sometimes you don't have the option to make those decisions if you employer is dead set on a given approach.
No, I have never written a workstation-class OS. I know better than that. Today, it would be lunacy to even suggest it, particularly for a solo programmer, even with a ton of experience.
Anyhow, you succeeded at getting a rise out of me. Congratulations. I hope you are happy. It still doesn't change the fact that attacking the messenger does not invalidate anything I have said or prove whatever your fucking point might be.
I thought I'd offer a more optimistic counterpoint. I think that the 10,000 hours figure is way, way overestimating the amount of time needed in order to create something usable enough to make you satisfied, and which could teach you enough to understand any real operating system at the level of source code.
Although, yeah, if you've got commercial aspirations like OP, then I think you're in for it.
I think you nailed it right there. If the OP had said something akin to "I want to write a small OS with a command line interface to learn about the topic" it would have been an entirely different question. I would encourage anyone to do that. It would be an excellent learning experience and the foundation for a lot more interesting work.
With regards to your comment about writing a "fully preemptible, multiprocessor unixy kernel and userspace on x86" in school. Sure. Of course. But, keep in mind that you are actually being TAUGHT how to do this and guided throughout the process. You also mentioned that "a bootloader, a build system, a syscall spec, and some moderate protection from the details of some hardware beyond the CPU" are provided. The OP is talking about writing everything!
For example, he seems to talk about writing PCIe, SATA and USB interfaces. That alone could take a newbie a whole year to figure out. Particularly if coming from being a web developer.
Insane? Yes. Impossible? Of course not. Probable? Nope.
About the course: the extent to which we were guided was minimal by design. It wasn't a matter of "here's a skeleton, here are steps A, B, and C to get it working", but rather "here's an API, implement it; here are some general design ideas". I think that this is comparable to what you'd get if you sat down with a book on your own, and so I hope that it might give people a decent idea of the level of difficulty of such a project.
I emphatically agree with your point about the difficulty of a newbie + PCI situation, although a year still seems steep.
Back in the old days before the pc took off and people started expecting abstractions for everything conceivable, this was a normal part of a project. At university, I was tasked with building an rtos platform for the m68k. Took about a month from zero knowledge to working multitasking os with DMA, memory management and protection and a virtual machine which ran plc-style ladder logic.
The only problem is if you start with x86, you're going to have to fight the layers of fluff that have built up since the 8086 (ldt/gdt/long mode/segments/pci bus/shitty instruction set etc).
I'd go for ARM.
Did you read the article? There's nothing there than hints a simple operating system at all.
So, yeah: Bullshit. The operating system he is talking about is no hobby project.
Also, define "simple operating system". What is it? What can it do? What can't it do?
And I never read any one of those books you're listing there other than K&R and the 486 reference manuals. Sure enough I had a fair grasp of the x86 processor architecture before starting this and I'd done a lot of low level 8 bit work. But on the whole I spent more time waiting for it to reboot than I did writing code or reading books and I still managed to do all this in about two years.
This is doable. It's hard, but it is doable, and it is a lot easier now than when I did it. For one you have VMs now which make it a thousand times easier to debug a kernel. No more need to use a dos-extender to bootstrap your fledgling kernel code and so on.
This guy is way out of his depth, that's for sure. But what is also for sure is that he's going to learn quickly and seems on the right road for that (acknowledging what he doesn't know yet).
Don't tell other people what they can't do. Just wait and see, they just might surprise you. You'd have been talking to Linus like that just the same. And you would have been right about him being out of his depth, and you would have been wrong about him not being able to achieve his goal in the longer term.
Maybe this guy will get discouraged, maybe he won't. But no need to kill his enthusiasm with a negative attitude. If you're so smart, why not give him a hand, point him in the right direction on the concrete questions he's asking rather than to literally throw the book (or in this case a whole library) at him and tell him he's clueless.
He probably already knows that anyway, but at least he's willing to learn.
He is not that far off from your suggestions. Why do you think 8bit and realtime is better than AMD64? The architecture is a lot more complicated. On the other it is probably much better documented as well.
The idea here is to really get down to the basics and understand them with a series of incremental projects.
Of course, there are no universally true rules about this stuff. This happens to be my opinion based on over quite of few years of developing and shipping products that entail both electronics and software.
As an example, I am teaching my own son how to program with C and Java almost simultaneously. Why? Well, he is not learning on his own, his Dad happens to knows this stuff pretty well and we are spending a lot of time on all of it. So, navigating two languages at the same time is working out OK. I've also had him sit with me while I work in Objective-C and ask questions as I go along.
In about three months we are going to build a real physical alarm clock using a small microprocessor and LED displays. I am going to to bootstrap Forth on that processor. The job will also require writing a simple screen text editor in order to make the clock its own development system.
So, by the middle of next year he will have been exposed to raw C, Java and a threaded interpreted language like Forth. I want to expose him to Lisp as well but don't yet know when it will make sense to do that. Maybe in a year or so. With three programming paradigms on the table it will be far more important to explore algorithms and data structures/data representation and understand how they look like with each technology.
Wow, what an idiotic thing to say. C is one of the worst languages for learning the actual fundamentals of programming, which are algorithms and data structures.
Let's take it down even further: Every language you care to suggest ultimately ends-up in machine language. You can implement ANY algorithm or data structure management you care to mention in assembler. So, assembler isn't any less capable in that regard than any language anyone might care to propose.
Now, of course there's the practical matter of the very real fact that doing object oriented programming --as an example-- in assembler would be extremely painful, so yeah, this would not be the first choice.
Nobody who has done a reasonable amount of programming across tools and platforms would, for a minute, suggest that C is the be-all and end-all of programming languages. That I have never said anywhere. In fact, in a recent post I believe I suggested a progression involving assembler, Forth, C, Lisp, C++, Java (or other OO options). Even less popular languages like APL have huge lessons to teach.
As the level of abstraction increases one can focus on more complex problems, algorithms and data structures. That is true.
One of the problems with a lot of programmers I run into these days is that a lot of what happens behind the code they write is absolute magic to them. They have almost zero clue as to what happens behind the scenes. That's why I tend to like the idea of starting out with something like C. It is very raw and it can get as complex as you care to make it.
One can use C to write everything from device drivers, operating systems, mission critical embedded systems, database managers, boot loaders, image processors, file managers, genetic solvers, complex state machines and more. There's virtually nothing that cannot be done with C.
Is it ideal? No such language exists. However, I'll go out on a limb and say that if I have two programmers in front of me and one only learned, say, Objective-C and nothing more while the other started out with C and then moved to Objective-C, the second programmer will be far better and write better code than the first.
All of that said, there is no magic bullet here. Start with whatever you want. No two paths are the same. Just different opinions.
You're arguing against a strawman here.
> Nobody who has done a reasonable amount of programming across tools and platforms would, for a minute, suggest that C is the be-all and end-all of programming languages.
And I never used that strawman.
> As the level of abstraction increases one can focus on more complex problems, algorithms and data structures.
And this is my point. You can focus on what you're learning without having to waste time on anything else.
Why don't you advocate a return to punch cards?
> One of the problems with a lot of programmers I run into these days is that a lot of what happens behind the code they write is absolute magic to them.
And most programmers don't know enough physics to understand how a transistor works, either. You can learn stuff like that as and when you need it. The actual core needs to come first.
> There's virtually nothing that cannot be done with C.
Ditto machine language, as you just said. So why didn't you say everyone needs to start with machine language?
You can learn just as much with assembler. It would be a huge pain in the ass. And, just in case there's any doubt, I am not proposing that anyone use assembler to learn complex algorithms, patterns or data structures.
Your original comment "what an idiotic thing to say" is just false. You can learn ALL fundamentals of programming with C. And, yes, you can learn ALL fundamentals of data structures with C.
Classes and OO are not "fundamentals". That's the next level. And there's a whole movement proposing that there are huge issues with OO to boot.
I have a question. You were quick to call me an idiot for suggesting that newbies need to start with C. OK. I have a thick skin. Thanks.
Now, let's move on. I noticed that you did not offer a solution. What would you suggest someone should start with? Why? How is it better than starting with C?
Now, keep in mind that we are talking about STARTING here. We are not talking about --and I have never suggested that-- C is the ONLY language someone should learn. Quite the contrary.
I NEVER DID THAT. I merely said an idea was idiotic.
I will not proceed until you acknowledge that. It's a question of honesty.
1. of, pertaining to, or characteristic of an idiot.
2. senselessly foolish or stupid: an idiotic remark.
What would you suggest someone should start with?
How is it better than starting with C?
What will they learn that they cannot learn with C?
How would learning C as their first language hinder them?
Why is C an idiotic first choice?
Newton held idiotic ideas. Was Newton an idiot? No. Did I just call Newton an idiot? No.
> What would you suggest someone should start with?
It depends on the person and why they want to program.
Because I don't think C is the best choice for all tasks. In fact, I think C is a poor choice for most of the reasons people start programming.
> How is it better than starting with C?
Because C forces the programmer to prioritize machine efficiency above everything else. Algorithms get contorted to account for the fact the programmer must explicitly allocate and release all resources. Data structures get hammered down into whatever form will fit C's simplistic (and not very machine efficient) memory model.
In short, everything is simplified and contorted to fit the C worldview. The programmer is forced to act as their own compiler, turning whatever program they want to write into something the C compiler will accept.
> What will they learn that they cannot learn with C?
A clearer understanding of things like recursive data structures, which are complicated with excess allocation, deallocation, and error-checking noise code in C.
Compare a parser written in Haskell to one written in C: The string-handling code is reduced to a minimum, whereas in C it must be performed with obscene verbosity.
> How would learning C as their first language hinder them?
> Why is C an idiotic first choice?
It is purely wasteful to have new programmers worry about arbitrary complexities in addition to essential complexities. It is wasteful to have new programmers writing the verbose nonsense C imposes on them every time they want to do anything with a block of text. That time should be spent learning more about the theory behind programming, the stuff that won't change in a few years because it is built on sound logic, not accidents of the current generation of hardware design.
Well. We couldn't disagree more.
I love APL because it absolutely removes you from nearly everything low-level and allows you to focus on the problem at hand with an incredible ability to express ideas. I did about ten years of serious work with APL. I would not suggest that a new programmer start with APL. You really need to know the low level stuff. Particularly if we are talking about writing an operating system and drivers.
Nobody is suggesting that a programmer must never stray outside of C. That would be, to echo your sentiment, idiotic. A good foundation in C makes all else non-magical, which is important.
In particular, I am thinking about pointers and memory management, but there are other things.
This is also why I think we have so much bloated code these days. Everything has to be an object with a pile of methods and properties, whether you need them or not. Meanwhile nobody seems to be able to figure out that you might be able to solve the problem with a simple lookup table and clean, fast C code. There was a blog post somewhere about exactly that example recently but I can't remember where I saw it.
I wrote a GA in Objective-C because, well, I got lazy. Then, after seeing the dismal performance I got I re-coded it in C. It's been a couple of years but I think that the performance was hundreds of times faster than anything the optimized Objective-C code could achieve. The heavy bloated NS data types just don't cut it when it comes to raw performance.
Someone who has only been exposed to OO languages simply has no clue as to what is happening when they are filling out the objects they are creating with all of those methods and properties or instantiating a pile of them.
'Bloat' is a snarl term. It's meaningless. It literally means nothing, except to express negative emotion.
> I wrote a GA in Objective-C because, well, I got lazy. Then, after seeing the dismal performance I got I re-coded it in C.
Did you try any other algorithms? Any other data structures? Simply picking a new language is laziness.
When dealing win an array is 400 times slower in a "modern OO language" then in raw C, well, the code id fucking bloated.
When you can use a simple data structure and some code to solve a problem and, instead, write an object with a pile of properties and methods because, well, that's all you know, that's bloated code.
Of course there are lots of places where OO makes absolute sense. And the fat and slow code is the compromise you might have to make. That's the way it goes.
With regards to my GA example. No, I had to implement a GA. That's what was required to even attempt to solve the problem at hand. Later on we used it to train a NN, which made the ultimate solution faster. But, the GA was required. There was no way around it and Objective-C was such a an absolute pig at it that it made it unusable.
> Simply picking a new language is laziness
See, there's the difference. I started programming at a very low level and have experienced programming languages and approaches above that, from C, to C++, Forth, Lisp, APL, Python, Java, etc.
I have even done extensive hardware design with reconfigurable hardware like PLD, PLA's and FPGA's using Verilog/VHDL. I have designed my own DDR memory controllers as well as raw-mode driver controllers and written all of the driver software for the required embedded system. My last design was a combination embedded DSP and FPGA that processed high resolution image data in real time at a rate of approximately SIX BILLION bytes per second.
So, yes, I am an idiot and make really fucking dumb suggestions.
Because of that I would like to think that, if the choice exists --and very often it does not-- I do my best to pick the best tool for the job.
More often than not, when it's pedal-to-the-metal time C is the most sensible choice. It used to be that you had to get down to assembler to really optimize things, but these days you can get a way with a lot if C is used smartly.
Social science numbers do not impress me. Besides, what is a "modern OO language"? Haskell? How can you give any numbers without even specifying that detail?
> Of course there are lots of places where OO makes absolute sense. And the fat and slow code is the compromise you might have to make.
Your idea that "OO = fat and slow" is blown away by actual benchmarks.
(And, yes, unless and until you define what "OO" is to you, I'll pick Haskell as a perfectly reasonable OO language. Given than I've seen C called OO by people with better writing skills than you, this is hardly a strange choice in this context.)
> So, yes, I am an idiot
Again, I did not call you an idiot. The only one calling you an idiot here is you.
> More often than not, when it's pedal-to-the-metal time C is the most sensible choice.
I agree fully with this. However, I disagree that "pedal-to-the-metal time" is all of the time, or even most of the time. Especially when you're trying to teach programming.
Do you teach new drivers in an F1 racecar? Why or why not?
No. Not really. C doesn't show you any of the essential parts of cache, opcode reordering, how multicore interacts with your code, or much of anything else that actually makes hardware fast.
C makes you act as if your computer was a VAX.
Below let X represent roughly the sentiment "C is a good learning language, since it teaches you what happens at a low level"
darleth: C sucks as an intro language
robomartin: No it doesn't because X
darleth: X was true 30 years ago but isn't anymore
robomartin: well C is still better because there is no language that does X
A better refutation is that I cannot predict the order of complexity for an algorithm written in Haskell that I could trivially do in C. Haskell presents immutable semantics, but underneath it all, the compiler will do fancy tricks to reuse storage in a way that is not trivially predictable for a beginner.
Similarly with Java, you end up having to explain pointers and memory and all that nastyness the first time the GC freezes for 1-2 seconds when they are testing the scaling of an algorithm they implemented in it.
Yes there is a "learn that when you need it" for a lot of stuff, but for someone actually learning fundamentals like data-structures and algorithms, we are talking about a professional or at least a serious student of CS. Someone in that boat will need to be exposed to these low-level concepts early and often because it is a major stumbling block for a lot of people.
If you just want to write a webapp, use PHP. If you want to learn these fundamentals you will also need to be exposed to the mess underneath, and it needs to happen sooner than most people think.
I appreciate your sentiment. However, I think you made the mistake of assuming that there is an argument here. :)
I find that most software engineers who, if I may use the phrase, "know their shit", understand the value of coming-up from low level code very well. I have long given-up on the idea of making everyone understand this. Some get it, some don't. Some are receptive to reason, others are not.
I am working on what I think is an interesting project. Next summer I hope to launch a local effort to start a tech summer camp for teenagers. Of course, we will, among other things, teach programming.
They are going to start with C in the context of robotics. I have been teaching my kid using the excellent RobotC from CMU. This package hides some of the robotics sausage-making but it is still low-level enough to be very useful. After that we might move them to real C with a small embedded project on something like a Microchip PIC or an 8051 derivative.
In fact, I am actually thinking really hard about the idea of teaching them microcode. The raw concept would be to actually design a very simple 4 bit microprocessor with an equally simple ALU and sequencer. The kids could then set the bit patterns in the instruction sequencer to create a set of simple machine language instructions. This is very do-able if you keep it super-simple. It is also really satisfying to see something like that actually execute code and work. From that to understanding low-level constructs in C is a very easy step.
After C we would move to Java using the excellent GreenFoot framework.
So, the idea at this point would be Microcode -> RobotC -> full C -> Java.
Anyone interested in this please contact me privately.
Except this is also true for C at this point. Maybe the order won't change, but maybe it will at that, if the compiler finds a way to parallelize the right loops.
C compilers have to translate C code, which implicitly assumes a computer with a very simplistic memory model (no registers, no cache), into performant machine code. This means C compilers have to deal with the register scheduling and the cache all by themselves, leading to code beginners have a hard time predicting, let alone understanding.
Add to that little tricks like using MMX registers for string handing and complex loop manipulation and you have straightforward C being transformed into, at best, with a good compiler, machine code that you need to be fairly well-versed in a specific platform to understand.
This is why I get so annoyed when people say C is closer to the machine. No. The last machine C was especially close to was the VAX. C has gotten a lot further away from the machine in the last few decades.
The implication here is that you should teach C as an end in itself, not as an entry point into machine language. If you want to teach machine language, do it in its own course that has a strong focus on the underlying hardware. And don't claim C is 'just like' assembly.
2) Despite #1 C is still closer to the machine than Haskell, and I'm not sure how you could maintain otherwise
3) Nearly all of the C optimizations will, at best, make a speedup by a constant factor. Things that add (or remove) an O(n) factor in Haskell can and do happen.
You're not reading my other posts, then. I explicitly said programmers can learn those things as and when they need to.
In fact, your suggestion is exactly on point: The OP should pick-up Tannenbaum's book and take a year to implement everything in the book. Why a year? He is a web developer and, I presume, working. It will take more than time to learn what he does not know in order to even do the work. So, let's say a year.
I would suspect that after doing that his view of what he proposed might just be radically different.
For example, wait until he figures out that he has to write drivers for chip-to-chip interfaces such as SPI and I2C. Or that implementing a full USB stack will also require supporting the very legacy interfaces he wants to avoid. Or that writing low-level code to configure and boot devices such as DRAM memory controllers and graphics chips might just be a little tougher than he thought.
There's a reason why Linux has had tens of thousands of contributors and millions of lines of code:
...and that's just the kernel.
Writing some kind of a minimalist hobby OS on top of the huge body of work that is represented by the drivers and code that serve to wake up the machine is very different from having to start from scratch.
My original comment has nothing whatsoever to do with anything other than the originally linked blog post which describes almost literally starting from scratch, ignoring decades of wisdom and re-writing everything. That is simply not reasonable for someone who's experience is limited to doing web coding and dabbling with C for hobby projects. In that context, just writing the PCI driver code is an almost insurmountable task.
If I were advising this fellow I'd suggest that he study and try to implement the simplest of OS's on a small embedded development board. This cuts through all the crud. Then, if he survives that, I might suggest that he moves on to Tanenbaum's book and take the time to implement all of that. Again, in the context of a working web professional, that's easily a year or more of work.
After that --with far more knowledge at hand-- I might suggest that he start to now ask the right questions and create a list of modifications for the product that came out of the book.
Far, very, very far from the above is the idea of starting with a completely blank slate and rolling a new OS that takes advantage of nearly nothing from prior generations of OS's. And to do that all by himself.
What's the harm here? He's going to dive in and learn something, and he's probably going to get further along than you expect, because this stuff just isn't as complicated as people like to think it is.
You have to keep in mind that the OP is talking about such things as writing his own PCIe and USB stacks as well as everything else. He is leaving all history and prior work on the floor and re-inventing the wheel.
That's very far from writing a small RTOS for an 8-bit processor. In fact, my suggestion is that he should do just what you did in order to understand the subject a lot better. There's a lot of good Computer Science that can be learned with a small 8-bit processor.
Did I know all of that stuff? Of course not. Did it compare in complexity to what this article is proposing. Nope. The OS described in the article is far, far more complex than what I just described.
Is is incapable of doing it? Nope. I did not say that. I think I said that anyone who has written a non-trivial RTOS would laugh at the idea of what he described. Why? Because it is a monumental job for one person, particularly if they've almost done zero real development at the embedded level and they also have to work for a living.
I got started designing my own microprocessor boards, bootstrapping them and writing assembly, Forth and C programs before when I was about 14 years old. By the time I got to college I knew low-level programming pretty well. As the challenge to start diving into writing real RTOS's presented itself I could devote every waking hour to the task. Someone starting as a web developer --who presumably still needs to keep working-- and wanting to develop such an extensive OS is just, well, let's just say it's hard.
The guy built three planes. One didn't take of, one crashed, one brought him across Africa in short hops with landings at rebel-occupied airports where he didn't always manage to announce his arrival.
Short thread at http://www.homebuiltairplanes.com/forums/hangar-flying/12196....
(Apologies for deviating from the subject, but I figure hackers might find this interesting)
If you really have a good grasp of the concepts sometimes the hard part is the drudgery of possibly having to write all the device drivers for the various devices and peripherals that the RTOS has to service.
In the case of the OP, he seems to be talking about rewriting everything from the most basic device drivers on up to bootloaders and even the GUI. That's a ton of work and it requires knowledge across a bunch of areas he is not yet well-versed in.
Also, when it comes to the idea of writing an RTOS, there's a huge difference between an RTOS for, say, a non-mission-critical device of some sort and something that could kill somebody (plane, powerful machinery, etc.). That is hard not because the concepts require superior intellect but rather because you really have to understand the code and potential issues behind how you wrote it very well and test like a maniac.
I have written RTOS's for embedded control of devices that could take someone's arm off in a fraction of a second. Hard? Extremely, when you consider what the stakes are and particularly so if it is your own business, your own money and your own reputation on the line. There's a lot more to programming that bits and bytes.
And thank you again for inspiration (even the cynicism is inspirational ;)
Sometimes the HN crowd surprises me. We pride ourselves in being hackers, most often idealistic (bitcoins and patent law change anyone?) but when a singular person shows idealistic ambition, we immediately engage in poppy cutting.
But. His post is just that - his expression of enthusiasm. There's not much of anything else here yet. We have nothing to discuss but his enthusiasm.
Having seen these kinds of ideas consistently end up as "well... it got hard" a week later, it triggers my grumpy "sure whatever let me know how it goes in a month" reflex.
This is the exact definition of hacking, if you ask me.
Generally this is a bad idea because without any external motivation, you lose interest and stop working. With external motivation is worse, because you can burn out and become a catatonic shell of a person, staring absently into space for the rest of your life.
Just some FYIs:
> On the side note - It’s 21st century, but our PCs are still booting up as old-fart Intel 8086.
You should read about EFI (http://www.intel.com/content/www/us/en/architecture-and-tech...)
You should also read all of the lecture materials from good universities OS classes. In those classes, you basically do this. Some classes are more guided than others. Some places to start:
- CMU: http://www.cs.cmu.edu/~410/
- UMD: https://www.cs.umd.edu/~shankar/412-F12/
UMD uses a toy operating system called GeekOS that the students extend. You might find browsing its source code useful (http://code.google.com/p/geekos/)
The actively developed ones we chose between when teaching the OS class at UW this fall were:
JOS (mit) https://github.com/guanqun/mit-jos
Pintos (stanford) http://www.stanford.edu/class/cs140/projects/
OS161 (harvard) http://www.eecs.harvard.edu/~syrah/os161/
Good luck to the author, nonetheless it will be a good learning experience for him.
"Those who don't understand Unix are condemned to reinvent it, poorly."
The linked article does NOT talk about a one-semester school project or a quick-and-simple learning OS.
No, the article talks about a web developer with no real experience writing low-level code not only wanting to bootstrap every single device driver but also ignoring years of accumulated knowledge and code libraries to write an OS that boots directly into graphical mode, does not take advantage of POSIX and more.
There's nothing wrong with the "How hard can it be?" approach to learning. I've done this many times. And almost every single time I came away with "I sure learned a lot, but what the fuck was I thinking?". The last time I pulled one of those was about fifteen years ago and the "three month project" took nearly two years.
What he is talking about is more complex than writing the Linux kernel from scratch because he wants to re-invent everything. Here are some stats on the Linux kernel:
Even if his project was 10% of this it would still be a grotesque miscalculation for a single developer, otherwise employed and without the experience to back-up some of what he is proposing.
If, on the other hand, the post had suggested something like this it would have been far more reasonable an idea:
"Hey, I just spent a year implementing everything in the Tanenbaum book. Now I would like to start from that base and enhance the OS to make it do this...".
Let's compare notes in a year and see how far he got.
I'm sure the idea of building a modern OS that is straightforward and written in a simple, popular language like C (and possibly Python later for higher-level stuff) will appeal to a wide range of people who will all want to help. I'd love to see this project happen, and if the day comes where Gusts is calling for help, I'll be right there in line to help him make this.
Oh. OK, then.
If I had crossed the desire threshold to start that project (#1 project in my mind since I left college) I'd leave the C ecosystem altogether, design a typed ,functional ,binary friendly, modular, subset of C (and probably be forever alone). Something in the groove of http://en.wikipedia.org/wiki/BitC, even though its talented author concluded it wasn't a successful path.
I would prefer he decided to fork linux and change things he didn't like, then start from skratch. However, there is a great value starting from scratch. I wish I had a life :) to join him and figure out things together, it would be a blast, how many times in your life you have a chance to work on actual modern OS.
I believe it is totally possible for him to accomplish what he started, if knowledgeable people would join him and work with project together. Today with amazing tools, it is good time to create a new OS that would have modern tooling.
I wrote recently on my blog about a need for developer distribution of linux. Strangely this is still missing. http://softwaredevelopmentinchicago.com/2012/10/17/ubuntu-al...
It is great that we are discussing this. That is how things start.
...and when you reach to the GUI part, do the same for C++, use the latest version and language features: I've heard that VS2012 lasts upgrade got closer to it, but google around before settling on it
...or to keep it simpler: better use GCC compilers (since the Linux kernel is built with it, you should find enough compiler specific docs and related tools too)
1. Targeting a modern architecture is good, but if I were being this ambitious, I would wager having such a backwards compatable burdened architecture like x86_64 (even when it is highly performant just through raw funding dollars) I would still rather start at square 1 on some risc 64 bit virtual 48 bit physical word system. Go even further, and design such a hardware ecosystem with heterogeneous computing built into the foundations - have arbitrary numbers of ALUs and FPUs and have different pipeline structures allowing for various degrees of SIMD parallelism across some tightly integrated weak cores and more heavily pipelined and bulkier serial cores, and have an intelligent enough instruction set to allow for scheduling (or even better, the hardware itself) to recognize parallel tasks and execute them with varying degrees of parallelism. Take AMD Fusion or Tegra to the next level and instead of having a discrete gpu and cpu on one die mash them together and share all the resources.
2. I'd kick C out. If I'm going with a new architecture, I need to write the compiler from scratch anyway. I might consider LLVM for such a system, just because the intermediary assembly layer is intentionally lossless and allows for backwards language compatability with everything under the sun right now. But ditch C, take modern language concepts from C++, Python etc, and cut down on the glyphic syntax and try rethinking the distribution of special characters (I think pointer<int> c makes more sense than int (star)c, for example - go even further, and provide 3 levels of verbosity for each concept, like pointer<int32> c, ptr<int32> c, and &:i32 c). I would definitely want to fix standard type sizes at the least, having things like i32 integers instead of the int type being 16 or 32 bit, etc, with some more modern niceities like the D style real float that uses the architecture restricted maximum FPU register size).
3. Screw UEFI, mainly because it is a design by consortium concept - it is inherently cumbersome because it was a commitee project between industry giants rather than a revolution in booting. I do like cutting down on legacy interfaces, I'd go even further and try to minimize my system to (in theory) one serial transport and one digital, maybe 4, with unidirectional and bidirectional versions of both, and maybe support for some classic analog busses (like audio, which doesn't make much sense to transport in digital format, although I haven't looked into it much). Everything plug and play, everything attempting to provide power over a channel so you don't need additional power connectivity if you can avoid it. For the BIOS, I would replace it with some metric of scan busses for profiles -> incite some kind of device-wise self test -> provide device information in memory to the payload binary, to allow memory mapping and all the other goodness. Maybe even have the bios itself act as a sub-kernel and provide the mapping itself. Maybe even fork the kernel, and treat it like some kind of paravirtualized device environment where the bios never overrides itself with the payload but instead stays active as a device interface. Saves a lot of code redundancy between the two then. It would of course have an integrated bootloader and the ability to parse storage device trees for some bootable material. Maybe have file system standards where each partition has a table of pointers to loadable binaries somewhere, or maybe stick them in some partition table entry (obviously not a FS expert here).
4. Screw URIs, go straight for a kernelwise VFS that can reference everything. I'd love to see /net/<IP address>/ referening the top level of some remote servers public resources. You could have a universal network protocol where each connection is treated as a virtual mount, and individual files (and everything is a file, of course) can dictate if they use streamed or packet based data off some network transaction about the base protocol. So instead of having http://google.com, you could use /net/google.com/ which when opened does DNS resolution in the VFS to 22.214.171.124 (well, ipv6, obviously - we are talking about a new OS here, so 2001:4860:8006::62 - and as a side note, I would never try to get rid of IP as the underlying transport protocol - as insane I might be about redesigning hardware and rethinking stuff people much smarter than myself came up with, I know you will never ursurp IP as the network trasport everyone uses to connect the world ever). And then when you open google.com/search, you open a remote file that interprets the "arguements" of ?q=baconatorextreme on the extension into the returned page file that you access.
I agree with getting rid of Unix directories, they are outdated crappy, and all their names make no sense. However, /bin is meant to be system required binaries to boot, where sbin is root utility binaries, /usr/bin is general purpose executables that might not be local to the machine and might be a remote mount, and /usr/local/bin is the local machines installed binaries. Of course these polcies are never abided by, and they still have /etc, /usr/games, and a bunch of other folders to make life a hassle.
That's enough rates for a HN comment thread though, I'll stop and spare y'all :P
A friend of mine wrote a little OS (similar to vxWorks) for 64-bit MIPS in the 90s; I helped him debug a couple of problems. This was actual production code to go in a product (our employer was too cheap to license vxWorks for the secondary processor in the product).
It's incredibly easy to get something up and running on an architecture like this. (Admittedly he cheated by virtue of not needing to interface with hw to get loaded -- everything gets much easier when your EEPROM is wired to the boot vector.) Get your kernel data structures set up, handle interrupts, and BOOM! you've got multitasking. (I think it took him about a week.)
But I have no access to such an industry like hardware production. Software is cheap, hardware is not :(
As for URIs - this is where they fit in well, remember that // stands for domain root, so instead of /net/google.com it's ok to use //google.com. Schema on the other hand is a hint for a port to use, so http://google.com/ is the same as //google.com:80 + it can be used for URI driver negotiation. Simply go with fopen("ftp://domain.com/file.txt") and kernel uses FTP driver, go with fopen("imap://me:email@example.com/inbox/") you'll receive your inbox folder on a mailserver, etc. Avoiding schema would fall back to file://, avoiding domain falls back to localhost, so /documents/hello.txt is really file://localhost/documents/hello.txt and localhost is managed with a VFS where you can create even funnier stuff, like linking other types of URLs to your file system. For examlpe, file://localhost/some_site.html -> http://www.google.com/ or file://localhost/documents/work/ -> ftp://me:firstname.lastname@example.org/home/
Basically, that is my idea of what I'd love to do + yes, complete hardware overhaul, but that's not for me (at least not for now) as I'm living in banana republic and we don't have 6 layer PCB printing/soldering facilities here :D
If you strip out loadable module support and such, is it possible to boot without the usual POSIX support structure? Without filesystems?
Linux has very minimal requirements of userspace - pretty much you need to have a process that will act as your init process that knows to wait to clean up any zombine processes now and again. The rest is pretty much up to how you want to organize things.
As for filesystems, there needs to be something, I don't think it can work at all without a root file system, but a ram disk will do fine. The FHS directory structure isn't needed at all of course.
I've seen some ridiculously stripped down embedded Linux systems. Most tend to have something like Busybox though, with a fairly conventional look, but some prefer to just use that while developing and rip it out in the deployed version.
As a developer, I have similar feeling on softwares including OSes, and I started a fresh vector editor project(Torapp guilloche online designer http://www.torapp.info), I know a vector editor is much simpler than an OS, but it is also pretty complicated. When designing the editor, I learned a lot and changed designs multiple times. I am sure that guy will learn and even if he can not complete an OS, he may leave some well-designed code base for other people to start with.
In no particular order:
54- ...well, I'll stop here.
Of course, the equivalent knowledge can be obtained by trial-and-error, which would take longer and might result in costly errors and imperfect design. The greater danger here is that a sole developer, without the feedback and interaction of even a small group of capable and experienced programmers could simply burn a lot of time repeating the mistakes made by those who have already trenched that territory.
If the goal is to write a small RTOS on a small but nicely-featured microcontroller, then the C books and the uC/OS book might be a good shove in the right direction. Things start getting complicated if you need to write such things as a full USB stack, PCIe subsystem, graphics drivers, etc.
I've always wondered if there could be created some way to skip this step in [research] OS prototyping, by creating a shared library (exokernel?) of just drivers, while leaving the "design decisions" of the OS (system calls, memory management, scheduling, filesystems, &c.--you know, the things people get into OS development to play with) to the developer.
People already sort of do this by targeting an emulator like VirtualBox to begin with--by doing so, you only (initially) need one driver for each feature you want to add, and the emulator takes care of portability. But this approach can't be scaled up to a hypervisor (Xen) or KVM, because those expect their guest operating systems to also have relevant drivers for (at least some of) the hardware.
I'm wondering at this point if you could, say, fork Linux to strip it down to "just the drivers" to start such a project (possibly even continuing to merge in driver-related commits from upstream) or if this would be a meaningless proposition--how reliant are various drivers of an OS on OS kernel-level daemons that themselves rely on the particular implementation of OS process management, OS IPC, etc.? Could you code for the Linux driver-base without your project growing strongly isomorphic structures such as init, acpid, etc.?
Because, if you could--if the driver-base could just rely on a clean, standardized, exported C API from the rest of the kernel, then perhaps (and this is the starry-eyed dream of mine) we could move "hardware support development" to a separate project from "kernel development", and projects like HURD and Plan9 could "get off the ground" in terms of driver support.
In my experience one of the most painful aspects of bringing up an OS on a new platform is exactly this issue of drivers as well as file systems. A little google-ing quickly reveals that these are some of the areas where one might have to spend big bucks in the embedded world in order to license such modules as FFS (Flash File System) with wear leveling and other features as well as USB and networking stacks. Rolling your own as a solo developer or even a small team could very well fit into the definition of insanity. I have done a good chunk of a special purpose high-performance FFS. It was an all-absorbing project for months and, realistically, in the end, it did not match all of the capabilities of what could be had commercially.
This is where it is easy to justify moving into a more advanced platform in order to be able to leverage Embedded Linux. Here you get to benefit and leverage the work of tens of thousands of developers devoted to scratching very specific itches.
The down-side, of course, is that if what you need isn't implemented in the boad support package for the processor you happen to be working with, well, you are screwed. The idea that you can just write it yourself because it's Linux is only applicable if you or your team are well-versed in Linux dev at a low enough level. If that is not the case you are back to square one. If you have to go that route you have to hire an additional developer that knows this stuff inside out. That could mean $100K per year. So now your are, once again, back at square one: hiring a dev might actually be more exoensive than licensing a commercial OS with support, drivers, etc.
I was faced with exactly that conundrum a few years ago. We ended-up going with Windows CE (as ugly as that may sound). There are many reasons for that but the most compelling one may have been that we could identify an OEM board with the right I/O, features, form factor, price and full support for all of the drivers and subsystems we needed. In other words, we could focus on developing the actual product rather than having to dig deeper and deeper into low-level issues.
It'd be great if low level drivers could be universal and platform independent to the degree that they could be used as you suggest. Obviously VM-based platforms like Java can offer something like that so long as someone has done the low-level work for you. All that means is that you don't have to deal with the drivers.
To go a little further, part of the problem is that no standard interface exists to talk to chips. In other words, configuring and running a DVI transmitter, a serial port and a Bluetooth I/F are vastly different even when you might be doing some of the same things. Setting up data rates, clocks, etc. can be day and night from chip to chip.
I haven't really given it much thought. My knee-jerk reaction is that it would be very hard to crate a unified, discoverable, platform-independent mechanism to program chips. The closest one could possibly approach this idea would be if chip makers were expected to provide drivers written to a common interface. Well, not likely or practical.
Not an easy problem.
Another thought: if not just a package of drivers, then how stripped down (for the purpose of raw efficiency) could you make an operating system intended only to run an emulator (IA32, JVM, BEAM, whatever) for "your" operating system? Presumably you could strip away scheduling, memory security, etc. since the application VM could be handling those if it wanted to. Is there already a major project to do this for Linux?
Actually it has a lot to say, but in this case it just appealed to the fifth amendment.
Seriously though, good luck.
Here's a few notes about his plans:
> Target modern architecture
> Avoid legacy, drop it as fast as you can. You can even skip the Protected mode and jump directly to Long mode
So if you want to "skip" protected mode, you'll have to write a pile of assembly code to get there. x86_64 is a lot more work than 32bit x86.
> Jump to C as soon as possible
You only need a few pieces of assembly code to get an operating system running: the boot code and the interrupt handler code. The boot code and the interrupt handler are just small trampolines that go to C code as soon as possible.
In addition to the boot and interrupt handler code, you occasionally need to use some privileged mode CPU instructions (disable interrupts or change page table, etc). Use inline assembler for that.
Anyone who (in this thread) suggested using something else than C seemed to be fairly clueless about it. Of the choices you have available, C is the simplest way to go. Everything else is either more work or more difficult.
> Forget old interfaces like PCI, IDE, PS/2, Serial/Parallel ports.
You're most likely going to have to deal with PCI bus at some point too, although many devices don't use the physical pci buses on motherboards, some devices still hook up to the pci bus. Look at the output of "lspci" on Linux, all of those devices are accessed through PCI. This includes USB, PCIe, SATA, IDE, Network interfaces, etc.
Again, using the modern buses is a lot more work than using the old ones and it partially builds upon the old things.
> Why does every tutorial still use such an ancient device as Floppy?
> Avoid the use of GRUB or any other multiboot bootloader – make my own and allow only my own OS on the system
Most hobby operating systems are just half-assed stage 1 bootloaders. Just get over the fact that you'll have to use code written by others and get booted.
Popular emulators (bochs, qemu) can boot multiboot kernels directly so you'll save a lot of time there too.
You need to get booted in an emulator and running under a debugger as quickly as possible. Operating system development is so much easier to do with a debugger at hand. Failures generally cause a boot loop or hang the device so there won't be a lot of diagnostics to help with issues.
So my advice is: set up Qemu + GDB + multiboot, and get your kernel booted in a debugger as early as you can.
I won't go into commenting his wacky ideas about VFS structure or APIs. It's nice to make great plans up front but by the time you're booted to your own kernel, a lot of the naïve ideas you started with will be "corrected".
Happy hacking and do not listen to the naysayers.
PS. here's my hobby OS: http://github.com/rikusalminen/danjeros
Also, prepare for about 6 months of hard yet rewarding, given that you put in about 50 hours a week ;)
And then get rid of sata/pci/sas etc internal connectors and just use the same interconnect hub as the external devices. Again, handshakes to determine device connectivity.
Wouldn't that be so easy? One connection to rule them all! I'm not trying to say it would be easy to get to, we are mired in a world where we look at a concept like this and say "how dumb, you aren't utilizing a bidirectional link with video feeds, or you aren't using the power connectivity to a network router, or stream based vs packet based trasport layers having speed / bandwidth advantages over one or the other. But wouldn't it be great to plug a new gpu into a real universal serial bus?
Say you're writing your own OS: ok, sure...