A quick read of the "About" page is probably in order:
What to say?
"Someone holding a cat by the tail learns something he can learn in no other way" --Mark Twain.
Here's the tip of the tail:
In case you're lost: http://news.ycombinator.com/item?id=4815463
You could have just posted the book recommendations with a handwave. Dismissing other people's ideas because "they don't know better" (or you know better) doesn't add anything and is harmful to discussion.
There's no comparison whatsoever here. My high-speed cargo rail proposal/idea is actually DESIGNED for criticism out of the self realization that I am no expert in the field. I gathered as much data as I could. Did a bunch of math. Studied some of the issues involved and devoted a non-trivial amount of time to understanding the underlying issues.
Had you engaged me privately you would have also realized that I am very aware of the near-impossibility of the project as I proposed it due to a myriad of issues, not the least of which are political and environmental. Of course there's the simple practical fact that it is probably nearly impossible to trench new territory to build a new railroad system in the US today.
The more important point of raising the issue was to highlight the issue of just how badly ocean-based container shipping methods are polluting our planet and creating a situation that is has escalated into the proverbial elephant in the room.
So, yes, I've done a bit more work than the OP has done in truly understanding --in his case-- what operating systems are about, how to write the, why things are done in certain ways, the history behind some of the approaches, what works, what definitely does not work, and more.
And, yes, I have written several real-time operating systems for embedded systems, some of them mission critical. And, no, in retrospect it would have been a far better idea to license existing technology but as a programmer sometimes you don't have the option to make those decisions if you employer is dead set on a given approach.
No, I have never written a workstation-class OS. I know better than that. Today, it would be lunacy to even suggest it, particularly for a solo programmer, even with a ton of experience.
Anyhow, you succeeded at getting a rise out of me. Congratulations. I hope you are happy. It still doesn't change the fact that attacking the messenger does not invalidate anything I have said or prove whatever your fucking point might be.
I thought I'd offer a more optimistic counterpoint. I think that the 10,000 hours figure is way, way overestimating the amount of time needed in order to create something usable enough to make you satisfied, and which could teach you enough to understand any real operating system at the level of source code.
Although, yeah, if you've got commercial aspirations like OP, then I think you're in for it.
I think you nailed it right there. If the OP had said something akin to "I want to write a small OS with a command line interface to learn about the topic" it would have been an entirely different question. I would encourage anyone to do that. It would be an excellent learning experience and the foundation for a lot more interesting work.
With regards to your comment about writing a "fully preemptible, multiprocessor unixy kernel and userspace on x86" in school. Sure. Of course. But, keep in mind that you are actually being TAUGHT how to do this and guided throughout the process. You also mentioned that "a bootloader, a build system, a syscall spec, and some moderate protection from the details of some hardware beyond the CPU" are provided. The OP is talking about writing everything!
For example, he seems to talk about writing PCIe, SATA and USB interfaces. That alone could take a newbie a whole year to figure out. Particularly if coming from being a web developer.
Insane? Yes. Impossible? Of course not. Probable? Nope.
About the course: the extent to which we were guided was minimal by design. It wasn't a matter of "here's a skeleton, here are steps A, B, and C to get it working", but rather "here's an API, implement it; here are some general design ideas". I think that this is comparable to what you'd get if you sat down with a book on your own, and so I hope that it might give people a decent idea of the level of difficulty of such a project.
I emphatically agree with your point about the difficulty of a newbie + PCI situation, although a year still seems steep.
Back in the old days before the pc took off and people started expecting abstractions for everything conceivable, this was a normal part of a project. At university, I was tasked with building an rtos platform for the m68k. Took about a month from zero knowledge to working multitasking os with DMA, memory management and protection and a virtual machine which ran plc-style ladder logic.
The only problem is if you start with x86, you're going to have to fight the layers of fluff that have built up since the 8086 (ldt/gdt/long mode/segments/pci bus/shitty instruction set etc).
I'd go for ARM.
Did you read the article? There's nothing there than hints a simple operating system at all.
So, yeah: Bullshit. The operating system he is talking about is no hobby project.
Also, define "simple operating system". What is it? What can it do? What can't it do?
And I never read any one of those books you're listing there other than K&R and the 486 reference manuals. Sure enough I had a fair grasp of the x86 processor architecture before starting this and I'd done a lot of low level 8 bit work. But on the whole I spent more time waiting for it to reboot than I did writing code or reading books and I still managed to do all this in about two years.
This is doable. It's hard, but it is doable, and it is a lot easier now than when I did it. For one you have VMs now which make it a thousand times easier to debug a kernel. No more need to use a dos-extender to bootstrap your fledgling kernel code and so on.
This guy is way out of his depth, that's for sure. But what is also for sure is that he's going to learn quickly and seems on the right road for that (acknowledging what he doesn't know yet).
Don't tell other people what they can't do. Just wait and see, they just might surprise you. You'd have been talking to Linus like that just the same. And you would have been right about him being out of his depth, and you would have been wrong about him not being able to achieve his goal in the longer term.
Maybe this guy will get discouraged, maybe he won't. But no need to kill his enthusiasm with a negative attitude. If you're so smart, why not give him a hand, point him in the right direction on the concrete questions he's asking rather than to literally throw the book (or in this case a whole library) at him and tell him he's clueless.
He probably already knows that anyway, but at least he's willing to learn.
He is not that far off from your suggestions. Why do you think 8bit and realtime is better than AMD64? The architecture is a lot more complicated. On the other it is probably much better documented as well.
The idea here is to really get down to the basics and understand them with a series of incremental projects.
Of course, there are no universally true rules about this stuff. This happens to be my opinion based on over quite of few years of developing and shipping products that entail both electronics and software.
As an example, I am teaching my own son how to program with C and Java almost simultaneously. Why? Well, he is not learning on his own, his Dad happens to knows this stuff pretty well and we are spending a lot of time on all of it. So, navigating two languages at the same time is working out OK. I've also had him sit with me while I work in Objective-C and ask questions as I go along.
In about three months we are going to build a real physical alarm clock using a small microprocessor and LED displays. I am going to to bootstrap Forth on that processor. The job will also require writing a simple screen text editor in order to make the clock its own development system.
So, by the middle of next year he will have been exposed to raw C, Java and a threaded interpreted language like Forth. I want to expose him to Lisp as well but don't yet know when it will make sense to do that. Maybe in a year or so. With three programming paradigms on the table it will be far more important to explore algorithms and data structures/data representation and understand how they look like with each technology.
Wow, what an idiotic thing to say. C is one of the worst languages for learning the actual fundamentals of programming, which are algorithms and data structures.
Let's take it down even further: Every language you care to suggest ultimately ends-up in machine language. You can implement ANY algorithm or data structure management you care to mention in assembler. So, assembler isn't any less capable in that regard than any language anyone might care to propose.
Now, of course there's the practical matter of the very real fact that doing object oriented programming --as an example-- in assembler would be extremely painful, so yeah, this would not be the first choice.
Nobody who has done a reasonable amount of programming across tools and platforms would, for a minute, suggest that C is the be-all and end-all of programming languages. That I have never said anywhere. In fact, in a recent post I believe I suggested a progression involving assembler, Forth, C, Lisp, C++, Java (or other OO options). Even less popular languages like APL have huge lessons to teach.
As the level of abstraction increases one can focus on more complex problems, algorithms and data structures. That is true.
One of the problems with a lot of programmers I run into these days is that a lot of what happens behind the code they write is absolute magic to them. They have almost zero clue as to what happens behind the scenes. That's why I tend to like the idea of starting out with something like C. It is very raw and it can get as complex as you care to make it.
One can use C to write everything from device drivers, operating systems, mission critical embedded systems, database managers, boot loaders, image processors, file managers, genetic solvers, complex state machines and more. There's virtually nothing that cannot be done with C.
Is it ideal? No such language exists. However, I'll go out on a limb and say that if I have two programmers in front of me and one only learned, say, Objective-C and nothing more while the other started out with C and then moved to Objective-C, the second programmer will be far better and write better code than the first.
All of that said, there is no magic bullet here. Start with whatever you want. No two paths are the same. Just different opinions.
You're arguing against a strawman here.
> Nobody who has done a reasonable amount of programming across tools and platforms would, for a minute, suggest that C is the be-all and end-all of programming languages.
And I never used that strawman.
> As the level of abstraction increases one can focus on more complex problems, algorithms and data structures.
And this is my point. You can focus on what you're learning without having to waste time on anything else.
Why don't you advocate a return to punch cards?
> One of the problems with a lot of programmers I run into these days is that a lot of what happens behind the code they write is absolute magic to them.
And most programmers don't know enough physics to understand how a transistor works, either. You can learn stuff like that as and when you need it. The actual core needs to come first.
> There's virtually nothing that cannot be done with C.
Ditto machine language, as you just said. So why didn't you say everyone needs to start with machine language?
You can learn just as much with assembler. It would be a huge pain in the ass. And, just in case there's any doubt, I am not proposing that anyone use assembler to learn complex algorithms, patterns or data structures.
Your original comment "what an idiotic thing to say" is just false. You can learn ALL fundamentals of programming with C. And, yes, you can learn ALL fundamentals of data structures with C.
Classes and OO are not "fundamentals". That's the next level. And there's a whole movement proposing that there are huge issues with OO to boot.
I have a question. You were quick to call me an idiot for suggesting that newbies need to start with C. OK. I have a thick skin. Thanks.
Now, let's move on. I noticed that you did not offer a solution. What would you suggest someone should start with? Why? How is it better than starting with C?
Now, keep in mind that we are talking about STARTING here. We are not talking about --and I have never suggested that-- C is the ONLY language someone should learn. Quite the contrary.
I NEVER DID THAT. I merely said an idea was idiotic.
I will not proceed until you acknowledge that. It's a question of honesty.
1. of, pertaining to, or characteristic of an idiot.
2. senselessly foolish or stupid: an idiotic remark.
What would you suggest someone should start with?
How is it better than starting with C?
What will they learn that they cannot learn with C?
How would learning C as their first language hinder them?
Why is C an idiotic first choice?
Newton held idiotic ideas. Was Newton an idiot? No. Did I just call Newton an idiot? No.
> What would you suggest someone should start with?
It depends on the person and why they want to program.
Because I don't think C is the best choice for all tasks. In fact, I think C is a poor choice for most of the reasons people start programming.
> How is it better than starting with C?
Because C forces the programmer to prioritize machine efficiency above everything else. Algorithms get contorted to account for the fact the programmer must explicitly allocate and release all resources. Data structures get hammered down into whatever form will fit C's simplistic (and not very machine efficient) memory model.
In short, everything is simplified and contorted to fit the C worldview. The programmer is forced to act as their own compiler, turning whatever program they want to write into something the C compiler will accept.
> What will they learn that they cannot learn with C?
A clearer understanding of things like recursive data structures, which are complicated with excess allocation, deallocation, and error-checking noise code in C.
Compare a parser written in Haskell to one written in C: The string-handling code is reduced to a minimum, whereas in C it must be performed with obscene verbosity.
> How would learning C as their first language hinder them?
> Why is C an idiotic first choice?
It is purely wasteful to have new programmers worry about arbitrary complexities in addition to essential complexities. It is wasteful to have new programmers writing the verbose nonsense C imposes on them every time they want to do anything with a block of text. That time should be spent learning more about the theory behind programming, the stuff that won't change in a few years because it is built on sound logic, not accidents of the current generation of hardware design.
Well. We couldn't disagree more.
I love APL because it absolutely removes you from nearly everything low-level and allows you to focus on the problem at hand with an incredible ability to express ideas. I did about ten years of serious work with APL. I would not suggest that a new programmer start with APL. You really need to know the low level stuff. Particularly if we are talking about writing an operating system and drivers.
Nobody is suggesting that a programmer must never stray outside of C. That would be, to echo your sentiment, idiotic. A good foundation in C makes all else non-magical, which is important.
In particular, I am thinking about pointers and memory management, but there are other things.
This is also why I think we have so much bloated code these days. Everything has to be an object with a pile of methods and properties, whether you need them or not. Meanwhile nobody seems to be able to figure out that you might be able to solve the problem with a simple lookup table and clean, fast C code. There was a blog post somewhere about exactly that example recently but I can't remember where I saw it.
I wrote a GA in Objective-C because, well, I got lazy. Then, after seeing the dismal performance I got I re-coded it in C. It's been a couple of years but I think that the performance was hundreds of times faster than anything the optimized Objective-C code could achieve. The heavy bloated NS data types just don't cut it when it comes to raw performance.
Someone who has only been exposed to OO languages simply has no clue as to what is happening when they are filling out the objects they are creating with all of those methods and properties or instantiating a pile of them.
'Bloat' is a snarl term. It's meaningless. It literally means nothing, except to express negative emotion.
> I wrote a GA in Objective-C because, well, I got lazy. Then, after seeing the dismal performance I got I re-coded it in C.
Did you try any other algorithms? Any other data structures? Simply picking a new language is laziness.
When dealing win an array is 400 times slower in a "modern OO language" then in raw C, well, the code id fucking bloated.
When you can use a simple data structure and some code to solve a problem and, instead, write an object with a pile of properties and methods because, well, that's all you know, that's bloated code.
Of course there are lots of places where OO makes absolute sense. And the fat and slow code is the compromise you might have to make. That's the way it goes.
With regards to my GA example. No, I had to implement a GA. That's what was required to even attempt to solve the problem at hand. Later on we used it to train a NN, which made the ultimate solution faster. But, the GA was required. There was no way around it and Objective-C was such a an absolute pig at it that it made it unusable.
> Simply picking a new language is laziness
See, there's the difference. I started programming at a very low level and have experienced programming languages and approaches above that, from C, to C++, Forth, Lisp, APL, Python, Java, etc.
I have even done extensive hardware design with reconfigurable hardware like PLD, PLA's and FPGA's using Verilog/VHDL. I have designed my own DDR memory controllers as well as raw-mode driver controllers and written all of the driver software for the required embedded system. My last design was a combination embedded DSP and FPGA that processed high resolution image data in real time at a rate of approximately SIX BILLION bytes per second.
So, yes, I am an idiot and make really fucking dumb suggestions.
Because of that I would like to think that, if the choice exists --and very often it does not-- I do my best to pick the best tool for the job.
More often than not, when it's pedal-to-the-metal time C is the most sensible choice. It used to be that you had to get down to assembler to really optimize things, but these days you can get a way with a lot if C is used smartly.
Social science numbers do not impress me. Besides, what is a "modern OO language"? Haskell? How can you give any numbers without even specifying that detail?
> Of course there are lots of places where OO makes absolute sense. And the fat and slow code is the compromise you might have to make.
Your idea that "OO = fat and slow" is blown away by actual benchmarks.
(And, yes, unless and until you define what "OO" is to you, I'll pick Haskell as a perfectly reasonable OO language. Given than I've seen C called OO by people with better writing skills than you, this is hardly a strange choice in this context.)
> So, yes, I am an idiot
Again, I did not call you an idiot. The only one calling you an idiot here is you.
> More often than not, when it's pedal-to-the-metal time C is the most sensible choice.
I agree fully with this. However, I disagree that "pedal-to-the-metal time" is all of the time, or even most of the time. Especially when you're trying to teach programming.
Do you teach new drivers in an F1 racecar? Why or why not?
No. Not really. C doesn't show you any of the essential parts of cache, opcode reordering, how multicore interacts with your code, or much of anything else that actually makes hardware fast.
C makes you act as if your computer was a VAX.
Below let X represent roughly the sentiment "C is a good learning language, since it teaches you what happens at a low level"
darleth: C sucks as an intro language
robomartin: No it doesn't because X
darleth: X was true 30 years ago but isn't anymore
robomartin: well C is still better because there is no language that does X
A better refutation is that I cannot predict the order of complexity for an algorithm written in Haskell that I could trivially do in C. Haskell presents immutable semantics, but underneath it all, the compiler will do fancy tricks to reuse storage in a way that is not trivially predictable for a beginner.
Similarly with Java, you end up having to explain pointers and memory and all that nastyness the first time the GC freezes for 1-2 seconds when they are testing the scaling of an algorithm they implemented in it.
Yes there is a "learn that when you need it" for a lot of stuff, but for someone actually learning fundamentals like data-structures and algorithms, we are talking about a professional or at least a serious student of CS. Someone in that boat will need to be exposed to these low-level concepts early and often because it is a major stumbling block for a lot of people.
If you just want to write a webapp, use PHP. If you want to learn these fundamentals you will also need to be exposed to the mess underneath, and it needs to happen sooner than most people think.
I appreciate your sentiment. However, I think you made the mistake of assuming that there is an argument here. :)
I find that most software engineers who, if I may use the phrase, "know their shit", understand the value of coming-up from low level code very well. I have long given-up on the idea of making everyone understand this. Some get it, some don't. Some are receptive to reason, others are not.
I am working on what I think is an interesting project. Next summer I hope to launch a local effort to start a tech summer camp for teenagers. Of course, we will, among other things, teach programming.
They are going to start with C in the context of robotics. I have been teaching my kid using the excellent RobotC from CMU. This package hides some of the robotics sausage-making but it is still low-level enough to be very useful. After that we might move them to real C with a small embedded project on something like a Microchip PIC or an 8051 derivative.
In fact, I am actually thinking really hard about the idea of teaching them microcode. The raw concept would be to actually design a very simple 4 bit microprocessor with an equally simple ALU and sequencer. The kids could then set the bit patterns in the instruction sequencer to create a set of simple machine language instructions. This is very do-able if you keep it super-simple. It is also really satisfying to see something like that actually execute code and work. From that to understanding low-level constructs in C is a very easy step.
After C we would move to Java using the excellent GreenFoot framework.
So, the idea at this point would be Microcode -> RobotC -> full C -> Java.
Anyone interested in this please contact me privately.
Except this is also true for C at this point. Maybe the order won't change, but maybe it will at that, if the compiler finds a way to parallelize the right loops.
C compilers have to translate C code, which implicitly assumes a computer with a very simplistic memory model (no registers, no cache), into performant machine code. This means C compilers have to deal with the register scheduling and the cache all by themselves, leading to code beginners have a hard time predicting, let alone understanding.
Add to that little tricks like using MMX registers for string handing and complex loop manipulation and you have straightforward C being transformed into, at best, with a good compiler, machine code that you need to be fairly well-versed in a specific platform to understand.
This is why I get so annoyed when people say C is closer to the machine. No. The last machine C was especially close to was the VAX. C has gotten a lot further away from the machine in the last few decades.
The implication here is that you should teach C as an end in itself, not as an entry point into machine language. If you want to teach machine language, do it in its own course that has a strong focus on the underlying hardware. And don't claim C is 'just like' assembly.
2) Despite #1 C is still closer to the machine than Haskell, and I'm not sure how you could maintain otherwise
3) Nearly all of the C optimizations will, at best, make a speedup by a constant factor. Things that add (or remove) an O(n) factor in Haskell can and do happen.
You're not reading my other posts, then. I explicitly said programmers can learn those things as and when they need to.
In fact, your suggestion is exactly on point: The OP should pick-up Tannenbaum's book and take a year to implement everything in the book. Why a year? He is a web developer and, I presume, working. It will take more than time to learn what he does not know in order to even do the work. So, let's say a year.
I would suspect that after doing that his view of what he proposed might just be radically different.
For example, wait until he figures out that he has to write drivers for chip-to-chip interfaces such as SPI and I2C. Or that implementing a full USB stack will also require supporting the very legacy interfaces he wants to avoid. Or that writing low-level code to configure and boot devices such as DRAM memory controllers and graphics chips might just be a little tougher than he thought.
There's a reason why Linux has had tens of thousands of contributors and millions of lines of code:
...and that's just the kernel.
Writing some kind of a minimalist hobby OS on top of the huge body of work that is represented by the drivers and code that serve to wake up the machine is very different from having to start from scratch.
My original comment has nothing whatsoever to do with anything other than the originally linked blog post which describes almost literally starting from scratch, ignoring decades of wisdom and re-writing everything. That is simply not reasonable for someone who's experience is limited to doing web coding and dabbling with C for hobby projects. In that context, just writing the PCI driver code is an almost insurmountable task.
If I were advising this fellow I'd suggest that he study and try to implement the simplest of OS's on a small embedded development board. This cuts through all the crud. Then, if he survives that, I might suggest that he moves on to Tanenbaum's book and take the time to implement all of that. Again, in the context of a working web professional, that's easily a year or more of work.
After that --with far more knowledge at hand-- I might suggest that he start to now ask the right questions and create a list of modifications for the product that came out of the book.
Far, very, very far from the above is the idea of starting with a completely blank slate and rolling a new OS that takes advantage of nearly nothing from prior generations of OS's. And to do that all by himself.
What's the harm here? He's going to dive in and learn something, and he's probably going to get further along than you expect, because this stuff just isn't as complicated as people like to think it is.
You have to keep in mind that the OP is talking about such things as writing his own PCIe and USB stacks as well as everything else. He is leaving all history and prior work on the floor and re-inventing the wheel.
That's very far from writing a small RTOS for an 8-bit processor. In fact, my suggestion is that he should do just what you did in order to understand the subject a lot better. There's a lot of good Computer Science that can be learned with a small 8-bit processor.
Did I know all of that stuff? Of course not. Did it compare in complexity to what this article is proposing. Nope. The OS described in the article is far, far more complex than what I just described.
Is is incapable of doing it? Nope. I did not say that. I think I said that anyone who has written a non-trivial RTOS would laugh at the idea of what he described. Why? Because it is a monumental job for one person, particularly if they've almost done zero real development at the embedded level and they also have to work for a living.
I got started designing my own microprocessor boards, bootstrapping them and writing assembly, Forth and C programs before when I was about 14 years old. By the time I got to college I knew low-level programming pretty well. As the challenge to start diving into writing real RTOS's presented itself I could devote every waking hour to the task. Someone starting as a web developer --who presumably still needs to keep working-- and wanting to develop such an extensive OS is just, well, let's just say it's hard.
The guy built three planes. One didn't take of, one crashed, one brought him across Africa in short hops with landings at rebel-occupied airports where he didn't always manage to announce his arrival.
Short thread at http://www.homebuiltairplanes.com/forums/hangar-flying/12196....
(Apologies for deviating from the subject, but I figure hackers might find this interesting)
If you really have a good grasp of the concepts sometimes the hard part is the drudgery of possibly having to write all the device drivers for the various devices and peripherals that the RTOS has to service.
In the case of the OP, he seems to be talking about rewriting everything from the most basic device drivers on up to bootloaders and even the GUI. That's a ton of work and it requires knowledge across a bunch of areas he is not yet well-versed in.
Also, when it comes to the idea of writing an RTOS, there's a huge difference between an RTOS for, say, a non-mission-critical device of some sort and something that could kill somebody (plane, powerful machinery, etc.). That is hard not because the concepts require superior intellect but rather because you really have to understand the code and potential issues behind how you wrote it very well and test like a maniac.
I have written RTOS's for embedded control of devices that could take someone's arm off in a fraction of a second. Hard? Extremely, when you consider what the stakes are and particularly so if it is your own business, your own money and your own reputation on the line. There's a lot more to programming that bits and bytes.