I got this running on qemu by cannibalizing a tiny bit of code from xv6 (http://pdos.csail.mit.edu/6.828/2012/xv6.html) to replace the GRUB dependency. After cloning and building mkernel according to its instructions:
$ git clone git://pdos.csail.mit.edu/xv6/xv6.git
$ cd xv6
$ make
(Based on xv6 at hash ff2783442ea2801a4bf6c76f198f36a6e985e7dd and mkernel at hash 42fd4c83fe47933b3e0d1b54f761a323f8350904. Ping me if you have questions; email in profile.)
There's a slight problem in this tutorial in that it assumes ESP (the stack pointer) will be defined by the boot loader to point to an appropriate location for the stack. However, the Multiboot standard states that ESP is undefined [1], and that the OS should set up its own stack as soon as it needs one (here the CALL instruction uses the stack, and the compiled C code may well too).
An easy way to solve this is to reserve some bytes in the .bss section of the executable for the stack by adding a new section in the assembly file:
[section .bss align=16]
resb 8192
stack_end:
Then before you make use of the stack (between `cli` and `call kmain` would be appropriate in this case), you need to set the stack pointer:
This seems like a good start but it's worth noting that it isn't guaranteed to work correctly.
The problem is that the control is passed to C code with no stack space set up. It works out of pure luck because the compiler has decided to keep all variables in registers and changing your C compiler flags might make this fail in interesting ways.
Another thing that is missing is clearing the .BSS section before passing control to the C code. It's not used at the moment, though.
I am pointing these things out because in my own Ring—0 programming projects I spent a lot of time debugging some failures related to stack space and an un-initialized BSS section.
> Another thing that is missing is clearing the .BSS section before passing control to the C code. It's not used at the moment, though.
The Multiboot standard says that the boot loader will clear the .bss section for you - in section 3.1.3: "bss_end_addr Contains the physical address of the end of the bss segment. The boot loader initializes this area to zero"
I don't know what GRUB does if you rely on that fact it can parse ELF files and don't specify the fields like bss_end_addr though. I'm fairly sure it clears it in this case too, but I'm using Multiboot 2 for my OS so the behaviour could be different.
> The Multiboot standard says that the boot loader will clear the .bss section for you - in section 3.1.3: "bss_end_addr Contains the physical address of the end of the bss segment. The boot loader initializes this area to zero"
Ok, good to know.
It certainly doesn't do that unless you tell it to (using the address tag), and this example (nor my hobby kernel) use that.
So the BSS must be cleared or the bootloader told to do so.
I've just checked the GRUB source code and I think it will clear the .bss section even if it's loading an ELF file.
grub-core/loader/multiboot_elfxx.c has a function named grub_multiboot_load_elf32/64 which actually loads the segments of the ELF file. A segment has two fields defining its size: filesz (which is the amount of bytes to copy from the file) and memsz (which is its actual size once loaded). If memsz is greater than filesz, it zeroes the trailing bytes:
The .bss section is placed by the linker at the end of a segment and increases memsz by the size of it (but not filesz, to avoid having to place lots of pointless zeroes in the ELF file) - for example this is one of the segments from my kernel's ELF file, which contains the .bss section at the end:
Here you can see memsz is 0x17678 bytes and filesz is smaller at 0x4be0 bytes. The difference between them is the size of the .bss section.
grub_multiboot_load_elf32/64 is called in the case when the address tag is not present, so the .bss section will be cleared by GRUB in this case as well.
If anybody is doing this, let me share some words of advice based on experience.
Please use a virtual machine instead of doing this on your primary machine.
You eliminate the risk of messing up your machine. Also, if you setup the VM properly, you get a debugger.
It's still important to test on physical hardware though, perhaps on an old spare machine you don't care about if you want to be cautious, as the virtual machines do not perfectly emulate real hardware. For example, QEMU initializes memory to all zeroes, whereas on a real system it's typically all ones, which led to some interesting bugs in my OS on real hardware where I had forgotten to zero out some memory.
I also really enjoyed James Molloy's OS kernel development tutorial at http://www.jamesmolloy.co.uk/tutorial_html/, which takes you from "Hello World" to some real toy OS kernel implementation.
It's worth pointing out there are a few bugs in James Molloy's tutorial [1], and some of the things he does in them aren't exactly best practices - for example, a few I remember are:
- Disabling interrupts and paging (which also has the side effect of flushing the TLB) to copy memory around by physical address. This could be done without disabling them by mapping all of physical memory into virtual memory instead (possible in 64-bit mode, but in 32-bit there isn't enough room when your PC has a similar amount of RAM to virtual memory space, in which case you could map smaller parts of it as needed).
- Moving the stack around to get around the fact that GRUB doesn't set ESP to some well-defined value (instead of defining the stack yourself, which would be much more robust) and then attempting to rewrite the base pointers to fix it. For example, his code can't tell the difference between integers that just happen to have a value in the range of the pointers and a pointer, and will happily rewrite both. Also as ESP isn't defined by the Multiboot standard you could be using any location at all as the stack (such as some memory address that doesn't exist, or your kernel's code itself, or some memory-mapped area for a piece of hardware, etc.) All of which will mean things go wrong. It's better to just set ESP yourself before you enter C - see another of my comments on this submission here [3].
There's actually a newer and much better version of JamesM's tutorials on GitHub, but I believe they aren't quite finished [2].
Very cool. I personally (as a developer without a CS background) find these sorts of posts wonderfully interesting, even if this kernel, as pointed out in this thread, lacks a lot of what a normal kernel does. I'd love to see one of these for a compiler!
I highly recommend http://www.hokstad.com/compiler/ ; it talks about writing a compiler in a way that makes sense to me - writing it the way you write any other program, rather than throwing you straight in with lexers and parsers and never justifying why we need to do things this way.
Glad you like it... I'm hoping to push out the next part later this week, and I've got a few more drafts queued up that "just" needs some proofreading.
Working on getting it to the point where it can fully compile itself now, and hope to get there over the couple of months.
It fits very well into a full recursive descent parser and is much more efficient and flexible than hardcoding in productions for handling the operators (and you can dynamically add new ones easily).
I've found myself spending a lot more time here as well. Was there any code/policy change with the HN site? Or is this new moderators jumping in and helping out a lot? Either way, I agree that it is excellent.
About a week or two ago pg added moderation. I cannot find the thread right now, but basically comments are not published until someone with high enough karma (I think 1000 points) approves the comment.
Neat read so far! Not done yet, but I think I've found a small error in kernel.c: the attribute byte of the characters in "my first kernel" should be set to 0x02, not 0x07.
edit: I misread. 0x07 is intentional, 0x02 was mentioned as an alternative. Good post!
uaygsfdbzf: your account is marked dead it seems, so most of us can't see what you write.
Here's what he posted:
Not an error. 0x02 means green character on black background. 0x07 means light grey character on black bg. It's explained with one, and coded with another.
This was fantastic. My only question is, how does one gain knowledge of the required x86 hardware specifics he mentions? I don't know where to begin looking to uncover these sorts of things:
- The x86 CPU begins execution at the physical address [0xFFFFFFF0]
- The bootloader loads the kernel at the physical address [0x100000]
- The BIOS copys the contents of the first sector to physical address [0x7c00]
Is there an x86 instruction manual or is this sort of thing passed down through generations of engineers?
> Is there an x86 instruction manual or is this sort of thing passed down through generations of engineers?
Yes, there is an x86 instruction manuals, colloquially known as "intel manuals".
That's about a dozen volumes of documentation, at a few thousand pages each. It is quite a good example of well written and informative technical documentation.
There is also the JOS kernel part of an MIT course. There were instruction to build a special version of bochs (pc emulator) and run various kernels you develop in it.
... and the gcc flag would need to be -m64. But I don't think it's this easy. If grub is in 32-bit mode when it hands you control, I would think you would need to switch to Long Mode before you could execute 64-bit code. So your entry point would still have to be a 32-bit program, which would set up everything necessary for Long Mode, make the switch, and then load and execute the 64-bit part of your program.
> Can anyone give me an idea how much different this would be for 64bit? Do I just change the nasm directive to `bits 64`?
A lot different. You can see the boot code of my x86_64 hobby kernel project [1].
The reason is that GRUB/Multiboot protocol is actually doing most of the machine initialization, but it can only set up 32 bit mode. 64 bit mode is missing partly because standardizing the Multiboot protocol is dragging behind, partly because there's no one correct way to do this as you can't have identity mapped memory in 64 bit mode (unlike 32 bit mode).
If you read the OSDev wiki, you can find examples of doing machine initialization "from scratch", ie. after the PC BIOS (or UEFI). This involves arcane details about the x86 machine like setting up something called the A20 line (which was a hack that allows to have 1 megabyte of memory - utilizing a spare pin from the keyboard controller!), etc. Dealing with this shit is not time well spent. (UEFI is a lot easier in some ways, harder in others)
This means that the boot code of your kernel must set up long mode, create an initial page table for virtual memory, etc. Here's my limited 64 bit boot code that sets up one 2 megabyte page [2].
Going to 64 bit mode is not that much more code, but it will make kernel development more painful. In particular, switching CPU modes messes up the GDB debugger which must be patched to work at all. And there's a bit more work involved in all the little things that come with it.
So for educational purposes it would make more sense to stick to 32 bit mode than deal with the nitty gritty details of 64 bit long mode.
This makes me really sad. 32bit mode x86 assembly is such a mess compared to amd64 -- I guess it's a good excuse to work with qemu and arm, if nothing else ;-)
It would be difficult to write an article similar to this on ARM, because there's no equivalent of what the PC "standard" is to x86 for ARM. Every ARM board and SoC have different boot protocols and peripheral devices.
There are some bootloaders that are commonly used in ARM, such as U-Boot. Perhaps that could be used to get started.
Too late to edit my post, but I eventually found this: http://wiki.osdev.org/ARM_Integrator-CP_Bare_Bones. I haven't tried it out yet as I'm still getting the cross-compiler toolchain together (doesn't seem to be part of OpenSUSE 12, at least not an obvious part), and it's not as well explained, but still the best I've found.
I kinda wish some of the FS utilities were better. Despite FUSE, mounting a block device (like, disk image) as a non-root user is tough, so writing to an image with actual partitions/FS on it is difficult, especially to script, especially if you don't want to sudo in a build script. losetup on a disk image doesn't (to my knowledge) detect partitions… for reasons unknown to me.
Bootsectors are similar. You can't just install grub to a disk image. (You have to losetup it, at the very least, which implies root. Why can't I just install to a file?)
You want a script/build system that allows you, at the very least, to:
1. Code.
2. Run build system/script.
3. Fire up qemu or similar.
You can't be rebooting. Ideally, it'd be great to do this in userspace.
That said, if you're starting out, just do [boot sector] + [kernel] = tada image until you need to do otherwise. (Really, do whatever works and is easy.)
That said, I've found a few somewhat helpful tools.
fuseloop[1] takes a file and offset/size, and exposes a single file. If you have a partitioned disk image, then you can feed it the partition offset, and it gives you back something you can format as an FS. (e.g., you can run ext2fs on.)
Then there's fuse.ext2[2], which mounts an ext2 FS on FUSE, so non-root usable again. Note that I'm linking to my fork of it, since the original didn't build for me, but I didn't write it. (Which I fixed, and sent a pull request, but never heard back.)
Finally — and sorry to peddle my own stuff again — I wrote a Python library for dealing with the MBR.[3] I use it to figure out offsets and sizes in a disk image.
I've had a bit of fun writing a boot loader, and I've managed to get it to load up its stage 2 and switch to 32-bit pmode. Had a fun error where a division instruction was throwing things into a triple fault; see [4] if you want to see how a division instruction can fail without dividing by zero (which was the first thing I checked). The disk layout is currently:
[boot sector] [stage 2] [kernel] [ partitions, FS, real data, etc. ]
stage2's size is hard coded into the first sector of stage 2 (into itself), and the kernel's location and size will be similarly hard coded into it as well. (When I get there. Disks in pmode are different, as you can't just have the BIOS do all the work for you, sadly!) And by hard-coded, a build script calculates and just re-writes a few bytes.
I just finished an Operating Systems final today and the entire class was conceptual and learning fundamentals, while I craved to get my hands dirty and actually try to make a simple OS.
I'll definitely be playing around with this. Thanks!
Why kernel tutorials always say they need nasm. Gcc already comes with an assembler so simply use it instead of installing other softwares.
And if you prefer intel syntax simply add ".intel_syntax;"
C is really easy for this sort of thing. I don't know the specifics for using C++, Go, Rust, etc. (Disclamer, I'm a fan of C, and I know some C++ and Rust but little Go). The kernel also doesn't get run-time help, so a lot of features of C++, Rust, and Go would be gone right off the bat (Ex. You can't be spawning threads in Rust or throwing exceptions in C++ without first setting up your environment, which can be a non-trivial task). Using a subset of any of those languages should work, provided the compiler can provide straight ELF binaries that you could boot (I know g++ and rustc can do this, Go is the only one I wonder about.).
If you get to that point, then you have the fact that the C ABI is fairly standard, extremely easy to use, and well-known, so it makes it easy to intermingle assembly and C. If you did it in C++, you'd definitely want to extern "C" any symbol asm is going to call so it isn't mangled, Rust should be the same, no idea on Go (You'd really also want to extern "C" any user-space system calls, because the name mangling might get in the way.). For the last point, you have to be able to use pointers and write to arbitrary memory locations, which is easy for C and C++, possible in Rust, but I don't know about Go.
IMO, the code displayed in this example is really hardly a kernel (It is, but it doesn't really do anything). For a more complex kernel, you may get a benefit from using a language other then C, but for something this simple C lowers the complexity to get a working example with minimal issues going.
Go doesn't really work because it needs a runtime (as far as I know). Rust however is quite useable. There's a couple of kernels, including one by me: https://github.com/lexs/rust-os
C is (pretty) easy to use as a structured, portable assembler (eg: you get loops, easy-to-use variables, which you don't have in assembly) -- and easy to interface with assembly.
Pascal might be another good alternative -- as it also doesn't require a run-time. I don't think go will ever (officially) support this kind of thing, rust most likely will.
For an example of something that's not C, have a look at Marte OS, implemented in ADA:
I'm going to give OP the benefit of the doubt on this one. A couple things worth noting, it successfully boots, does not cause a fault of any kind, and is in a position to interact directly with the bare metal. The tools that we use to interact with a *nix system are often just that, bits of code in user-space. This is a kernel-space program. That it does not do any of the memory management or device access yet doesn't mean it's not a kernel.
It might actually be interesting to take these techniques and apply them to something else. Maybe a demo, ala the demoscene, that runs on the bare metal. Maybe implement a game that doesn't have the overhead of an os.
This is less a finished product and more an inspiration. So often I think that people miss that about things. Sometimes things aren't done, sometimes they're just beginning.
> Maybe implement a game that doesn't have the overhead of an os.
I'm in CMU's operating systems class right now and that was one of our projects (our second). The first was writing a stack-tracing debug library, the third was a user-space thread library built on top of a particular kernel spec, and then the fourth was to build a kernel basically from scratch, using our thread library as a test program.
That fourth project, or "p3" as it's known around here, is known to be a killer. It's tough to write a whole kernel in 6 weeks (+ 1 week of break), especially when it's expected to have a full virtual memory system, kernel tasks/threads for concurrent programming, program loading, interrupt handling, etc, even with a partner.
This isn't even a microkernel. This would at best be an example of firmware on an Intel/AMD x86/x64 booted from Grub that prints to the console.
This code, however, doesn't do a single thing that other software expect even a microkernel to do (provide for basic scheduling, memory management if an MMU is available [which it is in this case], and IPC/FS).
The key part of any kernel can be expressed from this Exokernel definition:
Exokernels are tiny, since functionality is limited to ensuring protection and multiplexing of resources, which are vastly simpler than conventional microkernels' implementation of message passing and monolithic kernels' implementation of abstractions.
They have to, in some way shape or form, deal with conflicting requests for resources from their client applications (whether or not you have a concept of privileged or protected execution like on basic microcontrollers). I'd be curious to see how exokernels manage time unless each application implements its own scheduler and gives up control of execution.
Reminds me of the old "booter" games on the PC that used their own kernel instead of MS DOS [1]
It would be fun to see how far one could go with modern hardware. Writing your own driver for a modern graphics card sounds absolutely terrifying (and fun).
Come on, no. It is not a kernel. It's a program that runs "on bare metal".
Let me define what a piece of code needs to do to be a "kernel": it needs to manage some resources to allow other programs to run using those resources. E.g. memory, cpu time, I/O peripherals etc.
A kernel is a program that runs "on bare metal". This program is equivalent to hello world for kernels - it's the simplest possible one. It is called "Kernel 101", after all.
A kernel may be a program that runs on bare metal, but not every program that runs on bare metal is a kernel.
"With the aid of the firmware and device drivers, the kernel provides the most basic level of control over all of the computer's hardware devices. It manages memory access for programs in the RAM, it determines which programs get access to which hardware resources, it sets up or resets the CPU's operating states for optimal operation at all times, and it organizes the data for long-term non-volatile storage with file systems on such media as disks, tapes, flash memory, etc."[1]
A program that prints "hello world" doesn't come close to meeting that description.
> If someone here has gone to college, then they'll know what 101 means - course numbering.
At most colleges whose course numbering system I've seen, the kind of introductory course that the idiom "101" refers to would actually be "1"; at many of them, "101" would be an upper division class. Its an idiom that may have connected to some colleges' numbering system at some time, but it mostly exists independently now, and actually going to college doesn't actually make its intended meaning any more obvious.
I went to 4 colleges, applied to a number of others (not all US based), taught at a few and am giving advice to friends kids now going to college. With minor variations (3-digit v 4-digit course numbers) I've never, ever seen a college course with a single digit course number. Or 2 digit. But then, I've never met an English speaker so culturally tone deaf that they would question the idiom "Foo 101" as being anything other than "intro to Foo".
The courses that I've taken that were very basic and introductory was named 100. The next one was perhaps 101. Introductory courses in more specialized courses often had numbers signifying which research group they belonged to, with a low number at the end. Practices vary, even at my own university.
I'm guessing that 101 specifically is an americanism.
from my experience almost no schools in america still use 101 as the introductory course. at this point, it's basically an idiomatic expression of american english.
It may or may not be "a kernel", but personally, I learned a lot from this one simple tutorial. As someone who normally plays with website code, even something this simple can be very helpful in understanding other areas of programming and how computers work.
And if you look at the comments on the post itself, you will see this "Hey, i am actually planning to write another post with addition of a keyboard driver among others. :)"
This post is just a small steping stone on the way to a kernel (assuming the poster continues)
Let me rephrase. This is _bootstrap code_ for something that might someday be a kernel. It could also be bootstrap code for something else, a game perhaps. But right now it's nothing else.
We already have a whole bunch of operating systems, many of them free.
One of the frequent problems with a lot of free/open source software folks is that they lack direction. This type of thing where we do stuff just to do stuff probably won't fly in one of the leading tech companies.
Why don't you figure out a real problem people have and look for ways to solve that, instead of just doing random "interesting" stuff that wastes people's valuable time?
Great! except that to be called a kernel it's missing just a process manager, memory manager, filesystem, process separation and hardware abstraction. Yeah I'm that guy, down vote me as you wish, the article is still wrong.
It's a way to load a ring-0 application into grub. Pretty cool, but not a kernel.
Is Super Mario Bros. a kernel? Every NES, Game Boy, Sega Genesis, etc... game ran on the "bare metal" without anything resembling an OS (or even a BIOS, really).