$ git clone git://pdos.csail.mit.edu/xv6/xv6.git
$ cd xv6
$ path-to-qemu/x86_64-softmmu/qemu-system-x86_64 -serial mon:stdio -hdb fs.img xv6.img -m 512
$ dd if=/dev/zero of=mkernel.img count=10000
$ dd if=bootblock of=mkernel.img conv=notrunc
$ dd if=../mkernel/kernel of=mkernel.img seek=1 conv=notrunc
$ path-to-qemu/x86_64-softmmu/qemu-system-x86_64 -serial mon:stdio -hdb fs.img mkernel.img -m 512
This thing doesn't depend on GRUB, per se. It requires a multiboot protocol compliant bootloader, and QEMU and Bochs emulator have one built-in.
All you need to do is:
qemu-system-i386 -kernel kernel
An easy way to solve this is to reserve some bytes in the .bss section of the executable for the stack by adding a new section in the assembly file:
[section .bss align=16]
mov esp, stack_end
The problem is that the control is passed to C code with no stack space set up. It works out of pure luck because the compiler has decided to keep all variables in registers and changing your C compiler flags might make this fail in interesting ways.
Another thing that is missing is clearing the .BSS section before passing control to the C code. It's not used at the moment, though.
I am pointing these things out because in my own Ring—0 programming projects I spent a lot of time debugging some failures related to stack space and an un-initialized BSS section.
The Multiboot standard says that the boot loader will clear the .bss section for you - in section 3.1.3: "bss_end_addr Contains the physical address of the end of the bss segment. The boot loader initializes this area to zero"
I don't know what GRUB does if you rely on that fact it can parse ELF files and don't specify the fields like bss_end_addr though. I'm fairly sure it clears it in this case too, but I'm using Multiboot 2 for my OS so the behaviour could be different.
Ok, good to know.
It certainly doesn't do that unless you tell it to (using the address tag), and this example (nor my hobby kernel) use that.
So the BSS must be cleared or the bootloader told to do so.
grub-core/loader/multiboot_elfxx.c has a function named grub_multiboot_load_elf32/64 which actually loads the segments of the ELF file. A segment has two fields defining its size: filesz (which is the amount of bytes to copy from the file) and memsz (which is its actual size once loaded). If memsz is greater than filesz, it zeroes the trailing bytes:
if (phdr(i)->p_filesz < phdr(i)->p_memsz)
grub_memset ((grub_uint8_t *) source + phdr(i)->p_filesz, 0,
phdr(i)->p_memsz - phdr(i)->p_filesz);
LOAD off 0x0000000000020000 vaddr 0xffffffff8011f000 paddr 0x000000000011f000 align 2**12
filesz 0x0000000000004be0 memsz 0x0000000000017678 flags rw-
grub_multiboot_load_elf32/64 is called in the case when the address tag is not present, so the .bss section will be cleared by GRUB in this case as well.
In particular, setting up interrupt handlers, paging, and getting a PIC setup is pretty neat.
Please use a virtual machine instead of doing this on your primary machine.
You eliminate the risk of messing up your machine. Also, if you setup the VM properly, you get a debugger.
I also really enjoyed James Molloy's OS kernel development tutorial at http://www.jamesmolloy.co.uk/tutorial_html/, which takes you from "Hello World" to some real toy OS kernel implementation.
- Disabling interrupts and paging (which also has the side effect of flushing the TLB) to copy memory around by physical address. This could be done without disabling them by mapping all of physical memory into virtual memory instead (possible in 64-bit mode, but in 32-bit there isn't enough room when your PC has a similar amount of RAM to virtual memory space, in which case you could map smaller parts of it as needed).
- Moving the stack around to get around the fact that GRUB doesn't set ESP to some well-defined value (instead of defining the stack yourself, which would be much more robust) and then attempting to rewrite the base pointers to fix it. For example, his code can't tell the difference between integers that just happen to have a value in the range of the pointers and a pointer, and will happily rewrite both. Also as ESP isn't defined by the Multiboot standard you could be using any location at all as the stack (such as some memory address that doesn't exist, or your kernel's code itself, or some memory-mapped area for a piece of hardware, etc.) All of which will mean things go wrong. It's better to just set ESP yourself before you enter C - see another of my comments on this submission here .
There's actually a newer and much better version of JamesM's tutorials on GitHub, but I believe they aren't quite finished .
I have seen James tutorial a while back, and I agree. He does a great job.
Working on getting it to the point where it can fully compile itself now, and hope to get there over the couple of months.
FYI, I recommend using the precedence climbing algorithm if you do need to do expression parsing: http://eli.thegreenplace.net/2012/08/02/parsing-expressions-...
It fits very well into a full recursive descent parser and is much more efficient and flexible than hardcoding in productions for handling the operators (and you can dynamically add new ones easily).
edit: I misread. 0x07 is intentional, 0x02 was mentioned as an alternative. Good post!
Here's what he posted:
Not an error. 0x02 means green character on black background. 0x07 means light grey character on black bg. It's explained with one, and coded with another.
- The x86 CPU begins execution at the physical address [0xFFFFFFF0]
- The bootloader loads the kernel at the physical address [0x100000]
- The BIOS copys the contents of the first sector to physical address [0x7c00]
Yes, there is an x86 instruction manuals, colloquially known as "intel manuals".
That's about a dozen volumes of documentation, at a few thousand pages each. It is quite a good example of well written and informative technical documentation.
Other resources include http://www.brokenthorn.com/Resources/OSDevIndex.html
There is also the JOS kernel part of an MIT course. There were instruction to build a special version of bochs (pc emulator) and run various kernels you develop in it.
* Read mode has 1MB of memory and is 16 bit.
* Flat protected mode which is 32 bit and takes interrupts
* Segmented Protected mode which is 64 bit (w/ 32 bit emulation) and is used by operating systems to protect memory
In this tutorial, your boot mode depends on what GRUB boots you in.
A lot different. You can see the boot code of my x86_64 hobby kernel project .
The reason is that GRUB/Multiboot protocol is actually doing most of the machine initialization, but it can only set up 32 bit mode. 64 bit mode is missing partly because standardizing the Multiboot protocol is dragging behind, partly because there's no one correct way to do this as you can't have identity mapped memory in 64 bit mode (unlike 32 bit mode).
If you read the OSDev wiki, you can find examples of doing machine initialization "from scratch", ie. after the PC BIOS (or UEFI). This involves arcane details about the x86 machine like setting up something called the A20 line (which was a hack that allows to have 1 megabyte of memory - utilizing a spare pin from the keyboard controller!), etc. Dealing with this shit is not time well spent. (UEFI is a lot easier in some ways, harder in others)
This means that the boot code of your kernel must set up long mode, create an initial page table for virtual memory, etc. Here's my limited 64 bit boot code that sets up one 2 megabyte page .
Going to 64 bit mode is not that much more code, but it will make kernel development more painful. In particular, switching CPU modes messes up the GDB debugger which must be patched to work at all. And there's a bit more work involved in all the little things that come with it.
So for educational purposes it would make more sense to stick to 32 bit mode than deal with the nitty gritty details of 64 bit long mode.
There are some bootloaders that are commonly used in ARM, such as U-Boot. Perhaps that could be used to get started.
Bootsectors are similar. You can't just install grub to a disk image. (You have to losetup it, at the very least, which implies root. Why can't I just install to a file?)
You want a script/build system that allows you, at the very least, to:
2. Run build system/script.
3. Fire up qemu or similar.
You can't be rebooting. Ideally, it'd be great to do this in userspace.
That said, if you're starting out, just do [boot sector] + [kernel] = tada image until you need to do otherwise. (Really, do whatever works and is easy.)
That said, I've found a few somewhat helpful tools.
fuseloop takes a file and offset/size, and exposes a single file. If you have a partitioned disk image, then you can feed it the partition offset, and it gives you back something you can format as an FS. (e.g., you can run ext2fs on.)
Then there's fuse.ext2, which mounts an ext2 FS on FUSE, so non-root usable again. Note that I'm linking to my fork of it, since the original didn't build for me, but I didn't write it. (Which I fixed, and sent a pull request, but never heard back.)
Finally — and sorry to peddle my own stuff again — I wrote a Python library for dealing with the MBR. I use it to figure out offsets and sizes in a disk image.
I've had a bit of fun writing a boot loader, and I've managed to get it to load up its stage 2 and switch to 32-bit pmode. Had a fun error where a division instruction was throwing things into a triple fault; see  if you want to see how a division instruction can fail without dividing by zero (which was the first thing I checked). The disk layout is currently:
[boot sector] [stage 2] [kernel] [ partitions, FS, real data, etc. ]
You can use the kpartx command for this. This site has a good overview: http://nfolamp.wordpress.com/2010/08/16/mounting-raw-image-f...
It's pretty basic, currently just boots up and prints the memory map.
It's written in C++11 with the aim to be as clear as possible: https://github.com/thasenpusch/simplix
Have a look :-).
I'll definitely be playing around with this. Thanks!
I'm just curious if another language can be used. (c++, go, rust).
If you get to that point, then you have the fact that the C ABI is fairly standard, extremely easy to use, and well-known, so it makes it easy to intermingle assembly and C. If you did it in C++, you'd definitely want to extern "C" any symbol asm is going to call so it isn't mangled, Rust should be the same, no idea on Go (You'd really also want to extern "C" any user-space system calls, because the name mangling might get in the way.). For the last point, you have to be able to use pointers and write to arbitrary memory locations, which is easy for C and C++, possible in Rust, but I don't know about Go.
IMO, the code displayed in this example is really hardly a kernel (It is, but it doesn't really do anything). For a more complex kernel, you may get a benefit from using a language other then C, but for something this simple C lowers the complexity to get a working example with minimal issues going.
Pascal might be another good alternative -- as it also doesn't require a run-time. I don't think go will ever (officially) support this kind of thing, rust most likely will.
For an example of something that's not C, have a look at Marte OS, implemented in ADA:
It might actually be interesting to take these techniques and apply them to something else. Maybe a demo, ala the demoscene, that runs on the bare metal. Maybe implement a game that doesn't have the overhead of an os.
This is less a finished product and more an inspiration. So often I think that people miss that about things. Sometimes things aren't done, sometimes they're just beginning.
I'm in CMU's operating systems class right now and that was one of our projects (our second). The first was writing a stack-tracing debug library, the third was a user-space thread library built on top of a particular kernel spec, and then the fourth was to build a kernel basically from scratch, using our thread library as a test program.
Here's a link to the spec for the game we built on the bare metal: https://www.cs.cmu.edu/~410/p1/proj1.html
That fourth project, or "p3" as it's known around here, is known to be a killer. It's tough to write a whole kernel in 6 weeks (+ 1 week of break), especially when it's expected to have a full virtual memory system, kernel tasks/threads for concurrent programming, program loading, interrupt handling, etc, even with a partner.
This isn't even a microkernel. This would at best be an example of firmware on an Intel/AMD x86/x64 booted from Grub that prints to the console.
This code, however, doesn't do a single thing that other software expect even a microkernel to do (provide for basic scheduling, memory management if an MMU is available [which it is in this case], and IPC/FS).
The key part of any kernel can be expressed from this Exokernel definition:
Exokernels are tiny, since functionality is limited to ensuring protection and multiplexing of resources, which are vastly simpler than conventional microkernels' implementation of message passing and monolithic kernels' implementation of abstractions.
It also applies here: There's generally a userspace scheduler which programs can register themselves with. Or they can implement their own.
It would be fun to see how far one could go with modern hardware. Writing your own driver for a modern graphics card sounds absolutely terrifying (and fun).
CGA graphics(!), and pushed the monophonic speaker on a basic XT further than anything I'd seen at the time.
Unplayable on "turbo" mode :-)
Let me define what a piece of code needs to do to be a "kernel": it needs to manage some resources to allow other programs to run using those resources. E.g. memory, cpu time, I/O peripherals etc.
You know a kernel when you see one.
"With the aid of the firmware and device drivers, the kernel provides the most basic level of control over all of the computer's hardware devices. It manages memory access for programs in the RAM, it determines which programs get access to which hardware resources, it sets up or resets the CPU's operating states for optimal operation at all times, and it organizes the data for long-term non-volatile storage with file systems on such media as disks, tapes, flash memory, etc."
A program that prints "hello world" doesn't come close to meeting that description.
At most colleges whose course numbering system I've seen, the kind of introductory course that the idiom "101" refers to would actually be "1"; at many of them, "101" would be an upper division class. Its an idiom that may have connected to some colleges' numbering system at some time, but it mostly exists independently now, and actually going to college doesn't actually make its intended meaning any more obvious.
I'm guessing that 101 specifically is an americanism.
This post is just a small steping stone on the way to a kernel (assuming the poster continues)
(technically it was two lines of text printing alternately to prove the scheduler worked, but close enough)
We already have a whole bunch of operating systems, many of them free.
One of the frequent problems with a lot of free/open source software folks is that they lack direction. This type of thing where we do stuff just to do stuff probably won't fly in one of the leading tech companies.
Why don't you figure out a real problem people have and look for ways to solve that, instead of just doing random "interesting" stuff that wastes people's valuable time?
It's a way to load a ring-0 application into grub. Pretty cool, but not a kernel.
Did MS-DOS had a kernel?