Hacker News new | past | comments | ask | show | jobs | submit login

creating recursive page tables is bad idea.


It's not a general solution- you often need to access the page tables of other address spaces, and those aren't picked up by a recursive mapping.

And while a single recursive mapping is a relatively small fraction of the address space, you probably don't want to do it for all address spaces simultaneously.

So you need something more general anyway, and you may as well just use it all the time.

It is possible to access other address spaces, it is just a bit more complicated:

- Set the recursive entry of the active level 4 table to the (recursive) level 4 table of the other address space. - Now the recursive addresses point to the other address space, so you can access its tables freely. - The difficult part is to switch back to the active address space again because can no longer access the original level 4 entry. This can be solved with a temporary mapping to the active level 4 table or by using different level 4 entries for the recursive mapping of active and inactive address spaces.

However, I'm no longer sure whether recursive page tables are a good choice for the blog since they're complicated, difficult to learn and not portable. See my reply to DSMan.

this does seem really complicated. my guess is that it doesn't save on TLB space since the hierarchy is kind of baked into that cache. it saves a little bit of memory?

It really doesn't save anything, and makes some operations like finding the physical address for a virtual address more complicated. His description of "identity mapping" is also a bit too focused on page-tables and kinda misses the bigger picture. x86_64 has a 48-bit memory space, so you can literally get away with identity-mapping all of your physical memory into a higher-half location with no issues, because the virtual-address space is much larger then the amount of physical memory a 64-bit computer is going to have. Ergo, even if you went with his described approach and just made a single identity mapping for each page table, it would still use so little virtual memory space you could just allocate from elsewhere for your memory-mapped files and such with no worry of them touching eachother. And by doing a single large allocation of all of physical memory, you can use very large page sizes to setup the identity map, significantly reducing the number of pages required.

There is also a presumption that you can only choose one mapping technique, which isn't the case with x86_64 since it has such a large virtual address space and you can map the same page in more then one location. So, as long as you keep track of what's happening with your pages, you can have an allocator that just allocates contiguous pages in the identity mapping, and also have an allocator that maps separate physical pages into a contiguous virtual mapping, and both allocators can coexist perfectly fine as long as you ensure pages are only allocated to one purpose at a time (Which isn't very hard to do. You can even have your virtual allocator just call your physical allocator).

Thanks for the feedback!

I chose recursive page tables because the underlying mapping is very simple as it only requires to map a single level 4 page table entry. Thus, the bootloader does not need to force any address space layout on the kernel. If the kernel wants to use a different mapping scheme, it can use the recursive page table to perform that mapping and then unmap a single entry to undo the recursive mapping.

That being said, I like your proposal to map the complete physical address space to some higher-half location. It would allow to introduce a simple `phys_to_virt` function, which completely avoids the bitwise operations required for recursive page tables. It would also make this blog post much easier, which is always a good thing.

The question is how we can support that approach in the `bootloader` and `x86_64` crates without breaking users who wish to continue using recursive paging. I opened an issue in the blog repository with some ideas: https://github.com/phil-opp/blog_os/issues/545

Admittedly, when I made that comment I had not had a chance to fully read your article (And it's a good article by the way, I really like this series). I didn't realize that the mappings you're creating are just for your bootloader and the OS is going to throw them away. On that note, there's lots of things I want to mention/point out, hopefully it forms something of a coherent thought.

First, I would recommend taking a look at how bootloaders like GRUB handles this, since they do it pretty dead simple and make it easy for kernels to know exactly what situation they're going to be in right when they start. For GRUB, when loading a `x86_64` ELF kernel, it loads the kernel's segments at whatever physical address they are linked at (ELF records separate physical and virtual addresses for segments), and then identity maps everything and jumps to the kernel's entry point. Note it doesn't do anything else, like setup a stack, it assumes the kernel will do that if it wants one.

At that point, there's a problem, because almost all kernels are going to be higher-half and thus linked at a different virtual address then their physical address (But GRUB ignored the virtual addresses). Thus, the kernel's entry point needs to have some type of stub which can run at the identity mapped location, create a new page table with both the identity mapping and the higher-half mapping, and then jump to the actual kernel at the virtual location.

Now with that, I want to point out that the kernel does not need to use the page table that GRUB set up, and in fact GRUB doesn't even provide a good way to do that. But they don't have too - the kernel can simply load a new one into `CR3` to completely replace GRUBs page table. And once they do that, they can jump to the higher-half and then remove the identity mappings. And they can do this without having to do any allocation because they can simply make the initial page tables static variables, and then they'll already be in memory ready to be used once GRUB loads the kernel into the correct physical addresses.

You can see that in my kernel here[0], though it's a fair amount different being that it's 32-bit. The same idea applies though - my code is linked at a higher-half address, but GRUB jumps to the entry point while running at the lower physical address. This means that if I were to jump straight to my C code, it would be completely broken, because it was compiled assuming it was running at `0xC0100000`, not `0x00100000`. The assembly stub fixes stuff up just enough so that we can run at `0xC0100000`, and then later in the kernel we setup a more involved mapping.

This gets to another important point that I couldn't really figure out - how is your kernel linked? I looked at `cargo-xbuild` but I couldn't figure out it. I don't quite know what linking looks like for Rust, but I would think at some point you drop down to the LLVM linker and should have some linker script like this[1]. The `AT` specifies the physical address for those segments to be placed at, where the `. = KMEM_LINK` specifies the location the code will run from - that's how you achieve a higher-half kernel. I'm guessing that, if you don't have any linker scripts, then it just gets linked at whatever addresses the linker happens to pick for normal executables, which will cause you problems down the road since your kernel will be located at a weird location.

And now that I'm looking at the code further, I'm guessing you may have already run into this problem, since I noticed in `bootloader`[2] that you ignore the ELF file's physical addresses[3] and instead just load the kernel at a constant physical address. I would highly recommend against this, the ELF kernel should have correct physical addresses and if you ignore them then the kernel will not be able to correctly create a new page table itself, since it will setup incorrect mappings based on the assumption of where it thinks the kernel is located in physical memory vs. where it was actually loaded. The identity mapping will also not working, since the stub to load the new page tables will attempt to use incorrect physical addresses.

This does make the bootloader a fair amount more challenging though, since you have to move the code that loads the kernel into memory into an piece of memory unused by the kernel. But I would say for a first pass, simply moving the bootloader to a somewhat higher physical address and then assume the kernel will be at a low physical address (and panic if it's not) would be fine. Also, you still need to place the `BootInfo` somewhere, but you can just pick any random physical memory location. As long as the kernel knows the physical address of where it is, it can avoid writing over it (Or more likely, the kernel will just copy the entire thing into a separate structure inside the kernel to keep things simple). Either way, the kernel can handle that pretty easily.

> The question is how we can support that approach in the `bootloader` and `x86_64` crates without breaking users who wish to continue using recursive paging.

I don't really have a good answer for you as for avoiding breakage (I think it's somewhat impossible here), but I would argue that any kernels that want to setup their own recursive setup should simply do it themselves. They can write the entry point to their kernel to do whatever they want, including setting up a recursive mapping.

If you have any questions please feel free to ask them! I could also try to condense this somewhat and put it on the `github` issue if you'd like. But to respond directly to the ideas in the `github` issue:

> The recursive_level_4_table feature enables the recursive mapping of a level 4 entry that we currently have. Provides a RECURSIVE_LEVEL_4_TABLE_ADDR constant to the kernel. > The map_physical_memory feature maps the complete physical address space to some virtual address. Provides a PHYSICAL_MEMORY_OFFSET constant to the kernel.

If you respect the physical addresses in the ELF file and setup an identity mapping with them, then neither of these are necessary - the kernel doesn't need the `PHYSICAL_MEMORY_OFFSET` because it already knows it's at the locations specified in the ELF file, and the kernel doesn't need the location of the current page table at all because it can just replace it with a completely new one that also has the correct identity mappings for the current physical memory location.

[0] https://github.com/mkilgore/protura/blob/master/arch/x86/boo...

[1] https://github.com/mkilgore/protura/blob/master/arch/x86/boo...

[2] https://github.com/rust-osdev/bootloader/blob/master/src/mai...

[3] https://docs.rs/xmas-elf/0.6.2/xmas_elf/program/enum.Program...

Thanks for the detailed reply!

I don't like the traditional GRUB approach because it leaves so much work for the kernel (assembly entry point, stack setup, creation of new page tables, etc). Why not treat the kernel like any normal application and load it like a normal ELF loader? This has the following advantages:

- Each loadable program segment is mapped at its specified virtual address. To create a higher-half kernel, just set the desired virtual address in your linker script. If you don't need a higher-half kernel yet, simply use the default linker script. This is what we do at the moment since we don't have any userspace yet. Like in normal applications, the specified physical addresses do not matter.

- There's no assembly code required at the entry point. You can start directly in Rust because there's already a stack and everything lives at its expected address.

- Less startup overhead, since the page tables are only set up once in the bootloader. With the GRUB approach, the kernel needs to recreate the page tables since GRUB only created rudimentary mappings.

Like you said, the disadvantage is that the kernel doesn't know its physical addresses. But it can still find them out by traversing the page tables.

Overall, I think it's worth it because it makes the startup process much easier from the kernel's perspective. I believe that this is especially important for a teaching project because readers shouldn't have to understand assembly, linker scripts, virtual/physical addresses, higher half kernels, and page tables before they can even boot their Rust kernel. For people that don't like the bootloader's decisions we can add more configuration options or they can use a custom version of the bootloader (e.g. the nebulet project does this [1]).

[1]: https://github.com/nebulet/nebulet/blob/c0f3c993a87e72a95f85...

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact