That link is in the text "See the spec here." in the word "here".
I usually use adblock but I used the disable-all-extensions extension to debug something and forgot to turn them back on.
That documentation is usually.... lacking
> You can't fit a filesystem parser into 1 sector and have anything other sensible in there
You don't use a single sector. You use a 2 stage bootloader that uses a reserved FAT sector. In FAT12 (at least) you can specifiy how many reserved secotrs on formatting. A reserved sector comes directly after the first sector. You can do a full FAT12 implementation in assembly in two sectors.
> using bios generally (on x86 atleast). and you can understand how to do it from this tutorial.
You're going to need to do a lot more reading. Even just looking up the interrupts you need is a task. I've found nothign that does a good job explaining the 1, 2, and 3s of how to get all the way to booting from a formated disk.
There is unlikely to be a single thorough body of documentation on this sort of thing. IMHO, the best resource would be the numerous open source bootloaders that are available.
There isn't much call for FAT12 these days...
That said: if you're curious and want to learn, I have no objections to digging into stuff, even "obsolete" stuff like BIOS boot :)
Either way, it was definitely fun.
and for an example of how much stuff you can actually put inside some 440 bytes, the base reference is mbldr:
You may find it interesting.
Basically you won't bother writing a bootloader with UEFI, since it already provides that feature. Instead you'll get right at working on your OS.
which is certainly not an easy tutorial to follow. Seems like more robustness tends towards a greater barrier to entry.
It's a lot easier and nicer to use either UEFI or a Multiboot-protocol bootloader (GRUB, or qemu/bochs built-in bootloader). That also means you will be starting in long mode or protected mode.
In order to work with bare metal efficiently, you need to get a debuggable and bootable ELF image up and running as soon as possible. Once you can attach a remote GDB debugger to Qemu, you get things done much faster than using the Qemu/Bochs monitor/debugger.
Anything more complex than a hello world example is easier with UEFI or multiboot than BIOS boot sectors. You will need a linker script and a build script but things will be much easier after the first steps.
I made this pull request for someone else's bare metal project to add debuggable ELF images. It's a small but practical example:
It might be tooling. A 32-bit assembler (like gas) will turn `mov %esp,%ebp` into `89 e5` while `mov %sp,%bp` becomes `66 89 e5` -- the former being correct when actually in 16-bit.
> And there is 0xFFFF limit on segment descriptors in real mode anyway.
He is using nasm with "bits 16" directive, so 66 prefix will be emitted for "mov ebp,esp". gas with 32-bit target is totally unrelated to this discussion.
So what? MBR runs in 16-bit real mode.
So he is!
I had to download nasm to check, but that sounds useful.
locations of his binary file (which is identical to the binary I can produce with the nasm):
:0000001f 6683c402 add $0x2,%esp
:00000025 6655 push %ebp
:00000027 6689e5 mov %esp,%ebp
At the time, bootloaders could add additional functionality by staying resident and hooking various interrupts to provide services or work around bugs in the BIOS. For instance, my bootloader would translate sector mappings that would allow WinPE to boot from USB drives that would not otherwise work due to (idiotic) limitations in some BIOSes.
This is a fun example but doesn't really go very far. Handling memory and reading additional code from disk are other topics that are required for a full bootloader, but this was a nice trip down memory lane.
The OP states:
"Since our code resides at 0x7C00, the data segment may begin at 0x7C0"
0x7C00 in decimal is 31744 and 0x7C0 in is decimal 1984 and a segment is 16 bits. I understand that the CPU is hardwired to do a JMP to 0x7C00. But how did they arrive at choosing 0x7C0 for the data segment?
So typically the first thing you do on entry to the boot sector is to do a far jump to change code segments. But you don't have to; if you're immediately switching to 32-bit mode it may not be worth it.
Here's my boot sector --- everyone's written one. Mine's intended to load and run raw tiny mode executables on floppy.
If those are both the same address how can the code segment and the data segment occupy the same address?
It's also possible to configure the system so that your code lives in one segment, and your data and stack live in another. That way, you get 64kB for code and another 64kB for data.
We have to put the stack in the same segment as the data because C requires the stack to be addressable; if you're not working in C, you can put the stack in a third segment and get a bit more space.
Of course, if you're willing to use pointers which contain the segment as well as the offset, so 32 bits wide, you can use multiple segments, but that raises all sort of complexity that's not really worth thinking about these days.
"We have to put the stack in the same segment as the data because C requires the stack to be addressable"
Are symbols only "reachable" if they are within the base and limit of same memory segment then? If this was the case I would think that the code and stack must be in the same segment? Why is the data segment?
In ubuntu for example, whilst booting, ubuntu starts loading more of the malware from the metacache which suggest the hard drive cache may have been filled by the initial boot loader for the malware, again helping to hide the malware from detection. When booting different OS's, it works with XP, Ubuntu, Parted Magic, Kali, Tails and others. The OS's seem to actively hide the malware if you use a hex editor to scan the drive or infected files, so over time, some of that open source code has become compromised, and lets not forget the Dirty Cow exploit has been around since 2007 potentially making it possible to hack many different packages that make up the core Linux OS. It also seems to use SNMP to hack into managed switches, so whether this is getting into the Stuxnet/DuQu/DuQu2 territory, remains to be seen, but I would suggest it is, this then narrows it down to one country, because in maths its possible to calculate unknowns.
Remember, software always does what its told, unlike Humans and here in lies its weakness!
>rewrite the make and model of the hard drives
But I certainly see what you mean; I was probably too quick to use that stock phrase. My intent here wasn't to argue, because I know that's impossible; it was more to identify the pathology to others. Would my comment have been fine without the last line? (I'm assuming that you were the one to flag it.)
*: and I mean that in the psychiatric sense, not as an insult
There's probably a way of making this point that's fine, but it would require pre-emptively distinguishing your comment from the obvious misinterpretations. Which is tedious, but such is the way communication on a large forum needs to work, or else people just react and attack.
However having just read this https://www.facebook.com/dragosr/posts/10151655183445588 I can see he has explained much of what I was witnessing on some systems as well.
Cant rule out a modern day version of one of these https://en.wikipedia.org/wiki/Phoebus_cartel considering the Windows MSR partition I have copies of which effectively stripes the Windows Partition as well.
Some people are desperate to keep this quiet though, so perhaps suggesting what you have is your way to introduce doubt into an argument which is after all a valid debating technique.
" The OS's seem to actively hide the malware if you use a hex editor to scan the drive or infected files, so over time, some of that open source code has become compromised"
This part, in particular, I find hard to believe. You can inspect open source system and compile everything yourself. Did you try to reproduce this behavior on minimal systems (i.e. BeagleBone) or QEMU?