

X86 instruction encoding and the hacks we do in the kernel [pdf] - majke
http://events.linuxfoundation.org/sites/events/files/slides/bpetkov-x86-hacks.pdf

======
userbinator
The reference to the octal format (2-3-3) of the instructions is to this
document:

[http://reocities.com/SiliconValley/heights/7052/opcode.txt](http://reocities.com/SiliconValley/heights/7052/opcode.txt)

If you're at all interested in the x86 architecture, it's highly recommended
reading.

(This used to be one of the first search results for "x86 octal instructions".
Now it's nowhere in the first 10 _pages_ of results and searching for its link
directly doesn't even cause it to appear. What happened, Google...?)

~~~
ianlevesque
Not enough adsense on the page. But more seriously the fact that it is a
usenet post copied to reocities with probably zero inbound links hurts its
ranking.

~~~
cbd1984
> Not enough adsense on the page. But more seriously the fact that it is a
> usenet post copied to reocities with probably zero inbound links hurts its
> ranking.

Also not helping: Reocities is a website hosting the Geocities dump Jason
Scott and his merry band of guerrilla archivists made back before Yahoo killed
Geocities. The point of Reocities (and Oocities, and possibly others) is to
allow people to fix their old Geocities links by changing one letter in the
domain name.

So it's not only obscure, it's an ancient document which has been rehosted, so
none of the original links to it still work.

~~~
userbinator
The thing is, Googling the link (the Reocities version, not the original
Geocities one) reveals that others have linked to it a few times before -
that's how I originally found it - and I've also linked to it from here on HN;
yet Google seems to refuse to acknowledge the existence of that page itself,
as searching for specific quoted phrases in it shows. In other words this
isn't "link rot"; it's more like "Google rot".

I think it's even more unfortunate that such gems of information are being
lost not because the sites hosting them are gone, but because search engines
are rendering them inaccessible despite the sites still existing. In fact with
things like the Internet Archive, coming across a dead link is not so bad; not
being able to know (from a simple search) that a page with such information
actually exists, but just wasn't present in the results, is far worse.

This isn't the first time I've seen Google "disappear" pages still around and
containing useful information, but it makes for a good example.

Edit: I didn't know Jason Scott was behind Reocities - I think he deserves
another donation.

------
moyix
If you're interested in reading more about the various kinds of runtime code
patching used in Linux, there was a nice paper on it at last year's Malware
Memory Forensics Workshop:

Slides:

[https://www.acsac.org/2014/workshops/mmf/ThomasKittel-
Code%2...](https://www.acsac.org/2014/workshops/mmf/ThomasKittel-
Code%20Validation%20for%20Modern%20OS%20Kernels.pdf)

Paper:

[https://www.acsac.org/2014/workshops/mmf/Thomas%20Kittel-%20...](https://www.acsac.org/2014/workshops/mmf/Thomas%20Kittel-%20Code%20Validation%20for%20Modern%20OS%20Kernels-
acsacmmf_kittel.pdf)

------
legulere
I still wonder why AMD didn't opt for a saner encoding for 64 bit mode while
still mostly keeping assembly compatibility (32-bit binary code doesn't run in
64-bit mode anyway except some edgecases)

~~~
ithkuil
perhaps it has something to do with the fact that it was easier to port
existing compilers and perhaps even designing the first microarchitecture to
support it and give decent performances without having to wait too much and
loose competitive advantage.

I guess that 32bit mode could use most of the same microarchitecture and thus
guarantee that during the transition period people would still buy those new
chips.

Itanium was a good example of such a strategy that failed. However there might
have been other reasons as well.

~~~
Dylan16807
I wouldn't call Itanium 'such a strategy', because it wasn't even similar to
x86. 64 bit mode could have rearranged the instruction coding while keeping
everything from ASM up roughly the same. Few single byte opcodes and bigger
ranges for prefixes to improve density. Keeping all parts of register
specifiers together to improve sanity. Etc.

------
earlz
heh this reminds me of when I did some runtime patching in my own hobby OS
kernel. I used it specifically for interrupt routines so that I didn't have to
repeat this monstrosity header and footer for every interrupt handler.

Also, another thing begging for runtime modification is the `int` instruction
(used to create an interrupt). There is literally no way to choose a random
number of an interrupt to call. You're only option for that is to either do
runtime modification to construct the 2 byte opcode on the fly (with the
second byte being which interrupt to call), or to make a big table like `int
1; ret; int 2; ret; int 3; ret` and call into that manually. It is quite
infuriating

------
blt
The instruction format sure is interesting, to put it politely. I tried to
write a toy JIT once for compiling math expressions in a scripting language...
Didn't get very far.

