

A short introduction to the x86 instruction set encoding - penberg
http://penberg.blogspot.com/2010/04/short-introduction-to-x86-instruction.html

======
tptacek
Great article. I've "known" x86 asm since I was a teenager, but waded into the
actual encoding just a couple years back, fearing the worst based on
reputation. It's not that hard at all. Understanding how register and memory
addressing works takes you a long way towards understanding the whole encoding
model.

Here's a Ruby library I wrote 3 years ago, which we used to bootstrap a cross-
platform debugger for reversing work:

[http://github.com/tduehr/ragweed/blob/master/lib/ragweed/ras...](http://github.com/tduehr/ragweed/blob/master/lib/ragweed/rasm/isa.rb)

I think Zed Shaw pointed this out last year, and I second it: pulling assembly
into a scripting language is (a) not very hard and (b) incredibly powerful.

------
hga
" _What's interesting about this format is that it's identical all the way
from 8086 to x86-64 with the exception of SIB byte and REX prefix.... Intel,
AMD, and other vendors have added more instructions but the above encoding has
survived over the years. Pretty neat, huh?_ "

Indeed.

------
zandorg
It took me a few days of reading the x86 (586) instruction set guide, to make
a Lisp function called 'get-opcode-byte-count' which, given a few bytes (eg,
20 bytes), calculates the MOD/RM stuff, to give the number of bytes in the
next opcode. It's part of the pre-processing of my decompiler - finding all
the code and branching into branches, subroutines, etc, is crucial when
decompiling an x86 binary (but it doesn't do this as well as IDA, the
disassembler, which uses tons of tricks to find code).

------
DCoder
A few years ago I had just a passing familiarity with assembler. Nowadays I
write C++ that integrates with a closed source application, deals with its
data structures, classes, and function calls, and adds functionality, as a
hobby. Too bad I can't find a job doing something like that...

------
s800
I really think the same could be said for any architecture. Imagine a 65864
:-).

