
Let’s Learn x86-64 Assembly: Part 0 – Setup and First Steps - nice_byte
https://gpfault.net/posts/asm-tut-0.txt.html
======
simias
It's very common for people to learn assembly using x86(-64) but this ISA is
so messy, complicated and layered that it always seems like the bad choice to
me. It's like teaching an intro programming course using C++, I understand why
it's practical but it seems like it will be more trouble than it's worth.

I'd probably recommend getting some ARM or ARM64 board and starting with this,
most of the concepts will carry over to other assemblies anyway. Writing a
simple CPU emulator for some simple architecture and then coding on it can
also be a great teaching experience, if a little more involved.

I find this tutorial a bit oddly structured too, but that may be because I'm
from un*x world and I don't have the Windows mindset. It basically goes from
"int3 ret" to "The PE Format and DLL Imports". That seems like quite the jump,
and not necessarily super relevant to learning ASM IMO.

~~~
roel_v
One of the main reasons for learning assembly today is to do reverse
engineering, for which the plumbing is just as important as the concepts.

~~~
simias
I mainly use assembly for optimization myself, in general to double-check that
the compiler did what I expected it to do. I also sometimes use it when I do
bare-metal programming or other very low level code, but that's pretty niche
these days.

But even more generally I think you won't have any trouble picking up x86
idiosyncrasies if you've familiarized yourself with ARM/MIPS/Z80 assembly.
Although maybe MIPS would be a bad choice because it doesn't use flags which
are an important concept for many assembly languages.

IMO the key concepts for an assembler tutorial would be, out of the top of my
head:

\- The stack, \- banking registers, the frame pointer, \- The various types of
jumps/calls/branches and their differences, \- Conditionals, \- Calling
conventions, \- Banking/context switching/IRQs (at least for low level
programming, not so important if you're only dealing with userland I suppose),

That stuff exists on basically any architecture. Then you have things like
immediate encoding and addressing modes which are also very important and
architecture-specific.

I suppose SIMD could be interesting as well, but these days it seems to be
mostly done with intrinsics instead of raw assembly, at least in my
experience.

~~~
roel_v
Yes, when you study assembly as a general programming language, the same way
you learn another language. But again, for people who are interested in
reverse engineering, the focus is different, and you'll find that if you start
looking for assembly tutorials online (and also the books on modern assembly,
e.g. I have "Modern x64 assembly language programming" and "Windows 64 bit
assembly language programming quick start" here next to me and those fit that
description), a lot of those are for aspiring reverse engineers. The
(numerical) optimization angle is a lot harder to find, agner.org probably
being the exception.

E.g., knowing what _main is and how to call an OS primitive is a lot more
important when reverse engineering than knowing about context switching and
simd. I was just trying to say, that's where the focus on a specific OS comes
from, and a focus on things that people who do manual numerical optimization
would consider irrelevant or at best tangential to what they consider
'important' in assembly.

------
mlang23
I recently took up the challenge of porting JonesForth to x86-64. It was one
of the most rewarding personal challenges in a very long time. For one, I was
able to write x86-64 without needing to whip up a complete application. And
learning Forth during the process was even more fun.

[https://github.com/mlang/jonesforth](https://github.com/mlang/jonesforth)

And yes, I know there is also jonesforth64. However, doing the porting myself
was totally worth it.

~~~
MaxBarraclough
And in the true Forth spirit, the point was to implement Forth, not to
implement something _in_ Forth.

~~~
mlang23
Almost right. At the end of the exercise, I at least wrote TIME&DATE, a word
to get the current system time. And I implemented DO..LOOP which appeared to
be quite tricky. But yes, I agree. Getting the system to run was much more
interesting then actually using it :-)

------
beagle3
> Additionally, the higher 8 bits of rax, rbx, rcx and rdx can be referred to
> as ah, bh, ch and dh.

That’s bits 8-15 (lowest being 0 and highest being 63), not the highest 8 bits

Didn’t have time to read the whole thing, but it looks nice.

~~~
matzab
It does say 'higher' not 'highest'.

~~~
unnouinceput
It's still not clear to a first time reader. Here is a better visualization,
as much as HN let me do it, for RAX (same applies for RBX, RCX and RDX):

|63|62|61|60|...|34|33|32|31|30|29|...|18|17|16|15|14|...|10|09|08|07|06|05|04|03|02|01|00|

|.............................................|.............................|<\--
AH (8 bits) --->|<\------- AL (8 bits) -------->|

|.............................................|.............................|<\-------------
AX (16 bits - lower part) --------->|

|.............................................|<\----------------------------
EAX (32 bits - lower part) ------.---------->|

|<\------------------------------------------------ RAX (full register - 64
bits) --------------------------.------>|

(10 thousands edits, hope you guys see the same as I see, perfectly aligned)

~~~
Stratoscope
Nice illustration, thanks. Regarding the formatting, for a chart like this you
will find it easier if you switch to a monospaced font. You can do that with
two spaces at the beginning of each line, and you won't need the blank lines
either. Here is a start:

    
    
      |63|62|...|33|32|31|30|...|17|16|15|14|...|09|08|07|06|...|01|00|
      |...............|...............|<-AH (8 bits)->|<-AL (8 bits)->|
      |...............|...............|<-AX (16 bits - lower part) -->|
      |...............|<--------- EAX (32 bits - lower part) -------->|
      |<--------------- RAX (full register - 64 bits) --------------->|
    

(This still won't look good on a mobile device because of the width, best
viewed in a desktop browser.)

------
tempodox
Completely off topic, but I just have to say it: This is one of the few light-
text-on-dark-background web pages that actually does it well. Contrast is
managed so the text is still readable without escaping to Reader View or
wearing sunglasses. I wish those sites that use squeaky-white on pitch-black
would take example of this.

------
ed_blackburn
Around 2000 we did this in the first week of university along side C. The
number of people who changed courses to more analysis or business focused
degrees was high. Of course it was deliberate. We barely touched Assembly
again unless you opted into specific classes.

~~~
melvinroest
When I was in my masters I never programmed in C and barely assembly assembly
as I was in a Java/Python/JS school. Nevertheless, I found it doable to learn
them both at the same time.

How?

GDB

Combined with

the TUI window and typing:

* layout asm

* layout regs

* focus cmd

And then ni and si were my favorite "step into instruction" and "next
instruction" commands.

And for C the same thing but then simply with layout src and sometimes layout
asm and layout regs as well.

------
vorticalbox
When we were learning this in my degree we used little man computer[1] to help
visualise whats going on.

1\. [https://littlemancomputer.co.uk/](https://littlemancomputer.co.uk/)

~~~
atkbrah
We used titokone [1]. The newest version of it even has some graphics support
in it (although it's super slow).

1\.
[https://www.cs.helsinki.fi/group/titokone/v1.100/kayttoohje/...](https://www.cs.helsinki.fi/group/titokone/v1.100/kayttoohje/manual.html)

------
non-entity
Maybe I'm just dumb, but anytime I've tried to learn anything about modern x86
platforms I just end up completely lost.

I find it interesting the author uses FASM, and I've seen it used a bit more
than I did in the past. Several years ago I toyed with it and found it neat
because the editor and all the samples it shipped with. It did seems a bit
different from things like nasm or gas as it the FASM code I saw used all
sorts of interesting macros that provided quasi-high-level constructs like if
statements.

~~~
Wohlf
It's not just you, modern x86 is very complicated. There are likely few if any
people who deeply understands all of it.

------
dtornabene
I feel like there should be a text that teaches assembly, not assumes it, but
teaches it specifically with an eye toward shellcode and/or reversing. Every
"learn assembly" text I've ever seen either teaches it in relation to C or
architecture or simply by itself. Seems like a gap that could be filled by
nostarch or someone willing to self publish

~~~
im3w1l
Write position independent code. This is much easier nowadays that you can use
RIP relative addressing. To include data, just append it to your code or even
put it inline with jmps to avoid executing it. To accomplish tasks use
syscalls. If you want a library, load it dynamically with dlsym.

Also take a look at [https://en.wikipedia.org/wiki/Return-
oriented_programming](https://en.wikipedia.org/wiki/Return-
oriented_programming)

------
sushshshsh
A mid 90s object oriented textbook somewhere is screaming at me, "pidgeon-hole
computing is dead and unappealing unless you are a pidgeon or a mailman! Don't
treat computers like a list of robots and mailboxes, don't focus on the hows
of computing, focus on the whys!"

~~~
0xdeadbeefbabe
Contradiction by game
[https://github.com/nanochess/fbird](https://github.com/nanochess/fbird). Take
that 90s textbook.

------
lastgeniusua
Huge thanks! Was recently implementing an assembly language of my own, and
this read seems to capture all of the important things that were otherwise
hidden in ten different outdated books, specifications and websites.

------
zeepzeep
I can recommend the game Human Resource Machine

~~~
jnwatson
Yes, that was like assembly for dummies. Still, I don't think it really
exploited its concept well enough. It was not remotely challenging to someone
with a software background.

