Hacker News new | past | comments | ask | show | jobs | submit login
An Introduction to x86_64 Assembly Language (hackeradam17.com)
193 points by hackeradam17 on Mar 18, 2014 | hide | past | web | favorite | 54 comments

This seems like a great start, but it's lacking something that every assembly programming resource that I've ever come across lacks too:

How can I experiment?

How do I go from instructions in a text file, to compiling, to getting input in some form? What programs do I use (on Linux)? What commands do I run? What are fun projects that are worthwhile doing with assembly?

Beyond that, what are good applications of assembly in general? Where should I use it in my day-to-day development projects? Where can I read about best practices? Where can I find good examples of x86_64 assembly programming?

It's really easy to merely describe the instructions, the registers, the mapping from compiled procedural languages - but I feel like that information is superficial so long as you don't tell us how to apply it.

You might like the assembly tutorial I am writing: https://plus.google.com/111794994501300143213/posts/9gxSUZMJ...

It's focused on actually doing rather than how stuff works. In order to succinctly be able to present this, I had to choose a target platform, which unfortunately seems not to be the one that you prefer.

My tutorial targets OS X, and there is great similarity between coding assembly on OS X and Linux. The tutorial does not spend much time pointing out these similarities, and you unfortunately don't get sensible error messages if you make mistakes. Never the less, you might want to try :)

It would be cool if someone made a VM for development only, on which you could run x86_64 assembly, and get sensible error messages.

Qemu comes pretty close. You can dump register and memory values, crashes outputs useful dumps, etc.

"the overly abstract world of C" I chuckled at that. Looks like you have a fun tutorial though; I'd like to go through it more.

On Linux, you can use 'as', the gnu assembler, or you can also use nasm, which most people prefer over 'as'. I've used both, and I got used to 'as' syntax after a while.

I long time ago I wrote a threaded/fiber system for DOS. It was mostly C, but the task switching and interrupt stuff was all assembly. It's basically an implementation of setjmp/longjmp hooked up with timer interrupts.

You might use it in your day to day if you were a driver developer for something like PCI chipset drivers, network drivers, things like that. Even then, C is usually preferred. I worked for a company that wrote drivers for scsi jukebox systems and they had a compete software suite written in 100% assembly. It was very clean, modular, well commented, and even object oriented. Yes, you can have objects in assembly, you just have to make your own this pointers. Since the language itself is so limited, most good assembly programs have a lot of macro wizardry involved.

you can use the intel syntax with gas too (see -masm=intel or .intel_syntax noprefix).

This is a great resource, and the following is not a criticism, but an observation.

What I've found with most assembly language textbooks and online resources is that while the basics get covered well enough, the fundamental knowledge base is often skimmed over, as are the many assumptions and conventions that are made in how the CPU is supposed to work.

I first learned 8086 assembler from Peter Norton's book (http://www.amazon.com/Peter-Nortons-Assembly-Language-Book/d...), and more than any other resource, that taught me about how things actually work. It goes beyond just knowing the registers and the mnemonics and explains in a fantastically clear way all the implicit work that's going on behind the scenes, even in assembly code. From the stack to data segments and direct video control, I learned more about how computers work from this book than from my entire college Computer Science program. And I didn't even have an assembler to actually try any of the code at the time I read it.

I'm not sure if the book would still hold up from a practical perspective, as it does focus very much on MS-DOS-specific interrupts. So what I wonder is... is there a similar resource available today for learning what's going on behind the scenes in the modern 64-bit CPU on modern OSes?

> anyone know of a mirror?


That book was great. I made the mistake (perhaps) of using it to try to learn assembly language right after learning BASIC (Advanced BASIC that comes with MS-DOS/PC-DOS) and before learning Pascal or C. It had a program you were supposed to type in (a hex editor), and I was able to get it to work, but I was not able to get programs of my own design to work except very tiny programs or to modify existing executables using DEBUG. (Then again, shortly after, I could not get C programs of my own design to work very often either. I forget how hard C is to a beginner sometimes.) Then again, eventually I was able to throw together a decent hex editor, in C this time, in an afternoon, and I am certain it was from internalizing what I had learned from that assembly book, and when I started including assembly language graphics routines in my C programs, I'm sure that helped as well. When everything moved to 32-bit protected-mode code I was kind of lost again though. That book is 16-bit real mode.

Looks like the book is freely available - https://openlibrary.org/books/OL2197699M/Peter_Norton%27s_as...

Just a warning: the file I linked to is "broken" on account of needing Adobe Digital Editions and only one person (in the world?) can read it at a time.

I recommend Computer Systems: A Programmer's Perspective. It sounds like what you're looking for, although it uses a mix of C and IA32.

I came in here to say just this! A fantastic book in so many ways. I read a lot of it on my own and learned more than in many of my classes on similar subjects. It taught me a huge amount about C, systems, architecture, memory and more. Very cool stuff. It starts with x86 but goes over at least the basic differences between it and x64.

Sure, there are some applications in which we may need to squeeze every ounce of efficiency out of our programs, but in this day and age you’ll be hard pressed to come any where close to the code optimizations made by modern compilers.

Does anyone who regularly writes assembly really believe that it's difficult to write better code than a modern compiler? This isn't my experience at all. Instead, I'd say that with just a handful of tricks and a dozen hours practice you can speed up just about anything a compiler generates. The issue is not that it's hard to beat a modern compiler, but that it's rarely worthwhile.

It's almost as if there is some compiler-writers-protection-racket out there that threatens anyone who doesn't bow to the powers of the modern compiler. The wonder isn't that they generate perfect code (by and large, they don't), but that they can optimize as well as they do without introducing bugs left and right. The relative rarity of compiler bugs is the impressive part, not the speed of the code generated.

I think the same thing too; I've seen a lot of horrible compiler output, far more than good or even "novice Asm programmer" level. Maybe the fact that compilers default to optimisations disabled (and generate horribly redundant code), and a lot of binaries that are released seem to get compiled this way, has something to do with it.

> It's almost as if there is some compiler-writers-protection-racket out there that threatens anyone who doesn't bow to the powers of the modern compiler.

A lot of it seems to come from the "religious belief in abstraction" that is prevalent among academics; the belief that somehow, a "higher level" solution is always better.

> The wonder isn't that they generate perfect code (by and large, they don't), but that they can optimize as well as they do without introducing bugs left and right.

It's not so surprising when you consider that compilers are just following patterns/transformations when they generate or optimise code, and these patterns are designed to be very general so they work for all cases, even when they're not the most efficient way to do it in some of these cases. When a human is writing the Asm there is (or should be - otherwise you'd be better off just using a compiler) a higher level of thought, a different process, one that may be more prone to errors but one that also shows some form of creativity - thinking about the specific application and, understanding that, applying a transformation/pattern that is only applicable in that case to improve the generated code.

Size optimisation is also another area where even a novice Asm programmer can very easily beat a compiler due to this general/specific pattern-ness divide. From what I've seen, compilers start with often overly-general code-generating patterns to produce initially unoptimised output, and then attempt to remove unnecessary instructions; while a human would not have this generality - because the human knows more about the specific problem - and the "first-cut" of code he/she writes is already more specific than the compiler's.

(Not that I'm against compilers in general - I would very much like them to generate better code, but for that to happen there is likely going to have to be large changes in how they're written and how the whole process of compilation is thought of. Trying to make a compiler "think" like a human is a good first step...)

After learning 6502 assembly when I was a teenager, I was floored by the beauty and straightforward cleanliness of the 68000 used by Macs and then the similar assembly of the VAX.

x86_64 is just plain ugly. While I admire the backward compatibility, it's sad that this nasty-looking architecture won out. I'm really glad that the compiler handles it.

I was not there at the time of 6502, but imo ARM assembly is simpler (relative to x86_64).

I would go as far to say I really love debug things on RISC instruction set.

At the time of 6502 you were truly on command. I didn't do as much Z80 but I guess it was similar in this particular respect.

Things got really out of hand from AT on (80286). IMO that's not Personal Computing anymore, but rather Home Computer. You are mostly a consumer, even if you develop in ASM. You delegate control of your hardware completely.

I'd recommend anyone to learn Commodore 64 coding inside out. Or maybe Spectrum 48K. These are not the simplest machines of their time but they allow you to do a lot and were popular enough that almost everything that truly matters in computing exists for these architectures. And you can fathomably grok them inside out to such a level it will blow your mind.

If you start from 8080/Z80 and work your way up, you will have a very different perspective. 16-bit (and especially 32-bit) x86 looks quite regular if you see it in the right way; although IMHO the 64-bit extension wasn't designed as well as the move to 32-bit.

I did Z80 back in the day and I think x86 is a mess, ever since 286 at the very least. It's absolutely terrible. Not just because of the ASM itself but because of the whole OS/computing paradigm it grew with.

Well, I always thought that the motorola 6809 was much more attractive. As a compiler writer during those days, the lack of regularity of the 8080, z80, were a pain.

ASM is not made to be read by humans but by computers.

Actually, Assembly is sometimes called Mnemonic Assembly. Mnemonic roughly means "a device such as a pattern of letters, ideas, or associations that assists in remembering something." Assembly was developed to help humans write machine code more easily. It is machine code that is not for humans. Assembly certainly is. (Although, as a historical note, humans have often written raw machine code in a number ways, including toggling it in via switches)

Modern assembly less and less so, but it definitively was originally intended for humans, and for quite some time a lot of fairly substantial programs were handwritten in assembly.

When I was in my teens I even occasionally resorted to compling C code and disassembling it to work with it, because I preferred M68k assembly to the C source, and the C compilers of the time were horribly inefficient - I could often delete pointless lines and reassign registers etc. almost as fast as I could read the code...

I am trying to develop a JIT for array processing [1]. Here are some resources I found useful so far

* Dr Paul Carter's tutorial [2]

* Intel processor reference [3]

* C ABI standard [4]

[1] https://github.com/wedesoft/aiscm

[2] http://www.drpaulcarter.com/pcasm/

[3] http://www.intel.com/content/www/us/en/processors/architectu...

[4] http://www.sco.com/developers/devspecs/abi386-4.pdf

Cheers for that.

Yes, especially Halide looks interesting.

i always thought the art of assembly is the de facto bible [1]

personally i had all these books and never really cared too much about them in the end is was having a c compiler and a good disassembler - ida at that time - that made me learn it.

here's the free copy of ida 5 [2]. i think it doesn't do 64 bit, but it's good enough, and works perfectly fine in wine on both osx and linux. you can always use biew/hiew/olly etc, or more recently hopper, but i think for learning ida is still hands down the best choice

do a string copy see what it does. make a function notice how it always starts with

    push ebp
    mov ebp, esp
    pop ebp
make a switch statement, see what it compiles too(ida annotates it beautifully).

i commend the author for what he's doing. a little improvement would be to reduce some of the text, and add some more examples, and some guidance on how to experiment.

another sidenote is that the code is basically the same as simple c code. c++ code even doing simple things, depending on what you link, and compile in, can be terribly confusing for a newbie to look at.

[1] http://portal.aauj.edu/portal_resources/downloads/programmin...

[2] https://www.hex-rays.com/products/ida/support/download_freew...

The first AoA book was good. The rest, with the HLA crap in it, is not.

This could be of interest to someone: http://www.pentesteracademy.com/course?id=7

infoseckid: that looks like a very hard to find and thus great find. Thanks!

Learning Assembly language has been on my list of "want to do" for a long time now but as I have never had a need for it professionally I kept putting it off. I actually have time for myself now and outside of learning to cook properly and cycle a lot I would really like to finally tick this off my bucket list. The problem is I am not too sure where to actually start. I want something that starts from the bottom and teaches me about the CPU and goes on from there. I know what a register for example but I don't know what a register is if you understand?

So is there a book or web resources that is really a beginners guide to assembly? I understand x86 is not the nicest place to start but I have an x86 CPU in my laptop and not much else so it is a better place to start that anywhere else really. Anyone got any advice?

NB: My background is very high level languages such as JavaScript and Python and not much lower level. A bit of C++ back in the 90s but that knowledge is long gone.

I highly recommend Computer Systems - A Programmer's Perspective. I learned a ton from that book. I think chapter 2 or 3 is where x86 is covered in some detail. It's really well written, and the way they present assembly, C, and certain parts of POSIX draws everything together really well such that you come out with a good mix of practical and theoretical knowledge.

"Programming from the Ground Up" (linked from a couple of other posts) has exactly what you want. There's no need to learn C first - it becomes obvious how memory and pointers work when you're actually doing it. It's a lot easier than people make out, just give it a go.

Many thanks! Checking it out now.

Might be a good idea to brush up your C first. Make sure you really understand memory and pointers.

Yeah I think I will pick up Modern C by King and/or Head First C. Book seem pretty good. I have a copy of K&R somewhere already.

Are there any books available that someone could recommend for me to go down this route, specifically for hand crafting assembly routines inside my C++ programs. (note I'm currently only compiling for 32 bit). It's something of interest to me, but I've never gotten around to it. I'm comfortable with ARM assembly, and have a solid enough understanding to proceed, but the references I've come across so far go as far as "here's how to embed a simple ASM snipped inside your code, now you know how to do it" and then skip to "copy and paste for specific routine". I'd like some exercises for in between stages, for stuff that might actually be worthwhile doing.

If you are using GCC and Linux, you might find "Programming from the Ground Up" to be a useful resource: ftp://gnu.mirrors.pair.com/savannah/pgubook/ProgrammingGroundUp-1-0-booksize.pdf

You may find Agner Fog's Optimization series helpful too: http://www.agner.org/optimize/

Check out every volume, as each has very useful information tucked away. For example, "Calling Conventions" is great for understanding how C++ and assembly interfaces should work.

Read this book, it's basically only hacking C code with assembly. http://www.amazon.com/Hacking-The-Art-Exploitation-Edition/d... It also covers topics such as TCP sockets and the likes. If you ever want to become a real hacker it's your best starting point.

And the world was never the same again.

Another Relevant book: ftp://gnu.mirrors.pair.com/savannah/pgubook/ProgrammingGroundUp-1-0-booksize.pdf

Nice write up Adam. Just thought I would mention that your usage of former/latter is backwards.

Doh! Just when you think you've found all the editing mistakes :)

Big props to Creel ( What's a Creel? ) , I encourage you work through his video tutorials if at all interested in x64


From seeing some of the suggestions here I think I will extend this with two more articles:

1) Some more detail about how the low level of the computer actually works 2) A more "beginner friendly" guide to getting started in programming assembly.

Reading the examples is a little clunky because I have to scroll up to the dictionary every time I see a word I don't remember from skimming the list. Which is often.

It's nice to see the "correct" order of the registers being used. (The reason for this ordering is in the realm of trivia questions.)

Do you mean "correct" as in that's the absolute right order (as in math) or as in that's the original intel syntax?

My first programming language was mc68k assembler where the "correct" order of registers is the opposite.

It's the order they occur in the encoding: A, C, D, B.

68k has numerical, not named registers, and they're in the same order as the encoding.

This is great. I'm all for something that gets people into assembly. It seems to be a lost art these days.

Lots more examples would be very helpful.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact