I am a webdev by day (Python/Django) but recently started learning x86 assembly at home (after which I can pick off from where I left in C at college).
Some of my colleagues think I'm crazy, but I find I'm learning a lot about how a computer really works (and I'm also beginning to understand why the great old 'real programmers' spent hours upon hours on the damn machine :-))
If you want easy training wheels, try 6502 or 6809. If you want more relevant skills, go to a recent ARM instruction set like v6 or v7.
I've heard this said before. Having had basic exposure to both MIPS and x86 assembly in school, I'm not really clear on WHY people say this. Assembly seems equally unforgiving and obnoxious either way.
Assembly is unforgiving, but some processors are gentler on the programmer than others. If I was trying to teach someone the basics of register indexing and indirect indexing, I'd rather do it on a 6809 than an x86.
(Contrast this with using a C compiler on Windows, where "Hello world" compiled with default settings is usually beyond 10KB already.)
Then you get to the segment registers. As if things weren't bad enough. There are four, CS to point to code, DS and ES for data, and SS for the stack. You see, the 8086 can address 1MB of memory (20 bits), but it's a 16bit instruction set. So the segment registers contain not the upper 4-bits of the address, but the physical address divided by 16 (or, shifted right four bits). A physical address is formed by SEGMENTx16+OFFSET, giving you 20 bits of address. Most instructions and addressing modes use DS, except if you use BP, which defaults to SS, and the store-post-increment/decrement instructions, which must use ES:DI (DI is incremented---ES doesn't). You can override the segment register for most instructions, but not all (the string instructions, which give you the post-increment/decrement addressing modes are the exception).
And because of these segment registers, you can have 16 or 32 bit function pointers, 16 or 32 bit data pointers, and 16 or 32 bit stack pointers.
So what you have is an instruction set that is almost, but not entirely consistently inconsistent. The 80186 gives you a few more instructions. The 80286 adds protected mode (with four levels of protection) with a change of how the segment registers work (in protected mode---they're now indexes into a table of physical addresses for each segment, each of which can only be 64K in size), and the 80386, which extends all the registers to 32 bits in size (except for the segment registers---they're still only 16 bits long), adds new registers only available in the highest protection ring, and paged memory, in addition to keeping the segmented memory, plus allowing all the registers to more general purpose, in addition to keeping the old instruction set.
I'm really hard pressed to come up with a more convoluted architecture than the x86.
At the risk of coming off a little more contrary than I intend: seriously, who cares? joezydeco was talking about teaching an assembly language with "training wheels." 8086 assembly without any doodads works for that. It's useless for normal people but no more useless than 6502, and the student will have development tools that are actually friendly to play with.
The nice thing about learning x86 assembly is that embodies the entire learning curve. If you want to inflict all the horrors of assembly language on yourself - a pointless pursuit for anyone not writing a compiler while simultaneously eating paste, but hey, whatever - x86 lets you start small and work your way up.
I'm just not clear on who needs elegant assembly language programming (outside of compiler writers, etc.). Doing it at all represents failure. The point of an assembly language class is to be very quick and make the student appreciate what C does for them, IMO.
Fair enough, you can certainly go your entire career without resorting to assembly language (heck, the last time I got paid for programming in assembly was the early to mid 90s---I'm still amused by the "C is too low level for programming" arguments these days, when back in the early 90s, people were bitching about C being too high level and inefficient).
I just don't know if it's wrong to jump in the deep end with x86. Obviously the argument can be made either way, and x86 is ugly, but I'm pretty prejudiced toward learning something that will potentially be useful right off the bat. That would make me lean towards x86 or ARM for teaching or for self study, I think, and I have no idea what the ARM tools look like right now.
One of my more masochistic long-term goals is to make a compiler (I've made an assembler of sorts) so I might be revisiting this question at some point. Joy.
You don't have to worry about segmentation unless you're working with more than 64K of code or data, which is plenty when you're writing in Asm. And when you do, it's not hard to learn the few extra rules that come with that.
Protected mode and the 32-bit extensions are two separate (but related) topics, you don't need to learn the former to use the latter, and you can already do a lot without knowing either. Unfortunately the 64-bit extensions have gone in a very different direction, and you will need protected mode to use them. But for learning the basics of Asm, I don't think you need 64 bits.
The 8080/8085/Z80 are a little simpler, but learning them coudl be useful if studying x86 since they're its ancestors.
It's no harder (and in many ways easier) than learning the irregularities of English or any other language. One side-effect of learning x86 is that all the RISCs then become really boring and straightforward. x86 has character. :)
Now that's interesting. What is the deal there?
Sadly the best technology is not the one that usually wins mass adoption.
A nice tidy orthogonal instruction set is a joy to work with. The x86 ISA unfortunately isn't one of those by virtue of it's history in the marketplace.