

Say Hello to X64 Assembly Part 3 - valarauca1
http://0xax.blogspot.com/2014/09/say-hello-to-x64-assembly-part-3.html

======
userbinator
I think something isn't quite right with int_to_str/print; in int_to_str, each
converted char is pushed onto the stack as _8_ bytes, and while the total
length is calculated correctly in print, the result is that 7 null bytes get
written out with each char as well. What you see will depend on how your
terminal interprets them, but they will definitely be there if the output is
redirected into a file.

There's also an extra "add rdx, 0x0" in int_to_str, a puzzling multiplication
by _1_ in print, and a confusion between the standard input (0) and output (1)
fds.

------
pjmlp
Thankfully the article uses Intel syntax.

I had to port a code generation module from Intel syntax to AT&T. What a pain!

Gas is so limited compared with what PC macro assemblers are capable of.

~~~
MegaDeKay
I wrote a blog post a while back showing how Intel syntax can be used in gas,
along with a number of examples.

[http://madscientistlabs.blogspot.ca/2013/07/gas-
problems.htm...](http://madscientistlabs.blogspot.ca/2013/07/gas-
problems.html)

~~~
pjmlp
If you check a sibling thread you'll see I also had some issues with Gas macro
capabilities.

Has Gas understanding of Intel syntax meanwhile improved? It had a few bugs
when I did this.

~~~
MegaDeKay
I ported the five examples from an IBM Developerworks article on x86 assembler
[0] and they all worked fine, but didn't fool around much beyond that.

[0] [http://www.ibm.com/developerworks/library/l-gas-
nasm/](http://www.ibm.com/developerworks/library/l-gas-nasm/)

------
Havoc
I'm a little confused - why is ASM still an issue these days? Sure I can
understand some in-line ASM for hardcore speed-critical code but beyond
that...why bother? Even interpreted langs seem fast enough these days, so
compiled should def be fast enough and resorting to ASM should imo be
unnecessary.

NB the above is a personal view & I'm not a programmer by profession...so if I
missed something - no offence intended.

~~~
userbinator
As someone who reverse-engineers and has seen a _lot_ of compiler-generated
code as a result, I'm even more convinced than before that the whole
"compilers are better at generating code" mantra is a myth. The only thing
they're good at in practice is generating lots of code quickly; there are
instances when the output of a compiler manages to impress me (Intel's is
particularly good at that), but they're still quite isolated instances and the
rest of the code continues to have this "compiler-generated" look to it, i.e.
much could be improved.

It is certainly not hard to beat a compiler on speed or size (often both), and
I believe that the only ones who can't are the ones who learnt Asm the stupid
way that compilers generate code and not how the machine really works. E.g.
it's commonly taught that x86 has 6 general-purpose registers (reserving
eBP/eSP), but in reality eBP-based stack frames and the use of the stack is
nothing more than a compiler-generated artificial construct. Even eSP can be
used for something else if you _really_ need one more register[1]!

Compilers follow the rules of their source language and impose strict, often
unnecessary conventions on their output. Asm follows the rules of the machine,
which are far richer and more expressive than the abstracted simplicity of any
HLL. That being said, they have improved significantly over the years - the
days when compilers would push/pop every register on entry/exit to a function
regardless of whether it was used (or its caller needed the value preserved),
or when the start of every function could be identified by a distinctive 55 89
E5 (push bp; mov bp, sp) in the binary are fortunately mostly history.

[1]
[http://www.virtualdub.org/blog/pivot/entry.php?id=85](http://www.virtualdub.org/blog/pivot/entry.php?id=85)

~~~
pjmlp
In what regards modern processors I doubt very few humans are able to keep on
their head, what each model and micro-code firmware release is doing to their
micro-ops.

~~~
dllthomas
But sometimes you are faced with squeezing all the performance you can out of
a specific processor.

~~~
pjmlp
And then comes a firmware update to the microcode...

~~~
dllthomas
And then you apply it in testing, note the regression, and either don't apply
it in production or apply it as you push out an update.

~~~
pjmlp
Do you control all the processors your customers use?

~~~
dllthomas
If the software will be running on the machines of "customers" and you do not
"control all the processors" they use, then you're not in the "sometimes" I
was discussing above.

~~~
pjmlp
That was what I was trying to say, somehow badly I guess.

~~~
dllthomas
Yeah, I certainly didn't want to give the impression it was a _common_
situation, just that it totally does happen.

------
schappim
I think I just found the perfect use for the Intel Edison (a tiny Atom (x86) +
BLE + Wi-Fi module by Intel):
[https://www.sparkfun.com/products/13024](https://www.sparkfun.com/products/13024)

~~~
Narishma
Edison is 32-bit x86.

------
anjanb
is there something equivalent for nasm on the OS X ?

~~~
DigitalJack
yes, as someone commented already you can use homebrew. I think xcode might
include nasm too, but I'm not sure if that is still true.

There are some hello-world examples for 32bit and 64bit with nasm on osx here:

[https://gist.github.com/desertmonad/36da2e83569bc8b120e0](https://gist.github.com/desertmonad/36da2e83569bc8b120e0)

