

Pentium Appendix H fiasco - yuhong
http://www.agner.org/optimize/blog/read.php?i=82

======
Locke1689
If you go into to the ISA for hardware assisted virtualization, they just drop
all semblance of compatibility. The memory models just work completely
differently. When you're writing a VMM you'll need to write two separate
submodules -- one for Intel and one for AMD.

I wrote the Intel and part of the AMD for an HPC VMM. I also wrote the Intel
emulation code for an unmerged QEMU patch.

Is this lack of standardization a big deal? I don't know. Kind of. Software
developers generally don't have to deal with this kind of stuff and, to be
honest, wouldn't know what to do if they had to. I think the major problem
seems to be that compilers are a lot shittier than they should be. I'm not
really surprised though because I'd love to do compiler development but that's
just not a practical career choice. No one gets paid to do compiler
development, so compilers suck. Is anyone surprised?

~~~
1amzave
I'm curious what you mean by your comments about the state of compilers.

In what ways do current compilers suck? I assume you're not speaking in terms
of utilizing virtualization instructions, since I'm having a hard time
imagining a situation in which a compiler would be generating those.

As for people not getting paid to work on compilers -- I'm not sure what you
mean here, either. Intel employs people to work on ICC, I think Red Hat
employs some GCC devs, Microsoft pays people to work on VC++ (a friend of mine
is currently doing an internship on that team, actually), Apple employs LLVM
folks, Nvidia does too for their CUDA toolchain, AMD has people working on
Open64, I see job postings from Sony and various other companies in
comp.compilers...is that "no one"?

~~~
Locke1689
It's just practically no one. Yeah, there are a few people here and there but
they just don't mean much. I think the biggest single compiler dev team right
now is the .NET team and they're hiring, I guess. It's not really a thriving
market. You spoke about VC++ -- last summer I worked on MS SQL Server and the
optimization query engine that I built required that I be able to construct
optimized tail calls in C/C++. Of course, this turned into a major weeks long
discussion with the VC++ team about this peephole optimization. Yes, the x86
architecture sucks and no you can't do the perfect thing, but doing the simple
thing just to avoid blowing the stack would have been definitely reasonable --
about 5 years ago. Hell, I wasn't even asking for trampolining. Ugh, you get
my point. Compiler development is either fragmented or stagnated or both in
most projects. Right now I'm placing my hopes with LLVM, but we'll see.

 _In what ways do current compilers suck? I assume you're not speaking in
terms of utilizing virtualization instructions, since I'm having a hard time
imagining a situation in which a compiler would be generating those._

Well, the entire debate is about generating architecture-independent code, so
that's a good example. It should be possible to generate optimized code for
AMD and Intel using the same compiler. In practice, it doesn't really happen.

GCC is a compiler implemented in C. _In C_. Anyone who took their undergrad
Compiler Construction course should know what I mean here. I mean seriously,
tossing around ASTs in C? I'd bet it takes 10 times the amount of
concentration and time to implement an optimization in GCC than it would in
ML.

~~~
1amzave
> _Well, the entire debate is about generating architecture-independent code,
> so that's a good example. It should be possible to generate optimized code
> for AMD and Intel using the same compiler. In practice, it doesn't really
> happen._

Wait, that's a _good_ example? I still don't see how a compiler would (or
should) care about VMM-acceleration ISA extensions. If I were writing a VMM
and wanted to use those, how could I possibly express that without dropping
into assembly?

~~~
Locke1689
No not VM code. Architecture independent arbitrary code. For any given C code
your compiler should be able to generate optimized code for the Intel and AMD
architectures. This doesn't actually happen.

Sorry about the wording, I wrote that comment last night at 4 am.

------
tedunangst
If the intention was to link to that post, it's way the hell down the page.

<http://www.agner.org/optimize/blog/read.php?i=82#82>

~~~
yuhong
Oops, sorry.

------
vilda
There are some inaccuracies in the text.

The truth is, AMD could not copy Intel's 64 bit architecture. Intel created a
separate company to circumvent licensing agreement which grants AMD access to
86 instruction set.

Next, amd64 is not compatible with ia32. BUT it enables run both ia32 and
amd64 code almost seamlessly AND with no major performance impact. Note that
ia64 does allow to run ia32 but with notable performance hit (justifiable by
commercial interests only).

------
chubs
Its unfortunate that the x86 camp is splintering, with developers caught in
the crossfire unable to find a compiler that works well on all x86 processors,
while the world drifts towards ARM...

~~~
maximilianburke
I wouldn't say that. ARM as splintered as the x86 world, if not more so, with
processors implementing some combination of
softfp/VFP/NEON/Thumb/Thumb2/ThumbEE/Jazelle, not to mention multiple ABIs.

Yes the x86 camp may have different feature sets but it will be handled like
all previous extensions to the architecture: these features will be expected
to exist (for single deployment targets), detected via CPUID and code that
uses these instructions will be selected appropriately at runtime, or they
will be ignored.

------
derleth
It will be interesting to see if this complexity affects closed-source
software more than open-source software, or if AS/400-style bytecode-
compilation schemes catch on again.

(The AS/400 (now iSeries) world compiles COBOL and RPG source to a very stable
bytecode that has been the same for the lifetime of the system, AIUI. When the
program is first run, the bytecode is automatically compiled to machine code
and the machine code is stored alongside the bytecode; all subsequent runs
either just use the machine code, or regenerate the machine code if the
bytecode is newer and then run the machine code. When moving software to a new
system, the machine code is left and only the bytecode (and, possibly,
sources) gets moved. This works well enough IBM has been able to migrate
AS/400 people from CISC to RISC hardware without any more pain than using
OS/400 stuff usually entails.)

~~~
axman6
I may be wrong, but I think that this is theoretically possible using LLVM. If
you just translate the source into LLVM IR without any optimisation, then all
of that can be left until runtime if necessary, which is pretty neat.

Though I'm not sure if languages impose restrictions on the IR that may hinder
this, I know that GHC had to implement its own calling convention for LLVM to
make it run efficiently. There's also the fact that you need all of the
program to be translated into the IR; using GHC as an example again, this is a
problem because the runtime system is written in C.

I shall have to do more reading about AS/400 I think, sounds interesting,
especially the easy CISC -> RISC transition.

