Hacker News new | past | comments | ask | show | jobs | submit login
The PowerPC Compiler Writer's Guide (1996) [pdf] (cr.yp.to)
95 points by tjalfi 21 days ago | hide | past | favorite | 31 comments



(submitter)

Henry Warren, the author of Hacker’s Delight[0], is one of the authors of this book.

Chapters 3, 5, and Appendix D have some neat examples of bit twiddling code in Power assembly language.

The content overlaps with that of Hacker’s Delight and Bit Twiddling Hacks[1].

[0] https://en.wikipedia.org/wiki/Hacker%27s_Delight

[1] https://graphics.stanford.edu/~seander/bithacks.html


Great document for those with an interest in PowerPC assembly and still generally applicable to modern Power ISA. Some of these code sections appear in the TenFourFox JIT to this day.


I just fixed an inline assembly bug in LLVM for PPC Linux kernel support: https://reviews.llvm.org/D81767

I tried to see if this doc had anything about calling conventions for 64b parameters on 32b ILP32, but alas...(at least a cursory skim of `A.1 Procedure Interfaces` came up empty).

Another fun fact I learned about PPC assembly for that bug is that without `-mregnames`, it can be hard to distinguish between registers and constants in compiler generated PPC assembly. Official docs from IBM (https://developer.ibm.com/technologies/linux/articles/l-ppc/) say "Get used to it. :)" LOL


What was your question on the parameters? Is this SysV or PowerOpen ABI?


I've heard of OpenPower, but PowerOpen? I guess SysV since we're referring to Linux (though I'm not sure that the kernel strictly adheres to SysV).


For 32-bit PowerPC, PowerOpen is the ABI used in PPC Mac OS and OS X (largely, with a couple minor differences) and AIX. The System V ABI is pretty much everything else, but it's really a misnomer since the *BSDs on 32-bit PowerPC use SysV too.

PowerOpen is still a thing for 64-bit, see https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi-...

As memory serves me, on 32-bit you would pass a 64-bit value in adjacent 32-bit registers in big-endian fashion.


Amazing, that package is still being maintained!


I just wish Classilla got some love...

(purely selfish - I have no nostalgia for early OS X but classic MacOS was where all the fun was and that's what my PowerBook G3 boots in to)


Yes, I regret it as well, but even getting Classilla to TLS 1.1 would be a big deal, and 1.2 (let alone 1.3) would require a lot of rewriting. $DAYJOB is crazy at the moment, and keeping TenFourFox up with Mozilla's monthly release cadence is taking time away from me doing more Power ISA work on mainline Firefox for the Talos II I'm typing this on (let alone other fun projects I'd like to get to). Classilla just loses out because of the amount of work needed to fix it and the very small amount of cycles I have available.


I totally understand, the fact that it works as well as it does it a miracle already! Thanks for your hard work!


Thanks for all your hard work!


I just got a G4 Mac mini and I'm working on getting OS 9 running on it (not officially supported). It's been fun to tinker with so far.

I need to set up stunnel or an Nginx proxy to deal with the lack of SSL support. That's up next I think.


HTTPS explosion was great and needed but ended most of the fun for classic machines.


FYI: https://www.eejournal.com/article/ibm-gives-away-powerpc-goe...

You could, in theory, run code generated by your own compiler on a PPC chip you designed and built with your own hands and brain.

Random thought: would it be feasible to implement PPC on an FPGA, as has been done with RISC V?

Anyone know offhand what’s the most powerful PPC system you can readily buy? I’m guessing it’s one of the PPC Mac machines.


I don't know if there's something either higher up that you'd qualify as "can readily buy", but a [Raptor blackbird](https://www.raptorcs.com/content/BK1B01/intro.html) would handily outclass the legacy PPC macs, and a [Talos 4U server](https://secure.raptorcs.com/content/TL2SV2/purchase.html) (with two CPUs up to 22 cores each) would pretty much crush it.

Admittedly the latter is a $15k machine, which is perhaps may be pushing what you had in mind for "can readily buy", but it's for sale...


This 16-core (2x8) Talos II is around $8ish-K, which is still a lot, but I feel I got my money's worth out of it. 64 threads is good times (each core is SMT-4). I run a "stripper" spec for the Blackbird, which was an experiment to see how low it could go; it came to around $2100, but I would advise going a little higher spec than I did (single-4, no GPU).

The T2 is a great machine, though. There's just no comparison to the Quad G5 sitting next to it. It's quieter, it uses less power and it doesn't feel like I'm lacking for CPU. I got a lot of wear out of the G5 and I'll never get rid of it, but if you don't need 32-bit and/or to run OS X on the metal, the best Power workstation is a Raptor.


PPC on an FPGA happened in the early 2000s

https://en.wikipedia.org/wiki/PowerPC#32-bit_PowerPC

... Xilinx, FPGA maker, embedded PowerPC in the Virtex-II Pro, Virtex-4, and Virtex-5 FPGAs

https://en.wikipedia.org/wiki/Virtex_(FPGA)#Virtex-II

... Xilinx introduced Virtex-II family in January 2001 on 150 nm process technology,[14] and Virtex-II Pro family in March 2002 on 130 nm process technology.


> would it be feasible to implement PPC on an FPGA

The actual silicon-proper, not sure, but one could definitely implement the ISA.

If I'm not mistaken IBM haven't given away any actual synthesizable logic to use.


Do you specifically mean PowerPC and not the POWER ISA?

If not, there's at least https://github.com/antonblanchard/microwatt

And the most powerful POWER system you can buy would be something based on the POWER9 processor.


> would it be feasible to implement PPC on an FPGA

Why not? It is well specified so should be quite doable.


Depends on your definition of readily. IBM sells brand new POWER machines, but they are… not cheap.


Even if you show up with "take my money" IBM won't. They're out of the end user sales game; they want contracts.

Your best bet is a VAR (that's how I got my POWER6 when IBM wouldn't sell me a POWER7), or a company like Raptor, who actually sells retail.


Aaaah, the Power ISA... If you want to "rotate left immediate speculatively if condition matches additionally push sign bit on the stack and zero extend the next register unless it has an odd value", this is the most likely architecture to do that for you ;-) At least its the one I most often need to take a look at the manual what slxwimzus (or something like that) exactly does when looking at disassembly. Of course luckily most compilers don't do the freaky stuff, so I'm probably quite lucky :)


https://developer.ibm.com/technologies/linux/articles/l-ppc/:

“The rotate instructions (like the rlicr seen here) are notoriously complicated, and having jokingly been called Turing-complete”


Copying something quoted on HN ~4yrs ago:

"IBM has a well-known disdain for vowels, and basically refuses to use them for mnemonics (they were called on this, and did "eieio" as an instruction just to try to make up for it)."

- Linus Torvalds, 2009


I could swear it also has a `mkitso` mnemonic.


Don't think so.

There is "stfsux" though.

I like rlwinm which takes five arguments. It can do so many things there are like 12 alternative mnemonics for simplicity.

https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/as...


""""Reduced"""" Instruction Set Computer?


From the Wikipedia page for RISC:

"""

Instruction set philosophy

A common misunderstanding of the phrase "reduced instruction set computer" is the mistaken idea that instructions are simply eliminated, resulting in a smaller set of instructions.[2] In fact, over the years, RISC instruction sets have grown in size, and today many of them have a larger set of instructions than many CISC CPUs.

[Snip]

The term "reduced" in that phrase was intended to describe the fact that the amount of work any single instruction accomplishes is reduced—at most a single data memory cycle—compared to the "complex instructions" of CISC CPUs that may require dozens of data memory cycles in order to execute a single instruction.[23] In particular, RISC processors typically have separate instructions for I/O and data processing.[citation needed]

The term load/store architecture is sometimes preferred.

"""


IOW, it's a Set of Reduced Instructions rather than a Reduced Set of Instructions? (aka (RI)SC rather than R(IS)C?)


This is a great book that is good to see preserved and available in PDF form. A lot of this is applicable to other architectures, and gives a good view into how compiler engineers think..




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: