Hacker News new | past | comments | ask | show | jobs | submit login
IBM Open-Sources Power Chip Instruction Set (nextplatform.com)
587 points by Katydid on Aug 20, 2019 | hide | past | favorite | 277 comments

I have so many questions:

- Where can I get the ISA specification?[1]

- Where can I get a compiler?

- Is there a link to the "softcore model"?

With RISC-V you can start very simple and small (micro-controller) and work your way up in understanding and implementation to a very large core (application class). POWER is a monster of an architecture, designed more for "big iron". I guess that might limit the "hobbyist" factor RISC-V has.

1. This I think, all 1200 pages of it: https://openpowerfoundation.org/?resource_lib=power-isa-vers...

I own a Talos II (https://www.raptorcs.com/TALOSII/) computer. It actually runs an official port of Debian (https://wiki.debian.org/PPC64) on it, which includes a compiler.

Fedora [0], Red Hat [1], Ubuntu [2], and SUSE [3] all have their own ppc64le ports as well so there are lots of choices out there if anyone is interested.

Even Gentoo has one [4][5]!

[0]: https://alt.fedoraproject.org/alt/

[1]: https://access.redhat.com/documentation/en-us/red_hat_enterp...

[2]: https://ubuntu.com/download/server/power

[3]: https://www.suse.com/products/power/

[4]: https://wiki.gentoo.org/wiki/Handbook:PPC64

[5]: https://www.gentoo.org/downloads/

Fedora 30 on this Talos II. Works well.

Void Linux as well: https://www.talospace.com/2019/01/void-linux-goes-power9.htm... Although it's not official at this point I don't think.

No, though my impression is it's progressing pretty well, so I think it will get there.

I have drooled over the Talos II for quite some time...

Do you have a particular use case that makes POWER make sense over x86, or do you share my paranoia and love of non-mainstream ISAs?

Use of GPUs. Not Talos II it seems (?), but with POWER, GPUs are first-class citizens on the system, with NVLink-2-bus access to main memory as the CPU - 150 GB/sec in each direction! (simultaneously!)

Actual GPU use on Talos seems to be problematic from their wiki page. The CUDA use case is supported, but that bandwidth seems too high. Or are you quoting some future number ? The current bandwidth on a P9 system with nvlink is closer to 30 GB/s. And I don't think Talos supports nvlink.

Are all accesses to the memory from the GPU still checked for permissions at the hardware level by an IOMMU?

Yeah, checked for permissions in hardware, but not by an IOMMU. Requests from the GPU are forwarded to the "standard" SMMU. See http://www.ieee-hpec.org/2018/2018program/index_htm_files/13...

I don't especially, because my Talos II is "just" my desktop. I want a computer I can trust and that I know what it's doing from the ground up. It was already the best choice for that and today's announcement made the choice even better.

> I know what it's doing from the ground up

Do you now? There is not even a hidden embedded micro-core running a "secure operating system"?

You can audit the firmware and build it yourself. I did it. Raptor even encourages it: https://wiki.raptorcs.com/wiki/Compiling_Firmware

The biggest problem remaining is whatever blobs are in devices. That's being rapidly worked on.

It's worse than that:


You can't trust any modern computer to not be subverted. So, you have to change how you use them. True secrets should be kept out of computers or rooms with technology. Go old school.

> Go old school

OK. How?

Interesting. How do you find it as a desktop? I'd read in reviews that it's incredibly loud, so more suited for datacenter than office or home use, but maybe it's not as bad as I'd gathered?

This is a very early unit (#12) and the initial firmware was indeed deafeningly loud. However, the current firmware is whisper quiet, certainly much quieter than the Quad G5 next to it (and the G5 is throttled down), and I also have super-quiet power supplies installed. I find it perfectly liveable.

I've had a system with two quad-core CPUs running at 100% load under my desk for many days, whisper-quiet.

For me, the biggest problems are the long booting of HosBoot and lack of "suspend to ram"

I share and respect your paranoia. I have a love of inspecting code and not having backdoors in my processor.

For sure. It's nice that open source firmware replacements have been making progress, especially since the Intel ME fiasco, but it never ceases to amaze me that right now you can go out and get a modern, ultra high performance workstation with every single chip running auditable firmware. Hopefully we will start seeing more affordable POWER systems now that it is a fully open architecture

> with every single chip running auditable firmware

But disks? Isn’t their firmware closed?

What you can do with a trusted CPU domain is use FDE. FDE is standard practice for anyone even remotely concerned about security in the first place.

So the firmware that matters -- the firmware that can subvert the system due to privilege level, etc. -- is open. No other vendor aside from some lower end ARM toy SoCs can say that.

Maybe OpenSSD would be functional enough to use:



> But disks? Isn’t their firmware closed?

Encrypt your data in-memory with a file system feature (or something like LUKS/dm-crypt) before it's sent down the SATA cable to the disk.

The NSA has gone after disk firmware:

* https://www.theregister.co.uk/2015/02/17/kaspersky_labs_equa...

Ugh I guess there always has to be an exception. Maybe you could run everything in a ramdisk? It supports up to 2TB of ram after all

Isn't the price tag pretty amazing too?

Well yeah it is hardly cheap but for the people who can afford it, I totally get how having a computer you can trust is worth the price tag

Most people won't pay that kind of money just to tinker with it.

> Most people won't pay that kind of money just to tinker with it.

The interesting question rather is: how many of these simply cannot afford it and how many think that this is not worth it?

I looked at the Talos website. They're asking 2-4k$ for a 4 core (4 way SMT, so let's say 8 core when comparing to x86_64 just to be nice) dev desktop, with 8 to 16 GB ram. The same spend on a Dell Xeon workstation nets quite a bit more hardware.

While not exactly cheap in absolute terms, there is the Blackbird from Raptor. It's a single-CPU board, cheaper than Talos.

I wouldn't drool over it. See the latest benchmarks comparing it to epyc and Intel. Power9 does pretty poorly throughout almost every test:


Both of those processors insist on you ceding full system control to the vendor in perpetuity, with a literal "skeleton key" that let's the vendor in and keeps you out (the centrally signed, unremovable ME/PSP). If this doesn't concern you, then why are you looking at a local machine at all when a cloud system may very well be less expensive to lease than to purchase and keep current, not to mention run, local hardware? Unless you're loading the local machine 24/7, you're leaving a resource sitting idle for parts of the day without any real increase in security or control, meaning the cloud vendor can give you a cheaper experience overall by keeping hardware utilization over time high.

And no, ME cleaner does NOT (and cannot) fully remove a modern ME. The PSP "disable" toggle in the UEFI configuration does NOT disable the PSP from running during startup.

> why are you looking at a local machine at all when a cloud system may very well be less expensive to lease than to purchase and keep current, not to mention run, local hardware?

Because a cloud machine is rented and not owned. And because of the ping latency: there is a reason why there is for example still hardly any cloud gaming.

And what exactly do you call a machine that you are, by design, cryptographically locked out of, but a third party has access to?

Put another way, would you call a car that I kept duplicate keys and retained title for, but said you could use and maintain at your sole expense for a single upfront payment, rented or owned?

Latency is being solved, Google etc. are working that problem. I'm playing devils advocate here, but fundamentally if you don't care about actually controlling or being able to modify something, and pricing is cheaper to rent, why own?

> And what exactly do you call a machine that you are, by design, cryptographically locked out of, but a third party has access to?

Not a perfect solution, but such a problem can be mitigated by a firewall that blocks such ingoing/outgoing packets.

> I'm playing devils advocate here, but fundamentally if you don't care about actually controlling or being able to modify something, and pricing is cheaper to rent, why own?

Since I love to tinker with my computers, the answer is obvious to me.

I think this is apples and oranges. Sure, if those kinds of things are that important to you, then POWER9 is your only option. But if performance is important, POWER9 is a longshot from being the best. Most companies likely don't care about the things you're suggesting.

My understanding is that it is not really useful to compare Power9 to other cpus in these types of benchmarks, that Power9 is all about computation with massive datasets, not how fast it can zip a file.

> My understanding is that it is not really useful to compare Power9 to other cpus in these types of benchmarks, that Power9 is all about computation with massive datasets, not how fast it can zip a file.

Your understanding is wrong. For instance, running Java workloads on servers is a major Power9 use case.

The thing to remember, though, is that the Talos is only two four-core CPUs, for eight total. These benchmarks are comparing it to the Epyc 7742, which is a 64 core chip.

Naturally the Epyc will kill it on most highly threaded benchmarks. The individual cores on Power9 are quite fast, though.

> The individual cores on Power9 are quite fast, though.

Are there any benchmarks for single thread performance there that I could see ?

If I may ask why did you get it, which reasons? I like the cool non x86 factor but it's quite expensive...

EDIT: Forgot to mention the open argument which is quite amazing as well (I've followed what Talos does).

I tend to buy server level hardware for my own usage. It tends to last a LOT longer. With that in mind, it was roughly comparable to what I would have paid for a comparable Intel Xeon, and I like the fact that I know all of the code that runs on it (the only code that I can't actually change is the OTP memory that it first executes when it boots up, and even then you can inspect it!).

Thanks for supporting the development of open hardware and software. Very few of us can afford to do so.

They actually came out with a Blackbird, and I have been considering getting one of them to replace my server (it runs FreeNAS, but I have gotten my Debian system to run an encrypted ZFS drive).

The 18-core has decent performance. If it weren't for AMD EPYC Rome chips coming out a month ago, I would have considered a Talos II.

18-cores with 4x SMT == 72 threads per Power9. That's a lot of threads, no matter how you look at it.

That's a whole lot of SMT. Can anyone comment on how it behaves compared to hyperthreading? I'm assuming at 4x each core must have a ton more execution units to go around

Power9 is basically "Bulldozer done right". Each SMT4 Power9 core is incredibly fat, with 4x load/store units 4x ALUs, 2x Vector units. Bulldozer probably would have called each SMT4 core a collection of 4-cores.

But only 1x divider, 1x crypto unit per SMT4 core.

The chief downside to Power9 is that it only supports 128-bit vectors, and these 128-bit vectors are executed by ganging-together the ALU units. (so 4x 64-bit ALUs == 2x 128-bit vectors processed per clock tick). Compared to AMD Zen (4x 128-bit pipelines), AMD Zen 2 (4x 256-bit pipelines), and Intel Skylake-X (3x 512-bit pipelines), Power9's SIMD capabilities are tiny.

Another oddity: most instructions take 2-clock ticks to execute, even simple instructions like XOR or Add. This increased latency is likely the reason why it performs so poorly with Python / PHP code.

But when code is written for Power9, it works quite well. Stockfish chess seems to work extremely well on Power9, likely because Stockfish scales to many "cores" well (fully taking advantage of SMT4), and only has 64-bit operations.

One more wildcard: Power9 has 10MB (!!!) L3 cache for every 2-cores. That's 90MB L3 cache on the 18-core. I presume that real-life database applications would benefit greatly from this oversized L3 cache.

EDIT: It should be noted that the L3 caches serve as victim-caches of other L3 caches. So Power9 core-pair 01 can have its 10MB L3 cache serve as a "L3.1 cache" of core-pair 23. AMD Zen / Zen2 L3 cache CANNOT use this functionality. So AMD Zen2 64-core may have 128MB of L3 cache, but each core only "really" can go up to 16MB of L3 cache (because the other 112MB of L3 cache is only for other cores/module)

EDIT: Also note, Power9 came out a few years ago at 14nm, while Zen2 came out on the 7nm node a month ago. I think a new 7nm Power9 update is planned, but I don't know what its timeframe is.

In effect, you could have 1-program using the entire 90MB L3 cache for itself on Power9. While AMD Zen2 requires (at minimum) 8-programs, each program using only 16MB L3. This design decision is clear in the intended use of the chips: Zen2 is clearly targeted at the cloud-market, while Power9 is big-iron / databases.


Unfortunately, most of the benchmarks these days show that AMD EPYC / Rome is just the better overall processor. Still, 18-core Power9 is relatively cheap: a complete 18-core / 72-thread system for $4000ish: https://secure.raptorcs.com/content/TLSDS3/purchase.html

Cheap for Power9 anyway. AMD EPYC is also relatively cheap. You can get a 16-core / 32-thread / 32MB L3 cache AMD Ryzen 9 3950x for only $700 these days (and maybe a complete system build for only $2500).

I don't think "bulldozer done right" is the correct way to describe POWER9.

I see it more as a single big massively wide OoO core with 23 execution units (putting skylake's 10 execution units to shame). The slices are more there for design reasons, to simplify the design process by making it more symmetrical.

Bulldozer is clearly two integer cores sharing some execution units between them, a thread can only exist on one of the two integer units.

In contrast, a thread on POWER9 can simultaneously use all 4 slices, all 23 execution units. The dispatcher can dynamically mix and match which slice it's sending a threads instruction steam to based on slice utilization.

That single difference puts it in a complete different class of CPU architecture to bulldozer.

> In contrast, a thread on POWER9 can simultaneously use all 4 slices

My reading of the documentation is different.

> The most significant partitioning related to threads occurs when more than two threads are active, placing the core in SMT4 mode. In SMT4 mode, the decode/dispatch pipeline, shown in the blue shaded area in Figure 25-1 on page 321, is split into two pipelines, each pipeline is three iops wide and each pipeline serves two threads. The split decode/dispatch pipes each feed one of the two superslices, shown in the green shaded box in Figure 25-1, providing two execution slices for each pair of threads. The branch slice and LS-slices are shared between all threads.

Page 322 of 496: https://ibm.ent.box.com/s/8uj02ysel62meji4voujw29wwkhsz6a4


The left superslices serves 2-threads, while the right superslice serves 2-threads. All 4 threads are "behind" the singular decoder.

It seems very "Bulldozer-esque" to me, especially in SMT4 mode.


You are correct in that there is an SMT1 mode where one-thread could potentially utilize the entire processor. But with 2-latency on even Add / XOR instructions (see Appendix A), I don't foresee SMT1 code to be very useful on Power9. The processor is clearly designed to run most effectively on SMT2 or SMT4 modes.

I'm not even sure how easy or hard it is to switch into SMT1 to SMT2 or SMT4 modes. I don't think Linux can switch cores while running, and may need to reboot for instance. Maybe AIX can switch between the modes on the fly?

I guess if your code has enough Instruction Level Parallelism (ILP) available in its code stream, it could benefit from SMT1 mode. But I'd imagine that most 64-bit CPU-code wouldn't have much ILP.

It's worth noting that in SMT2 mode, it's still 2 threads dynamically scheduled across all 4 slices.

It's only in SMT4 mode that it starts statically partitioning the threads onto superslices. Even then, it's two threads sharing two slices.

I assume the static patitioning is an optimisation, that preformance increases due to the split L1d caches (and I'm guessing there is a delay cycle when one slice depends on data from another, I haven't read the documentation that closely).

It's the fact that slices can be dynamically scheduled across all four slices which makes it "not bulldozer" in my mind, and I don't think the presence of a mode that does statically partition superslices should make it "like bulldozer", even if that is the most common mode. It's just an optimisation.

> I'm not even sure how easy or hard it is to switch into SMT1 to SMT2 or SMT4 modes.

Idealy, the CPU core would dynamically drop down to SMT1 or SMT2 mode whenever the the extra threads are executing idle instruction.

> It's the fact that slices can be dynamically scheduled across all four slices which makes it "not bulldozer" in my mind, and I don't think the presence of a mode that does statically partition superslices should make it "like bulldozer", even if that is the most common mode. It's just an optimisation.

Well, its certainly a Bulldozer-like mode of operation :-)

Power9 is obviously a very different chip than Bulldozer. So I guess it all comes down to opinion, whether or not the chip is similar enough to warrant a comparison.

> EDIT: Also note, Power9 came out a few years ago at 14nm, while Zen2 came out on the 7nm node a month ago. I think a new 7nm Power9 update is planned, but I don't know what its timeframe is.

I believe 7nm POWER10 will be the next move, they had announced Samsung as the partner for their next chips back in December if I remember right.

The power 9 deceptively "came out a few years ago". But in reality, it didn't. The only ones available for a year or so we're demo units at IBM. The rest were being promoted as part of the summit supercomputer. Just like AMD's MI50/60 has been available since November 2018. But try to search/buy one. Good luck...

I have a 2009 Mac Pro with dual 3.2 GHz hexcore Xeons (so 24 threads) and 2 older GPUs and 48 GB RAM that cost less than $700 for the whole thing. I'm beginning to think I lucked out on it more than I already thought I did.

Each Nehelem hexcore Xeon is (EDIT) ~120 Watts of power, so your computer will be drawing well over 300W under load, maybe over 500W. (I mean, Mac Pro 2009 has a 1200W PSU. I presume its expecting to use around half of that power)

The Power9 18-core / 72-thread is going to come in at under 150W total.

The main advancement the past decade has been in power-efficiency. Cloud-scale providers keep their computers running at max load as well, so 500W does add up over months / years into a sizable amount of money.

Especially when you consider that 500W computer needs 500W of Air-conditioning, so the "True cost" of a 500W computer is roughly ~1200W or so (500W from the computer, 700W to power an air-conditioner to move 500W of heat)


A 12-core / 24-thread AMD Ryzen 3900x is just $500, with a total system cost under $1500. The big advantage of a Ryzen 3900x would be a max clock-rate of 4.7 GHz, while your Nehelem 2009 computer is... what? 2.5 GHz? Probably? And computers of that age didn't have deep sleep capabilities, wasting even more power than usual. Modern computers idle at 20W, even servers and desktops. Tons of power-saving features these days which add up.

I think a typical $1500 computer these days would be more than twice as fast with 1/4th the power usage. I don't think anybody seriously in this hobby should be using anything as old as Nehelem these days.

IMO, the price/performance "old computers" seems to be Haswell (~2014 era servers), if people want to buy old equipment. But 2009 is definitely too old, there are lots of used servers that are a little bit more expensive but a LOT more power efficient / faster in practice.

> Especially when you consider that 500W computer needs 500W of Air-conditioning, so the "True cost" of a 500W computer is roughly ~1200W or so (500W from the computer, 700W to power an air-conditioner to move 500W of heat)

I thought air conditioners/heat pumps were supposed to be substantially better than 1w of heat moved outside per watt of electricity?

Hmm... a typical home Air Conditioner is 15 to 20 SEER, which apparently stands for 15 BTU/hr per Watt.

15 BTU/hr == 5 Watts of cooling per Watt of input.

So it appears you are correct. To move 500W watts of heat, you only need 100W of air conditioner power.

The Mac Pro is a 4,1 flashed to a 5,1 and uses Westmere 3.3 GHz CPUs, but your point of power consumption is taken. As I can't possibly afford a $1500 PC, I'm still happy with what I've got. A multiseat desktop/server I could afford that is pretty happy with whatever I've thrown at it is a lot better than a bare CPU sitting idly on my desk.

If $600 or $700 is your budget, my main point was to look for Haswell (2014-era) systems.

For example, the Dell PowerEdge R630 (2014-era) server is in and around $600 to $1000 on Ebay, and will be more power-efficient and faster than any 2009-era system.

I think 2014-era servers are where the price/performance point is for the home-server enthusiast, especially if we're talking about sub $1000 price points.


2x8 core dual socket Intel Xeon E2640 v3 (Haswell) with 64GB of RAM. Its an auction, so it will probably go up another $100 or $200 from there, but I would expect it to sell well south of $1000.

2014-era equipment is the current price/performance king for home hobbyists. Obviously, a modern desktop with all the bells and whistles is a bit more expensive at $1500, but for $6oo to $700, you can get a pretty good 2014-era system.


My rule of thumb is to buy something 5-years out of date. That's roughly the time when businesses get rid of old equipment and upgrade. So 5-years old equipment tends to win in price/performance.

I did, oddly enough, look at used PowerEdge servers, but I wanted a multiseat desktop too, so the step-son and I could play games together at the same time. Less than $700 bought the Mac Pro, 2 video cards, 48 GB of RAM (3 sticks) and, not included in my original equipment tally, a 4 TB SSD and 24" AOC monitor. The bare Mac Pro was $250. As I got it early last year, the 5 year rule of thumb almost applied as a 2009 and a 2012 Mac Pro are nearly identical, the former being able to just be flashed to the latter. In another couple of years, if I have any cash to spare, I'll likely get a used PowerEdge, though. The cost of those things, for what you get, is exceedingly good.

Ah right, multiseat desktop.

Well, I guess the Mac Pro is fine for that, as long as you're fine with the Mac OSX operating system. The Mac Pro line hasn't really had many updates, so maybe the 5-year heuristic doesn't really apply.

Linux all the way! OSX doesn't actually do multiseat. So, have a Linux Mac Pro that I can ssh into, or if that's blocked, get a shell or even my desktop in a web browser among other things. All in all, rather happy with it, though I really would like one of those Raptor Power9 boards for the hell of it.

> The Power9 18-core / 72-thread is going to come in at under 150W total.

The TDP on the 18 core (and 22 core as well) is 190W as listed on Raptor’s website.

That's an IBM TDP (i. e. maximum ever power), not an Intel TDP (i. e. maximum power at some arbitrary power state declared as 'base clock speed').

It's going to be highly dependent on the workload. For some it's counterproductive because the working set of fewer threads will fit into a given cache level when more threads won't, and then it slows things down -- but then you can turn it off or run fewer threads per core.

Where it's a big win is for pointer chasing workloads or big databases, where the working set isn't going to fit in cache anyway and then it's effectively like having really fast context switches. You have four threads and three of them are waiting on main memory while you keep the core busy with the fourth, then that thread has a cache miss but by then one of the other threads has the data it was waiting on.

That pointer chasing benefit has been my experience on Intel, especially on anaemic low power designs. I'm more curious how/why Intel stops at 2 whereas Sparc/Power can manage much higher numbers. Maybe it's not architectural, but more just about product fit or something

It's probably a combination of target market and trade offs.

To make SMT-4 perform well you want to have larger caches so that cache contention between the threads doesn't become the bottleneck, but that eats a lot of transistors. It's essentially a brute force trade off between performance and manufacturing cost and IBM is more willing to say "damn the cost" than Intel.

There's also the matter of who needs a machine like that. There is a lot of ugly pointer-chasing code in the world, but to take advantage of SMT-4 it has to be well-threaded ugly pointer-chasing code. You basically need a customer that needs their application to scale and is willing to do the bare minimum necessary to make that possible, but not spend a lot of resources actually optimizing the code once they get it to the point that throwing more hardware at it is a viable alternative. That's the enterprise market in a nutshell right there, and that's where IBM lives.

That sounds fascinating. Do you have any examples / study material that describes these programming techniques?

My hyperthreading enlightenment came from discovering a parallelized XML parsing task (using libxml2) running on Atom N2800 (2 cores) absolutely trouncing a similar run on a much beefier Xeon with HT disabled. It came very close to a 2x speedup.

This is what the parent comment means when referring to pointer chasing -- XML documents are a big random access graph in memory, CPU cache and prefetch is close to useless in that environment, so when walking the DOM as part of some parsing task, much of the time is spent waiting on memory, with the execution units lying idle.

OTOH many 'genuinely computational' jobs like say, an ffmpeg encode have very noticeable slowdowns with hyperthreading enabled. In those kinds of jobs where the code is already highly optimized to keep the CPU pipeline busy, there will be contention for the single set of execution units shared by both threads, and so the illusion is destroyed.

As to why it results in a measurable slowdown, someone else would need to answer that, but it is at least conceivable that software overheads to manage the increased task partitioning might account for some of it

> This is what the parent comment means when referring to pointer chasing -- XML documents are a big random access graph in memory, CPU cache and prefetch is close to useless in that environment, so when walking the DOM as part of some parsing task, much of the time is spent waiting on memory, with the execution units lying idle.

Bare in mind that this is only true if you parse with the DOM model, if you care about efficiency and it's at all possible then the SAX model is much faster, you won't be bound by pointer chasing as there's very little in memory at once. IME the next big gain comes from eliminating string comparisons with hash values. By that point xml parsing is entirely limited by how fast you can stream the documents.

You can achieve a similar (although I guess not nearly as efficient) effect with DOM, without sacrificing convenience given a suitable library. For example the Python lxml library grants access to the tree as it is being constructed, if you are careful not to delete a node it will later modify, it's entirely safe to e.g. parse one element at a time from a big serialized array, then deleting the element from its parent container, so memory usage remains constant. By the end of the parse, you're left with a stub DOM describing an empty container.

The advantage is not losing access to lovely tooling like XPath for parsing

(If anyone had not seen this trick before, the key to avoid deleting elements out from under the parser is to keep a small history of elements to be deleted later. For an array, it's only necessary to save the node describing the previous array element)

I'm not sure I would describe an IO-bound problem as "genuinely computational".

Video encoding is one of the most CPU intensive problems that your average user will encounter.


This is an extreme version of yield on memory access

Have you had any issues with it, or has it required any additional configuration? I'm extremely curious about the real world use of Power chips

can you run AIX on it?

No. AFAIK AIX only runs on PowerVM systems, which none of the OpenPOWER systems are.

Then what's the point if I can't run AIX on it?!?

"Talos™ II 2U Rack Mount Server TL2SV1 Talos™ II 2U Rack Mount Server Starting at $6,089.00"

Not at $6,089.00; they can forget that. It has to cost no more than $500 USD or this will be a repeat of the same mistake Sun Microsystems did. Will these companies ever learn?

One cannot charge enterprise prices if one wants to build an upward spiral. Intel systems dominate because they are dirt cheap and convenient to buy.

You can't find a modern Intel Xeon Gold CPU for less than 2000$. If you buy 2, and a motherboard, you are already in the 6000$ ballpark, and then you still need to buy everything else (PSU, RAM, SSDs, GPGPU, etc.).

I can build (and have) a fully decked-out intel-based 1U server for $1,800 USD, so this Talos thing can't compete: it's not cost-effective no matter how one slices and dices it.

This company is repeating the same mistake IBM, hp, SGI and Sun before it made.

Those who do not learn from history are doomed to repeat mistakes of those who came before them.

Have you bought one of those Talos systems?

I'll believe you if you are able to provide a link to two Xeon Gold CPUs costing the same or less than the 1800$ you claim you are able to build a full 1U rack with two of them.

PowerPC and POWER are relatively mainstream. It's supported by IBM XL, GCC, Clang and most major JITs (including luajit).

Exactly. I can't speak for big iron and server usage because last time I used a POWER based server at work it was still AIX restricted (though IBM was already aiming at Linux for the future), however on PPC about everything user level was available 15 years ago, including USB devices support and compilers. When the very first PPC Mac Mini came out I purchased one to be used as a living room media PC connected to a projector and running a customized Debian which would load a media player (Freevo) just after boot. Worked like charm for years, no complains at all save for the atrocious loud "boonnnngggg" sound at power up that I was never able to turn off:^)

> including luajit

Well, a fork of luajit. LuaJIT proper has been abandoned for months…

I thought somebody picked up as new maintainer and then Mike Pall was back. What happened? He lost interest?

No new commits since January.


> all 1200 pages of it

Sounds weak, one of the versions of ARMv8 has a spec that's exactly 6666 (!) pages. I would expect IBM to be more detailed lol

Spec or manual? The ARM Spec I've seen is also 1200ish pages whereas the programmers manual is indeed thousands of pages

Ah, they indeed have a shorter spec, but it's at 2611 pages now https://static.docs.arm.com/ddi0596/d/ISA_A64_xml_v85A-2019-...

It's actually a document generated from machine-readable XML files https://alastairreid.github.io/ARM-v8a-xml-release/

IBM doesn't do wacky stuff like Pointer authentication that creeps into every corner of the spec, making everything more complicated …

> POWER is a monster of an architecture

FWIW: the original RS/6000 devices were 20-40 MHz in-order CPUs with architectures objectively simpler than a RISC-V microcontroller like the E310.

> POWER is a monster of an architecture, designed more for "big iron".

It's the same architecture as PowerPC, designed for desktops, isn't it? Have things really changed so much since then?

Yes, in fact if you run FreeBSD on POWER9 currently, it's compiled with ancient gcc 4.x.whatever (the last GPLv2 version) :D (The switch to clang and ELFv2 ABI is going to happen in the coming months)

IIRC PowerPC as a separate architecture doesn't exist anymore. All extensions were folded back into POWER.

I wouldn't be surprised if they have, given that PowerPC desktops haven't been mainstream for more than a decade now, and in the meantime IBM's servers have been marching on.

The last truly mainstream PowerPC desktop was Apple's 2005 Quad-G5 PowerMac. There were other PowerPC machines after this, the PS3 being the most notable. But they were either not desktops or not mainstream.

I'm a hardware nostalgic and have both gathering dust in my basement. So I can't wait for a PowerPC revival of any kind.

Yeah, I've got my share of PowerPC Macs, too (one Powerbook G4, one PowerMac G5, one XServe G5, one eMac G4, all running various versions of OpenBSD). They're really fun machines, and it's a shame Apple decided instead to be yet another x86 vendor.

I also can't wait for a PowerPC revival. Saving up for one of them Talos workstations as my next major hardware purchase (but it's really hard to pull the trigger when the motherboard or CPU alone costs as much as I paid for the entire Threadripper rig I built last year...).

Raptor Talos/Blackbird is a niche, expensive revival, but a revival nonetheless :)

I may have been overly generous with the "of any kind" :). That's a bit on the expensive side and the ecosystem and platform flexibility in terms of upgrade are still pretty slim/locked in. Something that's open and cheap enough to spark general interest would be much more interesting.

The Blackbirds don't look too expensive, and Talos' hardware in general is about as open as it gets (putting even the x86 market to shame, let alone the PowerPC Apple desktops).

It's already here. Get yourself a Blackbird (or a Talos, if you're really going to jump in).

Out of curiosity: is there any limit on the CPU I can stick in one of the Blackbird boards (i.e. can I stick with the lower-end CPU for initial purchase and upgrade to the 22-core monstrosity later)? If so, then that might push me over the edge into investing my next paycheck ;)

In response to your second question, I believe that gcc and llvm both support power.

The toy soft-core VHDL model that is referred to there will be available at https://github.com/antonblanchard/microwatt at some point in the next couple of days.

- Where can I get a compiler?

PGI has a free POWER compiler https://www.pgroup.com/products/community.htm

LLVM and GCC both support POWER.

And more questions, Correct me if I am wrong,

So this is opening up of POWER ISA, since there is quite a few different version or Revision of that, I assume that is the one beings used in POWER9 and in the future POWER10?

And it is more like RISC-V ISA open source rather than MIPS open source ?( I believe POWER was previously opened but with Cooperate Protection speak all over it ).

And this does not include Implementations, like POWER9?

I mean, if all of these were true, without implementation , or at leats licensing it for cheap, it still doesn't change the market one bit.

I'd like to recommend the friendly people at Oregon State University Open Source Labs [0] who host POWER resources for open source projects. If you're looking to see what the ISA can do on P8 or P9 system, I'd definitely contact them and see if you can get a VM.

There's also a cool vector library [1] that bridges the gap between different versions of the ISA and different compiler versions.

[0]: https://osuosl.org/services/powerdev/ [1]: https://github.com/open-power-sdk/pveclib

Shameless plug, but you can so grab a POWER9 micro VPS (and large ones too) without any human intervention at integricloud.com . Those are commercial / paid though, not free.

As someone who previously worked as a student at the OSUOSL, thanks for promoting it!

Note that some of these also have nVidia GPGPUs, so you can test your open source software on both.

An open, high end CPU design is really going to change the cloud market. An ISA like this is a first step in that direction.

Facebook and Google already have their own compute projects and, like Amazon, have access to custom versions of silicon from a variety of vendors.

With a properly open CPU design we'll start to see the first tightly integrated, vertical "cloud" products that maybe still have a "commodity" API on the top (or maybe not?) but are custom all the way down from there.

With the end of Dennard Scaling, if not Moore's Law, Open ISAs and Open CPU designs will radically change both the hardware and compute markets and ecosystems over the next 5 to 15 years, similar to what we saw with Open Source in the 1990s.

Of course, it's not clear that POWER will be the one to do that, and RISC-V isn't going to be making a grab for Intel's crown any time soon, but this looks like IBMs bid to lead in that area.

When the cloud vendors start building systems like this they'll not look too much different from mainframes and IBM wants to continue to own that market.

It's a far, far cry from an open ISA to having multiple competing vendors, let alone open CPU designs.

It was much earlier, but OpenSPARC's impact was limited-- and that was full RTL.

If POWER is open, does anyone really want to make competing high-performance designs-- let alone open them? Better to take something like RISC-V and come up with the first high performance design.

This is especially true when you consider IBM's vertical integration: IBM is the only real POWER OEM and the only real POWER semiconductor vendor.

(If we really assume a reduction of innovation in processors, and a 15 year time horizon... expiration of IP becomes a significant factor, too. Why not just make generic ARM?)

"Better to take something like RISC-V and come up with the first high performance design."

The problem is that RISC-V mnemonics and programming model is so retarded (as compared to MC68000 or UltraSPARC) that one needs a compiler to abstract and hide that mess away. The other problem is that in several years in which RISC-V has been hyped, nobody came up with a 19" rack server design, let alone sold one priced competitively with a 1U P. C. tin bucket server. RISC-V is all hype, but without serious hardware, its impact will be and remains questionable at best.

People have made really fast implementations of RISC-V and universally praised it as being very nice.

And that a ISA that is that knew doesn't have of the shelf server, has nothing to do with the problems of the ISA but rather making mass-market produces for new ISA is incredibly difficult.

RISC-V has barley out of the lab for a couple years and the growth of software and hardware has been impressive so far. Saying it is 'all hype' is serious nonsense and speaks more about your expectations then RISC-V-

I should hope it speaks of my expectations: can't run server workloads on it, worse to program for than OpenSPARC or M68000. I actually want a nice processor and server hardware to use it in to do work. RISC-V ISA and the hardware around it provide neither and yet here we are, it's constantly being paraded as the non-plus-ultra of central processing units.

> mnemonics and programming model is so retarded

Could you provide some examples instead of a slur?

First, it's not like the objection even matters: how nice the assembly interface is doesn't really matter for adoption at all.

And it's not too bad; it's basically very close to a modernized MIPS. There are legitimate complaints, though.

Probably the most controversial is that integer divide by zero can't be made to raise an exception.

Similarly, omitting condition codes is something that will be distasteful to many.

Also, there are so many combinations of legal instruction subsets that compatibility may suffer. Most everything is in a large set of optional extensions (and some important optional extensions aren't really finished yet).

move dst, src, src -- I could stop right here, but wait, there is more!

lui, auipc -- because two instructions are better than a simple move.b or move.w. Really, what nonsense.

sx, ux - I'm speechless at that nonsense.

bltu, bgeu -- because blt and bge just weren't enough -- who designs a processor like this?

lb, lh, lhu, lbu, sltiu instead of move.b, why? I challenge the sales pitch of making more nonsensical instructions amounting to a simpler processor design! (Boy does this make me mad.)

It's not a slur, it really is utterly retarded, especially if one used to program an elegant microprocessor like the UltraSPARC or the Motorola 68000; even the MOS 6502 is more elegant.

But to each his own, live and let live, right? Well why then must this botched processor constantly be sold and paraded as the greatest thing since sliced bread, a non plus ultra of processors, when it isn't?

Plenty of HN readers have children with severe learning disability. Using the word "retard"[1] is likely to attract downvotes.

[1] Unless you're talking about progress or watch mechanisms.

That's exactly what I'm writing about, progress. RISC-V is not an advancement. What is opposite of advancement? In a system, it's either regression or retardation.

And expecting people outside of the Puritan U. S. to abide by the same political correctness norms is extremely rude, inconsiderate and exclusionist -- using those same politically correct norms no less, which is to say, the U. S. should ban political correctness, and do so yesterday for the benefit of everyone.

I don't care what words you use. I'm just telling you that when you describe people as retards you're going to get downvotes, and I'm telling you why that is.

I'm not American and I don't live in the US.

I didn't describe people as retarded, but their work. Even very smart people often do dumb things.

When you say things like this...

> mnemonics and programming model is so retarded

...you are going to get downvoted. This is because people who speak English as a first language understand you to mean "this is stupid, like a retard". They don't understand you to mean "this is delayed, like a watch mechanism would be adjusted".

You can keep arguing that you didn't mean what you said, but at least two people are telling you how your words are being interpreted.

...you are going to get downvoted.

I would be a sad excuse of a being if I feared what some people on a random forum will think of me, or "downvote" me in some arbitrary, imaginative system. The entire thing is a delusion.

Not singling out anyone in particular but I'm a formed adult and have been for several decades, and I do not require upbringing, id est, anyone telling me how to behave or what not to write.

I will write it how I want and I shall not fear arbitrary decisions based on some arbitrary policies someone somewhere thought up. If that gets me down-voted or even banned, I will not let it bother me, as life does not revolve around arbitrary websites trying to tell one how to behave and think and I will damn myself into oblivion before I allow someone to impose such a thing on me. Lest we forget: I'm the only one who decides that, and I'm not allowing anyone to control my thinking or writing.

"Tightly vertically integrated" and "open" are somewhat at odds with each other.

I think far too many people seem to think that the instruction set is something you can just drop in to a chip and start stamping it out, without any appreciation for the amount of device-specific engineering that has to happen. The reason things like a "true open source" Raspberry Pi haven't happened is the $5m - $10m of work required. And for high end devices that would be required to be competitive in the cloud, that number goes up a lot.

I've not heard of Facebook, Google or Amazon doing significant custom silicon projects themselves, as opposed to just working with vendors for some customisation. The only FAANGM in that space are Apple.

IBM are the like the pastoralists living in the ruins of Rome in ~1000AD. They're a consulting firm with a grand name and history.

I'm not sure about this - there are many open processor designs in academia if a fb/google wanted to pick them up - the difficulty is integration and software. They could easier just work on ARM, the reference designs are available if you are fb or google.

I guess what I'm saying is, even if a reletively modern, 2-issue, OoO, with SMT and 256b vector proc, came out open source, would anybody really bother to integrate it and fab it?

From what I see fb and Google work with silicon vendors because they don't want be silicon vendors.

Google have been experimenting with POWER in their datacenters for a while now: https://www.forbes.com/sites/patrickmoorhead/2018/03/19/head...

More historically, Google have been building their own networking gear for some time https://www.wired.com/2015/06/google-reveals-secret-gear-con...

I'm focussing on Google in particular because they have always had a strong preference for Open components wherever possible and they've traditionally taken advantage that openness wherever they think they need to even if that goes against common practice. (There's a story I can't find the link to where, in the very early days, they wrote their own patches to Linux to work around some bad RAM chips that they'd scavenged from somewhere.)

If Google can get an advantage then they will take it. They will also invest heavily, over years, to research these advantages and opportunities.

Their attitude to things like ARM is still fairly accurate at the scale of their datacenters: https://research.google.com/pubs/archive/36448.pdf

The patents have expired on i486. Does that mean x86 qualifies as a free/open ISA? Patents will expire on 64-bit soon.

> An open, high end CPU design is really going to change the cloud market.

I agree. It's only that POWER does not appear to be very high end to me. At best it is performing acceptable for the energy it consumes. Lowering energy consumption is what drives the margins. As a Cloud vendor I would stay as far away from POWER as possible.

As a cloud services consumer, what guarantee (financial, legal, indemnification) will you grant me that your systems will not leak or otherwise tamper with my data, given that you use machines that I know for a fact you have no control over and have not audited prior to the handoff from UEFI to the hypervisor/OS? For that matter how have you mitigated the persistent x86 rogue DMA problem?

POWER9 still has two advantages -- security and speed. Yes, speed -- the core is quite weak on some tasks and very strong on others. If you're buying this to primarily run an AVX intensive type workload, don't (unless you need the security aspects). Those massively wide, vector dependent workloads aren't exactly common in multitenant cloud though, unless you're using GPU offload where POWER again beats even the newest AMD chips for pure GPU offload performance.

So much for the good...the ugly is that POWER9 was fundamentally late and not at performance levels we wanted, but that's a transient state. Every CPU vendor puts a chip like that out from time to time, and IBM is acutely aware of the problems here. I see no reason to go to an even more problematic architectures (x86 duopoly with master vendor keys, RISC-V with fragmentation and weak cores / immature toolchains) when we now have a better option available.

Will this do any better than open source SPARC, which was open sourced in 1999?


I can't see why it would. This would have mattered 20 years ago when there weren't more compelling ISA's out there. But that's not today's world: ARM is fairly ubiquitous and dirt cheap while RISC-V is a promising and open source up-and-comer. This seems like a relatively non-event (or worse: confirming that it's effectively a dying/dead platform) unless one has a significant investment in Power.

I tend to agree, I think if IBM were to release some core designs to go with it then they could potentially spur something interesting.

I really see it two ways, the fact that Talos has real hardware that isn't priced up in the stratosphere (it's not cheap, but it's not insane) and then the ISA being opened. Those are giant steps for a company like IBM. At the same time, as big as those steps are for IBM, they seem like pretty small steps in terms of taking on the world with this stuff.

Throw in something like the full G5 design? We might be talking about something different.

They are more than willing to hand out old designs to university groups (you just have to ask nicely). In fact someone in our group spend his PhD developing a custom embedded Power processor in 65nm just to be told that when he was hired by IBM (whether he would have gotten the job is a different question of course).

Why does RISC-V being "promising" and an "up-and-comer" make it more compelling than an ISA that been around for a long time?

It's more compelling because a number of research groups and companies are investing in designing and releasing hardware based on it. It has mindshare in the space that Power does not and is not likely to have.

Just opening the ISA doesn't mean that new players can start spitting out processors based on it tomorrow or even next year. And why would they want to? Power was never in remotely the same position that x86 is/was re: binary compatibility so being able to say 'Power compatible' doesn't carry much weight. An ISA which has been a minority player but around for a long time is more likely a liability than an asset.

For RISC-V to be a more compelling option than Power, it would need to be an option first, but if I need to buy a CPU today, I can't buy a RISC-V one.

I can, however, buy a wide range of PowerPC CPUs, for a wide range of applications. From embedded applications, like routers, to laptops, desktops, workstations, high-end servers, up to super-computer class CPUs.

It already has a small and growing ecosystem. I agree there are no RISC-V barn burners but there are already more people with RISC-V design experience than there are for POWER.

I think all the major non-IBM POWER folks are at Apple these days and you know which architecture they are working on!

Having observed both sides of the industry first hand, I don't agree with that statement at all. Without going into debates on relative merit, RISC-V does its development in public while IBM until very recently has done it all behind closed tightly sealed doors. This might be giving a slightly unfair comparison on size of teams from a public perspective.

The IBM folks really, REALLY understand how to design a secure core and chip, plus the decades learning how to make a fast and relatively efficient core. RISC-V is simply in a far more nascent state, trying to push it to POWER9 performance (let alone AMD performance) is like saying a toddler just learning to walk will win a 10k marathon tomorrow. Eventually that may happen, but not in one day, more like 20 years. ;). And when you start chasing performance, who is doing the actual hard, tedious work of verification and making sure security flaws aren't being accidentally introduced into the implementation?

POWER is interesting to me because we get mature tooling on a proven ISA that can be built and run on high performance chips today. No more cross compiling, no more pure emulation required to do soft core work. That in and of itself is huge in the embedded space, and honestly I'd love to see the experimental and interesting cores currently decoding RISC-V ported to decode ppc64 -- all the sudden real comparisons on performance etc. for identical binaries become possible, allowing proper comparison of core design ideas under real world loading. No more guessing and having to take on pure faith that the performance difference is down to ISA or compiler performance -- either your core is faster / more efficient on the sane binary, or it's not!

Oh I don’t disagree with what you wrote except on one crucial point. Certainly agree that RISC-V is still in its baby shoes.

But IBM’s announcement is simply the ISA itself being open sourced (with some patent IP). Apart from an FPGA soft core there’s no RTL/VHDL. If you want to make silicon you’ll be starting basically from scratch.

Sure. While I would also like to see a real hard core or three released, here's my frame challenge:

What if the existing RISC-V and other academic cores are already good enough for a lot of people? The instruction decoder is a relatively small part of the CPU, swap that out and you suddenly get new SoCs that can run the existing POWER software base (that means proven toolchains, vector accelerated applications, etc.). Right now RISC-V doesn't even have vector instructions per se; adding all that support to the entire tooling seems like a lot of effort for not much gain when you can simply implement VSX in the hardware and use the existing tooling for it.

I keep hearing the open RISC-V cores are going to be very fast very soon. If that's true, how would an IBM provided core help versus an instruction decoder swap on one of those and some tuning?

Except neither of those are in the same performance ballpark as Intel, while Power ISA is.

I might be misunderstanding you, but performance isn't in the ISA, it's in the implementation. In fact, the x86 ISA is the best example for this: It's really difficult to get competitive performance out of an ISA designed in the 70s, yet billions upon billions of USD in R&D and optimization make it work.

The ISA matters, otherwise we wouldn't care about SIMD. If your ISA is missing SIMD functionality, then it doesn't matter how good your implementation is, it will be slower than an implementation of an ISA that supports SIMD when it comes to anything that can leverage SIMD.

Fujitsu begs to disagree:


Fujitsu had already built SuperSPARC-based supercomputers, and Oracle recently ported their Red Hat clone to AArch64 (no legacy 32-bit or Thumb) and produced an ISO for the PI-3.

I won't dispute SPARC (Fujitsu has done some impressive work on it), but the post was for ARM and RISC-V.

The link I posted was Fujitsu's new ARM server, which bundles 32GB of RAM and over 30 cores on a single die. These are deployed in dual-socket blades attached to the "tofu" routing that they scavenged from their SPARC supercomputer.

Fujitsu is saying that their ARM implementation is the fastest server processor available, ahead of Intel.

I don't know why you compare processor implementations and then talk as if that generalizes to all processor implementations of a given ISA. Intel Atom chips definitively aren't in the same performance ballpark as POWER9.

I mean it should be obvious. The ISA does not dictate memory performance, micro architecture, clock frequencies, manufacturing processes, number of cores, maximum allowed power consumption, etc. All of those affect performance but are independent of the ISA.

Don't forget that MIPS also open sourced their ISA earlier this year as well.

Sort of. From what I've read, they "open sourced" their ISA in only a very weak sense of what it means to be "open source". Apparently there are a ton of restrictions on what you can do with their ISA even now.

Yes -- the target for POWER is basically competition with AMD Rome and Intel Cascade Lake and you need to have extremely deep pockets to compete there.

I like it.

Back in the day IBM ran a "System on Chip" factory based on PowerPC that gave us the Bluegene/L supercomputer, the GameCube/Wii/Wii U, the Playstation 3 and the Xbox 360. All of these combined one or more cores, coprocessors and tweaks to hold its own against x86.

RISC-V is meant to be used like that, but memory management support is not yet finalized. They are sampling prototype RISC-V chips with an MMU you can put in a dev box to develop Linux on. Other than that you are not using Linux or Windows.

If you think mainstream OS is bloated, then RISC-V has your number. If you want very low cost it would be exciting to cut RISC-V down to have fewer and less wide registers. The other day I saw an article about a guy who wants to build RISC-V out of vacuum tubes and thought... 'cripes with all of those wide registers it is a lot of tubes.

POWER is good-to-go right now for high end applications and can stay relevant against ARM and x86 by staying open.

They are misunderstanding why RISC-V and Raspberry Pi are popular: it's not so much that they're freeware but that they are cheap. Very few people in IT know how to implement processors in hardware even with an FPGA. What makes a processor popular are cheap, affordable systems people can easily acquire in an online shop at prices which compete with or are below contemporary P. C. tin bucket hardware.

If IBM wants an uptake of POWER systems and people to develop on them and for them, the only thing which might make a dent are sub-$500 USD complete workstations and rack mountable servers. Otherwise, they will repeat the same mistake which Sun made, that is, they open sourced their UltraSPARC T1 under GNU GPL but the uptake was nil, because few had the knowledge to design systems around the processor. People want cheap, ready made toys they can tinker with immediately.

Not a problem with the article but when it lists the various past contenders, MIPS (with many times the lifetime installed base of, say SPARC, and still in active production) doesn't get a mention. MIPS's IS has also been recently open sourced. It's a cautionary tale.

I don't see the point of this effort for IBM. These things need communities, and POWER simply doesn't have the community; as a proprietary architecture for so long that nobody really decided to buy POWER but rather they wanted some device/ecosystem/price point and POWER was how IBM could deliver it.

The article mentions RISC-V, which still has a nascent ecosystem and no significant design wins (yet!!). But if you want to design a chip with it you can find designers with some experience with it, people developing some IP you might want to use, etc. Even that has more momentum.

Naive questions:

- Will Huawei be able to use this processor design (now that it is open sourced) to build it's own chips, bypassing ARM restriction & US IP ?

- Are these processor designs usable in mobile device, or only in workstations and servers (using to much power for example) ?

What's being released here is the instruction set architecture, not the microarchitecture for any particular processor design. As RISC ISAs go Power is relatively pragmatic, though not to the extent 32 bit ARM is, so it has relatively good code density compared to SPARC and MIPS. Plus it doesn't have annoying misfeatures like branch delay slots or register windows.

For mobile processors it seems about as good as 64 bit ARM but with a bit less software support in the mobile world, though a good history of software support in general.

An interesting offset for the latter is that you could develop your software on a high end system that matches the mobile architecture. Anyone who has has to fight with (/slow/) pure emulation of e.g. Android on ARM knows the pain this causes, multiplied by the thousands of developers. That's a lot of wasted man hours vs. the develop on same architecture model.

About the first question, UltraSPARC has been open source for a while. You can even download the Verilog code. We haven't seen any UltraSPARC-based processors, so I don't see why they would use this.

There were quite a number of Japanese Sparks from PEZI, Fujitsu and some others, but they were all purpose made HPC products. Not mass market

Ah, didn't know that, just checked out the Wikipedia page. Thanks.

IBM targets the scale-up market (few big, fast machines) with POWER instead of scale-out (many small, slower machines). Consequentally they are high performance but not particularly tuned for high efficiency, because performance is the more important design goal of the system.


Freescale hasn't made a new low-power Power chip in a while, but... historically speaking, there were a lot of low-wattage / efficiency-focused embedded POWER designs.

I don't know what happened politically between the companies to use ARM instead. But I would imagine that ARM's instruction set was cheaper (or maybe easier) to engineer than Power ISA. Hopefully Freescale engineers can chime in on the discussion, because I'm really just shooting from the hip here.

I would expect most issues to come down to business politics. IBM open sourcing the PowerISA is also a business politics move (I guess they hope to recapture the lost ground in the embedded space).

PowerISA means operating with IBM's ecosystem: GCC, Linux, etc. etc. Remember IBM has merged with RedHat, so there's a lot of promise for Linux support that ARM and RISC-V don't necessarily provide. I think this is a good move.

> historically speaking, there were a lot of low-wattage / efficiency-focused embedded POWER designs.

Including radiation-hardened chips appropriate for satellites (if we count PowerPC).


ARM sells, in addition to ISA licenses, complete designs. I guess that's a big advantage for those that can't afford to develop a full CPU from scratch.

> promise for Linux support that ARM and RISC-V don't necessarily provide

uh, ARMv8 and RISC-V were developed with Linux in mind from the beginning, they didn't even have anything other than Linux/BSD/various-RTOSes, like IBM did with AIX.

That's not the kind of support I'm talking about.

Who is writing the RISC-V compiler? If the RISC-V compiler for GCC or CLang messes up, who do you call?

If the Power9 GCC / CLang compilers mess up, you call Red Hat for support. Red Hat / IBM are now the same company, so they'll offer end-to-end services.


ARM has okay support: the ARM foundation seems to be taking care of their compiler kits / Linux patches / etc. etc. pretty well. But I don't think you can buy an ARM support package from anybody... really.

I think the ARM / Linux ecosystem is still nascent. You get good support through the Rasp. Pi community, and maybe the occasional Android Phone gets a big community around it. But ARM / Linux ecosystem is quite poor outside of Rasp. Pi.

ARM, as a company, is clearly designed as an "embedded" company. It provides the documentation and compilers, but doesn't provide too many OS-level services above that.

> If the Power9 GCC / CLang compilers mess up, you call Red Hat for support

uh, where and when exactly did they offer that? Actually I don't remember anyone anywhere offering commercial support for GCC or LLVM/clang.

Well, I'm not the type of person to look for commercial support for anything ever, but I've heard of several companies that provide support for DBMSes like PostgreSQL. Not so for compilers.

I just googled "gcc commercial support" and the results are the GCC FAQ, a mailing list post about it from 2005 (!), GCC on Wikipedia, "Office 365 GCC" (lol) and so on. Looks like it's just not a thing at all.

Sorry, not GCC / Clang. You're right.

But IBM's XL Compiler: https://www-01.ibm.com/support/docview.wss?uid=swg21110831


I think I confused it with ARM: ARM has a CLang-based compiler with official ARM support IIRC. https://developer.arm.com/tools-and-software/server-and-hpc

I think the hobbyist (who won't get much support even if they're a paying customer) benefits from free tools / free support / communities.

But it seems like a number of professionals prefer having a degree of professional support in the products they use.

IIRC commercial support for GCC is included with some RHEL and SLES subscriptions (probably all except the lowest-cost "desktop"). Red Hat also has a Developer Toolset product that includes a (more recent) complete native toolchain, and SUSE has a similar SLES 12 Toolchain Module / SLES 15 Development Tools Module.



I haven't heard of commercial clang support though.

Red Hat supports arm64 and recently joined the Risc-V foundation.

IBM targets both markets. The two variations of POWER9 are literally called scale-up (uses buffered memory) and scale-out (regular DDR4 DIMMs). (Now there's a third one for huge I/O needs…)

The scale-out POWER9 scales down to 4-core.

POWER is about as far from a good fit for most ARM applications as you can possibly get.

It's all about shoving a ton of hot power hungry multithread cores as close together as you can and running them at full bore.

The Freescale/NXP 4xx/75x PowerPC cores are fairly common embedded CPUs. These days POWER and PowerPC are the same ISA.

No. POWER and PPC are decidedly not the same. the closest they ever came together was the G5's 970.

4xx and 75x were OK for embedded a decade ago, but today they're hot and power hungry. You can use them in devices where you can burn 10+ watts to maintain backwards compact with existing PPC code, but they're way the fuck too hot for a phone.

That's true for 32-bit, but 64-bit PowerPC is pretty much synonymous with Power ISA.

But is that due to ISA differences or just microarchitectural differences?

The ISA is mostly the same. I mean, look at https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/as...

There are differences in details about uncommon instructions, irrelevant assembly language changes, some instructions privileged for one arch and not the other, that kind of things.

But for the bulk of the ISA, it's the same. You probably can create a single userspace binary compatible with both? Not sure but seems doable.

The microarch is likely different but then it is also different between several members of each category, so the word does not really designates the micro-arch, but really the ISA. And then you have other brand names using that, and they are so similar that e.g. Freescale switched from PowerPC to Power while incrementing PowerQUICC II to III. I remember Linux has an eieio macro that just emits the aforesaid instruction for PPC, and actually the opcode does something similar on Power (mbar) and IIRC the assembler is happy to emit it regardless of the ISA.

So it was kind of messy when you reached the differences, but everything was quickly workable and you got use to it. The reference manuals of Freescale are very good and the "[...]Programmer’s Reference Manual for Freescale Power Architecture Processors" EREF_RM often directly points at the few differences with PowerPC.

On the same process node, PPC chips aren't anymore hot than ARM.

POWER already has a place next to arm https://en.wikipedia.org/wiki/QorIQ.

POWER and Power (formerly PowerPC) are similar but quite different. PPC has been in embedded (but generally not mobile) for quite a long time, but even then, the cores are still hot, power hungry, and poorly suited for mobile.

Because the designs predate the big push into extremely high efficiency processors. Like the big "server class" processors, the investment required to create a truly high efficiency processor is quite large. Small in-order cores with limited functional units, and lacking much of what makes a modern processor fast (vector units, specialty instructions/etc) can fool people into thinking that minimal clock domains/gating is sufficient to create a high efficiency design.

> Will Huawei be able to use this processor design (now that it is open sourced) to build it's own chips, bypassing ARM restriction & US IP ?

Which operating system would the use? Is that supported with the power instruction set?

Power is totally mainstream and well established. You have your choice of a bunch of operating systems and compilers. Linux, llvm/clang, GCC, etc. all there for a really long time.

Windows also ran on the PowerPC architecture at various points.

In 1996 I worked briefly a company that had a bunch of "PERP" ("PowerPC Reference Platform") machines lying around given to them by IBM that for application porting to Windows NT PowerPC. For kicks, I put Linux on them, so they'd actually be useful for something.

PowerPC is not strictly identical to Power architecture, but is related and most tools and OSes can be made to work either.

RISC-V is a better fit for mobile devices. But Huawei does produce network hardware that might benefit from a good POWER-based platform.

What aspects of RISC-V do you think make it a better fit?

I'd also like to know, but I imagine that the compressed version of the RISC-V ISA would be a significant factor as code density is important for a phone.


for last three-four years phones have ca 2-3-4Gb of RAM and 32-64Gb of storage

nad if are talking about RISC-V phone, it would be produced in no early than 2020, so not sure code density is a significant factor

I don't think

Code density matters for I$ thrashing reasons, but it's probably not a significant enough factor to overtake the market supremacy of ARM.

Interesting move by IBM.

See also OpenSPARC. [1]

I'm curious if anyone will do anything interesting with this.

[1] https://www.oracle.com/technetwork/systems/opensparc/index.h...

The IP/patents was doomed to expire anyway. But hey they got a free PR stunt out of it.

Anyway it will be interesting what "non expired" patents they bundle with this. Mostly because of https://archive.fosdem.org/2019/schedule/event/patent_exhaus...

which would allow people in the US to use these patents in any context as long, they derived their worked from an Open Source Power processor which got the patents exhausted and did not violate the Open Source license the processor.

3 of the top 11 supercomputers in this years ranking are Power-9s.

Still waiting for that modern POWER9 based laptop with 7-row classic ThinkPad keyboard ...

How is does this different from publishing the instruction set, which was presumably already available (if it weren't, how would one write an application for it)?

I guess, publishing something does not necessarily mean making it public domain.

Right. For instance, the x86 architecture is published. If you try to build anything that can interpret it, you'll get sued by Intel or AMD (or both, if you really copied anything good).

What is less widely considered is that emulation is considered an implementation of a CPU. So e.g. research labs at universities have been told in no uncertain terms by CPU vendor legal notices to stop working on research that is emulating the x86 or ARM instruction sets. Now with the POWER ISA being open, everything about the ISA is fair game for research in emulation, soft cores, hardware, etc., which does put it in a space that only academic "toy" ISAs and RISC-V really sat in before.

Is this a thing today (or was it in the past)? Was Cyrix or Transmeta in such position? The Zilog Z80 is notoriously an extension over Intel 8080.

I find it hard to believe and hard to enforce since emulators of various quality are practically everywhere (and not only for x86). QEMU?

But we have qemu [0] that emulates quite a lot of Instruction Sets. How comes that it is allowed to exist, qemu is emulating quite a lot of instruction sets [1] ?

[0] https://www.qemu.org/

[1] https://wiki.qemu.org/Documentation/Platforms

The support is there, yes, but it's always been a grey / largely unenforced black area. As soon as the emulator cuts into chip sales in any significant way the tolerance for its use would stop.

IANAL, this is simply what legal folks are saying on this topic.

Can you run qemu fast enough to be competitive with Intel? And I mean full emulation, not something that requires running on x86.

If only this had happened as Opteron was taking a beating in the server market.

Now with EPYC Rome, I wonder just how many takers IBM will have.

Yeah, it's the question I guess: too little, too late?

Because it's hard to beat the x86 mammoth for so many reasons (on top of my head):

- huge market share in servers/workstations

- Intel has more resources than pretty much anyone else

- AMD is now back in the game and started a core/performance/price war with intel

- x86 is "cheap"

- market shares for "cheaper" stuff will probably be taken by ARM and RISC-V

- so much time was invested in optimizing compiler, code and so on for x86 because that's what everyone has

- the Torvalds argument which is to say developper "will happily pay a bit more for x86 cloud hosting, simply because it matches what you can test on your own local setup, and the errors you get will translate better,". So as long as you don't have cheap Power workstations, it'll be a moot point. I remember working on AlphaPC and pretty much nothing was 64 bits clean back then, it was a huge mess. Now that part is solved but not everything else...

I definitely get the appeal for the Googles of the world to challenge Intel and for niche (internal) products, and for myself because honestly I don't really need an intel compatible CPU but in the long run, I am not sure it'll go anywhere...

> the Torvalds argument

Well, the local machines are coming, it's totally feasible to have a Blackbird at home and host at IntegriCloud…

but both of those are really expensive, so instead I have a MACCHIATObin at home & AWS Graviton in the cloud. ARM is winning :P

> optimizing compiler, code and so on

Fun fact, IBM is paying large amounts of cash on BountySource for SIMD optimizations of various things for POWER: https://www.bountysource.com/teams/ibm

But ARM is winning again: many things, especially the more user-facing ones, are already optimized thanks to smartphones. For POWER, the TenFourFox author is I think still working on SpiderMonkey's baseline JIT. For ARM (AArch64), IonMonkey (full smart JIT) is already enabled, developed by Mozilla, thanks to both Android phones and the new Windows-Qualcomm laptops: https://bugzilla.mozilla.org/show_bug.cgi?id=1536220

> Well, the local machines are coming, it's totally feasible to have a Blackbird at home and host at IntegriCloud…

Yeah but there's a HUGE but: the motherboard and CPU (1S/4C/16T) and heatsink alone are $1.4k, no RAM no case no HD no nothing (I found a guy who spec'ed one for $2.1k with everything you'd need for a reasonable workstation). So unless you have a massive good reason or interest (political, because POWER, your company runs on POWER, "f*ck" x86, ...) to run your code on POWER, I don't see why you'd spend that much while you could get better for a lot less.

And the only way it'll get cheaper is to mass produce it: let's be realistic, as much as I'd want to have a POWER workstation or laptop (hey, there were SPARC and Thinkpad PowerPC laptops so why not), I won't be holding my breath while I wait...

Yeah, I already mentioned that because of that price, my "fuck x86" machine is ARMv8 :)

(okay, not only because of the price, also because I just like the A64 ISA and UEFI)

The SolidRun MACCHIATObin is not nearly as powerful — it's ultrabook-grade performance, not server-grade — but it works fine for coding & browsing, and it's also quite open — the only blob in the firmware is something tiny and irrelevant (and I'm pretty sure for some secondary processor), everything on the ARM cores post-ROM (including RAM training code) I have built from source.

That's more expensive than an x86 equivalent system, but hardly outrageously more so.

Well, the CPUs are in the same ballpark (<500 bucks for the 4-core POWER9 ~= launch price of the Ryzen 7 1800X) but the boards are outrageous — $1100 for the Blackbird is almost twice as expensive as ridiculous E X T R E M E boards like the ASUS ROG ZENITH EXTREME for Threadripper (and a normal decent board for desktop Ryzen is in the $250 area).

Yeah, it's low volume and Raptor needs to pay their employees — but $1100 for a mainboard? Come on. Maybe they should have dropped PCIe Gen 4 from the Blackbird at least.

I think you are overlooking one of the main reasons for avoiding Intel and AMD -- the ME and PSP, respectively. Intel or AMD has a master "skeleton" key that can basically unlock any of their computers post-sale, while simultaneously using that key to ensure that you cannot modify, replace, or remove the black box firmware in question.

If you trust Intel and AMD, without an SLA, to keep your data private all I'll say is that's quite naive. Even the HDMI master key leaked, do you really expect the ME and PSP signing keys not to fall into the wrong hands at any point?

Yes, the mainboards are expensive. That's the price of making them blob-free and still retaining high performance. Blackbird lowers that barrier to entry some as well.

Again, Rome has a mandatory PSP blob that cannot be removed (any UEFI toggles that say otherwise are not accurate -- the PSP must run before the x86 cores even come out of reset). If you're OK with that loss of control, my gut impression is that use of Linux etc. is just being done to avoid Microsoft licensing fees, not because of security or owner control concerns ;). At that point, why not just lease cloud space on a major provider that can offer that compute power even cheaper than a local machine which sits idle overnight?

I know you like to play up the privacy angle in your marketing… that wouldn't work on me. I mostly work on public/FOSS stuff, about the only really private data on my PC is my access credentials. I don't want them stolen, but someone targeting me with a low-level exploit for them is a ridiculous moonshot scenario, they're a million times more likely to leak from the actual service itself.

> local machine which sits idle overnight

um, I thought we're talking about workstations here. I power mine off when unused.

> use of Linux etc. is just being done to avoid Microsoft licensing fees, not because of security or owner control concerns

This is based on two rather odd assumptions:

- Microsoft as the default: No, I grew up with Unix, Unix is my default choice just because I know it and I'm used to it;

- owner control on all levels being equally important: meh, there's a lot more that you'd want to tweak in the kernel and up the stack. I wouldn't know what to change in firmware. I have changed many little things in the FreeBSD kernel (and contributed them). The only thing I ever changed in the UEFI firmware on my ARM box is some ACPI tables to fix compatibility.

> That's the price of making them blob-free and still retaining high performance

That sounds vague ;)

Also, what's "high performance" about the board anyway? PCIe Gen 4? On a typical developer workstation that's kind of a waste, Gen 3 is plenty.

While the machines off it is a paid for resource that is unused. A cloud provider would lease that resource (so to speak) to someone else during that time, meaning in theory they can provide lower cost than you will ever see unless you can somehow get the hardware cheaper than they can.

Good providers will still allow you to run an accelerated VM inside the leased VPS, so you could still do your kernel hacking there.

I'm simply saying there's something interesting here -- you care enough about owning (I use that term loosely) a machine to spend more on a local system, but not enough to obtain one that you can freely modify as desired. Clearly there is a threshold, and I'm curious where it lies. :)

The threshold is not spending all my savings on an additional computer "for science" :)

> accelerated VM inside the leased VPS

Does that work on POWER?

> they can provide lower cost

They can but they won't. They like having huge profits. Even if they offer the base VPS for cheap (Spot instances) they rip you off on storage, bandwidth, IP addresses, etc.

Also, again, desktops. I like developing directly on a desktop workstation. I can't exactly insert my Radeon into a PCIe slot in the cloud and run a DisplayPort cable from the cloud to my monitor :)

Yeah, POWER has basically unlimited nested virt from POWER9 on. And unlike x86 you don't get the massive slowdowns past a level or two of nested virtualization.

Stadia seems to think it can push a high resolution monitor like stream over a network interface. I'm playing devils advocate of course here but fundamentally if you don't have control of the hardware there's no long term advantage to local compute, at least not with current market trends etc. Everything points to a move back to dumb terminals for consumer use at this point -- in the past it would have at least been possible to hack those terminals to run some minimal (for the time) OS, but crypto locking of the terminal hardware stops that quite cold.

> outrageous... ridiculous...

Enough people are buying, which means the price is just right (Capitalism 101).

Enough to not bankrupt Raptor just now, but not enough to make POWER actually popular, which would provide them much more customers in the future.

Yeah, I'm still working on it. But I'm just one guy. Hopefully this gains interest from others in working on it too (and I will _not_ be butthurt if someone gets a fully functional one submitted before I do -- in fact, I'll probably be relieved).


POWER9 only supports 128-bit vectors, which is a big disadvantage.

Yeah, Intel is ahead of everyone in SIMD width (until someone makes a chip with like 2048-bit ARM SVE :D).

But still, that didn't prevent POWER9 from being in one of the largest supercomputers. And super wide SIMD has its disadvantages (hello AVX Offset downclocking)

You have a point but then Intel has been crushing it...


They have cornered at least 85% of that market...

And most likely, it's the nVidia connection with NVLink which matters most in there if we talk about SIMD...

They said it better than I could (in June of 2018): https://www.top500.org/news/new-gpu-accelerated-supercompute...

In the latest TOP500 rankings announced this week, 56 percent of the additional flops were a result of NVIDIA Tesla GPUs running in new supercomputers – that according to the Nvidians, who enjoy keeping track of such things. In this case, most of those additional flops came from three top systems new to the list: Summit, Sierra, and the AI Bridging Cloud Infrastructure (ABCI).

Summit, the new TOP500 champ, pushed the previous number one system, the 93-petaflop Sunway TaihuLight, into second place with a Linpack score of 122.3 petaflops. Summit is powered by IBM servers, each one equipped with two Power9 CPUs and six V100 GPUs. According to NVIDIA, 95 percent of the Summit’s peak performance (187.7 petaflops) is derived from the system’s 27,686 GPUs. (emphasis mine, Summit being a POWER9 supercomputer with 4608 nodes with 2 POWER9 and 6 V100 in each)

I think that's more reflective of Intel cornering at least 85% of virtually every market (except, notoriously, mobile) than of some special suitability for HPC.

I agree, I was just saying all traditional supercomputer manufacturers lost that battle, including POWER but it's the last standing member from the old guard...

I think the best bet for POWER would be for Chinese manufacturers, given the trade war and general distrust of American-made tech in their systems.

There are already manufacturers who have licensed the EPYC IP from AMD, but a for-free design could be compelling.

AMD uses TSMC’s process, so PowerPC could end competing with x86 for Western consumers as well.

Power clusters are used for ML because they have integrated NVLink and as I remember OpenCApi for FPGA. It’s not general purpose platform.

I'm typing this reply to you on one (in Firefox 68). What's not general purpose about it?

From Power9 cluster with array of GPUs? For work loads like http servers/php code/MySQL/etc intel CPU will be much faster. We have CPU samples from all vendors, unfortunately Intel right now is the best platform.

That didn't answer my question. That's fine if you think Intel is better (I don't, but that's orthogonal), but that doesn't make PowerPC less general purpose.

The funny thing is that IBM caused x86 to be successful due to the IBM PC

It's entirely possible that the x86 might still have been very successful even if the IBM PC had never happened. Back then, there were a lot of vendors involved in the 8080/Z80 ecosystem, and the 8086/8088 was a logical path to the future for those vendors – an easier migration path than e.g. the Motorola 68K, and cheaper than the 68K as well.

I was an active Z-80 user in those days and I respectfully disagree with that assessment. There was nothing logical about going from Z-80 to 8088/86 (and 8080 didn't matter nearly as much).

Seeing the number of design wins m68k racked up, it would have been the logical choice (and ISTR that IBM actually liked it better, but it and its peripherals were more expensive). Disclaimer: not a fan of any of these architectures.

EDIT: typo

There seem to be general paths for OS.

1) OS from the start. Develop in the open. Maybe lock some features behind a paywall.

2) OS when something is not hot anymore. Take your formerly private stuff you charged a lot of money for, and because so much better stuff has come out.. meh, let's OS it.

This is clearly a case of #2....

While I agree that there's a whiff of IBM trying to offload responsibility for older tech, I think some credit has to be given to projects which pre-dated the current industry attitudes towards open source. A lot of things had to happen (probably including people of a certain generation/era passing the torch) before companies felt comfortable with the idea that they could open source their tech, and still have a competitive advantage over anyone else who would use it.

In the late 80s/early 90s when POWER appeared, and RISC fever was in full swing, if someone, as a VP or C-suite level decision maker at a mega-corp like IBM, declared "let's just release all the IP of our high performance processor design to anyone who wants it!", they would have their coworkers and superiors questioning their sanity, at very, very least.

> if someone, as a VP or C-suite level decision maker at a mega-corp like IBM, declared "let's just release all the IP of our high performance processor design to anyone who wants it!", they would have their coworkers and superiors questioning their sanity, at very, very least.

Well that was basically the PPC consortium. I'm not sure how much apple/motorola/etc paid to be part of it, but the idea was to build a common ISA from multiple vendors.

Hardware is different than software though: producing actual physical chips is extremely expensive.

POWER is hot, IBM is probably just confident that someone else producing competing compatible chips and taking all their customers is not a real threat / completely outweighed by the benefits of an open ecosystem (including more compatible but different segment (low-power) chips)

Producing actual software is very expensive. How expensive would Linux be if created by a group of paid developers? Postgres?

Don't downplay the work that goes into complex software.

Linux built up over time with zero upfront investment, starting with Torvalds in his bedroom. Silicon fabrication requires gigantic amounts of cash upfront.

I'm talking about the barrier to entry.

Time is a barrier to entry.

You don't think Torvalds could have gotten paid for programming in the time he spent writing Linux?

What is the sum of all the time invested into Linux * the average hourly rate?

That's not what "barrier to entry" means. Entry is not making 2019 Linux. Entry is making something.

I can write a tiny blog engine in a day on my existing computer. I can't walk into Global Foundries and ask them to make me a single wafer of my tiny microcontroller on their 14nm process.

Hobby software is on the same playing field as pro software, and can smoothly become pro software like Linux did. Hobby silicon is on the 1970s playing field - Jeri Ellsworth and Sam Zeloof making a few transistors with size measured in micrometers. There is literally no way to make your own "hello world" in modern performance silicon.

Are you trying to start an argument? Comes across that way if not.

I'm afraid the are doing it at least 10 years too late.

Naive question: What would this mean for Xbox 360 emulators?

As someone writing a 360 emulator, probably not much. The Cell docs are already available (Xenon is a modified Cell PPE), and the ways that it's different are a one off for Xenon that probably won't be documented in this dump.

They were fine anyway, in fact Xenia has been in development for a long time already.

ISA royalties are for chip vendors. QEMU implements like most ISAs out there and I don't think anyone ever had to explicitly acquire any rights for that…

I wouldn't think it would matter very much for emulators, but there has been some amount of work with old game consoles running on an FPGA to give the actual experience (and you can add quite a few old console chips on a single modern FPGA). If someone is up to re-design a POWER chip, running games would be a lot easier.

Nothing at all.

Or Mac, for that matter.

IBM is dead until it gets Z mainframes and Power servers in big a three data center (AWS, GCP, Azure).

Guessing Microsoft will start with putting mainframes in the West Des Moines data center since so many insurance companies are still dependent on DB2 batch crunching.

> IBM is dead until it gets Z mainframes and Power servers in big a three data center (AWS, GCP, Azure).

Google is using POWER9 servers in their data centers:

> https://www.forbes.com/sites/patrickmoorhead/2018/03/19/head...

> https://www.nextplatform.com/2018/03/26/google-and-its-hyper...

Of course, most of their servers are still x86; as far as I am aware, the reason why Google also uses POWER9 servers is that they don't want to be too dependent on the two manufacturers of x86 CPUs.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact