The problem becomes that Power9 is now effectively a substrate machine for NVidia GPUs more than it is a competitive offering by itself. In my previous life, working on building accelerated computing platforms a decade or more ago, some rules of thumb from customers were that they needed to see 5x at least performance delta to justify serious consideration. And that you couldn't charge 5x the price for that performance delta.
So if power9 is 5x the speed of a Xeon in a set of relevant test cases (ignoring GPUs), and comparable price for a system, sure, it has a good chance of being successful. If not, it is questionable at best.
As others have noted, cheap ... as in really dirty cheap ... development/test boxen for devs to build at home are critical for this. Back in my SGI/Cray days, I was advocating for, though failing to convince, management to enable us to sell older indys, indigos, etc. with IRIX and dev bundles for home/app dev use, at a very low price. My argument was (back then) that nothing could touch Irix for ease of use, and that it made sense to seed developers with this without worrying about making money on them, rather having them create the content that people demanded, that would help us sell machines. Management was worried it would decimate our developer revenue stream. Rather shortsighted I think in retrospect.
Power9 looks poised to suffer the same fate, though not because of OS/compiler costs, but because the hardware will be un-affordable for pretty much everyone.
This is why you have to target ubiquity. You need those developers. If they can't afford your boxes, or your stack, then you aren't going to get them.
This sounds terrible to me. From a POV of a platform vendor, the only "developer revenue stream" I'd care about would be popularity among developers. Give away the basic tools, sell slightly more specialized tools for the price of a pizza, get as many people hooked up as possible. Send your machines to universities at a discount. Sell older machines to hobbyists, and support the community. Help open-source projects run on your hardware and take advantage of its unique features (if any).
Of course, there is a niche to actually sell top-notch enterprise-grade tools for your systems, too, because corporations gladly buy support and stability. But this is usually a tiny trickle compared to the hardware sales, and its very existence is entirely dependent on your system being widely popular and ubiquitous.
As a bottom line, it's much better to get a 5% margin from a $100M market than a 25% margin from a $10M market.
Power9 would really have to double down to have any chance of beating x64 and ARM, especially with RISC-V also about to enter the mix with its own advantages (openness). I think their only chance would be to actually sell it at cost or at a loss, at least at the entry level.
Love my Fuel. Wish I could still use it as a dialy driver (too expensive to run electrically). Management really destroyed you guys, eh?
Hint, without specifics due to NDA, go talk to Supermicro sales directly about POWER systems.
Is Xeon 5x the speed of Xeon? How do those institutions ever upgrade then?
I think what you're really saying is Power9 needs to be 5x the speed compared to Xeons from 5 years ago or whatever. Because otherwise new Xeons would never replace older Xeons either, following that logic.
Power9 and other Intel alternatives just need to show 1.5x-2x the value compared to Intel's latest.
That is, there's no incentive to pay the cost of switching if you don't save any money in the longer term. And that "longer term" is usually but a few years.
kN might be 1.25x existing costs for 5x performance today. This was an actual example given to me. Customer then indicated that they'd be quite interested in performance at that price.
k*N >> 2x, and it got problematic. So I had my range to work with.
AFAICS, the selling point of this thing is that it has a higher BW and cache coherent connection between the GPU's and CPU's.
Of course, some users like national labs, Google etc., might want to invest in this as a strategic investment, to keep Intel from becoming even more of a monopoly than they already are.
No it isn’t; why did all non-intel processors fade into obscurity and why does intel architecture with all of its Zilog 80 baggage dominate? Because one can’t get a dirt cheap (under $500) POWER9 desktop or server; same goes for MIPS, UltraSPARC, anything else that’s modern and non-intel... why is ARM so popular? Because one can buy it dirt chip for tinkering with it at home. That is what propelled ARM and intel, people would install a Linux ISO on their parents’ old
PC and it built up familiarity.
Until that happens with POWER (or any other non-ARM, non-intel architecture) history has taught me that it will fail. Hardware has to be easily accessble and dirt cheap, and things like software mirroring or compilers must come with the OS and all the software has to be gratis. Otherwise history says - fail! Sometimes of epic
proportions (hp, sgi, Sun, IBM).
Sun Microsystems for example eventually realized that the software has to be gratis and open source, but they lived in this fantasy world where they thought they could charge anything they wanted for hardware — and would indignantly argue about it when it was pointed out to them that it’s way too overpriced and expensive. That lost them familiarity with the hardware and familiarity with the OS because for most people a second hand used PC-bucket with GNU/Linux was good enough and the target audience wouldn’t have known or cared about all the advanced features anyway. We all know how that ended.
Me: Oh look this is fascinating! We have a worthy competitor to intel processors and we can do our work way faster. Let's get this!
Boss: Ok, tell me why one of this costs more than all of our other servers put together?
Me: But, it is so much better, it does blah, blah and more blah.
Boss: Do we really need it? What would we save by getting this instead of the commodity servers we normally buy. May be we should look into making our code work better with GPUs instead?
Don't get me wrong. I think this really is fascinating. Just for the HFT traders and the top 1% buyers of exotic hardware. Not to an average HPC shop like ours, until the economy of scale and premium catches up to value.
And there lies the problem. HPC used to be a tiny market with huge budgets where expensive hardware was the norm. It's just not that big a deal any more. It's commodity hardware now with lots of cores and GPUs running OSS. They won't make it trying to charge a premium just because it's Power9.
It is an old idea called "mind share"  or in this case technical-mind-share, a vendor should want the largest possible community of technical people interacting with their technology and attempting to use it in as many niches as possible.
I'm not sure it even has to be uniformly cheap, as long as there is a low-cost entry point to hook people. If the price ratchets up from that low entry point but it is a compelling solution people will want it in their technical lives and get it funded somehow.
 I would point to RISC-V as another example of a newly popular machine architecture due to its initial low-cost.
 noting the preferenced technology must meet the minimal acceptable threshold for all the usual dimensions.
The real reason Intel/AMD64 is popular is the same reason Windows is popular -- mindshare and economies of scale. IBM and the x86 platform won the business desktop in the 1980s and that has been propelling everything on that architecture ever since.
I think the real problem is everyone likes to target the top of the Intel systems that sell in very low quantities. Sure I'm sure HFT folks can make huge $$$ trading just a bit faster. But the bulk of Intel server sales are the low/mid range. Things like the E5-2620/E5-2630 that costs $300-$500 per CPU and has nice capable servers starting around $2k.
Sure $65k power system with 4 Voltas sounds great. But few will drop over $50k on a server without having a few cheaper servers in production for awhile. Beating the collective performance of 10 $5k servers is also going to be tough. Amusingly Intel fails that same test, rarely are the top chips the best price/performance unless the cheaper servers trigger needing to buy a new building.
Not to mention if IBM starts getting traction you can bet those $5k per CPU intel chips will start getting discounts rather quickly.
Switching to a distributed system is sometimes too far off the radar and costs precious engineering time. So the story is “just buy a bigger and faster computer, we’ll figure out how to shard this later.” The company drops more cash on IBM or whatever, and the engineers focus on features that close sales deals.
And then there are the larger companies which buy some expensive IBM servers and port their software in order to squeeze discounts out of Intel.
(Lots of handwaving in the above, main takeaway is that it's the expected and rational path in a world where engineers are expensive and scarce and servers are cheap)
Inherited a complete mess of a system (response time measured in minutes) running on an Xeon E3 w/8GB of RAM and 500GB of spinny rust.
It's going to take months if not years to unscrew some of the mess so I suggested to the boss the quickest way of improving it was to upgrade the server so we ordered a Xeon E5 dual processor system with 32GB of DDR4 and 1TB DC grade SSD.
Once I've ported the system from PHP5 to 7 on that and moved to a recent MySQL version we'll get a very nice perf improvement for basically free and then I can refactor the mess and at the end we'll have a better optimised system running on much better hardware.
Hardware is cheap, people's time isn't.
Sadly it's not always that simple.
People's time, salaries, falls into the base expenditure.
Hardware acquisition is a discretionary cost, which is usually squeezed particularly in a cost-centre like IT support. Or the existing hardware is under a multi-year lease from a vendor like Dell or HP and has to be paid-for whether it's obsolete or not.
Where I've worked ( big non-IT-orientated companies ) it's more typical to throw a couple of people onto a project to debug and refactor an application than buy better hardware. Or even worse, implement a new project on existing shared hardware.
Lets say IBM sells server hardware for 60K USD per machine versus Intel's 20k (I made the numbers up) and shows, through extensive benchmarking, that despite the higher initial cost, their machine's performance/dollar is better than Intel.
Then 2 years later Moore's law happens. Intel's new 20k offering now outperforms IBM's old machine on every metric. In hindsight, IBM didn't make sense financially.
With Moore's law slowing down, maybe now its the right time for IBM to make a move.
Part of this may be due to Intel’s lead in manufacturing technology, and have nothing to do with architecture.
(Historically, RISC was much more competitive. Talking about current state of affairs.)
Actually, ARM is more CISC-y than MIPS, POWER, SPARC, etc. and recent cores break more complex instructions into uops just like x86.
If you consider uops RISC, then just about every architecture is "RISC-like on the inside". RISC is about a simple and restrictive user-facing ISA, not the internal microarchitecture.
In the long term, CISC makes a lot of sense --- it saves on fetch bandwidth and instruction cache to essentially have the CPU "decompress" complex instructions to execute internally, and the core is many times faster than memory.
IMHO "pure RISC" was an academic exercise, and the only reason why early CISCs were easily beaten was because they were sequential/in-order, and memory bandwidth wasn't a bottleneck at the time. With the growing core speeds, memory becoming a bottleneck, and invention of parallel uop decoding/execution, CISC could do more per clock and with less instructions. You can see this trend here:
The best ARM core on that list can do 3.5 DMIPS/MHz, and the best MIPS at 2.3 DMIPS/MHz, while the best x86 core is at >10 DMIPS/MHz.
This is from a couple years ago. Even more important (I can't find it right now) is that risc-v compiles code to fewer bytes than x86 and fewer micro-ops as well. With the smaller/simpler instruction set it also requires less area and power, so it may be just a matter of time now.
So with RISC-V GC + macro fusion of a few common idioms you get slightly less micro-ops than x86-64, armv7, or armv8.
That being said, for high-performance cores such as Skylake, POWER9, or the latest aarch64 server cores the ISA probably doesn't matter that much.
For microcontroller class HW, the x86 ISA might be a crippling disadvantage compared to RISC-V. For a high-performance core, which AFAIK is something like 20-30 million transistors, not so much. The bigger the core, the smaller the advantage of the ISA.
But yes, I don't there is any reason why RISC-V couldn't be used to create a high-performance core competitive with the x86, POWER, ARM of the day. It's just a hugely expensive affair.
Right now the big advantage of RISC in the high end is ease of parallel decode thanks to instructions always starting on 32 bit boundaries. Which is maybe a 5% power advantage at most. 64-bit x86 has lots of historical baggage and the people doing 64-bit ARM were clever so they're actually about equal in instruction density. It was an advantage to have fewer instructions back in the heyday of RISC but it isn't any longer and both ARM and x86 have tons of different instructions.
When you move down from high power OoO cores to in order ones then having 32 registers instead of 16 becomes more of an advantage and the benefits of simpler decode become more significant too.
As for ARM, I think their success has more to do with their business model (licensing instead of fabricating) than with their architecture.
Apple's A11 (single thread) 4217 (multi) 10164 https://browser.geekbench.com/ios-benchmarks
Intel i7 8700k (single thread) 6089 (multi) 26654 https://browser.geekbench.com/processor-benchmarks
It appears that we are near the end of physically making smaller nano meters and we are back to the co-processors (Like the old Amiga days) Co-Processors and some kind of Assembly (A Lisp that complies to an ARM Assembly to much for one to ask for???) are going to be the way we improve speed in the future.
Furthermore, RISC V is one of the most awful designs I have ever seen in a processor internally. The assembler mnemonics are idiotic (compare and contrast with MC68000), the assembler is backwards (intel syntax of dst, src instead of src, dst)... it’s just an awful, non-orthognal, non-intuitive design. It’s an imaginary processor for imaginary hardware.
Microchip makes a couple low-end MIPS boards:
When you outgrow those, you can move up to the Creator Ci20 from Imagination:
8080, please. The 8086 was designed to be assembly-level source compatible with 8080's so porting software (here's to you, Gary) from CP/M would be easy.
1. The 8086 was not designed to be assembly-level source compatible, but nevertheless intended that assembly code from 8080 could easily be ported to 8086 (mostly find & replace)
2. Zilog developed a different assembly language for the Z80 than Intel used for the 8080 (for copyright reasons, I think - though nearly everybody would agree that Zilog's assembly syntax is better; compare for yourself: http://nemesis.lonestar.org/computers/tandy/software/apps/m4... ). The Z80 assembly language was a strong inspiration for the x86 (8086) assembly language (Intel syntax).
3. The Z80 introduced two index registers (IX, IY) over the 8080. A strange coincidence that Intel also did introduce two index registers for the 8086 (si, di).
4. Do the INI/INIR/IND/INDR/OUTI/OUTIR/OUTD/OUTDR that were introduced by Zilog for the Z80 strongly remind you of the INS/REP INS/OUTS/REP OUTS (just consider that the 8086 uses the direction flag) instructions of the 8086?
TLDR: Intel took more "inspiration" from the Z80 than is generally acknowledged.
They have abysmal performance however. An RPi 3 (or just about any 64 bit ARM board) will perform better and cost less.
The Z80 was a clone of the 8080, not the other way around.
And while I am sure it is a sweet machine that offers great performance and reliability, it is rather expensive. If you want to - especially if you mostly care about the GPU part anyway - you can get an Intel or AMD based machine for a fraction of the price.
IBM still stells the large POWER servers that run AIX or IBM i (in addition to Linux), and they still make their zSeries mainframes. But at least the latter are even more expensive.
All of these are really great machines; IBM has decades of experience building computers for some fairly demanding customers, and from what I know about these machines, it shows. But do not confuse the machine described in the article with IBM's top-shelf machines.
Just due to volume and cost, I’ve basically given up on the idea of owning my own POWER system in my home rack. If ibm got aggressive with cloud pricing and put together some dev friendly packages that were all tooled up and ready to go, I’d be more than willing to play with it and see how it performs for my tasks. In spirit, I’d love to see a competitive alternative to AWS and Intel, we need it, but I have a hard time paying a premium for it to then experiment and find out if it can actually outperform Main Street.
Tooling is huge here too, there are very material advantages to the incredible hardware optimization done by node.js, pypy, llvm, etc.. since apple transitioned off of PowerPC, I don’t think this platform has had nearly the same amount of attention. I’d love to see IBM giving away access to their version of a “micro” instance to OSS developers
"two Power9 chips, four Volta GPUs accelerators, and 256 GB of memory" =~$65,000
The NVidia DX1 (8 Tesla GPUs, 2 Xeon, so much more GPU, less CPU) is $125,000
Configuring up a SuperMicro GPU server looks to be around $50,000
While I trust POWER9 to be very fast, I have had odd performance differences (both positive - meaning I overspent in hardware and had some explaining to do - and negative - meaning I was too optimistic and had some explaining to do) when moving from x86 to SPARC, Itanium and POWER (and MIPS - I'm that old).
The Talon are good entry level POWER9 systems for reasonable prices, but I'd love to see parts trickling down to Xeon E5 prices.
If the barrier of entry is too high, only what already runs on POWER8 will move to POWER9. Xeons have good cost/performance and are a safer bet.
Nobody does, they are still in the pre-order stage.
This means you'll have at least KVM so you can run nearly anything on top of that including Windows VMs with PCI-passthrough for gaming,etc.
The cost is non-trivial but I've been working up my nerve to pull the trigger and go this route.
Now I search eBay for one. I have a couple Suns ;-)
> Do you have one? Do you use it as a daily workstation?
I wish. Can't justify the price tag for a novelty item.
The ecosystem of tuned software just isn’t as big on Power.
Lots of good points in this thread, but Power is likely going to go for the invisible de-facto cloud data center hardware route.
IBM is different, though, in that they have a legacy system effect. Their locked-in customers are where this money is coming from. That's why they keep making the stuff to sell for exhorbitant rates. They really need to create different tiers of pricing justifying it by saying the firmware or whole stack is optimized for this, that, etc. If it's gravy train business, that optimized stack is priced at the fortune the marketing team says it's worth. If it's not (i.e. Raptor or RISC-V), it gets a steep discount just to encourage more adoption with associated software porting. They seem too dumb to do this.