
IBM z14 Microprocessor and System Control Design - rbanffy
https://fuse.wikichip.org/news/941/isscc-2018-the-ibm-z14-microprocessor-and-system-control-design/
======
reacharavindh
IBM Z systems are always fascinating to a computer engineer. I was fortunate
at my first job to be able to access a z10 brand new, and was tasked with
installing, configuring, and trying out several POCs with z/OS, DB2 on z/OS,
WebSphere Application server on z/OS, CICS transaction server, z/VM operating
system, running 1000's of linux images on top of z/VM, making them communicate
with each other and z/OS using Hipersockets. It was pure fun.

The cool moments I distinctly remember were runing the z/OS installation
binaries from tape drives, and swapping in and out - DASDs(Hard disks), and
perhaps the coolest was being able to hot plug in/out CPU blades from the
cabinet while the system is live.

It's pity that thing is so expensive, and out of reach of the common folks.

For those not in the know, and curious. Based on my knowledge from 2011, CPs
are the general purpose CPUs that run z/OS - the main OS companies typically
run on System Z. There were special CPUs(may be microcoded so), called IFL
that were allowed to run z/VM and Linux. I don't know how things have evolved
in the last 7 years :-)

~~~
mikehollinger
So I joined IBM in 2005 working on firmware for a project called eCLipz, which
eventually produced the POWER6 family of systems. We've just shipped POWER9
which is in the #1 and #2 supercomputers in the world, Summit and Sierra. [1]

The "i/p/z" part of the project name is important, because it represents:

i - the lineage of the AS/400, which was a 'minicomputer' design from the 80's
designed for multiple users, which came up with the idea of "wizards" for
administration tasks before there was even a name for a "wizard." The "i"
means "integrated," since IBM i is meant to require very little
administration. [2] [4]

p - the lineage of the RS/6000, which begat AIX UNIX. The "p" indicated
"performance." The box I worked on shipped at 5 GHz in 2008 (IIRC).

z - the lineage of the System/360 mainframe which the above article discusses.
The "z" indicated "zero downtime."

The control structure and power and cooling infrastructure for the systems
used common elements with that project. The hardware and firmware underneath
the operating systems for i and p converged into Power systems, which
eventually produced a variety of projects in the open source community
(OpenCAPI, OpenPOWER itself, OpenBMC), and of course runs bi-endian to support
things like little-endian Ubuntu or the classic Big-endian AIX or i/OS on the
same processor core.

Interesting trivia - In IBM systems (Mainframe or Power), the cold boot of the
processor from an unpowered state is called an "Initial Program Load," or IPL.
The IPL lingo dates back to System/360 (see its Wikipedia), and survives to
this day in really interesting places in the wild. Take the OpenBMC project,
which aims to create a 100% open source software stack for those sideband
management processors found in servers (and increasingly in other devices
too). This github bug from a few months ago which complains about a bug
causing "IPL" problems. [3] ;-)

[1]
[https://www.theverge.com/circuitbreaker/2018/11/12/18087470/...](https://www.theverge.com/circuitbreaker/2018/11/12/18087470/ibm-
summit-sierra-supercomputer-us-fastest)

[2] [https://en.wikipedia.org/wiki/IBM_i](https://en.wikipedia.org/wiki/IBM_i)

[3]
[https://github.com/openbmc/openbmc/issues/2831](https://github.com/openbmc/openbmc/issues/2831)

[4] edit - fixed "70's" to "80's" per
[https://news.ycombinator.com/item?id=18494866](https://news.ycombinator.com/item?id=18494866)
;-)

~~~
rbanffy
I have friends who still say their program "abended".

Just a minor nitpick - the green screens we see today are also often
descendants of the 3270 family as well as the 5250's that started life with
the System/34 and found their way into the AS/400 and iSeries.

I had to nitpick because the 3270 is where this beautiful (shameless plug)
font originated:
[https://github.com/rbanffy/3270font](https://github.com/rbanffy/3270font)

~~~
NikolaNovak
Are... are there people who don't? :O

I haven't ever really worked on Mainframe as such, but I _am_ in the IBM/ERP
world, and, well, a program Abends, that's just what it does when it goes
South. Not even sure what else to call it - failed? Errorred out?
...Segfaulted its guts on the ground? :->

(edit: I suppose I should also fess up that we call virtual machines "LPARs"
and physical machines "frames" :P )

~~~
rbanffy
It took me a long time to accept Docker's use of the word "container".

To me, "virtual machine" has a distinctive Smalltalk 80 feel. And, of course,
it's a bit underwhelming in comparison.

------
jhallenworld
I used to work at IBM, making x86-based appliances. Occasionally there was a
push to use Power, but x86 single-thread performance was better and cost was
lower so it never happened. What would have been nice is if they allowed us to
use z's CPU, but in a 1U or 2U appliance. The marketing advantage would be
that we could claim to get Z's reliability for our appliance (even if it was
slower). I understand they wouldn't make a generic z low end z/OS server
(competing with themselves), but this would have been a nice closed-system use
case.

BTW, mainframe group takes reliability very seriously. Basically it must not
ever corrupt customer (financial) data plus the fit rate must be as good as it
can possibly be. I remember when they were making the zBx add-on (an x86 blade
center for z which could run Java web apps), they tested the x86 in a
cyclotron beam to get an idea of its fit rate- it was quite good (good
enough), but not as good as z's CPU.

~~~
Shivetya
Where I work, been here for twenty years, and in all that twenty years I have
been told we are going to do away with the mainframe which is a Z series. Of
course we have been told the same about the i series which is actually larger.
What have we done in the last year alone, gone to the Z-14 and a 32-way 870 i.

seriously people underestimate the cost to move off a platform that is working
and worse when there are so many touch points no one can seem to get an
accurate handle on them all.

IBM does indeed take their reliability seriously, both i and Z are all built
around not having any downtime but the Z does do it better than i which still
can require it for patches.

i just find it fascinating the work that we use these machines for and how
instead of being replaced more work lands on them because of reliability
issues of other platforms. That and support requirements of other platforms.
The z and i teams are three man admin with a few operators to nod heads... the
other platforms seem to have armies

------
linuxonz
Seeing z14 at the top of HN makes me really happy :). I actually currently
work Z systems as Linux kernel dev. Z processor is a beast and has some great
features.

------
djhworld
Really enjoyed reading this, even if I didn't understand a lot of the jargon.

I find mainframes fascinating beasts, I don't understand how a programmer
would be able to make use of all those "CPs" but it sure looks like a lot of
power to use!

It's very exotic, maybe too exotic. I'd imagine hiring junior people and
training them up on these things is a lot of effort and probably very hard to
hire for - in competition with the companies offering cloud experience etc

~~~
wilsonnb3
I was hired as a junior software engineer for a mainframe company, and it is
indeed very hard. zOS was radically different from anything I had done before
and it took a long while before i felt comfortable with it.

IBM actually holds a very good week long training you can send junior
developers to. It’s basically a boot camp. There were about 10 other
developers at it when I went, most of whom were fresh college grads from the
big mainframe software companies (BMC, CA, Rocket, etc.).

I’ve since moved on to the world of .net and I don’t miss the mainframe stuff
at all.

Edit: I’ll add that a lot of what made it difficult for me was that I was
working with assembly language and our product required a lot of knowledge
about the OS. Many mainframe programmers are using java and eclipse, which
isn’t that different from using java and eclipse to target any other system.

------
Quequau
Every time I read tech news on IBM's Z series mainframes I wonder what a
minimum viable / reasonable build out of this hardware might be. As a regular
human there's no way or reason I'd ever really need an actual mainframe but
the hardware itself has fascinated me for a long time.

~~~
pjmlp
Back in 1994, the IBM OS/400 range had a small enough model that we could even
sit on it.

Think of those big tower PCs of yore, just a big larger.

~~~
dfox
IIRC most of the 90's AS/400's are of similar size as contemporary servers. Or
twice as big when they have "Integrated something whatever Facility", which is
simply an x86 server that occupies half of the enclosure.

------
linuxonz
Seeing Z related article at the top of HN is pleasantly surprising :). The Z
systems are a beast and have some very good features

------
trasz
FWIW, the mainframe teardown videos on YouTube are fun to watch:
[https://m.youtube.com/watch?v=vuXrsCqfCU4](https://m.youtube.com/watch?v=vuXrsCqfCU4)

------
3chelon
Each z14 has "approximately 20,000" C4 solder bumps, i.e. connections! That's
astounding.

------
krylon
It's a little sad these machines are so expensive. I can see it from IBM's
point of view, there probably is not much money to be made in scaling these
machines down, but still.

------
Koshkin
I wonder how this mainframe compares with a similarly priced x86-based server
rack running Linux.

~~~
fulafel
Anyone know what the biggest single-system-image x86 system on the market
today is?

~~~
CaliforniaKarl
One of my co-workers manages the SGI UV300 described at
[http://med.stanford.edu/gbsc/uv300.html](http://med.stanford.edu/gbsc/uv300.html)
(new purchased get hardware badged as the HPE MC990 X). It does fit most
definitions of a supercomputer, as it's multiple nodes with interconnects that
allow a single OS instance to run across all of them (the OS then sees &
manages the separate NUMA nodes).

One thing that has been learned is, software will sometimes do interesting
things if presented with so much memory. For example:
[https://github.com/zfsonlinux/zfs/issues/7275](https://github.com/zfsonlinux/zfs/issues/7275)

~~~
rbanffy
Speaking of SGI, I was expecting the Cray XC family to be able to do that with
the ludicrously fast interconnects they have, but I couldn't find any mention
of it.

~~~
gaius
No, an XC40 runs a kernel on each node, the programming model is MPI, not SSI
with NUMA. It runs a minimalist Linux distro called CLE. The nodes are
surprisingly small, in terms of memory too, 64 or 128G. Physically they are
beautiful to look at, so clean, just a CPU and some memory and nothing
extraneous. Cray really should do glass cases.

~~~
rbanffy
They used to make beautiful computers.

------
bhengaij
I didn't check this out but what made me respect IBM was their cell (PS3)
processor. Very well thought design and ahead of time. Shame it was hard to
program.

~~~
rbanffy
Consider these machines can run unmodified binaries written in the 60's for
the IBM 360 machines. Also 370, 390, ES 9000 and every previous z.

