Ignoring for a second that they claim these have the same performance without providing any actual benchmarks that I see... the server configurations alone make no sense. In 2021 nobody would order 12x HP proliants with less than 256GB of memory per server and then run 1,000 VMs on them. That averages out to what? Roughly 2GB of memory per server unless you're horribly oversubscribing them - which nobody who has ever designed a virtual environment would ever tell you to do - so your VMs are going to be memory starved pretty much 24/7. Even IBM tells you not to oversubscribe memory on their mainframes.
I also struggle to believe that oversubscribing the z CPUs 66:1 isn't going to result in performance issues. The documents on limits aren't clear but I foresee that mainframe spending a LOT of time on locks even with all the work IBM has done to help manage them.
This whole thing just really smells like a bad advertisement for IBM, which aligns with them announcing their latest gen z processor last week.
While I agree that this configuration needs more memory, mainframes are primarily used for very high throughput jobs. "Lots of memory to sit idle" isn't really a priority.
That aside, samepage sharing has been a thing in virtualization for more than a decade. If you have 800 VMs running RHEL8 for POWER, a significant amount of the memory load is kept in identical pages, which lowers the burden of oversubscription. The recommended workload is unlikely to be 1000 unicorn/pet VMs, and more likely to be a large number of VMs running the same base OS, and some of the same workloads.
Synchronous interrupt locking is also a solved problem.
If by "samepage merging" you mean KSM[0], it's hugely expensive in CPU cycles (not to mention severe impacts on membw, l3 cache, etc) and outside of low-end VPS hosting has relatively few real-world applications. I'm not sure why it would be a useful thing to mention here.
Yes, I mean KSM (and its equivalents in other hypervisors).
The CPU cycles are a non-issue for the configuration on this mainframe, and it's fair to assume that, since IBM developed the silicon and IBM engineers wrote the hypervisor layer, that most of the detrimental effects of "we need this to work across multiple CPU vendors and multiple generations of the architecture" are covered.
Even x86 CPUs get improvements to cache coherency in memory dedupe scenarios with generational improvement, and intelligent NUMA topology layout helps a lot. It's also worth noting that it's a mainframe, so essentially every configuration has a large number of other processors with a couple of instructions disabled so they don't "count" as CPUs for licensing, but they still perform work. I would not be surprised in the least if there were 1+ co-processors dedicated to offloading operations exactly like "LPAR || z/VM memory dedupe"
That's a good point - I was speaking from experience with KSM on x86, but having a dedicated coprocessor on system Z for this application would make a lot of sense and fix many of the problems I've seen on x86.
This isn't something to be done at the guest VM level but at the hypervisor level. In an optimal scenario, it can be cheaper to clone a VM with CoW semantics than it would be to deduplicate memory dynamically. But I guess this would defeat kASLR between all of the clones.
The entire scenario outlined is for a production deployment of Websphere backed by Oracle. IBM specifically calls out 1:1 memory for production workloads (which oracle very much is) in their performance documentation:
>Synchronous interrupt locking is also a solved problem.
No, it really isn't. I have not seen a production implementation of virtualization with > 4:1 CPU oversubscription that doesn't eventually or immediately have significant performance issues with database workloads.
Isn't WebSphere a web application server? That really doesn't look like the sort of extreme vertical scaling and integration that mainframes are supposed to perform well at. We would expect hyperscaled commodity hardware to perform best here.
> The entire scenario outlined is for a production deployment of Websphere backed by Oracle. IBM specifically calls out 1:1 memory for production workloads (which oracle very much is) in their performance documentation:
This discussion is about the configuration of the systems, not this specific scenario. While this may have been the scenario outlined for you, it wasn't apparent from your comment.
Production workloads on busy servers have very different requirements than consolidation, VDI, resiliency/redundancy, hardware abstraction, etc. Every workload needs a different evaluation.
All of these VMs running production Ora/OAM/OEM/Websphere workloads? No, don't oversubscribe. Some of them running similar workloads? More oversubscribe is ok. Few of them? Lots of oversubscribe.
Similar for interrupt locking. The "old" synchronous interrupt lock I was speaking about was "I have a bunch of VMs with CPU oversubscribe, and even if they're doing nothing, 30% or more of your CPU time goes to interrupt scheduling so every vCPU for a given VM can schedule simultaneously". This is solved.
"I'm running a CPU-intensive workload on a massively oversubscribed server" is not, and we really shouldn't expect it to be.
I am having a generalized discussion about virtualization oversubscribe. You are having a specific discussion about CPU-heavy DB workloads. Apples cannot be compared to oranges.
I guess you can move the goal posts now, but it should have been exceedingly clear that I was commenting on the workload in the link, because I'm replying to the link. My outline of hardware is the hardware in the link, I didn't just make up an imaginary scenario. Literally everyone else in the thread is commenting on the link, you appear to be the only one confused about what we're talking about.
>All of these VMs running production Ora/OAM/OEM/Websphere workloads? No, don't oversubscribe. Some of them running similar workloads? More oversubscribe is ok. Few of them? Lots of oversubscribe.
All of the VMs are running Oracle/Websphere/Apache, per the link. That is what the article is recommending, which is where my skepticism came and is coming from.
>I am having a generalized discussion about virtualization oversubscribe. You are having a specific discussion about CPU-heavy DB workloads. Apples cannot be compared to oranges.
Everyone in this thread is talking about the article, I guess everyone but you.
If those were goalposts, they weren't clear. The miss is that the article in the link doesn't discuss VMs or virtualization at all, so talking about oversubscription/etc seemed like an ad-hoc commentary on the hardware configuration unless I totally missed that in the article, but searching for "virtual", "VM", "vCPU", "subscribe", etc didn't find anything. There's a limited amount of effort I'm gonna put into digging into additional links on "Planet Mainframe" pushing mainframes.
That aside, comments on articles are sometimes more general. "How do mainframes compare in 2021?" is a conversation which is not intrinsically linked to "should you oversubscribe VMs 66:1 for this specific workload?", and I interpreted your comment at the former. I wasn't talking about the article at all.
> The miss is that the article in the link doesn't discuss VMs or virtualization at all, so talking about oversubscription/etc seemed like an ad-hoc commentary on the hardware configuration unless I totally missed that in the article
There is a giant table in the middle of the article that is impossible to miss if you read it. It literally says # of VMs. As well as calling out VMware, and z/VM as tye hypervisors in use. The oversubscription is a simple math problem, and that's assuming best case scenario.
The article's price tag of $2.3 MM for 12 HP Servers with 24 cores per server and 166 GB of RAM each is ridiculous.
We just paid just under $50k per server for on-prem physical servers that have 144 cores per server and 1.5 TB of RAM each.
For 12 of them it would be less than $600k and you'd get a total of 1,728 cores and 18 TB of RAM to compare against the mainframe's 30 cores and 2 TB of RAM for over 4x the cost!
4x the cost is actually not bad for the massive redundancy, hotswap-everything, boatloads of IO, insane build quality and crtitical legacy software support a mainframe gives you!
Apple’s Macintosh made waves with its cheeky hello world demo: “Never trust a computer you can’t lift” (…and defenestrate).
Similarly, with an IBM big iron box, if something goes wrong, I sure hope the 6-figure support contract gets it fixed - probably within a few hours if you’re in Manhattan or the SF Bay, but what about elsewhere? Whereas with commodity HP/Dell/Supermicro pizzaboxes in a HA configuration I don’t need to worry about needing a same-day fix for hardware issues - I could put things off for months, even - and probably do it myself and spend the support contract money on a Plaid Model S and have money left-over for a Mac Pro with XDR display (with the cool stand!).
I know IBM’s srsbsns machines have their own redundancies - including being able to hot-swap literally everything, it still feels like open alternatives are overall better in the long-run.
Using list pricing comparisons of course favors the mainframe in this comparison. Since commodity hardware is sold at a fraction of list price and mainframes are not.
It's been a long time since I've run a VMWare cluster, but that's exactly what they told you to do - oversubscribe memory and let their memory dedupe, balloon driver + swapping handle the case where you run out of physical RAM.
Starting with ESXi 6 VMware has disabled the transparent page sharing (Memory Dedup) by default. Back in 2015 or so there was a theoretical VM escape vector that used the shared memory pages. So VMware just changed the default to disabled. I usually recommend turning it back on in lab environments, but leave it off in production for security reasons.
Yeah, this whole LPAR thing is supposed to make oversubscribing less of an issue, but in practical terms it is comparable to Xen PV without HVM mode (Essentially Vt-i without Vt-d). It's not that it never works, it's just that you have to take severe shortcuts and snowflake kernels to get it to work in the first place (so now you're incompatible with the body of knowledge that is the internet), and then you essentially just do multiproces CPU core sharing, memory sharing with swapping etc... not great.
VT-d is used for device passthrough. I suspect you're thinking of VT-x. But LPAR have been around forever, and aren't that different conceptually from jails, systemd-nspawn, LXC, and any other of a large number of OS-level virtualization solutions, except that you can stub in/out different kernels.
Yes, the kernels need to be "snowflakes", but that's not necessarily an issue if this is functionality you really need.
LAPR is not a OS-level virtualization, you can install Linux on a LPAR (however you do it normally with z/vm), it's more like a Hard, Hardware divider.
Essentially it's a microcode hypervisor. You can indeed install Linux on an LPAR, but it will be a useless linux unless you intend to use it for classic 'bog box' installations. (with big box referring to the cardboard boxes that come with installation floppies/CD/DVD media type of software 'distribution')
While an LPAR could technically be HVM, the nature of the changes required to an OS make it PV-like.
If someone wanted to make a comparison, it's like additional 'rings' the x86 ring security model, but instead of nested rings it's parallel rings. You might as well have a single OS with containers.
> Roughly 2GB of memory per server [...] so your VMs are going to be memory starved pretty much 24/7.
Most applications don't need gigabytes of memory though? I don't know what one uses a mainframe for but web servers, file shares, remote login software, proxies, traffic scrubbers... lots of common applications use up to a few dozen megabytes of RAM; throughput is much more important there and memory to operate on is just for some state and buffers. Even a database with millions of rows and a dozen indexed columns, the index of that (I just checked a database of my own) is in the hundreds of megabytes, far from even one gigabyte.
It seems a bit odd to assume each server always needs multiple gigabytes of RAM.
The purpose of this article isn’t to actually argue that mainframes are cheaper. It’s an asset for a CTO who’s staying on mainframes. When the CFO asks him “why aren’t we moving to Azure? I heard my buddy say he saved $40 million doing that” the CFO sends this article.
If anything this article shows why Financial companies aren't moving away from Mainframe. It is comparatively speaking such as small amount of money while the risk are incalculable. Upgrading to Z15 is a much easier option.
It is not shown, but I expect that Oracle EE is probably the most significant contributor to software licensing costs. If Oracle licensing was computed for 288 cores, that is totally unrealistic. Oracle EE would only run on a subset of cores.
If I recall correctly, Oracle licencing would actually require to pay for all the 288 cores on the host even if just a subset of the VMs in that virtualization run Oracle.
True for all x86 virtualization i know of.
But not true for IBM Power LPARs where only the entitled CPU resources have to be licensed and can even be shared between LPAR (Power Enterprise Pool).
I don't know for sure but would therefore assume that it's the same for the IBM mainframe platform.
In 1996 I remember Oracle was licensed per user. We were planning to use it to back a Netscape Publishing System (IIRC) for a portal with both open and subscription-based areas.
Guess what happened when Oracle learned we were planning to have 3 million users...
I know Fujitsu have a version of Postgres made for LinuxOne (which is just Linux on z/Architecture) No idea what they are taking for support of it though.
Yes, whatever happened to DB2, or whatever it's called today (was UDB upon mainframes then DB2, though was different from DB2 upon AIX and X86, then there was this fetish to prefix things with the word web or suffixe them with the word sphere and I lost track around that time).
The z13 hardware is a bit more costly, but yearly maintenance and ops cost is lower. The margin would probably be lower (you'll still want some support from vendors) but the convenience of having all your data-center inside one box is tempting.
At least some years ago Oracle required to pay per underlying core - you don't need to know how many you are using at any point of time, you pay for all cores that can be used.
Interesting how they assume static loads, 1-to-1 transferability from x86 to whatever architecture the Z wants to run (s390x?), and that admins for IBM systems are as available as x86.
It seems that this would never work in a fast moving and fast scaling environment, and the added 'special' sauce makes it incompatible with the broader FOSS ecosystem (including when "it really is the same as Linux on x86" - which it never is).
Say you really do run the RHEL+Apache Webserver+Websphere+MQ+Oracle stack, are you really in a position to make sane choices anyway?
Better yet, if I have 10 people doing shared services on a public cloud supporting 150 developers developing applications that run on K8S that scales from 10% to 1000% on a daily basis. How would this IBM mainframe (or even the ancient vSphere 4.0 on blades) deliver any value over K8s and a cloud? Do I fire 100 developers to then hire 20 IBM auditors, 20 IBM mainframe maintainers and 10 legacy developers? Because cost-wise that would be the same, yet output-wise it wouldn't nearly deliver the value we would have had before.
And on top of that, the data tables are images? wtf?
I'm sure you can construct a scenario where a mainframe would 'win' (as if there is such a thing, only 'fit' matters), but commodity workloads on commodity resources has 'won' a decade ago. The only ones that are stuck are specialist cases or 'classic' multinationals.
> It seems that this would never work in a fast moving and fast scaling environment
A fast moving environment by definition would never even consider mainframe computing. And most places who would consider mainframes are already at whatever scale they will be at the conceivable future.
I used to work for a public healthcare company in the US that used mainframes extensively and when I was there we legally couldn't open any new locations. The only way to grow the business at all was to acquire existing ones, a multi-year process. So any changes or integrations typically had 12-18 months of lead time, and this was after everything was negotiated and signed. We typically heard about it a few months before that so sometimes up to two years.
And as you can imagine with the amount of red tape and regulation around healthcare, and the requirements around being on the stock exchange, change control was a pretty arduous process.
SWE working in financial services here. Have some insight into decision making wrt tech.
> How would this IBM mainframe (or even the ancient vSphere 4.0 on blades) deliver any value over K8s and a cloud?
There value is in mitigating risk. The engineers want the new and shiny toys and for all the right reasons (development velocity, etc), but the managers fear change bc change == risk.
> Because cost-wise that would be the same, yet output-wise it wouldn't nearly deliver the value we would have had before.
Likely, you are right, but the budget committee will exercise some financial gymnastics in order to justify the outcome they want, instead of the budget dictating decision making.
> The only ones that are stuck are specialist cases or 'classic' multinationals.
Wrt banks, they are stuck on cultural inertia and risk mitigation.
None of your points really hit at the merits of the meainframe,I think you are trying to shoehorn a rent-vs-buy discussion into this, and there are many scenarios where running your own servers is waay cheaper. Maybe your business doesnt fit that schenario - and thats fine. Then maybe the datacenter you are running on would be buying it.
"... and that admins for IBM systems are as available as x86."
Yep. My organization is trying to move away from mainframe and COBOL due to limitations in available personnel. The systems perform well, but that doesn't matter if we can't find people to keep them running.
> and that admins for IBM systems are as available as x86.
If the VM runs Linux, it's just Linux running on s390x. It's no weirder than ARM and has been on the Linux server market for far longer. I think you can install Linux directly on an LPAR. The part admins will need to learn is zVM, which isn't much more complicated than KVM or vSphere (and is much easier than K8s). The hardest part is the jargon and acronyms: IBM invented a lot of things that other manufacturers named differently.
> and the added 'special' sauce makes it incompatible with the broader FOSS ecosystem (including when "it really is the same as Linux on x86" - which it never is)
A mainframe usually can run Linux on zVM and it behaves just like a 5.2GHz VM with a very fast IO subsystem. From the top, it just looks like a very large computer that can run a lot of VMs in a single system. IBM started hitting diminishing returns with the number of cores and that's why core counts aren't increasing as they do on ARM and x86. Under Linux, the cores do SMT2.
> How would this IBM mainframe (or even the ancient vSphere 4.0 on blades) deliver any value over K8s and a cloud?
zVM is very mature technology and Z hardware and software have been coevolving for longer than most of us have been alive. More interestingly, you can partition your mainframe (into LPARs) and have several completely separated and isolated systems. It’s below zVM, so it doesn’t even suspect it doesn’t have the machine for itself. It's very common to run production, staging and development on the same system this way, as if you had three separate machines.
> 10 legacy developers?
Unless you plan to deploy your web app written in COBOL running on CICS and zOS, I'd suggest using different tools. If you already have systems running on zOS, you can easily access them from Linux VMs hosted on the same machine, through an imaginary network that's really fast (because it's not really a network).
> Because cost-wise that would be the same, yet output-wise it wouldn't nearly deliver the value we would have had before.
As the article points out, the mainframe itself is a little more costly, but the operational cost is lower. The convenience of having a single very reliable system instead of a cluster of less reliable ones can't be ignored. As Seymour Cray once said, it's better to plow a field with two strong oxen than with 1024 chickens.
There are downsides - IBM doesn't have these machines in stock the same way you can order a Dell, and ordering one is not a simple process - they'll build one for you, tailored to your needs. They'll probably deliver it with a couple extra CPUs so that when you need to upgrade it, you just pay the license and activate those resources.
One cool thing the z15 does is to activate all processors on boot to speed up the process. After the machine is running, the parts you didn't pay for shut down and the machine continues working.
The author is the CEO of DataKinetics, a major mainframe software company. I'm genuinely surprised that this wasn't listed as a disclaimer in the article--it represents a massive conflict of interest that calls the results into question, since the author would profit from companies remaining on the mainframe.
In case you're wondering why Planet Mainframe wouldn't require such a disclaimer: DataKinetics is one of their biggest sponsors, making this even more of an ethical minefield.
It's painfully obvious the web site is intended to promote mainframe technology, the author's bio is right there in plain sight, and you were able to find the web site's sponsors by simply checking the "About" link. It would certainly be unethical if they intended to deceive people about their purposes, but I'd be hard-pressed to find a way for them to be more obvious.
Finding a bit strange the overall tone of this thread.
Running a mainframe comes with the challenges of finding specialists, relying on a single vendor, having to buy at least two in two different data centers for disaster recovery,etc...But purely from performance point of view, as in throughput, and also quality of service, the case is
clear for the mainframe.
Spend some time reading the Z15 technical guide, and marvel
at what is today the state of the art in computing. We are talking about sustained, full on 5.2 GHz, all the time. Massive IO bandwith, ECC memory everywhere, massive caches, native crypto co-processors.
All this, while your x86, most likely, spends most
of its time waiting on memory...
"A Crash Course in Modern Hardware by Cliff Click"
What cloud vendors are doing is using, sometimes, custom hardware to create a mainframe at cloud scale. Its just
due to the sheer incompetence of IBM management, that they were unable to leverage their mainframe technologies, and
offer cloud offerings that would be competitive on
a price per compute or storage unit.
they appear to be bragging about 12GiB/s of aes-256-gcm per crypto accelerator. This is equivalent performance to about 2.6 cores of epyc milan, but let's be generous and say 3 cores. At retail, 3 cores of epyc milan costs $240-$360. How much does one of those crypto accelerators cost?
Also a comparison of Z15 Java performance vs x86. The advantage for the mainframe comes from the additional computing capabilities like compression dedicated hardware
and other components the x86 platform misses.
Ok the benchmark is from June 17th, 2020 but it’s using a Xeon 6140 a Skylake chip from Q3'17. That gets crushed even by even a $125.00 E5-2670 let alone whatever your similar budget buys you.
So, I am guessing actual head to head benchmarks must be quite bad.
I can easily write a memory bound benchmark that makes all CPU cache irrelevant. The only thing that actually matters is real world performance which is why actual benchmarks are so critical.
> Xeon® E-2386G Processor
That’s a chip from 2018. Anyway, you can talk about CPU clock speeds all day, but it doesn’t mean much between different CPU families.
Can't reply to response from Retric below so will do it here.
Wrong choice indeed the Xeon® E-2386G Processor its not what I was looking for. Instead of going for the latest lets review the latest offers from Intel:
A few models that have a peak turbo frequency approaching the sustained frequency of Z core. Also, of course due to different architectures, internal caches, we should not be comparing purely on frequency. Although the mainframe is again wining here in frequency and cache sizes...
Also agree that what counts are real workloads with real commercial applications. I have so far posted
at least three but here is another one:
Not sure about that, but the idea I was proposing was use the technical capabilities that allow them to build the mainframe. I am aware of the usual challenges of distributed computing, eventual consistency, etc...
But they could have created a computing offer, using the pure core performance of the Z processors and the technologies that enable things like z/OS Sysplex and do it at cloud scale. They could then maybe be competitive on a price scale:
The author ignores that large cloud providers design their own hardware. They are not using generic systems, but are designing their own. My own personal hypothesis is that the "servers" that run the major clouds will converge on something that looks like mainframes.
They won't be mainframes, because as the author does correctly point out, mainframes are unmatched at large-scale, high throughput, resilient, stateful, transaction-heavy workloads. While that is part of what the major cloud providers support, it is not everything. Because it is not everything, the systems the major cloud providers design and build will look different from mainframes, but I think they will have key similarities: virtualization and security features provided by the hardware; high reliability through redundancy and in-hardware error-checking; easy physical maintenance through hot-swappable components.
"They won't be mainframes, because as the author does correctly point out, mainframes are unmatched at large-scale, high throughput, resilient, stateful, transaction-heavy workloads."
So like DynamoDB, Document DB, SQS, Service bus, and vast majority of managed cloud services?
Thats literally the entire reason to use cloud - to offload management of statefull services.
If I wanted just to run stateless docker containers, I can set that up myself on bare metal in a day.
No, not like them. Those services are not handling the number of transactions a second, reliably, across a single database, as companies like Bank of America or Visa are.
Does VISA do more transactions than all of AWS combined? Is a single mainframe more reliable than the entire datacenter running AWS service? Is Bank of America's data literally a single SQL database?
> high reliability through redundancy ... easy physical maintenance through hot-swappable components
I don't think hyperscalers will ever care about this. They have millions of servers. They can just take the whole server out of commission as soon as it fails a health check. This is a critical difference between the mainframe model and the modern cloud model.
Am I missing something here, or are they paying $190,000 per HP server here? With each one being a two socket 24 core server. I am also not sure how to read the RAM part, I'd assume it is 2 TB in total for all 12 servers, as otherwise it would be a very weird comparison if the 12 servers also get 12 times the RAM. But if they actually put 2TB RAM into each of these servers it would explain the insane prices to some degree.
Being single-sourced is a major issue. One of the reasons FAANG have created their own commoditized server standards is that they have the might to push vendors around and force them to compete in price.
IBM is the only company that can sell you a fully licensed (you need a license to activate CPUs) IBM mainframe.
Agreed. I'm a mainframe fan, even, and this whole article is just a load of bullshit.
Starting with making a comparison on list pricing, when commodity hardware has huge discounts, mainframes don't. Then moving on to comparisons of running Oracle on a slew of tiny VMs with ~2GB RAM each. And, no benchmarks anyway, so what does "beats" mean?
This is an old study, and a biased poorly written review.
Which is a shame.
It also should not focus on HUGE cloud providers like
Amazon, Google, Microsoft.
Estimating licensing, maintenance costs for running at that scale would need access to their internal documents (as far as I know, none of them have published all the numbers).
Hardware costs are hard to pin down.
These companies have special specs for what they want.
Having custom computers produced should put the price
higher than off the rack, but given how mnany servers
they buy that probably changess.
In today's world, cloud can run high volume transaction loads and most recent IBM z15 and IBM LinuxOne can run
normal cloud tasks.
I believe that a proper comparison in 2021 would be a lot closer than what most x64 people would expect.
As long as we ignore AWS,Azure,Googe scale and think of smaller "in house" cloud like datacenters it should be part
of the estimation.
In my opinion DB2 running on a z15/zOS is hard to beat on uptime, reliability, and throughput. There would also be less need for sharding and clustering headaches.
A z15 can also be loaded up with Specialty Engine Support
like IBM Integrated Facility for Linux (IFL), or IBM zEnterprise® Application Assist Processor (zAAP)
Which means added a bunch of cpus that are dedicated to oen thing, in those mentioned above one is for running Linux and the second for running Java. All offloading the main system
One big problem in comparing is that the terminology and smenatics are quite different. Reading specs does not
make it easy to compare 1 - 1.
I have hoped i would end up at a place that did use modern mainframes for "modern conventional" workloads.
I havent had that chance yet.
> Estimating licensing, maintenance costs for running at that scale ...
I am pretty sure AWS for example doed not have major licensing payments for software. Their stack is mix of open source and self-written software and none of them require regular license payments.
So all the software costs that the study talks about would be reversed -- it is hard to compete with $0. And any extra work in system administration will be split over hundreds of thousands of servers.
I would think AWS pays a lot in licensing.
Oracle database, Oracle MySql, Microsoft Sql Server
Microsoft Windows.
But microscopic compared to having to run proprietary
software on all servers.
System administration over hundreds of thousands of servers is a lot more complicated than one or a few z15 or Linux One machines.
In addition, z15 has battle-hardened reliability (far above AWS servers). All the infrastructure to feed and care for over hundreds of thousands of servers is a lot more complicated. Even if what you have to do is just shutdown servers that are not working and never turn them on again.
It is one of the biggest advantages of the z15 platform.
I’m confused- they are claiming same compute for same cost but mainframe is 1 30 core chip and HP is 12x 24 core chips. Hard to believe ibm has a chip 10x faster per core
There are some relevant architectural differences that explain that. Most fundamental is that the Z CPUs can run continuously at 5.2GHz. Another difference is that these CPUs are assisted by a multitude of smart IO devices, each its own computer, that offload almost everything but application code off the CPUs. They also have humongous caches and will not wait for main memory as often as x86's.
I tried out hardware accelerated compression on a z15 and it just speeds it up a stupid amount, something like 20x times in decompression and 200 times with compression [0], which is not something that is done many other places yet. Though the PlayStation 5 does have it [1]
Comparing hardware costs (whatever you might think of how they've done it) doesn't really hit at the problems of running a mainframe and the TCO involved.
1) Single supplier. Have a falling out with IBM.... you're stuffed. that's a major business risk. With x86, don't like HP go to Dell etc.
2) Staffing! What's the relative availability of good S/390 admins and devs, compared to x86? IMHO one of the reasons first mainframes, then AS/400, then proprietary Unix lost out to Linux on x86 was ease of getting staff, which is fuelled by the fact you can start out with a home PC and learn the kind of skills you need.
I’m not sure that’s true. Mainframes can scale up as well, in my limited experience you get the thing fully loaded but don’t necessarily pay for all of it. If you need to scale up you just unlock another set of CPUs and pay the vendor a bit more.
IBM designs their own POWER processors, they also bought RedHat and can manufacture entire systems.
I know this is not a popular view, but IBM is the only vendor other than Apple that can build an entire machine from the processor to the operating system. It could be interesting if their systems were cheaper so that x86 had some competition.
How are Apple and IBM the only companies that can build their own machines? Amazon has their own (arm) CPUs and use Linux. They don't sell their servers but you can use them on AWS.
I knew based on the headline alone, this was going to be a fun read. After I read the article, I went to get some popcorns.
"Whether intentional or through ignorance, there is a great deal of bias against the mainframe. It’s too expensive! (It clearly is not.) It’s old and dusty! (Obviously not.) It’s hopelessly outdated!"
There might be 'some' bias against the mainframe purely from technology perspective, but I feel a large part of it is due to culture and people around mainframe. So the management uses the 'bias' as a guise to clean up the shop.
At the end of the day, most large banks, most large insurance firms, most credit processors, most large government agencies use z/OS and z/15 class machines. I work on mainframes and they are the source of data for just about all lower level platforms and companies.
I know of a finance company where years ago, IT was outsourced to IBM. Many of the VPs and Directors lost huge pensions when converting over to IBM. Once the IBM contract was discontinued, and those VPs/Directors returned to direct employment, an interesting thing happened. Those VPs and Directors colluded to eliminate the mainframe at any cost. Hypothetically, redeveloping an application in Java cost $5M to save $1M in mainframe cost - approved. Anything to hurt IBM’s chance of taking over IT again.
Is this a cautionary tale about IBM, or outsourcing in general? The company that outsourced their IT operations to IBM probably outsourced the development work for that $5M Java application too.
I’m sure Infosys or TCS got a good portion of that chunk. I was mostly conveying the story, because decisions to get off mainframes aren’t always pure financial analysis… vindictive VPs are likely inhibiting IBM in other companies too.
The bottom line seems to be "licensing cost savings make up for hardware cost differences"; but the assumption to that is running Oracle, which is a terrible business decision for almost any company.
Overall, I want to find a reason to get excited about this market segment because I'm a processor nerd, but the arguments are profoundly uncompelling.
The worst part is that there's no entry-level for the platform - the smallest box will still cost you an arm and a leg. On the bright side, it's not much more expensive than a similarly capable (and reliable) pile of generic hardware and much easier to assemble (you just pay IBM to do it). They used to have entry-level deskside boxes, but not anymore.
You can run the latest zOS on Hercules/Hyperion, but IBM will not like it - and most certainly won't sell you a license. You won't be using a cool CPU, only a normal one pretending to be cool.
You can get a Linux VM from the LinuxONE Community Cloud. It's fun to play with - feels like a fast two-core (probably running SMT2 on a single one) with wicked fast IO but, apart from that, nothing too special. It has a lot of performance counters in the CPU too, so that's fun at least.
I'm surprised by the operations cost for MF. In my org, we see significantly higher ops cost for MF over x86 simply because MF expertise is harder to find. Local colleges aren't seeing tons of high school students lining up to take MF courses/certs. The MF experts are at retirement age.
Consider: older company moving stuff to azure, google, Amazon from private/colo data center. The Linux stuff moved. The as/400 and related mid-range mainframe left. What does a prudent company do with it?
iSeries (what they call the AS/400 these days) is not going anywhere soon. You can even get one from the IBM Cloud if you want to go cloud with yours. If you don't, you can still upgrade them. You can still run anything a first-gen CISC-based AS/400 could on the newest POWER9-based hardware (they are just pSeries with some hardware key that allows them to load iSeries OS).
They are not called mainframes though. They are midrange systems, the last remaining minicomputer breed. They are expensive, reliable, their OS is alien to most of us, but they don't get the mainframe badge.
I once worked for $CORP that did a cost/benefit analysis of their mainframe, and it came out way cheaper in terms of TCO than the alternatives. The biggest challenge was getting staff, so there was a big line item for costing out 'training and nurturing whole-career staff'.
In other news...
- Beer Fanatic Magazine says "Beer is far better than wine or liquor!".
- Democratic National Committee says "President Biden is our best President ever!"
- Puerto Rico Tourism Board says "Puerto Rico is a fabulous winter vacation destination!"
- ...and HN'ers take all those claims very seriously, and start picking them apart.
(Yes, I know there are use cases where mainframes are the best. Zzz...)
I might have misread some of the tables, but I think it was saying that with the proliants you get 10x the compute power of the mainframe, and it costs nearly 30% more when you do that.
Maintenance cost for the z13 divided by 1000 VMs would be 22.48 a month. Add to that depreciation would cost 46.55 per month per VM. The machine has 2TB of RAM and this VM would have 2GB. committed to it, but CPU-wise it'd have less than a core (because you don't have a thousand cores on a z13).
You can get something similar (on a z15) for free from the LinuxONE Community Cloud.
If you were to specify an E2-custom instance from Google with so few resources it would only cost $46/month. So either this slice of a mainframe is 4 times more useful, or it's bullshit.
Yeah, they are not really that comparable, but I don't think they have any other z/Architechture vps avalible at IBM (or at anyone else for that matter) so I thought that was the best comparison.
Right, the article could have just done this, and compare IBM clould vs Amazon / Google / Azure for equivalently powered instances.
If what the article says is true, then for a similar workload, then IBM could instances ought to be much cheaper than the competition.
I only have experience with GCN, but to me it looks like GCN is much cheaper then IBM cloud, so something must be extremely off in the calculations done in the article.
How is oxide practically different from the application point of view than the vblock/vxblock/flashstack/etc. piles-of-servers-with-modestly-better-integration solutions?
I really thought they were going to do fully disaggregated clusters, but it looks more like vxblock-with-remote-management.
I also struggle to believe that oversubscribing the z CPUs 66:1 isn't going to result in performance issues. The documents on limits aren't clear but I foresee that mainframe spending a LOT of time on locks even with all the work IBM has done to help manage them.
This whole thing just really smells like a bad advertisement for IBM, which aligns with them announcing their latest gen z processor last week.