Hacker News new | past | comments | ask | show | jobs | submit login

A mainframe is not a supercomputer. Science is done on clusters of x86 chips and Nvidia gpus with custom interconnect and Linux.

A system that runs a bank or an insurance company and wants good availability can be built in one of two ways: you either spend money on software that deals with the hardware being unreliable and save money on software (pioneered by Google) or you spend money on hardware that promises to be highly reliable and save on software.

No new player believes mainframe (extremely expensive hardware) is cost effective, they all use commodity hardware.

A bank that needs to run binaries from the 70s for which they don't have the source code can keep paying IBM and not investing in reverse engineering the binary and implementing it in Java.

A bank that has a billion lines of code in Cobol can compile it to run on jvm on commodity hardware and run it in parallel with the mainframe for a year to validate and then switch over to the new system and stop overpaying for hardware, but that sounds risky, so they keep paying a million dollars for a system with the same performance as a 50 thousand dollar server.




> you either spend money on software that deals with the hardware being unreliable and save money on software (pioneered by Google) or you spend money on hardware that promises to be highly reliable and save on software.

Seems like the first "save money on software" shouldn't be there.


sounds like it should be "spend money on software" as the software becomes much more complex when hardware being unreliable (and distributed rather than one big box)


Yeah, I think maybe they meant to say "save money on hardware" since they already said "spend money on software" earlier in the sentence.


Should have been "... and save money on hardware (pioneered by Google) ..." but too late to edit now.


Google pioneered something done before Google existed?


Change it to "made famous by Google". Do you have an example of a famous large system built out of many cheap computers (cheaper than normal servers) that predates Google?


> Do you have an example of a famous large system built out of many cheap computers (cheaper than normal servers) that predates Google?

That approach is, in scientific computing, at least as old as Beowulf (1994, four years before Google was founded), the prototypical system from which we get the term “beowulf cluster”.


Oh, as scary as it sounds, it's entirely possible to save current day unicorn runway money on metered processing duty cost software compared with the Oracle installations that you find at this level. Especially when you are pre processing using the accelerator cards that IBM doesn't include in your cup budget. I keep telling myself that my retirement plan is to write the missing "DB2 On Z is Not Any DB2 From Any Brochure You Read", handbook, plus the supplementary, "The Difference Between Larry Ellison And God Is God Runs z/vm syslpex." *0 Not being funny but do you honestly like the fact your hardware can't return correct values for your financial accounts? I am drooling over fogas on xeon mcms for selling metered coprocessor logic .

Edit accidentally included runaway thought process...


You should try SAP


Do you know any big bank not using mainframes for their core operations? 90% of big banks use mainframes, so there must be some big banks that don't use them. Interested to know who they are. Chinese?

It's OK to do most things with chunk of normal servers, but when you need to handle very large number of transactions with fast commits, and stuff like eventual consistency is not allowed, it becomes expensive to handle them no matter what.


Commonwealth Bank of Australia -- they were on mainframes, but spent a few hundred million dollars to get off.

One of the other big Australian banks (Westpac) has an "exit IBM" project as well, but it isn't complete yet.


I googled it. Apparently they moved their home loan systems etc. from "windows-based mainframe" to UNIX. What is windows-based mainframe? (HP Superdome??) They also moved into SAP analytics.

There is no information anywhere saying that they moved their critical banking systems and transaction processing.


Unisys.

Unisys always had a smaller mainframe share than IBM, and they stopped making new chips sometime in the 2000s.

For at least ten years the fastest Unisys mainframes have been very big (proprietary) x86 machines running Windows with a Unisys mainframe emulator.


I worked for Unisys in 2002 or so and I always thought it was funny to see a mainframe able to run Pinball...

For some time I defined a "serious computer" as something that didn't have ports for keyboard and monitor.

Having said that, the pre-x86 A series were pretty cool. And ran the most user hostile OS ever created, to the point its very name was used for Tron's villain.


Going from IBM to SAP seems like no less lock in?


But only on the software side. Dozens of companies will compete to sell you hardware solutions to run your SAP stuff on.


SuSE Linux on Lenovo, Dell, HP, whatever


> Do you know any big bank not using mainframes for their core operations? 90% of big banks use mainframes, so there must be some big banks that don't use them. Interested to know who they are. Chinese?

Monzo? They're getting bigger now, circa 3M customers. More than First Direct, Starling, Metro Bank.


Seems to me like the global scale consistent database systems could really change this calculus (google cloud spanner, faunadb). To me these products represent a significant capability that never existed before — not entirely sure it would ever make sense for a bank to become dependent on hosted transactional storage but maybe a “bring your own cloud” model for one of these systems (I think faunadb offers this — cloud spanner can’t because of the custom network requirements for clock synchronization) ...


National Australia Bank moved their core banking to Oracle software running on commodity x86 servers.


Having worked there only a few years back, I can guarantee there was still a very large amount of mainframe activity - all the change management at the time was still done on an IBM mainframe.

I personally worked on a project there that involved building new virtualization platforms for not only x86/x86_64 Intel, but also AIX (Power) and Solaris (SPARC).


I do, but they still use enterprise grade servers that each cost almost as much as my house.


Is it really common for banks to run legacy applications without sources? How would they know or trust the behavior?


I can't speak for banks specifically, but for sure lots of old stodgy Fortune 500 companies have critical software running for which they have lost the source code. Similarly, critical closed source software for which the vendor no longer exists. And things like servers running that they are afraid to shut down...they have no idea if the output is used by anyone. It's a mess out there :)


Can confirm. The last two clients I worked with (both Fortune 500) used mainframes for business critical operations. The most recent one didn't actually have source for some components of their business ops software stack - they've just been working as-is for about 30 years and no one will touch them.

I was told they did an assessment a year or two ago to price out what it would take to move the business completely off mainframe and onto an x86 stack. It was in the hundreds of millions of dollars to do so because so much other software has been built to interact with and rely on the mainframe over the years that switching off it would be a multi-year effort across every department in the company. So of course that ROI calculation was pretty damn easy and the mainframe isn't going anywhere.


> so much other software has been built to interact with and rely on the mainframe over the years

So just... have the new server expose itself over the TN3270/TN5250 protocols, such that these interoperating systems see what they expect to see? (You could even build the new system as a regular REST API or something at its core, and then build the TN3270/TN5250 exposure as a separate gateway service on top, such that it'd be easy to shut it off later if everything finally moves off it one day.)


That's one of many possible integration points. LU6.2, SNA, batch file transfers, remote DB2 connections, MQ queues, etc.


Just?


If you were dependent, to the tune of 100s of millions of $ on certain hardware, wouldn't it make sense to start the process anyway?

You don't necessarily need to move the $new project off the legacy hardware, just write it in such a way that you can do easily later.

I'm eliding many details here, but the principle stands.


Choose one:

- Spend $300,000,000 over 10 years moving to a new system, and hope that by the time you've done that it's not obsolete.

- Spend $1,000,000 a year on a system that still works.

You could run the system for 300 years for what it would cost to replace the system.

In spite of the collective wisdom on HN, there aren't a lot of companies in the world with hundreds of millions of dollars sitting around doing nothing. Even ones that work on mainframes.


I would characterise it as option 3

- Spend $1.5 million maintaining the current system, but as bits get updated keep in mind the system that you'd like to have in 10 15 years time.

You're going to have to replace the system within 300 years anyway, you aren't saving that money, every feature you add that is reliant on the old system is literally technical debt, because eventually you'll have to rewrite it for the new system.

I'm not even saying you need to have a new system in mind, just keep in mind that you will be moving to a new system, so code appropriately.


Since a mainframe can usually be updated to the newest model (and you often run more than one for redundancy, as some people may need more than 5 nines), in 300 years the company will have a z150 or so, with millions of quantum entangled cores made of folded spacetime mesh around a pair of rotating singularities running at a couple terahertz.

And it will still be able to run all your programs that were written and tested since the late 20th century, by the kind of organic entity we used to call "human".


Lighthearted retort: And web browsers will still be slow.

Slightly more serious retort: I'm not aware of any brands from 300 years ago, I'd be surprised if any of IBMs customers survive, let alone enough to keep IBM as a going concern.

Btw when you start folding the space time mesh, terahertz figures just become marketing numbers, what you really want to know is how many parsecs it can do the Kessel run in.


I'm not aware of any brands from 300 years ago

You might be, you just don't realize that they're hundreds of years old.

For example, the insurance company Lloyds of London is fairly well-known around the globe. It's 333 years old.


I suppose brand yes, but it's not really a company. If you want to go down that route The Church of England, and indeed England are 'Brands'


What’s the meaning of those numbers? Where did they come from? How did you lose the source to a 300 million dollar software project?

If anything, this is a sign that mainframe users have way too much profit for what should probably be commodity software.


Not as long as IBM continues to make mainframes and components, which they do. If you were unable to source replacement hardware to keep it running indefinitely, then yes, but until then it's better to keep paying a couple million a year to keep your business running as-is than to spend hundreds of millions to effectively rebuild the entire infrastructure from the ground up in parallel to the running infrastructure.

Remember that these are large, public businesses. Explaining to shareholders that profits are going to take a noticeable hit for years because of IT investment that isn't strictly necessary is effectively a non-starter.


"rebuild the entire infrastructure from the ground up in parallel"

I wasn't thinking that. I was thinking more like when you add a new feature, fit it into an API that is portable. Or add a translation layer so the feature can be written how you would like system to be in the future, but it works on your hardware today.


Everybody attempts this. But the problem is that the ground keeps moving under your feet. Your portable interfaces become irrelevant as software systems evolve. XML gives way to JSON, etc. Eventually it costs just as much in time and effort to interoperate with your 10-20 year old "portable" interface as it would to interface with the mainframe directly, which was already established to have been too costly to bother. So now you have two problems, not just one.


Possibly. First I'd say that xml doesn't become 'wrong' when json came about. If that 1950s data format works for you, and you don't see advantages to moving, then fine stay with it.

2nd I'd say theres a half life of best practise. Over the 10 20 year tome frame, id expect some of what you were doing is going to become outdated yes, on the same way some of your knowledge over your career will become outdated, or the phone I your pocket will, you wouldn't use that as an argument against education, or buying that phone.

Btw, if its as costly to deal with the interface as with the mainframe directly, that's a win. The interface can move off the mainframe to somewhere cheaper/better.


> First I'd say that xml doesn't become 'wrong' when json came about.

It doesn't become wrong, it just evolves into another dead-end. The 3270 is an interface, too, but the reason everybody suggests replacing it or augmenting it is precisely because the spartan tooling and mindshare makes it expensive. As XML recedes into history it is likewise becoming more expensive as an interface.

I chose XML vs JSON because I figured it was a transition everybody was somewhat familiar with. And younger programmers have an almost visceral dislike of XML, which I thought might help get the point across--that an interface someone once thought (and probably still thinks) would help ease future interoperability becomes a reason or excuse for future programmers to avoid that integration.

> Btw, if its as costly to deal with the interface as with the mainframe directly, that's a win.

I think the problem is that you don't really know if it's as costly. The error bars on that sort of risk assessment are huge because our industry sucks at accurately predicting migration costs. And it sucks because complex software systems are intrinsically unique. Commercial solutions that claim to be able to capture and control all those dimensions of complexity tend to be sold by vendors with names like IBM and Oracle. Such vendors also pioneered interfaces like SQL, which is both a soaring achievement in terms of capturing complexity behind a beautiful interface while also falling epically short of what's needed to actually reduce long-term integration costs.


You seem to have made an assumption that the mainframe is “legacy hardware”, on a thread about the latest generation of mainframe hardware... As long as IBM keeps building mainframes the business case for anything else will remain unattractive. Unless you have some information to the contrary, it doesn’t make a lot of sense to invest (financially or technically) in switching.


It isn't about legacy, it's about being stuck on one suppliers hardware. That isn't a great place to be.


This is awesome in a terrifying way. The code itself outlived any employee that knows what it does.


Hi, pedant here. That's the original meaning of the word: to inspire awe, and there is always a bit of terror in awe. A hurricane is awesome, as is a $deity that sends one.

But alas, language has changed and these kids won't get off my lawn.


Speaking of words that changed: sublime used to mean this. Something powerful and dangerous that inspired terror and a sort of admiration was said to be sublime. Here's Edmund Burke writing in 1757:

‘Whatever is fitted in any sort to excite the ideas of pain, and danger, that is to say, whatever is in any sort terrible, or is conversant about terrible objects, or operates in a manner analogous to terror, is a source of the sublime; that is, it is productive of the strongest emotion which the mind is capable of feeling.’


This is the kind of pedantry I come to HN for, thanks.


This is the kind of pedantry that makes me thankful for the [-] box


And this was particularly exciting in the runup to Y2K


As someone who works for a bank... no, not in my experience.

What is a problem is that the last time a lot of code was touched may be 5-10 years ago. That code may have started being written 20-40 years ago. It may well have been maintained by people who think "if it was hard to write, it should be hard to read" or "documentation is for the weak". It definitely will have been written when the cost of a gig of memory and storage was many orders of magnitude higher than today (indeed, last time I priced memory for a mainframe, a Z10, it ran to $10,000 a gig); hence terseness in everything from table and column names through stored data and everything else was prized. Dropping from COBOL into assembler is not uncommon for critical path performance.

Making any changes will be a week of coding and three months of working out the what and why of the code, because the last person who worked on it retired a couple of years ago.


Even having access to the source code might not be enough: I’ve seen cases of weirdly forked repos with custom additions and half baked backports that relied on hacked system libraries and weird mix of specific legacy packages in some symbiotic way that just one specific laptop of some long gone developer was able to compile it and virtual machine of this Windows XP was passed around.


With few exceptions, every non open source software is. Without sources. Not every legacy but critical app was in-house custom software.


A lot of vendor packages included source code licenses actually. I'm not sure distributing binaries was even practical in all cases since the code would have to be built with and linked to specific versions of systems libraries and keeping track of which customer is on which system would be a nightmare. I'm familiar with several loan accounting systems and financial authorization systems that are either nowadays totally maintained by the bank or co-developed with the vendor in a shared source model.


> How would they know or trust the behavior?

It has worked for the past 30 years and hasn't been touched since 20.


Not just banks. Any long standing company has this problem.


I have worked with a dot com that linked to commercial C libraries for which they never had the source. Did finally reimplement but not till 2016 or something.


Because it's been running a certain way for decades?


Yep at a known fixed cost. Corps like knowns and hate unknowns, especially stalwarts like the various big banks. It's kind of like buying stocks rather than shorting stocks. Can you handle the unknown of potentially unlimited losses many times your original "investment" in the venture?


Re: binaries from the 70s, if the re-engineering costs are the only reason to stay on mainframe (rather than e.g. microcomputers not being up to the task IO-wise), then why not just run the workload in a z/OS emulator on commodity hardware? I mean, the modern z/OS machines are just running 70s-era workloads on an "older z/OS uarch" emulator anyways, so...


Technically that would mostly work but IT leaders at banks and insurance companies would rather pay high premiums to IBM then stick their neck out endorsing anything risky. Renegotiating with IBM is the safest career move at such places. Cloud is steadily edging out mainframes at banks but a lot of old guard IT leads are circling their wagons around old blue because learning how to do high resiliency on cloud is new and scary, but old blue is safe and well understood.


Using safe and well understood tech to handle my money sounds like a good thing.


It's hard to understate the value of safe and well understood for core systems. After all it's not like a massive migration to a new technology stack has ever blown up in anyone's face, right? (</s> if not obvious)


>then why not just run the workload in a z/OS emulator on commodity hardware?

Who are the vendors that provide a migration path from mainframe to a set of commodity hardware running an emulator, and subsequently provide continual maintenance and support?


The big cloud vendors have partners who specialize in such work. Random one from the first page of Google results: https://aws.amazon.com/blogs/apn/migrating-a-mainframe-to-aw...


part of this is the 'partnership' arrangement.

Global 100 company gets IT from other global 100 company

vs

Global 100 company gets IT from small mainframe support shop

there's a question of shareholder liability, being able to adequately sue them for M's of $, expectation they will be around in 20 years, etc.


Total non starter. IBM won’t license the software for use on an emulator.


My guess is a mainframe would be much better at handling a high volume of transactions much better than a $50k server? I would also think it would better handle conditions where "eventually consistent" doesn't work? I'm sure most companies are running batch processes and data-intensive calculations that could be easily shifted to cloud workloads in practice, but it does seem like there are some use cases where a mainframe could be better?


No. If you need reliability and have money, separate commodity servers running a consistent (paxos or raft, not eventually consistent) in-memory database (the memory is a few terabytes) with logs to flash and to disaster recovery site will outperform a mainframe.

A single computer with 4 sockets of xeons will outperform a mainframe but will have more downtime.

The mainframe has best possible single thread performance and as much cache as possible, redundancy and parts can be replaced while it's online, but not that many cores.

The cost is very high - when I looked, 1 million per year is the baby version with only 1 CPU enabled and no license to run the cryptographic accelerator and limits on software, etc.

Commodity hardware you buy and use for 5 years, so the amount of good hardware you can buy for the price of owning a mainframe for 5 years is a lot.

For the money, you can buy a lot more CPU, ram, network, storage, etc and hire Kyle Kingsbury to audit your distributed database.


>The cost is very high - when I looked, 1 million per year is the baby version with only 1 CPU enabled and no license to run the cryptographic accelerator and limits on software, etc.

At one point, IBM was selling base-level mainframes for $75,000 (see https://arstechnica.com/information-technology/2013/07/ibm-u...)

It is true though that a 'realistic' configuration is likely to cost north of $1 million, and that none of these numbers include the price of the software.


A standard IBM "gotcha" is to offer deep discounts and charge the 20% p.a. maint cost on the list price - e.g. "We'll give you this hardware for $1." then a year later "So sorry the maint is 3 x $250k x 0.2 = $150k p.a.".

Another is that the software and hardware designed to turn a baby system into anything useful is astonishingly expensive to people not used to dealing with this end of the market. Enjoy finding that your hypervisor is licensed at 5 figures per core, and that your cores are six figures a pop.


It’s the support contract that’s the real cost.


You have just described VoltDB. They worked with Jepsen years ago and fixed all the problems Kyle found.


Interesting. Thanks for the details. I didn’t realize how few cores mainframes had.


And it is especially quite a bit better in handling a high volume of transactions while physically replacing one of the CPUs and half of the RAM in the system than many $50k servers.


You are paying about 100x on hardware because your software is limited to a single address space.

Even if the mainframe never goes down, the entire site will go down (eventually the rack power supply, HVAC, fiber, natural disaster, backhoe, etc. will get you even if your CPUs and RAM are redundant and replaced before they fail), and then either your entire business stops or at least processing for that region stops, or your system is resilient to site failure because you built a distributed system anyway.

If you could rewrite your software to be distributed and handle a node/site going down, you could run a single site on 5 servers that together outperform the mainframe (by a lot) and can be serviced on a whole server basis (though of course, expensive x86 servers also have reliability features), or use really cheap hardware without even redundant power supplies, but have enough of them to not care.

The modern solutions are better than the mainframe, and the only reason to use them is management risk aversion and unwillingness to learn new things.


A single mainframe can easily be located in multiple datacenters. The Hungarian state owned electricity distributor company has one, one half is at Budapest the other half -- if memory serves -- is at Miskolc, a bit more than a hundred miles away.


Is that a single address space system or merely two systems with db2 databases and disk volumes on a SAN in replication? I think it's the latter.

100 miles will add 5ms (round trip) to your disk flush on commit. So a system like this has the sequential and random IO latencies of a RAID of SSDs but the flush (database commit) times of a 15K RPM spinning rust disk. People lived with mechanical disks, it's ok.

Sync disk replication (in one direction) over a fiber line is not an exclusive feature. Having both sides be active, instead of active and hot standby requires some smarts from the software, but modern distributed databases do that, and if you're careful you can get far with batch sync jobs.


It might be just storage as it was brought up as an example of how subsystems of mainframes are essentially their own world and a single disk or distributed volumes over a long distance are presented the same to the rest of the system. In the Linux world DRBD does something similar (just much simpler). The point, however, is that the software knows nothing about this being distributed.


Yes if you have a huge volume of transactions and a huge amount of constraints that can not be partitioned. But the real world constraints can almost always be partitioned.

For read-only batch computations you can always add some extra redundancy and partition the problem. So, I don't think it is likely that a mainframe would be useful here.


The newest trend in banks is to migrate to AWS. See recent Capital One hack. They view it as a combination of commodity hardware with cheap software.


Yes and no. I work for a bulge bracket bank and we're moving a significant number of applications to the cloud, including AWS, GCP, and our internal cloud offering.

But we have literally thousands of internally developed applications. We can move thousands of apps the cloud and still have a need to keep thousands on virtual/physical machines. My own apps are stuck on commodity physical hardware for at least the next few years.

The type of applications that can historically been run on mainframes is not really moving to AWS/cloud. Most of what's going to the cloud is what I would consider to be "supporting" applications, not core applications.

My own experience, that of others may differ.


What you're saying is we've been 737-maxing it for the last few decades?


Banks work because processes performed by humans that were developed before digital computers were invented, not because technology. They have logs of all transactions and are willing to reconcile them manually and are insured/lawyered against loss when they can't reverse an erroneous transfer. Money can take two business days to move instead of milliseconds, and their customers are mostly fine with this.

Programmers are taught that database transactions exist so that when you move money from one account to another and crash in the middle, no money is ever created or destroyed. Well, cat picture websites might do that, but banks don't. They reconcile logs at end of day.


I was commenting on this:

> you either spend money on software that deals with the hardware being unreliable...or you spend money on hardware that promises to be highly reliable and save on software.


Then I don't understand. Which strategy is like the 737-max ? Banks generally don't build systems that crash and lose your money.


Mainframe hardware isn't reliable though. It maybe more reliable, but no hardware is reliable. What you call reliable is implemented by an architecture that lets the system route around hardware damage. The difference is that a mainframe is a mostly closed box where you preselect what hardware components are wired together, while a cloud computing type system is an open system where you can dynamically rearchitect it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: