This is a great missing manual for z/OS tinkering (in the past I have also given up at the login prompt as the author said). Can someone sort of "steel man" this z/OS workflow? e.g. correct noob problems and explain why creating a file takes a page of options? As it is, if this is actually the way to get started, I'm seeing no compelling reason for a developer to bother to learn or use z/OS.
And I know you could say "well Linux is just as complicated." And maybe that's true, but I can also freely download it and learn it, whereas IBM seems to be making no effort to bring z/OS to the public, so why bother chasing it?
> explain why creating a file takes a page of options?
z/OS dates to an era of bewildering diversity in storage hardware - from punch cards to mag tape to disks with variable length sectors and built-in hardware search mechanisms. All sorts of weirdness by modern standards. Also common at the time was record-oriented files, which are accessed not byte by byte but in a more structured manner. For example records might be 80 characters at a time, corresponding to a punch card image. Record oriented files naturally lend themselves to primitive but fast databases (evolved out of records on punch cards) and that's why mainframe OSes supported them. Similarly, file systems in those days were considered significant overhead so there's provision for raw access by sector so apps can roll their own file systems if necessary. z/OS handles all these odd cases still. And all this needs to be specified in a command language that could be parsed in like 24 KB or however much RAM it was that OS/360 was originally designed to run in...
As the article says "file" is technically not the z/OS term - they're data sets. That might give a hint as to just how divergent z/OS is from just about everything else.
> well Linux is just as complicated.
Linux has less baggage, probably. Also the "worse is better" philosophy of UNIX kept things like quasi-database services out of the OS, for the most part. For comparative purposes check out VMS; it has some of these same features like record-oriented files in a (relatively more) modern design.
> I'm seeing no compelling reason for a developer to bother to learn or use z/OS.
You're really creating a member of a data set. What you get is far more flexible than a file but equally more complicated. The options come in handy when you're sequencing job steps together, as they let you specify sharing, cataloging, and automatic deletion under one or more specific types of job results.
> whereas IBM seems to be making no effort to bring z/OS to the public, so why bother chasing it?
It's a valuable and interesting part of computing history and understanding it's workflows gives good insight as to why many organizations continue to use them. They fill a very unique use case and exploring that is very interesting and enlightening.
The core concepts haven't changed in 70 years. You can even play with them today since a previous OS, MVS, was released publicly, and has been maintained as an open source project in recent years. If you get the Hercules emulator you can run a fully legal mainframe OS distribution right out of the box.
Speaking as someone who had to do a lot of work with these machines in college (and one job after grad school until I found something else), I think the author’s main problem is that they are insistent on using ISPF. Nobody I knew in either place would touch the thing, it was universally hated. We all did our editing locally, used FTP to transfer files to the mainframe, and created datasets and compiled our code by submitting pre-written JCL that older and wiser sages had composed for us.
I wonder what happens if you set the "Expiration Date" on a data set... Does it delete itself? Require special flags to access? Throw an error every time the data is accessed? Prevent writing? Can you imagine someone putting in an expiration date a decade in the future and it goes unnoticed until it hits? And presumably, like the other options, it can't be modified after you create the data set.
What a truly frightening option to have on a server file system.
It is rarely used now days in my experience and I have never used a flat file that is over a decade old. Storage used to be expensive so this would ensure that old data got purged. Most flat files are useless after a month or less. Even if a mistake did happen, most MVS shops have pretty good backups.
> I wonder what happens if you set the "Expiration Date" on a data set... Does it delete itself?
On a vanilla z/OS install, setting "Expiration Date" on a disk (DASD) data set does... nothing. It just acts as documentation. And it is normally an optional field which for disk data sets normally doesn't get set – although a minority of sites configure it to be mandatory.
Now, if you enable DFSMShsm – which is an extra cost add-on which provides hierarchical storage management (silently moving rarely used files from disk to tape or cloud storage, and then moving them back on demand when an application attempts to read them) – well, by default DFSMShsm doesn't do anything differently, but then you set "SETSYS EXPIREDDATASETS(SCRATCH)", and it will delete ("scratch") expired disk datasets automatically in the background.
Also, there is a distinction between SMS-managed and non-SMS-managed datasets. SMS (aka DFSMS, or to be more specific DFSMSdfp, originally DFP) started life as an add-on product to simplify some of the complexities of mainframe storage management, and eventually went from being an add-on to being included in the base OS. When you create a dataset, you can choose whether to put it under SMS management or not; if it is under SMS management, you have less control over it, because some decisions are based on storage policies configured by the storage administrator. In particular, SMS has a config setting OVRD_EXPDT which controls whether you are allowed to delete a dataset with an expiry date prior to its expiry date being reached. If you have OVRD_EXPDT(NO), then if an SMS-managed dataset has a future expiry date, your request to delete it will be ignored.
Now, it is much more common to set the expiry date on tape datasets. For datasets on tape, there is both a volume expiry date set in the volume header, and dataset/file expiry dates set in the dataset/file headers. (People say that the correct mainframe term is "dataset" not "file"–that's true on disk, ignoring the Unix subsystem, but on tape, both "dataset" and "file" are actually correct.) There is another extra-cost add-on called DFSMSrmm. And if you use that, that will automatically set the tape volume expiry date based on the dataset expiry date (for multi-dataset tapes, DFSMSrmm can be configured to use either the latest expiry date of all the datasets/files, or else in "FIRSTFILE" mode, it just uses the expiry date of the first dataset/file on the tape.) As well as multi-dataset tape volumes, you can also have multi-volume tape datasets (one dataset spans multiple tape volumes), and DFSMSrmm has some more config options to control how volume expiry dates are handled in that scenario. And then there are third party packages you can use instead of DFSMSrmm, such as Broadcom CA-1 or Broadcom TLMS.
Now, once your tape data set expiry date has been translated into a tape volume expiry date, that will go into the tape catalogue used by your tape library. And when the tape volume reaches its expiration date, the tape library management software may (again, depending on its configuration) mark it as a free tape, and then it may be overwritten by new data at any time. Or, maybe you are using a virtual tape library, where the tapes are actually files on a disk array, but the virtual tape library pretends to the mainframe to be an actual physical tape library – in which case the virtual tape library software may (again, configuration dependent) delete the file backing your tape volume once its expiry date is reached.
> Can you imagine someone putting in an expiration date a decade in the future and it goes unnoticed until it hits?
Since the expiry date is optional, it is commonly not actually set, except on temporary files, backup files, archival files, log files, etc. You can configure it to be mandatory or auto-populated, but you'd generally only do that for files/datasets whose names match certain patterns.
Most sites, source code files, application binaries, configuration files, operating system files, etc, the expiry date is blank.
> And presumably, like the other options, it can't be modified after you create the data set.
Not true, it can be. Although there are configuration options which control whether you are allowed to do that or not. A common configuration is that you can increase the expiration date of one of your own datasets, but only a privileged user can reduce it (bringing the expiry date forward).
> What a truly frightening option to have on a server file system.
HSM systems exist on Linux/Unix/Windows too, and most HSM systems have a file expiration date feature. And you'll find similar features in tape management systems, record/content/document management systems, email archiving systems, e-discovery systems, etc, all of which exist on those platforms too. The difference is, that on mainframes and minicomputers (not just IBM), "expiry date" has commonly been a standard filesystem attribute – most commonly the system by default doesn't do anything with it, and you need add-on software to actually delete the expired files (if you really want to), but it is part of the core OS filesystem metadata. By contrast, on Linux/Unix/Windows, it isn't part of the core OS filesystem metadata, so even if you are using a HSM system and your file has an expiry date set, you'll need to use some proprietary API or non-standard extended attribute to get at it. That's the only real difference.
This was fascinating. Thankyou for taking the time to write it.
That said:
> DFSMSrmm ... DFSMShsm ... DFSMSdfp
If rmm, hsm, and dfp mean something, and IBM's documentation didn't explain the names to me. they are some of the most extraordinary acronyms, _mixed-case_ no less, I've ever seen.
I can imagine where it would be really nice to have an expiration date on a filesystem. The behavior I would want to see for expired files would be: to show up on a report and maybe display in different color in directory listing. It could be a useful reminder to update an expiring certificate, expiring customer contract, etc.
There are now multiple options for GUI editors. When I was using the Mainframe in 2016 through 2020 I was using multiple eclipse based IDE's that provided the ability to edit datasets on the Mainframe, submit and view jobs, etc. They couldn't do everything you could do in the ISPF interface but they made the Mainframe much easier to use.
> I'm seeing no compelling reason for a developer to bother to learn or use z/OS.
I agree.
> And I know you could say "well Linux is just as complicated."
I've written COBOL on z/OS in the past (nineties). There's still COBOL used today. But there's a reason none of Google, Amazon, NVidia, Tesla, Meta, Netflix, etc. were built on mainframes zOS / COBOL / JCL / etc.
Yet billions (tens of billions?) of devices are running on Linux today. So saying "well Linux is just as complicated" would be actually quite stupid.
Something could be say too about the virtualization / containerization of all the things and ask how many VMs, containers and hypervisors are using Linux.
So, complicated or not, it actually makes sense to learn how to use Linux.
> But there's a reason none of Google, Amazon, NVidia, Tesla, Meta, Netflix, etc. were built on mainframes zOS / COBOL / JCL / etc.
The main reason is that Linux is free of charge, and that Unix happened to be more used in academia. It has little to do with underlying technology.
> So saying "well Linux is just as complicated" would be actually quite stupid.
It is just as complicated, if you are looking at feature parity. Maybe there is less historical baggage but that comes with complications as well (think of the grumbles about systemd).
> The main reason is that Linux is free of charge, and that Unix happened to be more used in academia. It has little to do with underlying technology.
That's not true, or at least there certainly isn't a consensus about it. One of the narratives that is associated with the rise of Google is the use of commodity hardware and high levels of redundancy. Perhaps this attitude originated from some cultural background like linux in academia, but their rejection of mainframes and the reasoning surrounding it are extremely well documented[1]: "To handle this workload, Google's architecture features clusters of more than 15,000 commodity-class PCs with fault tolerant software. This architecture achieves superior performance at a fraction of the cost of a system built from fewer, but more expensive, high-end servers."
Name one business started this century that uses mainframes. If there were any compelling reasons to use them in the modern times, there would certainly be some companies using it. Mainframes are legacy cruft used by businesses that had them decades ago and are too cheap or entrenched to modernize their systems.
"Legacy cruft" can be code that's been providing business value for 50 or more years. The mainframe may be expensive, and IBM may love to nickel-and-dime you for every available feature, but it might still make business sense to keep using it. What's the point in rewriting all your code to move it off the mainframe if that will cost twenty times as much as maintaining the existing code while vastly increasing risk? While you may achieve cost savings by moving off the mainframe, they might take so long to accrue that it doesn't make business sense.
If there are any, you can be sure that they're not using z/OS. More likely would be one of those rack-mounted models that only run z/Linux (and possibly z/VM).
Z/OS systems are rackmount nowadays too, they just take up the full 42U.
At least back in the POWER8 days those z/Linux systems were the fastest you could buy, and IBM was super happy to let you overclock them: their reps told me that was just more CPU sales for them.
My previous company had a large estate of linux and mainframe applications. While ensuring that the disaster recovery is implemented in linux applications was a nightmare with different standards and different ways of doing things, in mainframe it was inbuilt.
While the mainframe may be old and out of fashion, it did have the capabilities that we are rediscovering in Cloud, containers, VMs and all...
I've wondered what might happen if IBM lowered costs on this hardware... If they offered a compelling deal, it's conceivable a startup might choose them over Linux. As it stands now, I find it near impossible to imagine any organization starting with a clean slate choosing a mainframe for anything. Cost combined with the work required to make the thing do the things is just way too much of an investment.
Many comments tout the uptime and reliability of the "mainframe", I'd argue we have many counterexamples built on Linux. Building a product with this level of reliability on Linux is expensive but still cheaper than a mainframe and the various support contracts, IMHO.
I started out working with an IBM AS/400 in college and eventually worked for a company that ran their business on one. Eventually market pressure forced that company to move to Windows servers accessed through Citrix from thin clients. In my opinion, this didn't make much material difference: it was still complicated, people on the floor complained about Citrix+Windows just as much as they did the old IBM terminals. Hardware costs and support contracts were less expensive but the cost of the software was much, much more expensive and the organization no longer had the ability to change any substantive functions. Just sayin', moving away from a mainframe isn't necessarily a clear win either.
Bear in mind that IBM has been making computers — and terminology — since the 1950s. They invented hard drives. So it seems wrong to say their terminology has “diverged”…from what?
IBM terminals like the 3270 operate almost like a web browser, with form fields implemented by the terminal. The computer sends a page of text, you use the terminal to edit it, and you submit data back. That’s why you have to move around the screen to type things in the right place, and thus why the editor has “line commands”.
There were no words for this stuff. People just made it up, and it took years for something to emerge as the de facto standard term for a collection of data persistently stored.
There's an interesting parallel with the emacs editor. Operations we now universally call "Cut" and "Paste" are "Kill" and "Yank". The better terms won out and emacs is stuck with Ctrl-K and Ctrl-Y as not-so-intuitive mnemonics for cut and paste.
I actually don't think they won out. Mindshare maybe, but I use kill and yank on macOS all the time. In fact, since they are enabled by default, I use Emacs keybindings all the time and everywhere on my Mac. Even the password entry screen in macos has Emacs keybindings.
I always thought that Apple saw the superiority in these bindings that they have them included by default.
As much as I love emacs, I think cut and paste is better metaphor for text editing than kill and yank. Once the key combos are part of muscle memory, the value of mnemonics goes down, but they're very useful for adoption. This is an unfortunate barrier for emacs beginners. Even though the usual CUA keystrokes are available (and even the defaults on some popular starter configurations), the terminology creates some friction when reading the docs.
I agree the semantics are strictly superior though.
Mirrors my experiences with z/OS. One thing I distinctly remember was when I tried to write JCL to move some datasets. Took me 2 days reading documentation and trying out things. Finally I just gave up. It is not fun when you have noone to help and Google isn't very helpful. If you think Unix tools have no consistency, try JCL and z/OS. Considering how alien JCL is and magic incantations I invoked with it, I'm convinced that there is Cthulhu in the mainframe machine.
Still, I'm quite proud I manage to write some JCL which saved us potentially days of manual work.
Other things I remember from that time was that passwords were only 8 characters long and case insensitive. My guess is z/OS is secure only by its obscurity. Though maybe this was just our installation. No idea until today.
> Other things I remember from that time was that passwords were only 8 characters long and case insensitive. My guess is z/OS is secure only by its obscurity. Though maybe this was just our installation. No idea until today.
Everybody nowadays uses a security add-on product with z/OS - most commonly IBM’s RACF, although some people use Broadcom (formerly CA)’s ACF2 or TopSecret instead. RACF allows a user to have either a “password” or a “pass phrase” or both or neither. For legacy reasons, a “password” indeed can be max 8 characters case-insensitive, but a “pass phrase” can be up to 100 characters and case-sensitive. And it also supports non-password based authentication mechanisms, including client certificates, smart cards, multi-factor auth, passtickets… some of that stuff is relatively new, but it isn’t all new. The bigger problem is you can offer all these more modern security features, but you can’t force customers to adopt them, especially when that adoption isn’t free (putting aside additional licensing costs for some of these features, there is also the person-time to configure it, test it, roll it out, etc)
I was using the Mainframe as an intern. Working in a Mainframe company it still took me three months to be able to write JCL from scratch with out needing help. It was a constant struggle until one day a switch flipped and I just understood the basics. I've never had another programming experience where I've gone from struggling to comfortable so quickly.
Recently saw an AS/400 on Facebook Marketplace. I was very tempted to pick it up, but I am trying to reduce AND knew dang well I would have no idea how to interact with the thing.
I recall in about 2009 working on a project where the clients inventory system ran on an AS/400 and I knew literally nothing about IBM mainframes at the time. All we wanted was them to publish a simple JSON API we could poll hourly to update the front end we were building. In a conference call with umpteen of their engineers, they basically told us that was impossible. I thought this was ridiculous, but they were insistent.
Their counter proposal was that they would automate emailing us updates and we could parse and process those. I still have no clue how that could be easier for them but it sure would have been more work and infrastructure for us!
Eventually after literally weeks of back and forth they finalized landed on letting us use FTP to read hundreds of XML files they would generate nightly. They were often mildly malformed/improperly encoded. It was simpler to deal with that than ask them to fix it. It was not a very fun project, but a major learning experience looking back.
I was a young hotshot who had only recently been promoted to lead developer, and having to interact with these people who clearly thought I was dumb for not knowing anything about IBM mainframes certainly knocked me down a peg. Kind of humbling to see you don't know what you don't know once in a while.
> the clients inventory system ran on an AS/400 and I knew literally nothing about IBM mainframes
I'm going to be "that guy": The AS/400 (nowadays System i) is not a mainframe but the last standing member of the family of midrange (outside of IBM called minicomputer) systems. (NonStop and OpenVMS - and probably something else I don't know about - are still somewhat around, but nowadays they run on more or less microcomputer systems).
It's just IBM i now. I believe it was (not including System/38) AS/400-AS/400e->eServer iSeries->eServer i5->System i5->System i->i.
Nonstop has run on X86 systems for about 10 years now, but I don't think they were ever considered 'mid-range'. I think they usually were referred to as mainframes, although the hardware architecture was totally different.
I personally thought the AS/400 was midrange. That's how a co-worker characterized it when he left us to work on OS/400 development (way back in the 1980s).
IBM's own "History of AS/400" web page sort of implies it's midrange:
"For the first time, small businesses, city governments and other medium-size enterprises could set up their own computer networks and connect them to workstations, printers, file servers and even other networks — all running four times faster than what was previously possible."
Maybe System i is more mainframe-y than the original AS/400.
The question got me looking at old AS/400 manuals on bitsavers, and there is some interesting stuff. From the AS/400 handbook:
- Layered machine architecture. This insulates users from
hardware characteristics. It enables them to move to new
hardware technology at any time, without disrupting their
application programs.
- Single-level storage. Main storage and disk storage appear
contiguous. An object is saved or restored on the system via a
device-independent addressing mechanism
- Operating System, OS/400, is a single entity, fully integrating all
the software components (relational database, communications
and networking capabilities, etc.)
I agree with you that AS/400/i has always been called mid-range. I was actually referring to NonStop as usually being called "mainframe" not mid-range, or mini-computer, at least the early pre-RISC models. It's another interesting computer architecture.
For more on the AS/400, there's a book by the lead architect of it, Frank Soltis on archive.org that goes into a lot of detail. It is really quite different from anything else.
With 128 bit pointers encoding security and providence information. And a unified memory space, no files, just addresses with RAM as a cache for persistent disks. Might as well be from 2080.
> “There I just saved you a lot of time and aggravation. What IBM calls ‘data sets’ virtually every other OS on the planet calls files and directories.”
I understand the author’s pain, but sometimes the “I’ll just google things based on my existing assumptions” approach fails.
Imagine trying to learn SQL like this. You google for “how to create file in SQL” and later you conclude: “For some crazy reason files are called rows in SQL!” — In other words, the basic assumptions just don’t map, and you’d be better off reading the manual.
Even files aren’t quite as ubiquitous as it seems. Apple iOS didn’t offer a user-visible file system until 2017. The Unix file APIs work in a sandbox, but they’re not recommended.
>> every other OS on the planet calls files and directories
I think it is a useful distinction, the one between a file and a data set. In one case the OS is kept unaware of the logical structure of a file, while the user does not need to know about its physical arrangement; in the other case, things are almost exactly the other way around.
The IBM mainframe branch of evolution is different from UNIX and Windows.
It is unfamiliar for most.
I know the first time I sat down with a terminal to a UNIX machine
I found it hostile, unintuitive and perplexing.
vi is not easy to use when you know nothing about it to begin with.
Yes I fly on UNIX after a few decades but I sill remember my first
introduction.
z/OS (which is also, but far from only UNIX¹), came out in 2000 and is far more than this article lets on.
z/OS supports backwards compatability
CICS, COBOL, IMS, PL/I, IBM Db2, RACF, SNA, IBM MQ, REXX,
CLIST, SMP/E, JCL, TSO/E, and ISPF, among others.
z/OS also ships with a 64-bit Java runtime,
C/C++ compiler based on the LLVM open-source Clang infrastructure,
z/OS can communicate directly via TCP/IP, including IPv6,[5] and includes standard HTTP servers (one from Lotus, the other Apache-derived) along with other common services such as SSH, FTP, NFS, and CIFS/SMB.
¹ UNIX (Single UNIX Specification) APIs and applications through UNIX System Services – The Open Group certifies z/OS as a compliant UNIX operating system – with UNIX/Linux-style hierarchical file systems.
From afar it seems easier than ever to learn mainframe. IBM has free courses on z/OS these days (besides the registration) and there are lots of PDF:s hidden away in blog posts and whatnot on their site. Coursera has some courses too.
Reading about z/OS systems programming is one thing. Being able to practice with it on a real system is what would likely be missing. You have to have a safe environment to learn on and it's hard to get access to those.
Notice I said as a systems programmer. A systems programmer is the one who would configure system settings (possibly for the whole lpar). Is IBM letting people change settings on the LPAR as part of the course?
Somebody has to kepp the lights on for stuff that is still out there running on cobol when the current generation retires. Its already a bit of problem with tryung to port it over, as nobody can read it
All over the world new COBOL developers are made every day. Typically they aren't specifically COBOL developers, they also have background in Java, C# or C++. It's common in institutions that rely on COBOL to have toolchains that generate code for both mainframe and regular applications or SDK:s. In many places it's no longer locked in with greybearded warlocks.
And I know you could say "well Linux is just as complicated." And maybe that's true, but I can also freely download it and learn it, whereas IBM seems to be making no effort to bring z/OS to the public, so why bother chasing it?
reply