OpenVMS x86_64 at the Edge

jamesy0ung · 2024-08-29T21:46:35 1724967995

I’ve heard VMS has legendary reliability, but is there any reason to build new stuff on it in 2024? Surely Linux would be a better choice since it’s free of cost and you can do whatever you want without being locked in to a single vendor (VSI)

lenerdenator · 2024-08-29T22:07:38 1724969258

Same reason IBM still sells tons of big iron:

Your company built something on it in 1985 and the risk is too high if they have to port it over to anything else, so you have to build on top of that.

icedchai · 2024-08-29T22:49:29 1724971769

If risk is a factor, you won't port it to VMS x86, either. You'll leave it running in an Alpha or VAX emulator.

bbatha · 2024-08-29T23:46:36 1724975196

There's fortunately a lot less of this. The last new VMS server SKU was launched in 2004, and HP stopped production in 2015. Unlike IBM which launches new hardware every few years and has continued to update the OS. The risk calculation is completely different. However, there are certainly still laggards on the platform who are being led around by the barely funded VMS x86 project.

numpad0 · 2024-08-30T05:47:16 1724996836

But that thing probably fits on few floppies and don't even know about npm, so there are not a lot of moving parts and there is no fear of malicious npm packages ever. What's _technically_ wrong about that?

trelliscoded · 2024-08-30T06:52:11 1725000731

OpenVMS and all its predecessors are still useful and, I think, ideal for real-time control applications and clustered high-reliability setups. The OS is structured much differently than pretty much everything else, so you can treat an OpenVMS machine as a completely dedicated embedded system where direct control of the hardware is possible without needing to deal with writing a complex kernel driver. It's a bit like if you crossed the low-level control guarantees of DOS with all the modern stuff you want like a network stack, Java, web servers, etc.

Despite having all the conveniences of a modern OS, the overall complexity is way lower than a typical Linux distribution. It was all designed by a single firm under tight resource constraints, so the amount of shared common code is way higher, the number of parts is way lower, and it's much more feasible to keep the whole design in your head all at once.

The clustering technology is also state-of-the-art, and has been for decades, even if you treat the cluster members as real-time control machines. Everything you need to build what most people think requires a whole k8s cluster and several helm charts is already built in to the OS. This includes stuff that most people don't think of until later, or try to reinvent on top of the DB. Distributed transactions, propagating the OS security model into the application, industrial-strength audit logs, autodiscovery, it's all built in and works out of the box. Pretty much everyone from a k8s or borg background that I've had to onboard to legacy apps running on OpenVMS clusters is blown away by how simple everything is. Even though there's fine-grained resource quotas and support for bin packing workloads throughout the cluster, it's rarely needed because of how efficient everything is since applications all use a common runtime designed around async paradigms instead of six zillion third party libraries and multiple C runtime libraries per container that can't share memory pages.

Because it has fewer moving parts, almost never needs to be patched, and they kind of got everything right the first time, it's much more deterministic than any other non-embedded operating system you can practically use on modern hardware today. It would absolutely be my first choice, today, if I had to build some kind of highly reliable control system cluster that had to stay up beyond my lifetime. To this day, I see absolutely ancient installs on a regular basis that simply will not die because they keep swapping out cluster members when they fail. Itaniums and Alpha Galaxies with RPS are extremely reliable by themselves too, and their biggest reliability threat is availability of replacement PSUs and memory. The OS was designed in a pre-Internet era where putting a control host out in the boonies somewhere to control some satellite ground station, feeder dam for a hydroelectric system, or military radar had to work, had to stay up, and had to be as reliable as the hardware of the time would allow. There really isn't anything else like it that will give you those kinds of guarantees in a disconnected, IoT-type edge environment, yet also run on modern gear.

One of the major downsides is that learning how to work with the OS to do modern things isn't well documented anywhere (e.g. how to use logical names to emulate Linux namespaces/container technology). DECNet in particular is a real sticking point for everyone, even though Linux has supported it since forever. I think I might be the only person I know with the enterprise networking, Linux, and OpenVMS background to do all the network plumbing required to integrate DECNet with a modern hybrid cloud or hyperconverged stack. It's not hard, but you have to make sure you check off a bunch of stuff from a checklist to make sure everything works. It's getting harder to find Cisco gear that natively supports it too, but using VMs to NFV some vyos routers or whatever to create brouters works just as well as it does for managing all the usual IP north-south and east-west flows in a big production system.

Despite its age, DECNet is a big feature of OpenVMS because of how it supports the clustering technology, MOP, and things like LAT. LAT is a standardized remote serial protocol that lets you do stuff like add a serial port somewhere that the OpenVMS host can see on the network, and it shows up as a native comm device in the operating system. You can open it, toggle the control lines, all that stuff just like a normal serial port connected to the host. This is unbelievably convenient for distributed control systems because everything was designed by a single firm instead of e.g. trying to integrate digi's remote serial drivers into your product or trying to reinvent autodiscovery of RFC2217 devices. There's a lot more serial ports around than you might think, like the QXDM port on your phone, BMCs, deaf TTYs, the console port on your RPis, etc. Having an infinitely expandable host capable of doing real-time control of hundreds of serial ports all at once isn't something I'd consider any other operating system for.

You kind of have had to work at a shop where they built a lot of stuff on top of OpenVMS within the Internet era to get that knowledge, because I'm not aware of any books or other reading material which explains how modern things map onto the corresponding OpenVMS concepts. I should probably write a book or something, because it would be a shame if the underlying concepts faded from our collective memory and the open source community ended up reinventing all this stuff in a much less efficient way spread over a dozen open source projects that don't have a central architectural authority beating everyone up about centralizing common code.

Cutler and his original team really did deliver a complete OS that got everything right the first time, and I can't think of any modern requirements that OpenVMS can't handle from an architectural perspective. I guess the filesystem isn't capable of doing really big work like a huge z/OS data set or a petabyte sized ZFS pool, but then you just use NFS with EFS or a netapp if you have a fortune 500 IT budget. It sounds wrong, but OpenVMS I/O still sets speed records compared to everything else because of the first-class async I/O support in the OS, and how simple the file I/O layer in the OS is. You can explain the entire write() to disk driver sequence of events to someone in 5 minutes for VMS, but I can't imagine trying to explain just the linux dcache in less than an hour. Just walking through all the filter drivers and everything else touching an IRP in a typical Windows machine is like a semester-long course.

justinclift · 2024-08-30T02:13:21 1724984001

Very much no. Strong avoid.

voidfunc · 2024-08-29T22:08:03 1724969283

New thing? No.

boricj · 2024-08-29T20:53:20 1724964800

From the linked forum thread, 18 months later (https://forum.vmssoftware.com/viewtopic.php?t=183#p17812):

> The Atom project as shown in the YouTube video is not a VMS Software Inc. product. As Andrew pointed out using OpenVMS on Atom or similar devices is different from Linux which is obtained freely. VMS Software is not further promoting the Atom project now or into the future. VMS Software is concentrating on OpenVMS in virtual environments, not on hardware devices.