Hacker News new | past | comments | ask | show | jobs | submit login

It seems obvious in retrospect, but persistent memory adds a pretty exciting new advantage for persistent data structures.

Another thought: As potentially paradigm changing technology like this becomes available will it ever make sense to redesign the OS?

I used to think so - once you have persistent primary memory, you can install operating systems and applications directly into primary memory, so there's no longer any point in having a distinction between "install" and "run/boot".

However, iOS and Android have shown that it's possible to do away with this distinction even with a traditional OS running underneath. So I now tend to think that instead what will happen is more continual evolutionary changes at the OS level to work better in a "boot once" environment, rather than a revolution.

> so there's no longer any point in having a distinction between "install" and "run/boot".

When Linux boots, the in memory state changes quite a bit. Even the actual code gets modified during boot. The whole process takes well under a second. Linux does support an “execute in place”, but it’s barely a win, and I don’t think it works on x86.

A more interesting idea is to put your OS installation on a DAX (direct access) filesystem.

Seriously, yes. If there was ever a time to rethink OS design, surely this is it.

That being said, operating systems like Linux tend to capture most of the value from these kind of advances - often by dint of being able to simply 'get out of the way' if a sufficiently important user space process wants access to the device.

But one would suspect that things have changed sufficiently from the 1970s to warrant a ground-up rethink. Core counts, distributed systems (the Plan 9 folks already too a swing at this in the 90s), nearly ubiquitous graphics/GPGPU accelerators, persistent memory, nearly ubiquitous access to 64-bit address spaces (at least for desktop and most phones) - you'd think something would change about design. I don't work in the area so I don't know what that is...

> Seriously, yes. If there was ever a time to rethink OS design, surely this is it.


Traditional servers are persistent: they never turn off. 500+ days of uptime is typical. And today, with VMs which at worst... hibernate... it seems like "never turning off" might be the norm.

On the contrary, as a security professional I’d be thrilled if servers had a lifespan of hours instead of weeks or months. Reimaging VMs/containers/machines from scratch frequently gives so many advantages.

When OS, system, or library updates happen, you can easily launch replacement servers on the updated stack, put them in the rotation, and decommission the old ones. This is so much simpler than trying to run OS upgrades in-place across an entire fleet. The longer a machine has been running between reboots, the lower my belief in its odds of upgrading and restarting cleanly.

Further, this regularly tests your load balancing setup and pretty much fundamentally gives you capacity to scale up and down as load permits. Problems will be discovered early on, instead of during crunch time when you have to scale or when a few of your machines go offline during peak hours.

Security-wise, you don’t just get the benefit of fast, regular updates. But you also get assurances that users haven’t left stale data like unencrypted database exports, PII dumps, etc. lying around. Go on a long-lived machine some day and check out users’ home directories. That shit is a gold mine if someone who wants to do harm gets on your systems.

Not to mention regular reimaging makes it harder for an attacker to establish a permanent foothold in your infra.

None of this has anything to do with fast persistent storage, but I sincerely hope the era of 500-day uptimes is waning.

You forgot, frequent rebuilds kill off any intrusion as the world reflashes -- unless they get into IoT or microcontroller packages.

I did mention that it makes it harder for an attacker to keep a foothold in your infrastructure, but I think I wasn't as clear as I wanted to be.

But yeah, it's bad that an attacker has been able to get to a critical system, but it's a phenomenal defense if any of their beacons or remote access tools last at most a few hours or days before being wiped. This makes an attacker's life much harder.

> None of this has anything to do with fast persistent storage, but I sincerely hope the era of 500-day uptimes is waning.

On the contrary. Persistent memory means that infinite uptime is the future. Which, as you note, is difficult. Resetting the OS every now and then to a known state is a good practice, although disruptive to a lot of workflows.

If anything, I consider your post to be an argument AGAINST persistent memory.

Persistent memory might enable those sorts of uptimes, but it doesn't inherently mandate it.

But traditional operating systems still assume RAM contents is volatile (because currently it is), most filesystems assume disks are glacially slow etc.

A traditional spinning rust HDD has an effective latency of ~10 ms. The NVME version of the 3DXP has an effective latency of ~50 us, or two orders of magnitude better. Not sure how low the DIMM version will go, but maybe another order of magnitude?

If so, we're talking three orders of magnitude difference. That would radically affect the assumptions going into storage algorithms. Suddenly you can no longer spend millions of instructions trying to avoid I/O. Batching of I/O is also not needed to the same degree. Complex syncing of memory and disk is not needed. Etc etc.

> But traditional operating systems still assume RAM contents is volatile (because currently it is)

RAM is only volatile on startup. Certainly not when a VM hibernates and comes back.

> RAM is only volatile on startup.

No it isn't. Anything that needs to survive a power cycle needs to go to non-volatile storage. And this is assumed to be very, very slow.

I don't think "computers stay up a long time these days" is an argument against doing OS research on order-of-magnitude-faster, byte-addressable persistent storage.

We seem to be doing pretty well with a bunch of abstractions from the 1970s, as well as with the idea of just building giant trapdoors into our hardware whenever these abstractions fail (e.g. most databases, DPDK in the network space, etc). It's not a crisis. It just seems like a pretty good time to do some basic OS research (aside from all the usual headwinds for that, e.g. massive complexity of underlying hardware, difficulty finding meaningful workloads for a "toy" OS, etc).

Your ideas around uptime are still in the 70's. Systemd updates require reboots. Then there's Spectre and Meltdown BIOS updates, gotta reboot for those. Oh and the SSD and NIC firmware as well.

To think we formerly only had one devil in glibc. Now everything is constantly being updated and it's fine. We've moved on from the uptime as phallic measuring stick mantra. Patch, reboot, and stay secure.

Probably the optimal use of this technology is databases, since they rely most on random access to large persistent data.

Any operating system that's designed around this technology is probably going to look like a database.

Basically, boot to Postgres and all "files" are now SQL tables, stored in NVDIMM. Indexes are in DRAM, and critical nodes are in cache.

All data (system and user) is organized and opinionated: All photos are in a photo database, with tables for IPTC metadata. All music. All executable files. If you're browsing the web, it'll probably cache data in local SQL tables. etc..

I can envision using SQL stored procedures as actual apps, perhaps with an API to access graphics hardware, network, sound, etc..

The entire information world is either a database or a cache (or communication between them), layered on top of each other over and over. Every new storage technology typically ends up being yet another layer as either database or cache (or both). This case is pretty unique in that it can actually serve to remove a layer: ram (typically a cache) is not necessary if nvs is viable at the same speeds. But in general new storage tech just adds another layer, which the software world reacts to by rushing in as if to fill a void by creating new software to take advantage of it which ends up being... another database or cache, often with similar tradeoffs to the layers of cache/database surrounding it. In the limit, I see more and more layers of cache/database until they merge into some kind of continuous data/cache field with a continuous tradeoff gradient between size and latency.

Hey, that sounds line a mainframe... I wonder what it'll take to get zOS running on commodity.

Mainframes have always led the way in computer architecture.

And yet z systems have now moved to PCI express, an interconnect designed for desktop PCs

They can afford to, considering how much one costs :)

Persistent dara structures? Well yes, but that’s just the tip of the iceberg. There other scenarios like powerful real-time analytics that could benefit from 100TB of RAM immediately.

100TB systems at RAM speeds are theoretically possible without this new memory. For,example 64-bit systems could easily provide enough address space.

The problem is practically speaking, server systems limit address bit capabilities quite often. And other problems still remain, not the least of which is the crazy price for 100TB of DDR4,physical slots, etc. The price would be crazy even for most enterprise projects.

So yes this new generation of memory will be disruptive, but also keep in mind even though it’s faster than SSDs, that’s not nearly enough. I’m not positive, but IIRC correctly it’s still 2 or 3 orders of magnitude slower than conventional memory.

Does that mean this new wave od persistent RAM it’s not useful and awesome? Not at all, I’ve already started using it.

But it does mean it’s still at the stage where you have to analyze your scenarios carefully, see if it’s a good for your architecture and environment, and benchmark your particular stack to verify assumptions and make sure it’s help you the best way it can.

There will, almost inevitably, be someone who needs 101TB of memory. Then you get back to the same place where you need to scale out instead of up. If you asked cloud architects for cheaper, lower latency network or faster more expemsive storage you'd probably get the former most of the time.

Spark already works nicely with 100+TB datasets, and those can sit in memory across a thousand spot instances. Technology like tidalscale's hyperkernel can also merge together multiple systems into a single addressable memory space at the OS level so that you can run non-distributed applications across multiple commodity machines (like a reverse VM).

If 3d xpoint can give competitive price and speeds to tradional DRAM, then it will have a place in the market. Nobody has seen pricing yet nor benchmarks for these. For Intel however, this could increase their component share from CPU/chipset/network/storage to also include memory. That is pretty compelling since it's a market they haven't monetizes (not counting memory controllers) since the early days of Intel.

I would also think, OSes that tend to emphasize their primary file systems as 'the distributed memory', like DragonFly BSD -- would benefit significantly.

I am speculating, of course, but the whole Hammer 2 design of DF BSD emphasizes cross-machine 'database-like' file system, with built-in transparent state snapshots, state-branches, etc. [1].

So with this new type of persistent storage, DF's Hammer2 could erase the difference between 'persistent state' and in-memory-only state.

Therefore eliminating the need for reconciliations, application-specific backups, and application-specific distributed architectures.

[1] http://apollo.backplane.com/DFlyMisc/hammer2.txt

> Another thought: As potentially paradigm changing technology like this becomes available will it ever make sense to redesign the OS?

Realistically, it's implications are much bigger for applications that depend heavily on persistent storage, like databases. They make tons of assumptions about persisting to block storage, whereas 3DXP could enable them to function entirely "in memory", so all that block storage specific optimization they have is now working against them. I'm just generalizing here, though.

Zero serialization. Imagine installing a program and always having it "running". It may be swapped out but littetally everything in it is ready to go when you switch to it's window.

We have that right now, do we not? You don't have to quit apps except on reboot.

Except that many apps are so buggy you have to restart them often in practice. NVM won't change that, sadly.

I'd still want a way to kill it.

This is a huge PITA with mobile devices - I have no clue what code is, or isn't, being executed at any given time. Even if I force-kill an app, it has still most likely left some background service running, that will still use data, trigger GPS updates, wake the phone up, etc. What I wanted since the very day I first got my smartphone is to have PC-like control over applications.

In a perfect world of total ubiquity of wireless electricity, not to mention infinite CPU speeds and free and unlimited bandwidth, having everything running all the time in some way might be ok. As it is today, we still need the ability to kill software (and have it stay down), up to and including rebooting everything, to deal with obscure bugs in applications, OS and drivers. Not to mention being able to have some semblance of understanding of the device's state.

Leaking memory would be dangerous.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact