Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The end of Optane is bad news (theregister.com)
270 points by RachelF on Aug 2, 2022 | hide | past | favorite | 249 comments



Optane is magical and was mismanaged by both Micron and Intel. The article doesn't really express what we have lost.

The DIMMs were expensive and didn't have broad compatibility. They should have made them in a regular DDR3/4/5 format and half the cost of main memory, or 2x to 4x the density for the same price. The density roadmap had them doubling in capacity with each generation. 1TB per DIMM slot.

The Register is right in that it was the newest thing to come along in 50 years. Reliability, performance, flexibility, all could have made huge strides under PMEM/Optane architectures. Intel thought it had a golden goose (it did) and that the top profit margin enterprise workloads would flock to it.

Intel has everything it needs to succeed except for internal politics. But that is true of all large corpse these days. FPGA fabric built into the Optane DIMM would have been magic.


The article does however mention the basic issue which led to Optane's failure (at least if I understand it correctly): that all current OSes have a concept of primary and secondary storage, whereas for getting optimal performance out of Optane, you need to have several levels of memory storage (which apparently have to be managed by the OS, as opposed to CPU caches): a program's code can stay in the Optane memory, while the stack and heap have to be in "real" RAM to reduce wear on the Optane memory when the program is executed? So Intel would have had to develop an OS for it, or worked on extending e.g. the Linux kernel in that direction.


FWIW this article, like The Register articles of olde, did bring this point up.

Multics had this concept back 60 years ago; like many useful Multics ideas (cough security) it was jettisoned in Unix. Most interestingly, Organick’s book describes a hierarchy of fast to slow, with the slowest being tape backup :-).

More recently HP’s “The Machine” was designed this way — the article touches on this but if you don’t know what they are talking about that HP paragraph looks like it was accidentally jammed in.

With the increased use of databases and key stores plus the dominance of mobile platforms, this single store model will probably be back.


> like The Register articles of olde

Coo, thanks!

That is what I'm going for, yes. :-) Or at least, when I have the time, it is, anyway...

(If it's not obvious, I wrote the linked article.)


El reg has had two truly great writers in its history; perhaps you'll be the third!


Oh my word. I am honoured.

JOOI: who are you thinking of? I rate Orlowski, even though I disagree with Andrew about a lot of stuff IRL.


> or worked on extending e.g. the Linux kernel in that direction.

Iirc Intel was the biggest code contributor for 2021 to the kernel. So they already have the expertise.

Running code from Optane would likely be a mistake, though. Jumps would be very expensive due to the extra latency (unless they hit the cache). Perhaps it can work with page swaps but still.


There's already plenty of code to handle NUMA, it shouldn't have been too hard to add proper support for that.


It adds another layer - while NUMA is about latencies, Optane memory is about latency AND how frequently the information is changed (to minimize wear on the device). There are some obvious signals (disk caches would be one), but richer semantics would help a lot to make it perform.


I am a bit ignorant in OS stuff but isnt that managed by the filesystem? I think some people were using them as LOG drives in ZFS fs.


Be wary mixing Optane drives (what you seem to be referring to) and Optane memory (parent commenter).


That's described in the article, that they could be used in Linux, but only as a volume. Apparently (if I read the article correctly) that wasn't optimal for exploiting the higher performance.


As I understand it Linux's VFS and Block layer aren't really designed to exploit the type of throughput and latency that Optane provided.

What I'm curious about is what happened to all the PMem (Persistent Memory) and NVRAM (Non-volatile RAM) work that went into recent Linux kernels and libraries.


I thought that with DAX, it was possible to have a real file system, but access to the actual bits goes straight to the backing storage, not the page cache (as it would with traditional block devices). So the block layer and VFS inefficiencies would not necessarily matter, at least if you can use DAX.


> Apparently (if I read the article correctly) that wasn't optimal for exploiting the higher performance.

That can be compensated by tuning the price point. If the performance isn't there, make it cheaper until it's better than the alternatives.


This is Itanium all over again


Itanium was a bet on compilers being able to extract parallelism from the source code. Maybe JIT optimizations, where code is rewritten on the fly, more or less the same way as a reordering buffer works, would have done the trick, but, in the end, Itanium fell short of its performance promises. I'm not sure compilers alone can do it.


The trouble with Itanium was that memory latency is a limiting factor. A CPU knows what is in the cache but the compiler doesn’t. In fact, a major use of parallelism in CPUs is to hide latency and that involves a flexibility of scheduling greater than bundling groups of instructions together. (Think of 3 instructions hanging because 1 needs to load something as opposed to a system which might be able to barrel on and run a few non-dependent instructions.)


> A CPU knows what is in the cache but the compiler doesn’t.

You could kind of try to sidestep that by inserting preload instructions ahead of the instructions that would use the values but I agree that doing that at runtime would probably be better. Maybe a JIT runtime could help, but, still, that's beyond the control of the compiler.


https://www.researchgate.net/publication/3044999_Beating_in-...

This one was never realized AFAIK, but addresses a lot of the latency issues of the IPF designs due to their relatively static nature. Still does not take the massive cost of reservation stations. I think being open will be a major requirement going forward. It is hard to build trust in a black box and RISC-V is about to become a real alternative. Being modular, open and growing up from smaller systems seems like a winning strategy to me.


I can't wait to have a completely Windows-proof desktop again (even though my last SPARC still has an x86 coprocessor board) and my IBM PPC machine had Windows NT for it.


I did some research on

https://en.wikipedia.org/wiki/Transport_triggered_architectu...

which is an approach to CPU design that puts specialized CPU design in the reach of quite a few people (say on an FPGA) but the main weakness of it is that it is not smart at all about fetching. Although it is not hard at all to make that kind of CPU N-wide you are certainly going to have all N lines wait for a fetch.

Seems to me though that that kind of system could be built with a custom fetcher that would let you work around some of the challenges.


Also a major issue was the big iron shared memory systems that Itanium targeted. They collectively fell out of favor with the trend towards Virtualised x64 commodity clusters and later the cloud. I’d love to see an EPIC strategy for a CPU targeted at smaller systems. Perhaps implementing webassembly based actors in hardware and support for messaging and pmem. Wasmcloud in hardware.


The way I see, disaggregation will be a thing and we may be back to large shared memory racks in cloud datacenters. While an individual cloud machine can range from small to very large (effectively one fully dedicated host) a rack-sized server enables much larger (and profitable) offerings. Scaling up has limits, but it's very comfortable to have the option when needed.


Hardly an Intel specific problem (frowns at Alpha)


"corpse" - vs "corps" - kinda funny if it was intentional; maybe funnier if it wasn't


I mean, corpses do have everything in them to succeed; their cells just can't agree on a actually doing that anymore.


Not to be pedantic but most corpses alive today don't have anything resembling cell walls let alone cells.


To be pedantic, most corpses aren't alive today.


To further the pedantry, parent didn't make any claims about most corpses. Just most corpses that are alive today.


I saw the typo, chuckled and left it in.


I'm not sure I really understand why Optane went in the DIMM slots instead of the PCIe slots. I know PCIe storage is usually NVMe, but you could also expose the storage as a large memory mapped region. You wouldn't need explicit support from the cpu/chipset/motherboard, although you probably want resizable BAR support, cause the region would presumably be pretty large, and maybe you want to do something for older systems that can't handle a giant region. From the software side of things, PCIe mapped memory is more or less the same as physical memory; from the hardware side, you'd be taking a different path to get there, so you'd have different bottlenecks, but PCIe5 is here with 2x the bandwidth and there's a limited number of applications that really max out their PCIe lanes.


PCIe Optane was a thing and it achieved 10us latency whereas today's fastest SSDs get 40us. IIRC the DIMM version of Optane was <1us, literally an order of magnitude faster!

I'd expect PCIe to contribute ~3us latency and interrupt/context-switch/kernel to contribute ~5us. You could eliminate the latter, but 3us is still kinda slow.


An order of magnitude faster, but still a few cache hierarchies away from where the actual computing is happening.

But the real dealbreaker was that there are also similar indirections in the other direction. It's just so very, very common in modern architectures to not permanently store data on the first drive it hits. That first level of persistence is usually just some queue log where the write takes a short breath, safe from power failure for the first time, before getting passed on to some more distributed persistence story. Optane wouldn't change this general setup at all, because those layers exist for more reasons than just the power failure scenario. Optane might change implementation details of that first layer, but it wouldn't question the layers. Because optane is only answer to power failure, not to the data center burning down.


Optane is either the fastest SSD or the slowest RAM. If it is doing the job of RAM it might hurt performance, not help it. It’s not clear adding another level to the cache is really going to help.


For consumers notebook, Optane could be fast swap drive. It's pity manufacturers didn't get this done. Swap should be just fast to make application switch fast enough. Optane could do it.


Far too specific, plenty of work loads where that wouldn't make a difference at all. And of those remaining, chances are that a bump to the next main memory price tier would completely change things, more so than optane swap.

Where I'd expect the technology to really shine is in flash controllers: give that thing a generous serving of optane cache to play with in its fancy wear leveling rites, power-loss-persistent and sufficiently closely integrated (same board) to allow software to confidently assume "committed" without the flash even touched if the write happens within a tight working set (you might want to set certain file system parameters up for "it's ok to write the same address many times"). You could even include the optane in the advertised capacity! And the swap space use case would be included without even trying.


They kinda did that with Optane+flash 1TB M.2 consumer drives like the H10 and H20 series. Unfortunately they never merged the controllers so each one only gets 2 PCIe lanes and you need Intel's software (which is only compatible with Intel CPUs because why not) to do software RAID (which has a terrible reputation on Windows for losing data)

It achieved 14us latency (by acking writes before they hit flash) which is extremely good (better than today's fastest SSDs) but the 32G Optane cache doesn't go very far when most enthusiast consumers are gamers that blow the cache every time they download an update. A lotta games are also larger than 32G so if you use more a few large programs, the cache just isn't large enough. So you end up with a product that's kinda expensive, but still not quite good enough to take the performance crown.


Wow, so close, yet still a total failure. So the data would have to hit PCIe when copying/moving between optane and flash?


Yep! With the older CPUs it was released alongside, I believe it was even worse: The data would go through PCIe into the CPU cache, then into DRAM, and then back down PCIe to flash while wasting a few CPU cycles along the way. It looked terrible on benchmarks (literally half the MB/s compared to cheaper flash SSDs that used all 4 lanes) but wasn't too bad in practice since NVME drives can do many requests in parallel.

The real issue was size. I've got 100M CPU cache (5800x3d), 24G VRAM (3090), 64G DRAM, and 2TB flash. I'd need at least 128G of Optane for it to make sense in my cache hierarchy and improve game load times. I could get the $3000 data center Optane SSDs, but that's kinda hard to justify when it's as expensive as my entire PC.


I mean... with modern memory compression and SSD speeds, is this really neccesary? If you've got an NVMe SSD, you've most likely got enough bandwidth to never notice the swap kicking in (especially with a generous swappiness value). Sure there might be a couple workloads that could benefit from that kind of upgrade, but I think most people building memory-optimized rigs would rather just buy more memory, since it goes in the same DIMM anyways.


I mentioned consumers notebooks as main target for such optimization. A lot of "memory" for apps and real hibernation are main profits.



Sure, but looks like that was NVMe, not just a big bunch of bytes, so whatever magical future was enabled by having a big bunch of non-volatile bytes wasn't enabled by this. (unless there was a way to change it into big bunch of bytes mode?)


My guess is that to get low latency you have to be interacting with a memory controller that expects low latency (like dram) I don’t know how many cycles it would take as a difference, but my impression is that pcie trades throughput for latency compared to dram


IIRC permissible lane skew in PCIe is like 30 ns, which means that the latency due to deskewing alone in PCIe already approaches main memory latency.


Probably because the way they did it allowed it to participate in the caching hierarchy with MESI and all the rest of it.


[Article author here]

FWIW I tried to explain why that was the important aspect of the tech in the article.

I also covered it in some depth in my most recent FOSDEM talk, which is linked in the article. As is the script if you don't have time for the video.


I guess it's because DIMM is physically closer to CPU than PCIe, so you get better bandwidth thanks to that (think that the data travels at the speed of light - the closer the faster).


That is just not true, in any shape or form.

It wasn't mismanaged at all. Neither Intel or Micron could make it cost competitive. Optane had its shining moment when DRAM and NAND ( both are commodities like ) skyrocketed 3x the price, at one point making Samsung the most profitable company even surpassing Apple. There were roadmaps for power, reliability, performance and density improvement, literally everything except cost. Micron wasn't even the cheer leader, they were only happy to play along with Intel because Intel made the commitment to buy enough capacity for Micron to sustain the business.

And Optane could barely compete when DRAM and NAND price were at its peak, making zero profits on most of its revenue. Imagine when DRAM and NAND are back to normal.

The future of memory is either extremely low power on Mobile SoC or Ultra High Bandwidth in the server with 128 Cores. Both of these doesn't fit into Optane Roadmap.

The market is so small that despite Intel was giving away its optane product and selling it at zero margin they still couldn't fill the minimum orders commitment to Micron and had to paid its penalty.


[Article author here]

> Neither Intel or Micron could make it cost competitive.

Cost-competitive with what?

I think there are two misconceptions here.

1. It is a kind of memory, not disk. But it is non-volatile large-scale RAM. There isn't any other terabyte-scale non-volatile memory for it to compete against. So how can it fail to compete on cost when there was nothing for it to compete with?

2. As memory it was extremely competitive with conventional volatile memory, being much larger. ITRO n times (for integer values of n whatever that value may be) more Gb/$ is being competitive, IMHO.

Against flash storage: sure, more expensive, but orders of magnitude faster and orders of magnitude more write cycles -- that is highly competitive, isn't it?


I am very late to this reply.

At its peak, Optane Memory had double ( or nearly triple ) the memory capacity for the same price, at practically zero profit margin. Once DRAM prices falls, this advantage is gone. Remember this was compared to ECC high capacity DRAM module, the highest margin in the whole DRAM market.

It was ( and still is ) a wonderfully technology, but Optane Memory didn't serve a large enough market where non-volatile memory had obvious advantage. As memory it offers much lower bandwidth and response time than DDR4, which means substantial amount of Server workload does not fit Optane Memory's performance characteristics.

And again neither Intel or Micron is making any profits out of it. With no clear roadmap of cost reduction. Compared to conventional DRAM with DDR5 and NAND with Z-NAND.

The common misconception is that high write cycles, random read write, non-volatile memory had its place in the market for a price. Turns out this is a classic market mis-fit.


Um. I have to ask: did you read the article? All of it?

Because what you're offering as a comment is sort of a backwards version of the argument I'm making in the article.

My argument is that it flopped because of the monoculture of C21 OS design that means we lack OSes that can take full advantage of persistent-memory computers.

Yours seems to be that the kit wasn't competitive. I think that's a conflation of multiple category errors.

[1] It is not in the same product category as either DRAM or Flash.

It's not RAM: RAM is volatile, Optane is nonvolatile. But it can be used as primary storage, that is, appearing in the CPU memory map. It is not secondary storage, that is, storage requiring any kind of block-based controller handling.

It's not Flash: Flash is not word-writable and thus cannot be primary storage.

Because of decades of technical debt in C21 IT people do not know this vital primary/secondary distinction well. As a result they can only think in terms of two different types. Optane is not even neither type: it's both. It blurs the distinction. That's why it was important tech.

[2] Thus it is a fallacy to compare the price, or performance, or price:performance of a new tech that eliminates the primary/secondary split with either primary or secondary storage.

It is traditional in IT to make comparisons with automobiles.

It's like criticising cars because they are not good bicycles and they are not good aeroplanes.

These are different vehicles with different characteristics for different types of transport. To say that cars are bad bikes, or bad aeroplanes, means that you have failed to understand that cars are not either.

Cars are better for some things than bicycles. They are better for totally different things than aeroplanes.

You are trying to judge cars by the criteria of bicycles and aeroplanes, because you've never seen a car before and you're not used to thinking about cars. You're used to 2 categories and this is not either. It's not in the middle. It's a different category.

That is what the article was about.


Optane was still cheaper than DRAM modules of same capacity. And were twice larger for same price.

Really, I dream of Optane as a "super-fast swap drive" that "extends" RAM (that is what some manufacturers do with NVMe this days). Then it will be ease to have notebook with 256G "RAM" with actual 16G of DRAM and 256G Optane swap. Most users will not notice any difference compared to actual 256G DRAM since Optane is fast enough to make application switch fast. At least my Google Chrome and Intellij Idea could live in peace in such setup, each eating many gigabytes "RAM" since I don't code and watch sites at the same microsecond.


16GB optanes with an ordinary SSD interface are pretty widespread in China and DIYers do use them for swap/pagefile. The problem is that they take up a whole M.2 or NVMe slot with the corresponding number of PCIe lanes…

These drives generally come from budget laptops where the optane is originally used as the system disk. Not a good idea anyways and computer shops usually swap them out. You pay about 28 CNY for 16 GB, which is 2x the average SSD price-per-gig.


DRAM interfaced Optane wouldn't consume NVMe lanes. It should be soldered same way as usual DRAM is soldered in most non-professional (but home or bussiness) notebooks this days.

For example: MacBook M1/M2 has fixed limiter RAM. But with Optane as swap it could be four-eight times larger without user notice on difference.


> The DIMMs were expensive and didn't have broad compatibility. They should have made them in a regular DDR3/4/5 format and half the cost of main memory, or 2x to 4x the density for the same price.

Were the prices intentionally high or was that just because they couldn't compete with the economies of scale of DRAM manufacturers?


The prices were low if you view them as RAM, the densities were crazy. The prices were extremely high if you view them as a small cache SSD in front of your spinning rust, which was how they were marketed and sold.


You're asking for density or cheaper prices, but neither happened. The endurance improved, the IOPS improved, even latency a bit, but density didn't. Or it would've been cheaper than it is. I want to relate this to other forms of storage that went by the wayside, but I hope someone else will try again with a different roadmap and make it work.

FGPAs or SOCs with Optane for instant on/low latency applications would've been amazing.


> FGPAs or SOCs with Optane for instant on/low latency applications would've been amazing.

Looks like a company called Everspin is working on exactly that:

> "will provide FPGA system designers with extremely fast configuration, instant-on boot capability, and rapid updates of critical application parameters such as weighting tables in AI applications."

https://www.theregister.com/2022/08/02/kioxia_everspin_persi...


I don't think it's "gone forever" it's just gonna sleep for a while like "dumb terminals" now we have a hybrid with tons of cloud apps (and most things headed that way) and local apps and mixes of the two where it makes more sense. The same will happen with something like optane. I'm surprised they didn't push it more in "larger" embedded systems since it would have benefitted well from that NVRAM goodness, probably more so than PCs and cloud compute.


Is it possible to have DRAM connected to the bus, but without resetting the DRAM upon reboot? If that is possible, then you would have some form of persistence, provided that power never goes off.


Sadly, users usually reboot because some poorly written piece of software (probably written in C/C++ or Java) is out of wack and resetting RAM is the only way to fix it.

Also keeping RAM running all the time would eat electricity and generate heat.

But I like the general idea. It definitely would be a good option.


> users usually reboot because some poorly written piece of software

You can have that in Java - in any runtime environment, if your java processes keep consuming too much memory - swapping starts, on linux you will have the OOM killer, in the end the thing will reboot. Also operating systems can have bugs, hardware has bugs, nobody is perfect.

> Also keeping RAM running all the time would eat electricity and generate heat.

In a server room you wouldn't turn out the lights, ever.


Honestly people reboot not because something is poorly written, they reboot because it is too difficult to understand how to solve the problem so they go with the short term solution instead


> out of wack and resetting RAM is the only way to fix it.

This is not how this works. It doesn’t matter what language an application is written in, if a process is terminated, that frees all RAM consumed on any commonly used operating system.


>They should have made them in a regular DDR3/4/5 format

There was/is CXL


"large corpse" !!


> all large corpse

Freudian slip? :)


> How do you find a program to run if there are no directories? How do you save stuff, if there's nowhere to save it to? How do you compile code, when there is no way to #include one file into another because there are no files, and where does the resulting binary go?

The term for this is orthogonal persistence. Lots of people have thought about this before. In terms of Lisp OSes, Squeak (Smalltalk), and the long-dead TUNES project[1]. You don't need special memory to do this, but I imagine it could help.

There actually is a modern mainstream OS that does this already. It's called iOS. No, I'm not joking. It effectively provides the illusion of persisted state which is good enough for the user to believe it. No more saving files. Apps just do it. Automatically. Android does the same. Both also provide the illusion of always-on apps. You click a button and apps, typically, load so fast you don't notice they are loading from RAM or flash storage. It's incredibly seamless and precisely what people had in mind 20 years ago.

[1] http://tunes.org/cliki/orthogonal_20persistence.html


Developers don't see that though. It only looks like that on iOS to users. Behind the scenes, apps are still written using files. Hell, Apple eventually had to add its own "Files" app because storing things and finding them again was getting complicated.


Orthogonal persistence was a big research area at my university when I was studying CS. The pitch was that it would be like garbage collection for your persistent data: just like GC means you never have to worry about manually deallocating stuff, orthogonal persistence would mean you never have to worry about manually serialising stuff.

I was skeptical because I could never figure out how they proposed to manage code update -- if your data is always live, how do you safely modify the code?

And I still haven't seen a good answer to that! Although I haven't studied this area in a long time, so maybe it has been solved? I'd be interested to read any links people have about this.

I think of those environments like Squeak as a kind of "object soup", where everything is in one place, rather than rigidly separated into source code, compiled binary, data schema, serialised data, live server, etc. I thought back when I first heard of orthogonal persistence, and still think now, that that rigid separation of concerns is essential for maintainability. It's good to have carefully managed source control, and it's good to have a carefully managed data schema.

Have any Squeak-like environments been really successfully in large-scale systems or user-facing apps?

I'd say infrastructure as code has been a much more important innovation than orthogonal persistence; and it takes exactly the opposite approach, of explicitly serialising live stuff (your active deployment) into flat files.

Edit to add: maybe another way to put it is, the promise of orthogonal persistence is that you never need to reboot your computer. But that's a bad idea, because as we know, sometimes rebooting is (sadly) the only way to fix a problem.


There is no good upgrade path. Database migrations with existing data are a pain in the ass. Schema upgrades of capnproto are a pain in the ass, etc.

Separating working memory from persistent memory has obvious benefits in terms of mental compartmentalization.

Rebooting is good because it is like a fire drill, if you don't do it regularly, the real thing will happen one day and you won't be prepared.

It is the same with backups, you want to replay them regularly.

Of course that doesn't inherently invalidate the idea, it just means it isn't as useful as was previously thought.


As a user, I don't like the fact that apps now own my data.


That's the typical difference in expectations between users. Some users don't even want to think about what the computer is doing and want to avoid all the complexity, others don't want the computer to do any thinking for them because they might be aware of all the ways things could go wrong, and part of that is transparency and data ownership/control.

I'm definitely in the second group, whenever I see an app just saying "click this button to magically do the thing" I immediately worry about where the files are stored and how I'd back them up and how I could reverse the changes when (not if) something goes wrong.


I remember how Windows 95 was said to be document centric. It's now 2022 and we don't have documents to speak of.


Maybe to your point, I don't even understand what "document" means in this context. Can you explain?


https://invisibleup.com/articles/34/

> What Windows 95 wanted to do was to create a document-centric workflow where the application was out of the user's way and it only provided the means to let the user manipulate files.

> It needed to somehow take this file-centric workflow and bolt it onto application-centric DOS.


Another OS that did something like this, IIRC, was early versions of PalmOS. There was literally no distinction between storage and memory -- everything was stored in battery-backed RAM (except the OS, which was in ROM).


It was the same deal with some Windows Mobile 2003 (and presumably other) devices. Except it was Windows, so of course it had to have a weird interface. There was a slider in the control panel that let you change the allocation of RAM between "memory" and "storage". But the system would auto-adjust this allocation, so the slider's effect was temporary. It was quite strange.

See the first screenshot on this page:

https://www.pocketpcfaq.com/faqs/5.0/memory_management.htm


I remember you could somewhere (Regular settings? Or perhaps some hidden registry-tweak?) turn off the automatic memory allocation, so I moved as much software and documents as possible to the internal flash storage (still relatively tiny) respectively an external memory card and then moved the slider fully over to the left in order to have as much program memory as possible.


That also describes a TRS-80 Model 100.

It still calls files files, but a more meaningful all-ram distinction besides the simple fact everything is in battery backed ram, is that in most cases when a file is used, even for editing not just reading, it is used in-place, not copied from a storage area to a working area.

This is a 1984 device with 32k of ram and 32k of rom.

Partly this is because of the tiny amount of available ram, none to waste on copies.

Partly this is because the cpu doesn't have a relative-jump instruction, so binaries are not easily relocatable. So binaries get compiled to run at a specific static address, sometimes modified at install-time to stack up next to whatever else you already have installed, and that's just where that binary lives forever after that whether it's currently being executed or not.

So that flat model is actually driven by a couple of severe limitations.

The "files" are really only files for display in the main menu and when exporting like saving to tape or disk. While on-device, the files are really just areas of ram. They're always there at whatever address they're at, whether they're being used right now or not. For instance there are such things as BASIC programs that read data stored in comments to do things the language doesn't normally provide for, because the comments are ultimately nothing but a predictable address in memory like any other.


> This is a 1984 device with 32k of ram and 32k of rom.

The original Palm Pilot wasn't all that much larger. Despite being released 12 years later, it only had 128 KB RAM and 512 KB ROM.

The PalmOS memory model was unusual. Rather than trying to explain it all here, there's a decent summary at: https://www.fuw.edu.pl/~michalj/palmos/Memory.html


So you are saving files, but not showing them to the user? Sounds like progress.


Don't worry, the apps know where the files are when you need them. And you only need them when you are using the app, after all you will own nothing and you will be... wait now that I think about it, you do have the files somewhere on your hard drive, right? so they're technically actually yours, right? my mistake, yeah so nothing to worry here.

And there are some rare exceptions of those apps that uploads your stuff to the cloud for security and for easy backups, such a great technology because if you ever lose your computer you can just buy another one and the 'files' will be magically there (as long as you keep paying the app subscription)! Hopefully the author also implemented the app in a way that can be used offline.

Btw you will need the newest version of the computer because your apps updated so that's also great because new features are good! Your files are basically your creation, you can't just leave them behind.


Aaaah, and this is why we need Steve Jobs, or someone with his power, guts, vision. I think this is the kind of technology that Steve Jobs would have pushed for the new iPhone/iPod replacement. Similarly to how he pushed the "miniature mechanical disks" while the majority of Mp3 players were struggling with 4/8gb flash memory.

I wish that whoever leads The Pixel would adopt this memory... shit, even RaspBerry or some other "underdog" could create an Android phone that used this instead of RAM/Flash.


[Article author here]

Excellent point, thank you. I was aware of it, but I should have used the correct term, you're right.


There is an OS in current use that could have used Optane directly.

It used to be OS/400 on an IBM machine called AS/400. Now the machine is an "i-series", and I think emulated on a POWER machine.

Under OS/400, throughout the life of the machine, malloc will never return the same address twice. If you have an address (128 bits), you can see the last thing written there, in perpetuity. And, a program started up will run until it dies. Shut down, start up again, and the programs only notice time passed. If you get a new computer, you shut the old one down and start its image on the new hardware, and again the programs don't notice.

Amusingly, much of the OS was coded in pre-1990 C++, basically Java without the GC.


[Article author here]

Yes, and the FOSDEM talk that the article is partly based on mentioned that. I linked it from within this article.


> coded in pre-1990 C++, basically Java without the GC.

How do you get "basically Java without the GC" from "pre-1990 C++"?


From the language definitions. 1995 Java was, roughly, 1990 C++ with GC stuck on. It was not a thing of beauty. Its designers had absolutely no desire to code in it, themselves.


The "we can't figure out how to change Linux to use only primary storage" is real. I realized not too long ago that the biggest thing holding back the advancement of computing is software developers. They can't imagine novel ideas. Try to explain to a developer that dependency hell only exists at all because their own software/compilers/programming languages aren't tracking dependencies within their own application's functions, and they look at you like a dog looking at a human in a dog costume. (Most of them also think YAML is a configuration format and don't know what SDLC stands for, so I guess the dependency thing is way out there)


> I realized not too long ago that the biggest thing holding back the advancement of computing is software developers. They can't imagine novel ideas.

Hey - Software engineers can imagine novel ideas. The problem is that every year, there's more code in the OS, and there's more code in web browsers, and there's more applications which depend on all that stuff. Chrome has 25 million lines. At that size, its really hard to innovate on how the browser works.

And that matters because some solutions require changing the operating environment. For example, I'd love to replace the filesystem with an append-only log, and replace files with CRDT based datasets. That would solve all sorts of problems. But doing so would require changing linux, and changing most existing software. Every year that passes, that gets harder and harder to do.

I don't know what the answer is here. Lots of people "solve this" by building further and further up the stack. Linux binaries not feature rich enough for you? We could fix them, but lets use Docker instead. Oh, you want to make a desktop app? You could use the native APIs for that; but who can be bothered learning the native UI APIs? Its much easier to write web app and ship a copy of Chrome on every platform.

We have to find a better answer than going "up the stack" every few years. But so long as people find it easier to write software than read software, its be hard to imagine the system getting simpler over time.


> I'd love to replace the filesystem with an append-only log

How would you solve the problem of, "I have a finite amount of storage and my drive is full and I need to delete something to make space for this new thing I want"?


1) there are file systems that are append only logs (to an extent) - a decade ago I played around with nilf/nilfs2

2) the way finite storage is handled is that "deleted" content can be garbage collected, and you will then be able to loop around and write to those garbage collected areas. not fundamentally different than an SSD which many times has such a log file system at the hardware level.


I'd make it be an append-only log + some sort of kafka style compaction.

So for example you can tell it to compact all the parts of the log involving that file you've deleted. Create + Delete compacts to a no-op.


Some sort of smart FIFO system that combines old entries and then culls them?

Time Machine (apple's automated backup thing) does something similar, at a simpler level


Eventually it becomes overconstrained, too top-heavy and a new simpler set of abstractions are adopted. It feels like we are probably close to this happening now.

The Structure of Scientific Revolutions by Thomas Kuhn is a view of what I’m talking about but in the world of academic theory instead of technology architecture.


Brad Cox wrote “Planning the software industrial revolution” 30 years ago but now just might finally be the time. Won’t happen all at once but the complexity of the modern stack is clearly unsustainable. Exciting possibilities!


Could you expand on why you think that solves dependency hell?

It sounds like you're describing tree-shaking, which is already commonplace, particularly in the JS ecosystem. Or LTO dead-code elimination, in compiled languages - neither of which solve dependency hell.


Dependency hell happens when code is subject to an external dependency. Within an application, you can determine which of your functions calls some other version of a function/dependency, can remove code, add code, etc. But you can't do that to other applications; you only control your own application's code and what it depends on/is built against/tested against. So every application is effectively locked into what it's view of the world is, and it just has to hope that never deviates, and that every system it runs on is set up exactly the same. You can play fast and loose with your code by trying to keep the same API/ABI as your underlying code changes, but subtle differences will eventually cause bugs/regressions when expected behaviors change.

The way to solve the problem of dependency hell is to version every function, and only call functions based on their versions, and ship every old version of every function. Then the application itself, or a dynamic linker, must find and execute the correct version of the function it needs to call. In this way, dependencies can change at any time, but every line of code will only ever call functions that they were originally built and tested against, because those versions are still knocking around somewhere in the dependency. You can upgrade a dependency 50 times but your application will still just keep calling the old version of the dependency's function, and so the behavior of your application remains the same while the dependency receives upgraded functions. You can upgrade any application at any time and it will just pull in the latest dependency, and continue to call the versions of that dependency that they need.

The basic concept for this already exists in glibc as you can ship version-specific functions and link/call a version-specific function. But literally no one uses it. To get widespread adoption and make the paradigm implicit/automatic, we would need to modify the programming language, the compiler, the way programs are executed, require lots more storage, probably new paradigms of caching and layering and shipping code. Package deps would become metadata in a build process, package management would become "scan all executables for dependent versions & download them all".

Nix OS is an attempt to work around all of this by pretending that the problem is just with the linker or the file tree or PATH or something. But it's simply impossible to resolve dependency hell entirely without versioned functions, due to rare but intractable intra-package dependency conflicts. And it doesn't address data model versions or network service interface version changes.

Containers are an attempt to work around all of this by literally having a different copy of the world for every containerized application. It works well enough, except again the boundary between containers is subject to application interface versions and data model versions changing. (The data model and code are basically the same thing since you need code to do anything with data) The only way to have dependency-hell-free containerized applications is if you version their interfaces/data models and have them call the correct version for other containerized apps [or network services], so again you're back to versions of functions.


> The way to solve the problem of dependency hell is to version every function, and only call functions based on their versions, and always ship every old version of every function. Then the application itself, or a dynamic linker, must find and execute the correct version of the function it needs to call.

If this is your vision, why would you dynamically link? If you static link your code to the library functions it calls, the runtime environment can't be changed out from under you (well --- not without a lot of work), and you'd presumably only build with the version of the library functions you like, so you'd be set there too. If you want to update a dependency, pull it in and rebuild.

I don't think this is a popular vision, because people want to believe that they can update to the latest OpenSSL and fix bugs without breaking things and sometimes, they can.

You still have a difficult problem when you share data with code you didn't fully control the linking of. If your application code needs to setup an OpenSSL socket, and then pass that socket to a service library, and the service library uses OpenSSL A.B.C-k and you use A.B.C-l, maybe that works, maybe it doesn't; if it doesn't, that's a heck of a problem to debug. Of course, it's even worse if you're not on the same minor version or across major versions.

While I'm picking on OpenSSL, because it's caused me (and others) a lot of grief, this kind of thing comes up with lots of libraries.


> people want to believe that they can update to the latest OpenSSL and fix bugs without breaking things

Yeah. It's a bug in the culture, really, and culture is much harder to change than software.

> problem when you share data with code you didn't fully control the linking of

Yeah, the data model needs to be versioned too. It's impossible to pass data between applications of different versions without the possibility of a bug. The options I'm aware of are A) provide that loose-abstraction-API and hope for the best, or B) provide versioned drivers that transform the data between versions as needed.

A is what we do today. B would be sort of like how you upgrade between patches, where to go from 6.3.1 to 9.0.0, you upgrade from 6.3.1 -> 6.4.0 -> 7.0.0 -> 8.0.0 -> 9.0.0. For every modified version of the data model you'd write a new driver that just deals with the changes. When OpenSSL 6.3.1 writes data to a file, it would store it with v6.3.1 as metadata. When OpenSSL 9.0.0 reads it, it would first pass it through all the drivers up to 9.0.0. When it writes data, it would pass the data in reverse through the data model drivers and be stored as v6.3.1. To upgrade the data model version permanently, the program could snapshot the old data so you could restore it in case of problems. (Much of this is similar to how database migrations work, although with migrations, going backward usually isn't feasible)


Who's going to write those migration drivers though? Not OpenSSL, because they don't think it's valid to link to multiple versions of their library in the same executable. But also, it will be difficult for it to be anybody else, because the underlying incompatible data structures were supposed to be opaque to the library users. Note that I'm talking about objects that only live in program memory, they're never persisted to disk.


This is the underlying problem: it's the software developers' philosophy and practice that are the limitation, not a technical thing. Doesn't matter if it's program memory or disk or an API or ABI, it's all about what version of X works with what version of Y. If we're explicit about it, we can automatically use the right version of X with the right version of Y. But we can't if the developers decide not to adopt this paradigm. Which is where we are today. :(


Sounds as though you're on the cusp of inventing semver.


I so agree with you.

Containers are just a very wasteful way to do static linking.


I get my teeth kicked in every other month by Docker gang or glibc gang or both for saying this ;) Careful!


Wasteful? Honestly, the docker images I use take up more RAM than they take up disk space. If I had to give up containers you would have to use VMs instead and those are significantly more wasteful.

Also, nothing stops you from putting your statically linked go app in a container which can then use e.g. kubernetes or nomad for horizontal scaling.


"The way to solve the problem of dependency hell is to version every function... always ship every old version of every function..."

This seems like hell to maintain and transitively speaking, a mountain of code. For this much work, I'd prefer to invest in obscenely detailed unittests that allow the team to retire old function-versions and everybody stay on HEAD.

That said, I can imagine cases where old-versions might be helpful for some period of time, and you could call them via their function name and commit hash... then a dependency detector only keeps old versions as needed, a warning tool detects ancient versions to consider retiring, etc. You'd need editor/debugger support so developers can see and interact with old versions, and I'm not sure how this works with raw text editors - perhaps the dependency detector copies the (transitive) code into a new subdirectory?


Yeah, I updated my comment to reflect that, you definitely would only need versions of functions on disk where some code depends on it. That's basically how container layers work.

I think CI/CD practices are really important to get rid of the old code. Automated build systems should be constantly downloading new versions of deps and running tests so devs can fix bugs quickly and release new versions that depend on new deps. The quicker that cycle happens, the quicker old code can disappear, new functionality can be implemented, bugs can be fixed, etc. Because everything would still use pinned versions of deps, it would still work exactly as it was tested, so you wouldn't be sacrificing reliability for up-to-date-ness. Speed and automation of that process are critical.


You might find a video of unison language interesting. (I'm not sold on there approach per se, but they do grapple with this problem and its logistics.)


No, this is not the silver bullet for solving dependency hell. Apart from the problem of maintaining all those older version code, the problem of creating different versions of the same function is already non-trivial. For imperative programming languages, you have preconditions on global states that depends on other functions to work correctly.

Say, you have a certain access sequence for locks that has to be obeyed by all functions in this library. Suddenly in some version, you want to add another lock or expose some API that does things with finer granularity. Suddenly the precondition for older code breaks, so there is no guarantee that users can sometimes call the old version of the API and sometimes call a newer version of the API.

I am not saying that the weird example I mentioned is a good way of programming, I just don't think that function versioning is the magic bullet to solving dependency hell. Intra-package dependency conflict is a difficult problem, neither the NixOS approach nor your approach can magically fix that.


Without a more detailed example I'm not sure how you mean (also I'm very tired) but it seems function versions would still solve that problem. A1 and B1 use library C1, and C1 has primitive locking. C2 is released with new locking. A2 is released which uses C2. B1 still runs against C1 still. If B1 tries to call A, it will call A1, both still use C1. If during development, A2 tries to call B, it will call B1, which, during the testing of A2, would fail (if B1 is incompatible with C2), so the developer of A2 would either have to implement a workaround, or upgrade to B2, and then A2 would be calling B2, which would work with C2.

If A and B are developed independently, but some day somebody makes A and B try to talk together, and they were never tested together, they could either 1. make a best effort to try to work and then die, 2. compare mutual dependency trees and try to walk the trees to find versions of themselves (A and B) that work with the same version of a dependency (C), or 3. simply not run at all because they don't know what version of each other to use.

In theory you could use CI/CD to build every version of every app against every version of every dep, store them all in a remote registry, so that whenever some weird combo of apps wanted to use something, they could look up a version of that app built against the right dep.


Yes, developers will need to implement workaround in the case when A1 and B1 cannot use the same version of C1. So how does the function versions approach solve the problem?

For using CI/CD to build every version of every app against every version of every dependencies, even if we just ignore the problem of combinatorial explosion, some software may be abandonware and some may be entirely new. The set of versions of dependencies that work with them may not even overlap, so the 'correct' combination may not even exist.

I think the function versioning approach is just a way to maintain backward compatibility, it is not required for backward compatibility, nor enough for backward compatibility.


Thank you, this is what I believe as well. At work I solved the previous team's dependency hell in this way. Our app is installed on many computers in the organization, and it has dependencies that fortunately are mostly developed in-house, where I'm the one ultimately controlling the API and the ABI. I wanted to enable those dependencies to be developed without me recompiling everything, and vice versa. Previously everybody just recompiled everything, and this slowed development to a crawl since upgrades took forever.

This isn't a server, it's something that people install on their machines and are free to upgrade in whole, or in parts (upgrade just a single dynamic library to fix a bug), or not at all.

My solution was to use a "COM-inspired" versioning. I defined an interface-based ABI. Not a function-level versioning as suggested here, it's more coarse than that, but it was enough. See my comment here [0].

[0] https://news.ycombinator.com/item?id=31965368


I think you've just added a 10th circle.

More specifically, it seems like you're solving a subset of dependency issues (mainly version conflicts), while making the other issues worse (e.g. keeping sub-dependencies up-to-date, ballooning disk space usage, less easily cachable requests to package repos).


Thank you! ;-)

Well that's what dependency hell is, a subset of issues: https://en.wikipedia.org/wiki/Dependency_hell The other issues are of course important but require their own solutions. Keeping dependencies up to date [or, in the case of the above method, recompiling applications to use new function versions with security fixes] requires CI/CD, feature flags, auto-upgrading, avoiding drift, etc. Disk space and caching issues require storage with built-in delta layers/CoW/compression (which can also be applied to dependencies, e.g download the dependent versions of functions rather than keeping all of them on disk at once).

As we get more advanced, the hell will get deeper for sure.


> The way to solve the problem of dependency hell is to version every function, and only call functions based on their versions, and ship every old version of every function. Then the application itself, or a dynamic linker, must find and execute the correct version of the function it needs to call. In this way, dependencies can change at any time, but every line of code will only ever call functions that they were originally built and tested against, because those versions are still knocking around somewhere in the dependency.

That breaks as soon as you have a diamond situation. A depends on B and C, which each depend on D. Often you need that to be the same version of D. If B and C are allowed to depend on different versions of D, you get worse problems. There are languages that have tried this model and it has its advantages, but IME the cure is worse than the disease.

> The basic concept for this already exists in glibc as you can ship version-specific functions and link/call a version-specific function. But literally no one uses it.

It doesn't actually work properly, mainly because linux package management sucks. No-one cares about backwards compatibility on linux because you're either using a rolling release distro or you're using a traditional distro that will do all your compatibility checking and versioning for you and upgrade the global version of everything when they're good and ready. Windows has much better support for this kind of thing and that's why you don't really hear about "DLL hell" anymore.

This isn't the biggest thing holding back software development, not by a long shot. It's a side issue that we tweaked and can keep tweaking. But being able to version libraries and functions doesn't work, at least not alone; it breaks the idea of having a platform that mostly-independent components can build on.


> That breaks as soon as you have a diamond situation. A depends on B and C, which each depend on D. Often you need that to be the same version of D. If B and C are allowed to depend on different versions of D, you get worse problems.

Yeah :( If B1 and C2 are installed, they might depend on D1 and D2, respectively. But luckily, A can be built against B1 and C1, which can both depend on D1. Or, A can depend on B1 and C2, and B1 will talk to D1, and C2 will talk to D2; when B1 talks to C, it will talk to C1, because that's the version B1 was built/tested against; if C2 talks to B, it will talk to B2. However A was built and tested, the versions of functions that it used are what will be called. (There may need to be some deeper, ugly static mapping of function deps from compile time, but essentially it should be possible for every version of an application to record what dependencies it used and those dependencies, and for function calls to automatically download and call the correct dep, up/down/across the hierarchy)

The data model is the really sticky wicket, but I think migrations provide a path to deal with it, or just a loosely coupled API.

> mainly because linux package management sucks

Yes that too :) But if apps used versioned functions then package management wouldn't suck so bad!

> This isn't the biggest thing holding back software development

It's what keeps us from changing software frequently/easily, which I think is the biggest problem limiting advancement of the discipline. Changes cause bugs and bugs cause a waste of time, and fear of bugs makes us jump through hoops and keep us from making the changes we need to move other things forward. People are trying to work around it by vendoring deps or statically linking but it's a bad hack because the app & data interfaces are still volatile.

For example, all Cloud technology is currently mutable. An S3 bucket as a whole concept cannot be snapshotted, changed, and then reverted automatically. That's a limitation of the service and the API. Making it immutable requires redesign, which would probably break existing stuff. Every single cloud service is like this. It will take 10+ years to redesign everything in the Cloud to be immutable. But imagine if we could just make it immutable tomorrow, and nothing would break, because old software dependent on S3 would keep using the old version! Make any change you want at any time and ship it right now and nothing will break. That's the future that nobody can imagine right now because they haven't seen it. Just like they haven't seen an OS with only primary storage.


> But luckily, A can be built against B1 and C1, which can both depend on D1. Or, A can depend on B1 and C2, and B1 will talk to D1, and C2 will talk to D2; when B1 talks to C, it will talk to C1, because that's the version B1 was built/tested against; if C2 talks to B, it will talk to B2.

That only works if B knew about C, or if there's a version of D that there are releases of both B and C that were built against. Neither of these can be relied on, especially when you bring in E and F and all the other dependencies that a real application will have (e.g. maybe B updated E before updating F, whereas C updated F before updating E).

> The data model is the really sticky wicket, but I think migrations provide a path to deal with it, or just a loosely coupled API.

The data model is the root of the problem, if you can't fix that you can't fix anything.

> For example, all Cloud technology is currently mutable. An S3 bucket as a whole concept cannot be snapshotted, changed, and then reverted automatically. That's a limitation of the service and the API. Making it immutable requires redesign, which would probably break existing stuff. Every single cloud service is like this. It will take 10+ years to redesign everything in the Cloud to be immutable. But imagine if we could just make it immutable tomorrow, and nothing would break, because old software dependent on S3 would keep using the old version! Make any change you want at any time and ship it right now and nothing will break.

But as long as there are still applications that mutate them, your buckets aren't immutable! Effectively you're just introducing a new, incompatible cloud storage system that client applications have to migrate to - but that's something you could already have done.

The problem isn't depending on old versions versus new versions. The problem is getting consensus on the semantics between all parts of your system, and being able to depend on an older version of something just moves the problem around.


> But if apps used versioned functions then package management wouldn't suck so bad!

Wishful thinking. The main issue with Linux package management is the fragmentation between distros and the fact that human volunteers can't keep up with upstream.


Are you thinking of something like this?:

https://www.unison-lang.org/


This sounds like a good fit for a patch-based version control system like Pijul ?

https://news.ycombinator.com/item?id=29991417


That does not "solve" plugins. That does not allow bugfixes in libraries.

Edit: and worse, that's introducing plugin like problems in code that does not even use plugins.

Next idea?


Dependency hell exists because people still think static linking was not a great idea.

All this while Rust, go, and all the new kids on the block use only static linking. Because static linking is a great idea.

If you can update a library, you can also update all other software, and usually from the same source.


> All this while Rust, go, and all the new kids on the block use only static linking. Because static linking is a great idea.

No, it's more that they were designed by entities that distributed their software using static linking.


In Rust's case it was popular demand. Old Rust used dynamic linking a lot more. But our users demanded static linking (Go's popularity was a big driver of this), so the decision was made to do static linking by default.


Do you think Rust would have made more progress on stabilizing its ABI if dynamic linking was pursued?


No, at most we would've invested more in e.g. ensuring¹ you cannot accidentally (or intentionally) mix-and-match shared objects not built by the same build process, or in ways to reduce the cost² of safe ("full relro") dynamic linking.

¹ via forcing the dynamic linker to eagerly resolve some symbols, global constructors, etc. - see https://github.com/rust-lang/rust/issues/73917#issuecomment-... for some recent discussion

² such as exporting a single base symbol per dylib (with a hash of code/data contents in the symbol name, for integrity), and statically resolving imports (to constant offsets from that base symbol), when linking object files into dylibs/executables - because of how position-independent code indirects through the "GOT", in practice this would mean that the dynamic linker would only need to lookup one symbol per dependency .so instead of one per import (of which there can be hundreds or thousands)

Also, "dynamic linking" in Rust was never really "dynamic" or about "drop-in replacement to avoid rebuilds", it was only about the storage reuse of "shared objects", with the "late-binding" semantics inherent in dynamic linking a negative, not a positive.

For example, rustc, rustdoc, clippy, miri, etc. are all relatively small executables that link against librustc_driver-*.so - on a recent nightly that's 120MiB, with only static linking that'd be half a GiB of executables instead. The relationship is "one .so shared between N executables" not "one .so per library". Also, if we could remove runtime symbol lookup but keep the separate .so (like ² above), we totally would.

---

At the same time, Rust's continued evolution depends on being able to (almost) "never ask for permission" when changing internals that were never guaranteed to be one thing or another.

Rust recently fixed most platforms to use simpler and more performant locks based on futex(-like) APIs (see https://github.com/rust-lang/rust/issues/93740 and PRs like https://github.com/rust-lang/rust/pull/95035), and this meant that e.g. Mutex<T> has a 32-bit (atomic) integer where a pointer used to be.

An even more aggressive change was decoupling Ipv{4,6}Addr/SocketAddrV{4,6} from the C types they used to wrap internally (https://github.com/rust-lang/rust/pull/78802) - that one actually required library authors to fix their code in a few cases (that were very incorrectly reinterpreting the std types as the C ones, without this ever being officially supported or condoned).

Imagine trying to do that in C++. The stories I keep hearing from people involved in the C++ standard are tragic, more and more otherwise-uncontroversial improvements are being held back by poorly-motivated stubbornness around ABI stability. Large-scale users of C++, like Google, have long since migrated from C++'s standard library to their own replacements (e.g. abseil) for many of their codebases, and I kept hearing Google were "forking C++" (over the frozen ABI debacle specifically), long before it came out as "Carbon".

---

The technology to coherently³ avoid today's rebuild-the-world cost in compiled languages⁴ isn't here yet IMO.

³ as in, guarantee identical behavior to the full rebuild from scratch, without introducing inconsistencies that could cause logic errors, memory corruption, etc.

⁴ some/most C libraries of course being the biggest exception - and while it's possible to intentionally design libraries like this in any language that can do FFI, it's fundamentally antithetical to "zero-cost abstractions" and compiler optimizations - nowadays the latter are forced even on otherwise-patchable C dependencies via LTO, in the interest of performance

What I'm referring to is incremental recompilation with the following properties:

...

[oops, hit comment size limit! I'll post the rest separately]


Has the Rust compiler team considered using a busybox-style approach (single binary, multiple entry points) for rustc and friends? Even Windows supports hard linking, so AFAIK this should be feasible.


We explicitly support building against librustc_driver-*.so for both "custom drivers" (what we call those binaries I mentioned) and generally "rustc as a library" usecases. We should maybe rename it to librustc and remove as much potentially-confusing "driver" terminology as possible.

Pinning a nightly and installing the "rustc-dev" rustup component are both necessary, the former because internal APIs of rustc aren't stable, but it's a supported usecase.

Both clippy and miri are developed out-of-tree like that, and sync'd using `git subtree` (or `git submodule` but we want to move everything to subtree).


[continued from above due to size limit]

What I'm referring to is incremental recompilation with the following properties:

1. automatic correctness

` - that is, a compiler change not explicitly updating anything related to dependency tracking should at most result in conservative coarser-grained behavior, not unsoundly cause untracked dependencies

` - manual cache invalidation logic is a clear indicator a compiler isn't this - e.g. this rules out https://gcc.gnu.org/wiki/IncrementalCompiler (maybe also Roslyn/C# and the Swift compiler, but I'm not sure - they might be hybrid w/ automated coarse-grained vs manual fine-grained?)

` - the practical approach to this is to split workload into work units (aka tasks/queries/etc.) and then force information flow through centralized "request"/"query" APIs that automatically track dependencies - see https://github.com/salsa-rs/salsa for more information

` - research along the lines of ILC (Incremental Lambda Calculus) might yield more interesting results long-term

` - outside of a compiler, the only examples I'm aware of are build systems like tup (https://gittup.org/tup/) or RIKER (https://github.com/curtsinger-lab/riker) which use filesystem sandboxing (FUSE for tup, ptrace/seccomp-BPF for RIKER) for "automatically correct" build dependencies

2. fine-grained enough

` - at the very least, changes within function bodies should be isolated from other bodies, with only IPOs ("interprocedural optimizations", usually inlining) potentially introducing additional dependencies between function bodies

` - one extreme here is a "delta-minimizing" mode that attempts to (or at least warns the developer when it can't) reduce drift during optimizations and machine code generation, such that some fixes can end up being e.g. "patch a dozen bytes in 200 distro packages" and get quickly distributed to users

` - but even with history-agnostic incremental recompilation (i.e. output only depends on the current input, and past compilations only affect performance, not other behavior), function-level incremental linking (with one object file per function symbol) could still be employed at the very end, to generate a binary patch that redirects every function that grew in size, to additional segments added at the end of the file, without having to move any other function

3. propagating only effective "(query) output" changes

` - also called "firewalling" because it blocks irrelevant details early (instead of redoing all work transitively)

` - example 1: editing one line changes the source byte offsets of everything lower down in the file, but debuginfo doesn't need to be updated since it tracks lines not source byte offsets

` - example 2: adding/removing explicit types in a function should always cause type checking/inference to re-run, but nothing using those type inference results should re-run if the types haven't changed

4. extending all the way "down" (to machine code generation/object files/linking)

` - existing optimizing compilers may have a hard time with this because they didn't design their IRs/passes to be trackable in the first place (e.g. a LLVM call instruction literally contains a pointer to the function definition, with no context/module/pass-manager required to "peek" at the callee body)

` - can be theoretically be worked around with one object file per function, which https://gcc.gnu.org/wiki/IncrementalCompiler does mention (under "Code Generation" / "Long term the plan is ..."), but that alone isn't sufficient if you want optimizations (I've been meaning to write a rustc proposal about this, revolving around LLVM's ThinLTO having a more explicit split between "summary" and "full definition")

5. extending all the way "up" (to lexing/parsing/macros)

` - this is probably the least necessary in terms of reducing output delta, but it can still impede practical application if it becomes the dominant cost - AFAICT it's one of the factors that doomed the GCC incremental experiment ("I’m pretty much convinced now that incremental preprocessing is a necessity" - http://tromey.com/blog/?p=420)

` - outside of "rebuild the world", this also makes a compiler much more suitable for IDE usage (as e.g. a LSP server)

My experience is mostly with the Rust compiler, rustc, whose incremental support:

- started off as coarse skipping of generating/optimizing LLVM IR

- has since evolved to cover 1./2./3.

- while many passes in the "middle-end" were already both "local" and "on-demand", sometimes allowing incrementalization by changing only a dozen lines or so, 4. is indefinitely blocked by LLVM (at best we can try to work around it), and 5. (incremental "front-end") has been chipped at for years with several factors conspiring to make it orders of magnitude more difficult:

` - macro expansion and name resolution being intertwined (with the former mutating the AST in-place while the latter being a globally stateful algorithm)

` - "incremental front-end" used to be seen as essential for IDE usage and that notion started falling apart in 2018 or so (see rust-analyzer aka "RA" below, though RA itself is not itself directly responsible, nor do I hold it against them - it's a long messy story)

` - this work (without the IDE focus, AFAICT) has mostly driven by random volunteer work, last I checked (i.e. without much "official" organization, or funding, though I'm not sure what the latter would even be)

- its design has been reimagined into the "salsa" framework (https://github.com/salsa-rs/salsa - also linked above)

` - most notable user I'm aware of is the rust-analyzer (aka "RA") LSP, which hits 1./2./3./5. (4. isn't relevant, as RA stops at IDE-oriented analysis, no machine code output) - without getting too into the weeds, RA is a reimplementation of "Rust {front,middle}-end", and ideally eventually rustc should be able to also be just as good at 5. (see above why it's taking so long)

I am not aware of any other examples of industrial compilers having both 1. and 2. (or even academic ones but I suspect some exist for simple enough languages) - every time someone says "incremental compilation? X had that Y decades ago!", it tends to either be manually approximating what's safe to reuse (i.e. no 1.), or have file-level granularity (i.e. no 2. - the "more obviously correct" ones are close to the "separate compilation" of C/C++/many others, where a build system handles parallel and/or incremental rebuilds by invoking the compiler once per file), or both.

While 2./5. are enough to be impressive for IDE usage all by themselves, the interactive nature also serves to hide issues of correctness (lack of 1. may only show up as rare glitches that go away with more editing) or inefficiencies (lack of 3. leading to all semantic analyses being redone, may still be fast if only requested for the current function).

Another thing I haven't seen is distro build servers talking about keeping incremental caches around (and compiling incrementally in the first place), for builds of Rust "application" packages (most of them CLI tools like ripgrep, I would assume) - it's probably too early for the smaller stuff, but a cost/benefit analysis (i.e. storage space vs time saved rebuilding) would still be interesting.


> Dependency hell exists because people still think static linking was not a great idea.

Then someone decided to call it Docker.


Exactly, containers are static linking with many extra steps. And more memory and storage requirements.


If your app calls the tool yq, is it written for yq v3, or v4? v4 has breaking changes in its interface. Just statically linking your app won't deal with that, but a container with both your app and yq v3 will. What containers don't deal with is the interfaces between containers.


> All this while Rust, go, and all the new kids on the block use only static linking.

This is trivially disproven:

  $ ldd /usr/bin/rustc
   linux-vdso.so.1 (0x00007ffd3de9b000)
   librustc_driver-e4385bf403336f76.so => /lib64/librustc_driver-e4385bf403336f76.so (0x00007ff832a00000)
   libstd-d371b708c754b3bb.so => /lib64/libstd-d371b708c754b3bb.so (0x00007ff832886000)
   libc.so.6 => /lib64/libc.so.6 (0x00007ff832600000)
   libLLVM-14.so => /lib64/libLLVM-14.so (0x00007ff82be00000)
  [...]
Not only does rustc dynamically link against rust's "std" library, but also parts of rustc are implemented in a dynamic library ("librustc_driver").

The only reason projects written in Rust (other than rustc itself) don't use dynamic linking to other Rust code yet, is that there still isn't a stable ABI for Rust code (for good reason: that allowed them to introduce not only reordering of struct fields and niche optimizations, but also passing and returning slices as a pair of registers instead of a pointer to a two-word structure in the stack). I fully expect that, as the language matures, a stable (possibly opt-in) ABI will be introduced, together with freezing some of its standard library (for instance, the contents of the Vec structure, and how its inline functions work), allowing for dynamic linking against Rust's "std" (and as a consequence, passing Rust types between separate dynamically linked modules compiled with different releases of Rust).

(As an aside: back when C was the "new kid on the block", it also used only static linking, dynamic linking of C code came later.)


> but also parts of rustc are implemented in a dynamic library ("librustc_driver")

Nit: 100% of rustc is found within librustc_driver-*.so (I mention it in https://news.ycombinator.com/item?id=32329062 - which also touches on stable ABI).

`rustc` just calls into the .so without any custom configuration: https://github.com/rust-lang/rust/blob/e141246cbbce2a6001f31...

(but rustdoc/clippy/miri all have additional configuration, passes, etc. specific to them)

Also, if you look at the `rustc` executable's size:

    $ du -bh ~/.rustup/toolchains/nightly-*/bin/rustc
    17K     /home/eddy/.rustup/toolchains/nightly-2018-01-01-x86_64-unknown-linux-gnu/bin/rustc
    2.9M    /home/eddy/.rustup/toolchains/nightly-2021-12-02-x86_64-unknown-linux-gnu/bin/rustc
    2.9M    /home/eddy/.rustup/toolchains/nightly-2022-01-13-x86_64-unknown-linux-gnu/bin/rustc
    2.7M    /home/eddy/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/rustc
17K is what it should be, I'm not sure where the extra 2.7M are coming from - my current suspicion is that it's entirely statically linked jemalloc, which I guess we didn't want to impose on all users of rustc_driver, but it might be worth doing it anyway.


Don't be that misleading. That's the rust compiler.

You should analyze an executable created by the rust compiler instead.

AFAIK only glibc and libgcc are linked dynamically in a program written in rust, and there's a way to statically link these.


The Rust compiler is an executable created by the Rust compiler.


IIUC though, the Rust compiler executables are highly unusual in that they dynamically link their main top-level dependency. The vast majority of executables built with cargo do not, and I'm not sure how easy it is for a cargo project to do what the compiler does, i.e. build a top-level crate as a DLL and dynamically link that, while using plenty of static linking within that DLL.


And one of the very few, if not the only one, that have to link with LLVM.


That's true, but many of them that use C libraries dynamically link to those libraries, even if it's not LLVM. OpenSSL is classic example.


This entire conversation has managed to diminish my enthusiasm for Rust. =/


You can write C libraries in Rust. The entire operating system could be written in Rust and there would still be be dynamic linking between programs.

What you are complaining about is ABI stability between versions of Rust but you are hiding that by arguing steveklabnik is violating the spirit of your secret ideals.


Dependency hell strikes me as one of those things where people keep trying to apply technological solutions to a sociological problem and failing over and over again. If we can’t produce stable APIs or recognize that Hyrum’s Law is a sign of fatally-flawed engineering, I don’t think that any amount of versioning is ultimately going to help.


Why do we _want_ to change Linux to use only primary storage? Having a hard distinction between persistent and non-persistent memory is insanely useful. What would we do when something goes wrong if rebooting stops being a thing?


There are a couple aspects that are interesting. With Optane, there is still a distinction between persistent and volatile primary storage - you don't want things that change quickly to be on persistent memory because it'll be slower to update and wear down the device. What you gain is fast memory addressable persistent storage.

As for rebooting not being a thing, the IBM AS/400 and its offspring (currently named IBM i) have been doing it for decades now. They are very alien when compared to Unix, but still quite cool. And extraordinarily mainframe-like reliable.

There's a reason they cost a small fortune.


And when you track fine-grained dependencies that way the result is that upgrading an insecure version of a library still leaves a dozen other copies in memory that leave you still vulnerable.

Dependency hell isn't just dealing with incompatible dependencies. It has another side of forcing upgrades when they are needed for very, very good reason.


The old version of the function could get security patches.

There could be a dialog that warns the user and offers to disable functionality.

We will have to eventually do an accurate permission system anyway.


Who's going to write the patches?

Out-of-support branches aren't suddenly going to start being supported again, just because you've changed how dependencies are managed.

Lets say there are 10 versions of my function. A critical vulnerability is discovered, and I publish a fix in version 11. I hire someone to backport the fix to all 10 historic versions of the function.

Now I have 21 distinct versions of the same function.

Oh no, another critical vulnerability has been discovered!

I fix it in version 22, and hire a team to backport the fix to all 21 other versions.

Now I have 43 versions of the same function!

Oh no...

Sure, you could be a bit more pragmatic about which history nodes you backport to, but it's still fundamentally O(n^2). It's hard enough to manage backports at a whole-project level, let alone per-function.


The old version of the function could get security patches.

But then it wouldn't be the old version of the function any more, but a brand new function. If you allow arbitrarily changing 'old' function then you just lost all the guarantees of stability and reproducability that this approach was supposed to offer.



We already do the function tracking with left-pad and isOdd. I don't like it.

Frankly, what we actually need is a unified package format like we have for containers rather than specific micro optimizations that nobody really asked for. If gradle, npm, pip, cargo, pacman, apt all speak the same package format you only need one registry for all languages. After that you can think about building a unified package manager. After that you can think about a unified build system.


It's a pretty consistent pattern that hardware can do things and software fails to utilize them.


You forgot to include the link to your github where you showed us all how it's done.


Very sad to hear this news, but not so unexpected.

As a distributed database engineer, we have been working with some universities and exploring the optimization of database on Intel Optane for a long time, and even have published some papers.

But everything was only in the experimental stage, it was difficult for us to use the research successfully in production, because nearly all of of our customers hadn't bought Intel Optane. They thought Optane was too expensive and didn't find suitable scenarios. As far as I know, the only successful story was from one of our customer, he used Optane for Redis which had an acceptable latency and better cost-effective.

Things go worse when the cloud is raising. We find that we can't use Optane on the cloud, so our later priority is to explore the optimization on the cloud disk like EBS GP3, etc.


I under understood what problem single-level storage was actually supposed to solve. In the real world you still need a notion of startup and shutdown because programs crash, memory gets corrupted, things get screwed up, hardware changes, etc., so it's not like we were going to avoid those. So, from a technical standpoint, why exactly would I want to merge my RAM with my SSD? The only thing I can think of is that it lets you achieve genuine "zero-copy" usage of data. Which, I mean, is cool. But is the overhead of copying read-only data from secondary storage to RAM really such a big problem for >90% of users to be justification for changing everyone's very notion of computing?


> But is the overhead of copying read-only data from secondary storage to RAM really such a big problem

It's not a problem that will affect every developer but it is a critical problem.

Many uses cases need access to data in both a fast (~ memory) and durable (~ SSD) way.

Copying data from SSD to memory helps with performance but the minute you modify that data then you've lost durability. Databases, file systems, caches etc are examples of critical use cases that are used by all developers that would benefit from something like Optane.


What percentage (say) speed improvement would you expect for typical workloads on database, file systems, etc. from a unified memory system?


So there are only a few examples of existing systems being re-written for Optane.

There was Redis [1] which was around ~10% slower than the memory only version but gained you durability. And there was an implementation of Spark Shuffle [2] which was 1.3x - 3.2x faster but that isn't really stressing I/O as much as other use cases.

For a filesystem you can store the entire metadata in Optane so EXISTS/LIST type operations and anything involving a bloom filter would see the full benefit e.g. order of magnitude better than NVME.

[1] https://github.com/pmem/pmem-redis

[2] https://databricks.com/session_na20/accelerating-apache-spar...


I'm sure it's amazing for heavily loaded databases but that's a pretty small fraction of computers.

For a filesystem I'm less sure about the benefits. Most of the waiting time I see is for the CPU, even if the information is already in memory. What I need is better code implementing the filesystem, not a hardware bump. And even if you go all-in on optane metadata, you only need to replace 1% of your NAND.

I do think there's some really nice potential, but almost all of what I'm interested in can be done with tiny amounts.


Some things are amazingly slow, though; look at how long it takes to install software, for instance. That's rarely CPU-bound at all, but somehow is influenced by I/O mismanagement in the various installers.


Installing software is usually CPU bound on Windows as decompression, anti-malware scanning, and signature validation of every file write limits throughput. Filesystem metadata management is also a bottleneck.

Other platforms may not have the anti-malware scan but do similar things.


Wow, the Redis case is even worse than I expected! I would've thought maybe a 2x improvement would be normal.

Also I think you're conflating workloads with operations. Sure, the occasional operation like EXISTS and LIST operation might be faster, but surely most workloads do a lot more than checking a trillion things for existence?

I feel like everything you wrote just makes the case against Optane better than even I could. There seems to be little if any clear performance benefit (being generous here, given slowdowns are also possible as you noted!) to most people's workloads to justify upending everyone's current model of computing. Something like this would probably need to deliver at least an order of magnitude of visible speedup in typical use cases for people to consider it. Which isn't to say some niche workloads might not see 2+ orders of magnitude performance improvement, but the rest of the world clearly won't; why should they have to pay the price for niche use cases?


Why would you expect Optane Redis to be faster than normal Redis ?

Redis is an in-memory system where you are compromising performance for durability. Optane Redis gets you the best of both worlds.

Examples where you are comparing Optane against SSD is where you do see significant improvements especially for smaller bits of data.

And EXISTS/LIST operations are more than just "occasional" operations for data storage systems.


> Why would you expect Optane Redis to be faster than normal Redis ?

Sorry for the confusion. To clarify, my sentences were separate; I wasn't saying I expected 2x for Redis specifically. I was just saying I didn't expect a 10% slowdown for Redis, and that I expected a 2x improvement typically (not necessarily for Redis).

> And EXISTS/LIST operations are more than just "occasional" operations for data storage systems.

Again, communication issue. I wrote "occasional" in the sense of "a small set of operations", not "infrequent operations". As in, you're going through a list of all operations a DB supports, and occasionally one pops out as potentially substantially benefiting from Optane.

Regardless, your argument misses the point I'm making. The point was: how much of the total workload time do they take up. Even if Optane brought down EXISTS/LIST latency to zero, your workload (including all OS/network/client/etc. overhead) would literally have to be 90% composed of (i.e. overwhelmingly dominated by) EXISTS/LIST checks to get an order of magnitude speed improvement for the user.


> I didn't expect a 10% slowdown for Redis

Pretty sure you misunderstood. If you forced redis use an SSD (persistent storage) for everything redis normally uses DRAM for and only observed a 10% slowdown, it would be a goddamn miracle!

> how much of the total workload time do they take up

If you're talking about a read-heavy workload, the only good thing about Optane is that it's a little cheaper than DRAM. But those workloads are easy to scale (just buy 2x caches to get 2x throughout) so they're often not worth discussing.

Also reposting my comment from above:

> PCIe Optane was a thing and it achieved 10us latency whereas today's fastest SSDs get 40us. IIRC the DIMM version of Optane was <1us, literally an order of magnitude faster!


I didn't look into details for the Redis port to Optane, but work on GemFire which is a similar in-memory system. I wrote expect one big benefit not captured by the 10% number to be startup time. Redis uses a operations log. On startup you'd have to replay the operations which can take a while and you have to clean up the log periodically. So startup / recovery should be quicker and you now have a whole source of complexity and bugs you don't need to worry about because hardware and OS should solve it for you.


Sure, but again, that is clearly far, far too minor of a benefit to justify the radical change we are proposing to everyone's model of computing.


I'm not sure it's all that radical. Sure, you can think of ways to completely clean slate redesign various applications and systems. But equally there are reasonably straightforward ways to integrate it into existing systems to achieve good speedups.


Are changes to everyones model of computing actually radical? I mean, I understand that taking advantage of the fact that memory actually persists is actually somewhat non-trivial when done via byte-addressable accesses, and that even with libraries that will be doing these things, developers gain a new set of properties to worry about, but... isn't this all optional? Why can't software treat persistent memory as just a lot denser RAM?


I recommend looking at this presentation from Oracle for an example of the benefits of PMem: "Under the Hood of an Exadata Transaction – How We Harnessed the Power of Persistent Memory"

https://www.youtube.com/watch?v=ertF5ZwCHP0


> I under understood what problem single-level storage was actually supposed to solve.

Briefly stated, it turns fsync() into a near-noop. That seems like it could be a pretty big deal for some workloads.


> programs crash, memory gets corrupted...

Yeah I think you would still need an "erase all caches and reboot" option but there's no theoretical reason you couldn't have that. The main reason you have to restart desktop OSes so often is because they're ancient and don't isolate components properly. How often do you have to restart your phone? A couple of times a year maybe?

But I agree with your point - it does seem like a very cool idea but practically wouldn't make a huge amount of difference and basically requires an entirely new incompatible OS.

I wonder if it would have been more successful in phones actually. iOS already doesn't have a user-accessible filesystem and Android is moving in that direction.


> How often do you have to restart your phone? A couple of times a year maybe?

Me? Like once a month at least, probably twice. Could be as frequent as multiple times in an hour. Just depends on the reason. It could include anything from "battery ran out before I charged" (a few times a year maybe) to "my phone is crashing/behaving erratically" (could be every few weeks) to "Android didn't refresh its MTP database live and won't show the file I added till I reboot" (could be from 5 mins ago). Not an exhaustive set, just listing a few examples.

> I wonder if it would have been more successful in phones actually.

Interesting idea. Maybe? I wonder how much the performance improvement would be for the end-user.


Ok that's not common by any means. Maybe a custom ROM or a new phone is in order?

Personally I reboot my iphone once a year. It also reboots overnight every 3 months or so for software updates, but that's often not observable because apps restore their state.


> Ok that's not common by any means.

Pretty sure it's not uncommon on Android for people to reboot every couple weeks or so: https://www.reddit.com/r/GalaxyS21/comments/prdfpf/how_often...

> Maybe a custom ROM or a new phone is in order?

Not due to the above (the reboot need isn't frequent enough to bother me), but for other reasons, possibly? It's not high priority for me but I've been thinking about it. Thing is, I love my phone otherwise. An insane amount of resources go into making hardware like this, and the planet's already trashed enough as is; I hate throwing out hardware that works fine just for random software glitches I can easily put up with.

>Personally I reboot my iphone once a year.

Maybe iPhone is better about this?


Huh? I probably reboot my phone once a week at least. I would argue once a year is far from normal.


> that's not common by any means

Pretty common, I second the GP.

> I reboot my iphone once a year.

I wonder how many times per year the phone reboots by itself to install OS updates?


I literally answered that in the next sentence after looking up the iOS release log.


> iOS already doesn't have a user-accessible filesystem

iOS actually did add some sort of native file explorer some time ago – no idea how comprehensive it is, but I guess it shows that even Apple couldn't entirely get rid of this.

> and Android is moving in that direction.

… and I absolutely hate it. Though I think it's not so much getting rid of files, as simply a half-assed attempt at sandboxing with a completely incompatible new API, various bugs (performance and otherwise), and breakage (flexibly exchanging multi-file file format files [1] between arbitrary apps is more or less dead if you follow the new rules, though in that case no sandboxing solution on any OS seems to get that right – as far as I'm aware only macOS even attempts to offer some sort of solution for that problem, but even that only solves part of the problem).

[1] Like locally mirrored HTML files with multiple pages or separately stored subresources (JS/CSS/media files/…), or movies + subtitles, or multi-part archives, or…


> How often do you have to restart your phone? A couple of times a year maybe?

Every month, due to the monthly security updates which are delivered via a firmware update.


About the revolutionary aspect of it: primary memory instead of files. Suppose we overcome the problems mentioned in the article, then it still doesn't do anything for current software for normal tasks. E.g., how do you back up primary memory? The only way would be for the application that owns the memory to do it. How do you share files between two processes? By adding memory sharing into your application. So, many of the tasks that are normally taken care of by the OS would need to move into your application. That's not feasible. So it needs a rather different OS and run-time environment, which don't exist, plus a huge rewrite of your apps, along the services it needs, before it can be even considered.

But not only that: it doesn't solve a problem big enough to make it commercially viable. It might be nice for something like Amazon's Redshift, but not for our mundane web servers and accounting programs, nor for the apps on our laptops and telephones. It's a solution in search for a problem.


Related:

Why Intel killed its Optane memory business - https://news.ycombinator.com/item?id=32272648 - July 2022 (6 comments)

Intel to Wind Down Optane Memory Business - https://news.ycombinator.com/item?id=32270799 - July 2022 (6 comments)

Intel Kills Optane Memory Business - https://news.ycombinator.com/item?id=32270584 - July 2022 (1 comment)


SSDs are already insanely fast, and the file model has served us well enough. It makes persistence explicit.

Considering that rebooting is currently the primary way of fixing any malfunctioning device, blurring RAM and disk doesn't seem that helpful. Things will crash and you will still need that distinction to some degree, no matter what the storage tech is. Some stuff is volatile and some stuff isn't.

We still need to explicitly define what happens when we reboot.

What makes optane really interesting to me is the endurance. They seem like they would be great as traditional SSDs hosting databases.


Not just “traditional SSDs”, which are block-addressable.

A byte-addressable Optane RAM would enable a database that no longer “thinks” in pages, but can independently persist much smaller pieces of its internal structures, possibly leading to new paradigms compared to what we have now. Perhaps write-ahead logs, B-Trees and other stuff can be implemented better when you are no longer limited to writing entire pages?


AFAIK they were used for databases


This is a very silly analysis. Lots of smart people took a good, hard look at 3DXP and it just never made economic sense. On top of the cost of the modules, and the cost of the high-spec processor that can use it, it also comes with the opportunity cost of not getting to use a DIMM slot for DRAM. It wasn't the failure of the industry to imagine a better future, it was the failure of the product to make economic sense.


A lot of this was Intel’s own misbegotten attempt to force the use of higher profit processor SKUs which were the only ones allowing enough memory slots/address space to be useful. Also the Optane interface is incredibly complex with lots of funny tradeoffs having more to do with imposed market segmentation than technology. I can’t comment on 3DXP cost effectiveness but it was always funny how Intel ducked admitting the use of Ovshinsky IP.


MBAs eating the world.


This is by far the most informative article coming from The Register, I'd had expected this article coming from AnandTech or similar news outlets.

Back in last year 2021, Linus was asked what is the most special about Linux in an interview, and he pointed to the Linux File System as the fastest in the business [1]. Now imagine an OS (Linux or not) having a first class in-memory file system, and it will blows away any conventional OS file-system, and Optane can provide this possibility at an affordable cost. I know that you can get several Terabytes of RAM with Xeon CPUs now, but at the moment the cost will be prohibitive.

The above mentioned feature has a killer application namely non-linear time-frequency analysis [2]. This quadratic complexity algorithm need to have huge amount of working memory in order to work effectively. I have a paper on this topic applied to biomedical field and it is currently trending on top of the page in Google search for several years now due to the highest accuracy, specificity and sensitivity of more than 99%. Apparently it's a sleeper paper because other researchers would avoid citing the article due to its very high accuracy using commonly available datasets hence lower ranking (still high but not top or front page) on the subject in Google Scholar search. During a meeting with Micron engineers earlier this year I have conveyed my condolence to them.

[1]https://www.tag1consulting.com/blog/interview-linus-torvalds...

[2]https://en.m.wikipedia.org/wiki/Bilinear_time%E2%80%93freque...


The second part of your comment hints at the trouble with your first: Optane didn’t provide major benefits over existing storage architectures when used for a block-addressable file system. It had significant density benefits when used as RAM, but at an equally significant latency penalty.

To really see wins, you needed an application data model that could benefit from the high density/single node capacity, byte-addressability, and persistence. Basically an in-memory database, either general purpose or embedded in a specialized data processing application.

Those are premium, niche markets, and could never support the demand Intel needed to make commercial sense.


> Those are premium, niche markets, and could never support the demand Intel needed to make commercial sense.

That's exactly the sort of thing that people said about the Mac in the mid-1980s, and Windows in the early 1990s, and so on. :-)


> This is by far the most informative article coming from The Register, I'd had expected this article coming from AnandTech or similar news outlets.

Thank you. :-)

I am trying to do some more in-depth analytical pieces for the site, and I am delighted that several of them have done very well -- not only on HN.


~ alternative:

"Last week Intel killed Optane. Today, Kioxia and Everspin announced comparable tech"

"Rumors of storage-class memory's demise may have been premature"

https://www.theregister.com/2022/08/02/kioxia_everspin_persi...


~ alternative:

"Kioxia Corporation, the world leader in memory solutions, today announced the launch of the second generation of XL-FLASH, a Storage Class Memory (SCM) solution based on its BiCS FLASH 3D flash memory technology, which significantly reduces bit cost while providing high performance and low latency. Product sample shipments are scheduled to start in November this year, with volume production expected to begin in 2023.

The second generation XL-FLASH achieves significant reduction in bit cost as a result of the addition of new multi-level cell (MLC) functionality with 2-bit per cell, in addition to the single-level cell (SLC) of the existing model. The maximum number of planes that can operate simultaneously has also increased from the current model, which will allow for improved throughput. The new XL-FLASH will have a memory capacity of 256 gigabits.

Kioxia's second generation XL-FLASH memory solution is engineered to bring higher performance and lower cost for data centers and enterprise servers and storage systems. In the future, it may also be possible to apply the product using CXL (Compute Express Link). Kioxia will continue to develop cutting-edge technologies and products to meet the needs of the expanding SCM market.

Kioxia will feature the second generation XL-FLASH during its keynote presentation at the Flash Memory Summit 2022 in Santa Clara, California on August 2. "

https://www.techpowerup.com/297385/kioxia-launches-second-ge...


XL-FLASH is in the 10us latency range? That's competitive with Optane over NVMe but I wouldn't really call it an alternative to the DIMMs that could do 0.3us.

I'm definitely curious about how fast it will run over CXL.


> XL-FLASH is in the 10us latency range?

we will get more information in the next 24h

"Kioxia will feature the second generation XL-FLASH™ during its keynote presentation at the Flash Memory Summit 2022 in Santa Clara, California on August 2." !


Just to be picky, if I remember it correctly, IBM's AS/400 (and its descendants, IBM i) were based on the idea of all storage being directly addressable by the CPU. When I say AS/400 was ahead of its time and, in some aspects, ahead of ours, I am not entirely joking.


This is definitely a case where hardware surpassed software in that classic “debate”. Intel probably needed to do a reference implementation in Windows/Linux for it to gain teeth though.


Indeed. Sometimes the "build it and they'll come" doesn't work at all.

It Itanium had excellent Linux support on day one it could have been a different story.

To this day I complain loudly IBM has no entry-level desktop/deskside POWER machines that are appealing to people who would buy a generic x86 workstation. If all they have is hardware for current clients to upgrade to, the platform becomes legacy pretty quickly.

I will go as far as saying ARM servers would never have happened were it not for pioneering stuff like the Raspberry Pi and other small hobbyist boards that served as an entry level and proved ARM was viable as a Linux host.


Making a version affordable and available to small time randos would probably have helped as well. Was it even possible to buy a complete system at something other than call-for-pricing pricing?


Few people have a problem that Optane directly solves. Its utility is narrow in the best case and you would need to specifically engineer a software stack to take advantage of it for some small part of the memory address space.

In many cases it is desirable for memory to be volatile. If high-throughput paging mechanics are required, like databases, this technology doesn't serve much of a purpose. Storage densities per server have only been increasing. It has some obvious utility for transaction logs and such but falls into a notorious dead spot in the market -- people with low volumes don't need the performance and people with high volumes can achieve similar results by other means for a price that isn't that different from the Optane premium.

I've tried to find a use for it, and I haven't been able to. That doesn't mean a use case doesn't exist, but if it was "the biggest new idea" you'd think the use case would be more obvious for more applications.


The use case is replacing flash, because modern flash cells die after a handful of writes. Surely there's a market for ultra-durable SSDs.


That math rarely pencils out. Most companies with large storage budgets have figured out that you can design write-intensive storage engines for cheap read-optimized flash that will last several years without an issue despite the relatively low number of write cycles. This is easy to verify on a cocktail napkin if one can assume the software is efficient with its use of storage bandwidth. In most applications with sane storage architectures, you might physically rewrite an SSD once a month, and often less, despite relatively high write intensity.

Quite a lot of software is excessively wasteful with storage writes, in which case it is more of an issue. That says more about the software than the storage, and can be addressed by not being profligate.


What do you mean by 'wasteful' with storage writes? That there is some unintentional write amplification going on for whatever reason?


>you would need to specifically engineer a software stack to take advantage of it for some small part of the memory address space.

didnt they already do that in kernel and/or drivers?

because Optane had customers, that wasn't some crazy lab tech, it was actually deployed in prod envs.


The article makes a case, to me at least, that Unix’s “everything is a file” mindset may be increasingly a liability.


[Author here]

Yes, that's exactly right.

The mindset was apt when RAM was tiny and expensive, and the only way to get large amounts of space was disks.

That is no longer true.


I'm really sad to see optane go. Byte-addressable, high-endurance, consistent low latency. It was the platonic ideal of solid state storage. But flash is more than good enough and it's cheap. Even pretty good quality NVMe drives are approaching $100/2TB.


Might have been more successful if it hadn't been burdened by Intel's IP and licensing.


Maybe... I think the bigger problem is that Intel was banking on corporate/commercial customers to buy these cards, and in the end they were only able to trick a few gamers and pro users into buying one. It really begs the question of which workloads are memory/bandwidth constrained, and how does this help them? It reminds me of Apple's impressive memory bandwidth figures for M1 Pro/Max, that were slightly soured when only a scant few real-world tasks fully took advantage of it.

We don't really need memory-mapped storage, IMO. If we do, it's perfectly attainable with swap/virtual memory (and pretty damn fast with newer NVME specs). I think Intel found something cool, didn't know how to market it, and ended up placing a bad bet on fairly domain-specific technology.


> It reminds me of Apple's impressive memory bandwidth figures for M1 Pro/Max, that were slightly soured when only a scant few real-world tasks fully took advantage of it.

Tangent, but... that's kind of missing the point. The M1 SoCs have tons of memory bandwidth to keep the GPU fed. The fact that this also gives the CPU more memory bandwidth than it can possibly use is a convenient benefit.


Oh, for sure; it's entirely a byproduct of engineering an SOC that needs high GPU bandwidth. That doesn't stop Apple from marketing it as a CPU boon though, and it certainly didn't stop starry-eyed HN readers from losing their minds over a spec that only a small handful of people care about. Even price-to-performance, a figure that hasn't been relevant for nearly 2 decades of commodity computing, is a better metric to advertise than memory bandwidth. People simply won't notice the difference.


Almost all workloads are memory bw constrained, is the problem of our era. That is one of the biggest drivers of M1 perf across the board.

Optane is orders of magnitude faster than virtual memory, it is a complete OS bypass. I think you are conflating Optane used in an SSD and Optane on the memory bus. SSD Optane was a way to package and sell it before the rest of the hardware and sofware caught up for DIMM based Optane.

edit, removed snark.


> Almost all workloads are memory bw constrained, is the problem of our era.

Well, sure; I think that sentiment applies to networking/CPU maximization too. But, much like we discovered with SSDs, there is a point of diminishing return for most people. Pretty much anything running an NVMe SSD won't experience significant storage throttling. Anyone connected to wired broadband will have roughly the same experience as everyone else on it. Since so much software is designed around nominal specifications, I don't really notice the M1's memory bandwidth in regular use. Text editing, app launching, smoothness and usability... it's all roughly the same as my throwaway $300 Thinkpad that has a nicer keyboard and no ARM contrivances.

> Optane is orders of magnitude faster than virtual memory

Not really... Intel has only claimed that Optane is 3-4 times faster than flash-based NVMe, which still puts it at >10x slower than the speeds of a trashy DDR4 2133mHz DIMM. If it's supposed to replace memory, it's both slower and more expensive. If it's intended to be a storage volume, most people would probably get better performance out of a ramfs or tempfs.


> Intel has only claimed that Optane is 3-4 times faster than flash-based NVMe

Which you can't talk to directly, you have to traverse OS and the cache.

Optane sits directly on the memory bus, reads and writes can bypass the OS completely. We aren't talking about the same things.

It sounds like there is something wrong with your M1.

https://swanson.ucsd.edu/data/bib/pdfs/2019arXiv-AEP.pdf


Optane was a very interesting technology but I don't think it as revolutionary as the article makes it to be. First, Optane does not make the memory hierarchy simpler as you still need fast RAM to solve the write endurance problem. Second, it is not obvious that Optane would be cost-competitive with SSDs. Third, you don't need a tech like that to implement a persistent memory system like one described in the article: just use RAM backed by an SSD and a battery/capacitor to flush the dirty pages in an event of power failure. I think Apple does something like this in their newer devices.

This indeed leaves Optane as a limited-use enterprise persistent cache technology.


Optane by itself wasn't a product that acted as a solution to real-world problems. It needed to be incorporated into a larger stack.


It needed a database engine to take advantage of it.


Persistent memory though is a solution to a very common problem.

Especially for anything distributed you want that metadata to be super fast but also durable.


I think Intel killing Optane the way it did is premature it does make some sense in the long run. If CXL takes of the best parts of optane will live on in an industry wide standard. However I am having a hard time understanding the nuances between different CXL versions and implementations so it might end up as an example for the famous XKCD standards post. I was under the impression that optane was just starting to gain a bit more traction with major players like VMware acknowledging persistent memory through project capitola. Let's hope that Intel axe-ing Optane opens a door for industry supported pmem so that AMD systems can also benefit from it.



>Optane presented a radical, transformative technology but because of this legacy view, this technical debt, few in the industry realized just how radical Optane was. And so it bombed.

No. It bombed because Intel is not in a habit of reducing prices. Why optimize production and sell more instead of milking top 1% of your corporate clients? P4800X was introduced in 2017 at $1.5K 375GB, guess how much cheaper its now :)


I wonder how much of this is to do with cloud support.

AWS has never supported it. And on Azure/GCP it was only on expensive, high end types designed for SAP HANA and other niche use cases.

Shame as there was some interesting projects e.g. DAOS [1] that I would have loved to try out.

[1] https://docs.daos.io/v2.0/


> But Intel made it work, produced this stuff, put it on the market… and not enough people were interested, and now it is giving up, too.

That sounds like a dubious claim to me, at least if we take the hyping up at face value.

If Optane was really that awesome, why not build a database on top that's either much faster or much cheaper than what we have now? Then sell that to a cloud vendor. The end user is only interested in the wire protocol (and up the stack from there) that they speak to the database, and not the underlying OS.

I cannot imagine Intel going to one of the three major cloud vendors, offering a 10x improvement (in either cost or performance) for one of their larger offerings, and the cloud vendor simply saying "nope, not interested".

Maybe the practical improvements simply weren't there?


Optane was very cool, but at the same time it quickly turned into a very niche product. If I'm not mistaken, once of the primary purposes was to accelerate spinning disk drives. However, I really don't think that's needed nowadays as hard drives aren't really that horrible now and SSDs are getting cheaper and cheaper every year.

I think if Intel really focused on what they were trying to do, I really do think they would have had success in the data center for accelerating data center drives.


The very laptop I'm typing this in has 256 GB of Optane memory, I'll never see it's full potential, will I?


That seems pretty unlikely. Do you mean it has a 256GB SSD with a small amount of Optane memory fronting it, integrated on an NVMe device?

256GB of Optane persistent memory would cost north of $2500.


My desktop has a 480GB Optane disk, it cost around $500 4 years ago. There was a prosumer lineup of almost affordable Optane disks - https://ark.intel.com/content/www/us/en/ark/products/123626/...


Yeah but that’s not the byte-addressable looks-exactly-like-DRAM stuff the article is going on about.


what do you mean? the disk only has optane, there's no NAND or anything behind it like some of the lower end Optane SSDs.


Optane is/was sold as pcie/nvme block devices. I use that for durability compared to TLC flash. It was also sold as DRAM format hanging off a memory controller and presented as (magically persistent) pages/bytes to the os/application. Theyre different price points.


I always wondered what it'd have looked like if Optane had a frontend of regular memory (infinitely re-writable) and just had a special command to "sync" it back to persistent memory.. then the OS wouldn't need to be made aware, except for the special sync option once every while to persist..


What I really want: a connector that allows me to put DDR3 modules on a SATA port and use it as a swap partition.


I remember some 10+ years ago there were pcie cards you could stick ddr2 modules into - but i can't seem to find a modern equivalent...

Edit: found the old one though, Gigabyte i-RAM

https://en.m.wikipedia.org/wiki/I-RAM


I'd love to have a simple DDR3 to SATA connector and get cheap upgrades for my laptop.


please ewaste the old ram honey, the kids are getting too old to share a bedroom


I was really looking forward to using optane as an ARC and database cache, along with a boot drive.


The “cheap” optane nvme devices work great as ARC and or Zil. I also use one as temp storage for streaming nvr data and post processing. Id buy another today for the durability properties alone.


Any hope they will license the tech or pivot and offer it under a different brand and marketing approach?

I was very interested in the tech but never quite understood how to utilize it, so it dropped off my radar.


This doesn't give me great hopes for Processing In Memory... Either they get full compatibility with existing DRAM at same performance, or they won't see mass adoption.


In contrast, I would say new growing advancement of DDR, PCIE and SSD to handle more data rates for ingestion and retrieval while also being more power efficient is pretty good news.


I love my optane drives. One of the biggest quality of life upgrades for my desktop.


> Optane kit is as big and as cheap as disk drive

Wait what? It's always been at a significant premium per GB over both enterprise HDDs and high-speed SSDs.

The cheapest DIMM module is 128GB for > $450 a pop.

Even in the enterprise range you'll get 1TB of fast and durable SSD with PLP for that, and that's when I restrict myself to things that are in stock right now. Even if you would get the Optanes at 75% off, that'd still almost be twice the price.

There's never been a point in the past when they've been significantly cheaper, either.


Yes, OK, I will own up to that. It was poorly phrased.

"Cheaper than RAM, but in the size range of SSDs, with performance in between the two," would have been a better way to express it.


I'm CYNical. If this is as powerful as mentionED in the article by ʼThe Register', it would be likely adopted by the coMMunity in some form? Saying it's the death of this technology is probably SHOrt-sighted.


Huh?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: