Traditional operating systems such as Linux and windows are 100% dead when non volatile memory comes along in force. Paradigm shift time.
There is pretty much no reason to use any filesystem APIs or filesystem any more. You just keep your data in the process address space - its just not going to go anywhere. Just make a data segment persistent across processes and you can survive restarts. If you backup, you can just dump the address space. Screw hard disks as well. i imagine some form of rpc will be in place between processes so they can talk to each other and that is it. Lots of small redis instances would be a similar concept.
Imagine an mp3 server process which can provide persistence in the heap, metadata and decoding services and you're there.
it'll be like a small internet inside your machine.
Lisp would fit nicely in this world. Imagine A persistent root environment. load a defun once and it's forever. Teracotta do something similar with java.
Then again I could just be insane.
And as for dropping "files", I think that's missing the point. Files are just pickled streams, and there will always be stream metaphors in computing. How else do you represent "inputs" to your program (which, remember, still has to run "from scratch" most of the time, if only on test sets)?
I guess I see this as a much narrower optimization. A computer built out of perfectly transparent NVRAM storage is merely one with zero-time suspend/restore and zero suspended power draw. That doesn't sound so paradigm breaking to me, though it sure would be nice to have.
Assuming that NVRAM becomes dense enough to replace storage in practice -- which is a big assumption, but it's happened to tape and is happening to hard drives right now -- concepts like launching a program, opening and closing a file, even booting will become mostly academic. Certainly they'll be of no crucial interest to users, to whom the distinction between what something is and what it does has never made that much sense.
Sure you could apply all the same abstractions over the top, but if you were designing your OS from a blank slate, why on Earth would you? And it will only be a matter of time before one of those blank-slate OSes is compellingly superior enough to the old-school paradigm, and users will start switching en masse.
> DRAM is hardly transparent already, c.f. three levels of cache on the die of modern CPUs
Keep in mind I'm talking about interface, not implementation. My code might care about cache misses, but my users have no reason to (except in the very, very aggregate). We leak a user-facing distinction between storage and memory because the difference is too significant to pretend it doesn't exist.
I actually wrote in defense of the traditional filesystem model in another post on this thread. Just because I don't agree with your reasoning doesn't mean I don't agree with your conclusion.
Consider redis which is a great example of this. How do you store redis' data efficiently? Well it turns that aof files are slow to start up and disk backed virtual memory is slow. The problem goes away instantly with NVM - the jib is done with no filesystem api used.
(^+ and B-frames, but let's not overcomplicate things)
The most important property of a filesystem or the Unix-style pipes&filters that permeate them is the fact that serialization is a fundamental property component of persistence and communication. Any finite quantity of space can be addressed by a linear scheme and any unbounded sequence is inherently linear. Consider what happens as soon as you want to send a stream of data across the wire. Or checkpoint an operation. You need to store stuff in a non-volatile and linear form. And since data outlives code, you better invest time into thinking about that linear form.
Now, if you use immutable data structures (ie. non-cyclical) with a proper linearization and corresponding reader/printer (like in Clojure), you can get many of the same benefits, but you still need to have a whole bunch of other things to worry about. Just look at the weird things that Clojure needs to do with print-dup and the like.
I'm not saying that there isn't a kernel of a good idea in here. I'm just saying that it's going to have a lot more in common with filesystems and traditional shells than you'd first expect coming from a Lisp REPL school of thought.
No wait, sorry it doesn't. I agree!
Don't agree. Filesystems are used more for organizational purposes than anything. Most of my folders are logically named and organized rather than physically based because of the types they contain. Probably the filesystem API will evolve into something more tag-based than hierarchical, but you never know. The filesystem hierarchical folder metaphor is re-used in tons of places where there's really no need for it physically. Why? Because there's a logical need for it.
Additionally, file system APIs provide lots of useful abstractions such as opening resources, closing them, reading to, writing from, appending to, etc. When you go to do your backup in process space, which data is in a self-consistent state that can be copied? You'll need something like a filesystem API to coordinate.
Ever heard of STM?
The web is a great example of a system where there's no particular need to store things like you do in a filesystem. The data for most sites is stored in various SQL and NOSQL databases... yet we still predominately see hierarchical paths used for resource identification. I wonder why?
Ooh it even wrapped.
Ie, current operating systems won't die and filesystem or filesystem-like APIs won't be going anywhere for a long long time.
The Windows NT kernel is primarily a filesystem-backed address space for committed RAM. Originally you actually had to have a pagefile at least as large as physical RAM. Except for nonpageable kernel structures, all the program accessible RAM was part of a memory-mapped file. [[EDIT: There's plenty of text from Microsoft that implies this, (e.g. "you should set the size of the paging file to the same size as your system RAM plus 12 MB...because Windows NT requires "backing storage" for everything it keeps in RAM") however, offline discussions have convinced me that it was never strictly true.]]
This was back when drive capacity was "more larger" than RAM capacity and disk bandwith was "less slower". The kernel has evolved away from this design a bit, but it did bring a certain purity. For example the filesytem cache and virtual memory paging system could be largely the same thing.
> You just keep your data in the process address space - its just not going to go anywhere. Just make a data segment persistent across processes and you can survive restarts.
This more or less what happens when the kernel bluescreens and the page file was at least as large as RAM. It makes debugging kernel crashes easier. (Spare me the Windows jokes please I'm not advocating for it, just saying this part of it that no one ever sees had a relatively elegant design opposite to what crazy futurist rant suggests.)
• Tape drives still make sense, to some people that store insanely large data sets with extremely infrequent (or never) access.
• I do not mourn my tape based backups for a single instant when using my rsync based backups. Life is good.
• Despite tape being dead for most of us, we all know a program called "tar". We still make tape archives, just on other storage.
The paradigm survives on virtual tapes because it is a useful cognitive model. Sure I could make a block file be a virtual disk and put a filesystem in it and send you the files that way, but you'd rather have a good old "tape" archive.
Likewise, disk filesystems are not going to go away if disks go away. They are too useful for reasoning about problems.
Scratch the surface of an iPad. It is full of files, yet empty of disks. Go to Linux, land of speciation, try to find a persistent storage system for flash memory which does not treat it as a disk. You will find some filesystems optimized for flash, but you will come up (nearly) lacking for a completely new way of looking at storage.
Persistent full speed RAM should be enough of a change to spawn that new thing, but I'll bet people keep the "real" copy in a filesystem for a long time. When that alpha particle corrupts a bit and trashes your clever RAM based data structure, what are you going to do? I'll reload from my file.
If you use a safe programming language which doesn't piss over memory (I.e. Haskell, Python, Ruby, Lisp etc) you won't need to reboot. Just reload the broken function into your environment and carry on. As for data, the same thing can happen to your disk...
The filesystem only exists because we couldn't put data in process due to the cost of memory. We cram everything through the filesystem API if it fits or not due to this.
It's why our machines are slow and primitive.
Consider the case of video - it is better represented as streams of audio, picture and metadata. To get this from a filesystem, we have to mux them all into a single stream and then demux them afterwards and hand them over to other APIs to process. This just wouldn't need to exist.
One point of a filesystem is to have a consistent state that you can recover from after an errant process stomps on memory, or your machine suffers a kernel panic, or your memory becomes so fragmented that you can't even read a 100mb data source, or any other number of issues that can only be resolved by rebooting or reloading.
Once you've committed to non-volatile memory and ditched files, you're tightrope walking without a safety net, at the mercy of the next system level bug. I'd rather know that my data is safe and double-backed up at multiple physical locations, with recoverable history. Files give me that in a well supported, (mostly) system agnostic manner.
Not true. Video components are packaged together because they go together logically. When you want to send the video to another machine or give a copy to a friend, it's entirely logical and useful that the various components are packaged together in some way.
Seems like you were talking about a lot more than just one particular facet of filesystems when you made the futurist proclamation above.
My idea is that communication should be transparent.
I've worked on kit that is never upgraded or rebooted. It's active 24/7 for 365 days a year and is expected to work for 30 years.
I epxect the same thing from a normal computer, epecially considering the engineering budget is larger for them.
This is a man who's never used Xmonad (which is written in Haskell) for an extended period of time before. If you use it for a long enough time, it gets slower and slower until workspace changes start taking whole seconds.
Eventually you say fuck it and reboot...
...'cept you can't do that without a filesystem with a "base state" to reboot from.
It turned out I had a memory leak in my configuration file :(
xmonad is probably just a turd.
In fact we don't really need processes, just suitable environments set up for execution of code. For example, stop a lisp definition calling eval:
(defun no-eval-wrapper (form)
(let ((eval nil)) (form)))
Then why not use filesystem APIs to talk about these structures? There's nothing about opendir() that requires it be backed by a block device instead of nvram.
Files are data and structures are part code. You can't do much with code 'cause of turing completeness.
See also http://en.wikipedia.org/wiki/Single-level_store
Haha, this is exactly how the IBM AS/400 works!
I now to you IBM (but not the COBOL bit)
Hopefully current OSs will be re-imagined anyway just due to the vast time difference.
And NVM tech will probably be more limiting in storage to current RAM than SSDs are to spinning HDs. To replace even 30% of RAM/Storage in 5 years seems like a stretch.
HDs haven't replaced tapes yet. And probably won't for slow read but massive storage. They handle the functionality of general use with 20GB tapes but that's some ways off current multi-terabyte tapes. And those might increase storage capacity faster than HDs. So server OSs will have to maintain compatibility.
This implies there is something rearing at the starting gate to replace them.
(I'm going to try - 10 years of embedded followed by 10 years of business facing is a good foundation and I've spent the last 15 years on the problem in my mind waiting for the technology to arrive)
I doubt they'd be dead, they just won't be able to do the new things that NVM would allow you to do (at first, anyway).
At a minimum, mobile battery life could be better, depending on how much power is usually used to keep things going. Get an event from a radio or button and the system can instantaneously wake up.
The magic trick that the NDS does when you close its lid will become universal.
<tinfoil hat on>
the whole path/file thing is really useful for distributed computing.
Also what would users use for object naming(especially across address spaces)? How do I search for a presentation or spreadsheet without a reference to it?
b) no it's not. Distributed computing is normally message based.
c) you use an index in (a) and pull it into your local scope.
b)yes, but in those messages are references to data. How do you reference that data across computer address spaces? Right now, we use an URI that is based on host/path/resource semantics. How do you get path/resource without a filestystem like construct?
c) I'm not following you.
b) The above solves that. You just read an address and it pulls that block over the network into your local address space. You write to it, and it pushes it back.
(STM comes in here as you can semantically wrap such things in transactions).
c) Someone gives you a pointer to the root of a catalogue or there is one built in then you can navigate the data structure be it a full text index or linear linked list to find the data you need. There is no filesystem.
I think someone related my ideas to AS/400 which TBH after doing some reading is a pretty good comparison, although I'd do it on a larger scale.
This is really "fringe" computer science if you want to call it that. It's pushing thing boundaries of what is possible intentionally.
Yes, only we want to have those data shared among many processes, with abstractions, names, rich metadata etc. (Not to mention different machines, backups, etc).
It's not like we are currently forced to use filesystems because memory is volatile.
Actually, you got it backwards, we use filesystems specifically in NON VOLATILE memory, that is hard disks.
I think the need for such a well defined and accepted user organizable data store isn't going away regardless of the underlying storage medium.
I like that my file system can be reasoned about in very concrete ways. I'm okay with using a tag based system like gmail, but I find it very flat and more difficult to organize vs traditional hierarchy of folders.
And btw, GET OFF MY LAWN!
Then, fun things start to happen: You no longer need disk buffers, since your RAM is your disk. And mmap()ing no longer consumes "RAM" because you just map a part of your file to whatever virtual address you want. You need no swap. You never have to swap in or out. You never have to sync your disks or observe buffer dirtiness (if only on CPU cache level) because your buffer is a part of your file - once you write to it the file is already updated.
Of course, software will have to adapt: prefer mmap()ing file to read()ing one; use persistent memory structures. The goal is to minimize copy-on-write by either read-only access or safe in place data transformation.
If it would cost more, you can use it as a persistent block cache (that survives reboot), but it doesn't make much economic sense since nobody is willing to pay money for just faster start-up and warm-up after reboot. If you have 8G of NVM, you could just read 8G off disk in something like 30 sec and be settled with regular RAM.
(Disclaimer: I'm working on a solution to this.)
I don't think it will be too difficult for security software to wipe their keys from memory before shutting down, and many programs already do this. But so much more would remain vulnerable unless the decrypted data structures were also wiped from memory. Implementing effective security with NVRAM-equipped computers might therefore negate much of the benefit of using NVRAM in the first place.
(Disclaimer: I am biased since I am working on this.)
And, for now, it won't change filesystems much. Unless you can get a similar ammount of it as a disk (or maybe a compromise, let's say today around 8GB ram is common, and 1TB of HD, then if you can get around 128GB NVM this can be your new 'SSD')
It is of course, a very important development, and may make things faster
Edit: according to reviews it uses compression to boost its bandwidth (but not capacity). Still seems like a decent tradeoff.
Too bad it doesn't fit my MBP =)
Isn't this essentially NVRAM? What are the downsides to it?
Battery-backed cache has probably been around even longer in write caches for large RAID systems.
The 5% power and shutdown is also how the Macbook laptops handle sleep. Sleep mode is a low power nothing-but-ram mode, and when the battery gets too low, it goes into 'safe sleep', basically a dump-to-disk all-powered-off hibernate.
The main reason it's not all that common is that for the sort of workloads you're prepared to pay for a shitload of RAM, you're probably just using it as a cache for a DB or some monster app, and actually keeping it around isn't that much of a priority. You've got failover somewhere else in the stack, and it's one less thing to buy and maintain.
The other critical flaw is that there is a (potentially huge) performance hit in presenting as a disk vs hanging off the northbridge MMU. Even the latest in new fancy SATA is hilariously slow compared to the actual memory bus (6Gbps for SATA3 vs maybe 100Gbps for DDR3), and having all the filesystem abstraction on top, as in the titular article of this thread mentions, is a whole lot more overhead.
So yeah. We can. Sometimes people do. But it's probably easier and better to just stick it in the actual RAM slots, and use it differently for everything except 5-second boot times.
Bye bye WAL
Bye bye fsync
Hello NVM replication
I think I'm gonna cry