Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: I'd like an OS/FileSystem that Never Deletes Anything. Does it exist?
15 points by choletentent on Feb 5, 2022 | hide | past | favorite | 31 comments
Forgive me for the ignorance of my question.

I'm tired creating backups, or loosing data. Let's be honest, sometimes it happens. Does a filesystem that never deletes/overwrites anything exist?

I envision something like an append only filesystem where you all files in your hard drive are under some OS level 'version control' and you can rollback to any version you want. Only files in my home dir would be enough. Obviously once space runs out it would start overwriting the oldest versions.




The expensive version of what you are describing is a vaulting appliance but I won't continue down that path unless you are a bank.

The affordable version of what you are describing is a set of NAS class servers that use something like DRBD or Ceph for multi-node replication and rsnapshot for local snapshots/versioning if LVM. If using LVM, then rsnapshot should be configured to save its snapshot where the remote clients can not write to it.

Look into setting up Ceph clusters if you plan to build something that will need to scale really large on individual volumes across multiple nodes. Ceph supports creating snapshots. There are pros and cons to solutions such as DRBD, LVM and Ceph. That would be a topic in and of itself. Others here are mentioning ZFS and that is also a popular solution but I have never used it and can't comment on it.

In summary if I was asked to build something that would grow to unknown size, require snapshots to roll-back files to a specific date and be fault tolerant I would go with Ceph. As a bonus feature there are libraries and tools to present Ceph volumes as S3-like buckets. There are ansible playbooks for setting up Ceph clusters. Ceph has some really good security features as well.


> Does a filesystem that never deletes/overwrites anything exist?

> Obviously once space runs out it would start overwriting the oldest versions.

You can’t have both.

Also, your system wouldn’t prevent data loss on hardware failures.

What you describe is/may be a WORM device (https://en.wikipedia.org/wiki/Write_once_read_many)


Digital's VMS/OpenVMS had something that was a fair way towards this, in that a file was created with a version number (myfile.txt;1) and when modified, a 2nd file (myfile.txt;2) was created, and so on...

In practice we used to work on things then purge when happy with the final changes, as we didn't have that much 'spare' disk, but it was a nice feature.


S3FS with versioning enabled on the S3 bucket would be a pretty low friction way to implement this. You'd also have to enable cross-region replication to deal with your "backups" being in the same place as your working copies.

OpenVMS also has a versioning filesystem and a NFS server. The x64 version is available as a private beta, but it's not cheap.

Personally, I just keep most stuff in a Google Drive folder and important stuff gets the keep versions forever bit turned on. Everything else goes in a git tree and have some automation to clone it to a reliable utility server when the OS thinks there's plenty of bandwidth available.

It's worth noting that the Google Drive solution is the only one that I've been able to get to work reliably across Linux, macOS, Windows, Android, and iOS. I've tried OneDrive, Dropbox, and iCloud Files, but they all have various problems with conflict resolution, annoying upselling, or limited file type support for versioning.


S3FS can also be paired with MinIO (1) for a selfhosted solution. MinIO does apear to support versioning.

Also it appears VSI will offer free comunity licenses for OpenVMS x64 in the future (3).

(1) https://min.io

(2) https://min.io/product/object-versioning-bucket-versioning

(3) https://vmssoftware.com/about/news/2020-07-28-community-lice...



I imagine something does exist that does this, but I do not know what it is called.

I know that overlayfs exists, which could enable this sort of thing by capturing each individual change to the filesystem as a new layer. https://www.kernel.org/doc/html/latest/filesystems/overlayfs...

Btrfs and ZFS can do snapshotting, and I would imagine it could be configured to snapshot at certain intervals or after individual changes to the filesystem or a section of the filesystem, like watched directories.

https://wiki.archlinux.org/title/Btrfs#Snapshots

https://docs.oracle.com/cd/E19253-01/819-5461/gbcya/index.ht...

There's also filesystem-in-userland tech like Fuse which could enable something like this. https://www.kernel.org/doc/html/latest/filesystems/fuse.html


On Windows, since 8.1, there is the built in File History. https://support.microsoft.com/en-us/windows/backup-and-resto...

On macOS, there is Time Machine built in. https://support.apple.com/en-us/HT201250

ZFS does have the concept of Snapshots but I would not consider that user friendly or useful for version tracking.

There is plenty of Version Control systems out there, like Git, again not user friendly for what you want.

Dropbox/OneDrive/Google Drive provide various levels of version tracking of changes.

I personally use Backblaze (30 days, continuous) and Tarsnap (45 days, nightly) for both off-site and temporal recovery when I really need it. Backblaze is used for "desktop" systems, Tarsnap for "server" systems.


" Set up a drive for File History Before you start using File History to back up your files, you need to first select where your backups are saved."

This is just backup.


>On Windows, since 8.1, there is the built in File History.

Why am I only hearing about this today? That's an amazing feature to support.


Not exactly what you're asking for, but how about:

* use ZFS (or any snapshot capable file system)

* trigger snapshots at regular (short) intervals (as chosen by you).

ZFS snapshots are very low cost, and you can browse them using your normal file tools. You could reasonably snapshot every (say) 5 minutes, but I wouldn't try every second :)


> I'm tired creating backups, or loosing data. Let's be honest, sometimes it happens. Does a filesystem that never deletes/overwrites anything exist?

I use linux and system link the home folder to a Dropbox folder. There is an extended history option [1], so that you can go back a year for any file. I think it’s a reasonably close solution if you do not mind the cost and syncing to cloud.

[1]https://help.dropbox.com/files-folders/restore-delete/extend...


Look up "append only" filesystems. That'll get you most of the way there, but you'll have to manually reset things to get rid of old files. Or just use a competent backup system.


Let's say you are editing a source code file. You save and you run. Than you tweak, save and run again. But then you would like to go two changes in the past, but you did not make a commit. Very recent changes are not captured by backup systems, no matter how competent they are.

I'd like to have all my files under git, for every save - and frictionless. I know it is a lot to ask.


I remember VMS on the VAX did this, it was a major pain in the neck, it was really easy to run out of disk space (although the quota on my account was 1 MB, which was somewhat small even in those days). But consider if this is really what you want. Every habitual save is another version. Every recompile duplicates all the intermediate object files. If you are using Unix and you have /tmp mounted on a versioned file system, every piped command like less or sort will create a new file. If your system uses pagefiles instead of a swap partition (e.g. macOS, Windows), that's going to be really unpleasant.

I've been developing for many years and I can't remember the last time I ran into this problem, so you can try my approach. When I'm debugging something and unsure of my solution, I comment out the original code and put the potential replacement below. If that doesn't work, do the same. I'll sometimes have three or four potential versions. Also, I generally commit every time a task is finished. During development the tasks might be pretty granular ("implemented underlines, strikethroughs, and custom fg/bg colors for text"), but each bug fix gets its own commit ("bug-2309: text was offset incorrectly on Win32"). Between these two, I rarely run into the problem.


Look up "git wip", configure your editor to use it and it keeps an unintrusive 'second branch' in the background.

https://atom.io/packages/git-wip

"git-wip is a script that will manage Work In Progress (or WIP) branches. WIP branches are mostly throw away but identify points of development between commits. The intent is to tie this script into your editor so that each time you save your file, the git-wip script captures that state in git."


> Very recent changes are not captured by backup systems, no matter how competent they are.

Depends, there's backup software for Windows that will watch for file changes and backup immediately. Usually enterprise targetted. You could certainly do that on other OSes, possibly with less nice apis; a half measure would be taking filesystem snapshots continuously or once a minute or ???.

There's filesystems you can use on cd-r/dvd-r, although that doesn't sound pleasant, that would give you the option to go back to any version.


This sounds a lot like what you're describing: https://github.com/tkellogg/dura


For source code, check out Local History if you’re using JetBrains IDEs. If not, there are similiar plugins for other editors too.


I think NilFS [0] fits your description, though it is somewhat obscure. It creates a checkpoint with every write, which can mounted after being turned into a snapshot.

Although the website seems to have not been updated in a while, it's mailing lists appear to be active. [1]

https://nilfs.sourceforge.io/en/ [0] https://marc.info/?l=linux-nilfs [1]


Maybe one option would be to host a WebDav share with the Delta-V extension (RFC 3253) backed by Subversion. An alternative to Subversion might me NextCloud which supports automatic versioning and WebDav, but I do not know if it supports versioning through WebDav.

I also desire an easy solution to this usecase.


To expand a bit on the issues of the solution above, the problem with WebDav/Delta-V is that very few servers and very few clients suport it and it is also dificult to find documentation on whehter it is or isn't supported by a piece of software.

Here is more information about WebDav/Delta-V: https://www.linuxtopia.org/online_books/programming_tool_gui...

This part describes what you and I desire with caveats: https://www.linuxtopia.org/online_books/programming_tool_gui...

Here is a discussion on why IIS never implemented it: https://social.msdn.microsoft.com/Forums/en-US/a9bf95b1-36ac...



Even if the perfect version of this exists, it will _not_ save you from needing backups. The FS can always screw up or the drive(s) go bad.


Is Apple Time Machine dead? Many years passed since I used my last one. But anyway,

I'm tired creating backups

Have you tried to automate them? Thousands of free and paid software to do that. Setup once, forget until next loss.


I hope this hypothetical OS never needs to use a swap file.

More seriously, in a time of unobtrusive automated backups and dirt cheap storage space, why is this such an issue where you want to have this kind of system?


Domain OS had versioning filesystem. You could always load an older version of your file in the editor.


Something similar can be achieved today with a combination of autosaves or version saves at the application level as well as time machine backups at the OS level


I'd like the opposite - an FS which really deletes everything you command it to so no recovery tool would find (let alone recover) there was anything.


I think you can do this with fossil and venti


Check out Plan9’s Venti service?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: