Which operates with the same idea, up to a 10Gb local cache and will stream out larger files. Supports a huge number of storage back ends (Dropbox, Google Drive/GCS, Amazon S3/Drive, OneDrive, SFTP, etc). Also can pin files and trees into the cache.
(Sorry, don't mean to attack anyone or put too much blame, but let's call it what it is.)
To get back to our argument: By the time you said "coming soon" you didn't know if it was coming at all, so I would call that a bit of a lie.
It's the same idea as setting up a landing page for a product that doesn't exist and seeing if people click "pricing" or "buy now" to determine if there is a market for it.
I am building Zero to store my personal pictures and videos and I feel like 10GB local cache means that only the last month of pictures and videos is local, which seems very limited.
Are there any technical reasons why the cache cannot be 500GB?
Working on both issues.
But first I will add some instructions on how to run it.
It seems to me what we really want is a cloud file system with local cache (like Dropbox or iCloud conceptually) so that if our local device is vaporized we have a pretty much up to date logical store alive and well (and we can work on any number of machines). The word “swapping” seems to me to be based on the virtual memory model which means that if anything goes wrong you have two disconnected piles of crap.
At a file level you could theoretically have a giant file that is never wholly local, but how useful is this as a feature in real terms?
Of course you don't get a nice mirror of your files right in the cloud, unless you run a separate server that reconstructs it and makes available as traditional buckets.
From what I tested, restic has friendlier command line options but duplicacy is technically superior at this point (restore works way faster)
Using something like inotify to record changed files and a worker in the background to immediately sync. Like dropbox.
And yes, performance is a disaster right now simply because the code is not optimized at all. But the sync to the cloud happens in the background so it should not affect your performance unless you have a "cache miss".
What about often-locally-changed data which are part of a coherent set, the classic case being a file used by a database engine to store data? We nearly always need to mirror/backup a consistent version of it (just after a successful nesting transaction, in the SQL world the upper-level "COMMIT"), but AFAIK for the time being the HSM+backup software cannot detect such a state. trapping existing system calls (fsync and co, in order to copy to the remote storage data in a sync'ed state) but this is not robust because their semantics is not "upon return of this call the whole dataset (in all files) is consistent".
Moreover if the application using the DB engine is not perfect such inconsistency may reside at application level => after a COMMIT the file is consistent for the DB engine, but not for the application.
I wonder if some users of such HSM+backup software felt some major disappointment after restoring an inconsistent version of such a file. Even a minor loss (garbled index) may be hard to detect and lead to a "fork" of the data.
A dedicated system function called to signal "in my set of opened files the data are consistent" would be useful but is AFAIK missing, and even if someone adds it to some libc/kernel it will only be useful when the application code will actually call it.
The kludge is a procedure "order to engine to sync the data ; throttle the engine in 'no write mode' ; create a RO snapshot ; backup the snapshot; unthrottle the engine ; delete the snapshot", which seems not exactly "transparent".
Don’t perform data operations at the wrong layer.
With Zero, all local data is eventually synced to the cloud but usually this only happens after the local file is idle for a while.
This would be a actual block device not a fuse file system.
However, the software tries to make predictions for which files will be accessed and doing this on a block level would be much harder.
Plus, I had to learn fuse to write this and I thought I'd start off easy :D
It was better than walking around with an external drive. But it was super slow. After I upgraded the laptop, I had no further use for it.
Not at all interesting IMO. WebDAV and Nextcloud already work (just like Rsync). What's interesting is applying some kind of encryption on top of it. For that, I use Cryptomator  which also works on mobile devices.
The other reason is Linux support.
Have you done any performance testing to compare zbox vs xfs/ext4/zfs/whatever on the same system? I saw the benchmark in the readme, but that doesn't necessarily show how much overhead or loss of performance there is over the native filesystem.
For ~GBs of data per day is it necessary to use something that avoids having a full local copy? I'd have thought you could have a full local mirror backed up with Dropbox, Backblaze, rsync, rclone, etc.
The code is good, maybe you'll find inspiration there.
It allows you to configure an arbitrary cache size, I've been using it with 60GB local cache.
I'd primarily like to use this to back up a couple of Proxmox hosts.
Local caching is limited to 10% of your disk space (if I remember correctly).
Cool project though. Will definitely keep it in my radar.
Btw, I really love the idea of keybase, I hope they take off.
Same here, they're doing some really good work.
I just find s3 very expensive for long term storage of personal data.