
Show HN: Zero – Local file system transparently swapping to the cloud - konschubert
https://github.com/KonstantinSchubert/zero
======
hemancuso
Very cool project! Close to my heart. I write ExpanDrive

[http://www.expandrive.com](http://www.expandrive.com)

Which operates with the same idea, up to a 10Gb local cache and will stream
out larger files. Supports a huge number of storage back ends (Dropbox, Google
Drive/GCS, Amazon S3/Drive, OneDrive, SFTP, etc). Also can pin files and trees
into the cache.

~~~
sz4kerto
Bought a license maybe 4-5 years ago because the website said that Linux
support is coming very soon. It's been coming since then. Tried to sign up for
the beta but didn't even get an answer. :(

~~~
hemancuso
Originally said "coming soon" to see if anyone was interested. Turns out they
are!

~~~
gdfasfklshg4
Can you explain a little more what you mean by this?

~~~
pvinis
It means that is was not actually "coming soon". It said that, just to see if
many people are interested in that feature. If there were many people
interested, development of that feature would start. If not many people were
interested, I guess the text would have been just removed. Makes sense?

~~~
arthurcolle
Pretty convoluted to me...

~~~
konschubert
No, it's simple, it's just a lie.

(Sorry, don't mean to attack anyone or put too much blame, but let's call it
what it is.)

~~~
hemancuso
Technically the lie really only extends to the "soon" part, since it is
coming. And software regularly has a list of future features that often don't
ship. Would love to chat if you were ever interested!

~~~
konschubert
Hi, I am always up for a chat, my email is mail@konstantinschubert.com. :)

To get back to our argument: By the time you said "coming soon" you didn't
know if it was coming at all, so I would call that a bit of a lie.

------
konschubert
Author here. Very glad to see that nobody has noticed how slow it is and
nobody has made a comment about how messy code is in some places :D

Working on both issues.

But first I will add some instructions on how to run it.

~~~
ohiovr
I think you should try funding your project by encouraging users to follow an
affiliate link to buy storage from backblaze.

------
Tloewald
Interesting idea. But is the cloud version a complete image (perhaps out of
sync)? If so then it’s a performance disaster, if not it’s very fragile.

It seems to me what we really want is a cloud file system with local cache
(like Dropbox or iCloud conceptually) so that if our local device is vaporized
we have a pretty much up to date logical store alive and well (and we can work
on any number of machines). The word “swapping” seems to me to be based on the
virtual memory model which means that if anything goes wrong you have two
disconnected piles of crap.

At a file level you could theoretically have a giant file that is never wholly
local, but how useful is this as a feature in real terms?

~~~
nine_k
I think Borg or Tarsnap use the right approach here: a map of blocks, updating
a file updates only the changed block(s). It balances the efficiency of
updates and the completeness of the copy. Sort of like FAT filesystem, only
with block-level deduplication built in.

Of course you _don 't_ get a nice mirror of your files right in the cloud,
unless you run a separate server that reconstructs it and makes available as
traditional buckets.

~~~
h1d
restic and duplicacy are the newer implementations of block level dedup
encrypted backup.

From what I tested, restic has friendlier command line options but duplicacy
is technically superior at this point (restore works way faster)

~~~
mappu
Restic's restore isn't parallelized at all, whereas its backup is. It should
be straightforward to improve the restore performance.

[https://github.com/restic/restic/pull/1719](https://github.com/restic/restic/pull/1719)

------
mkl
I've been using SyncThing [1] recently, which does a similar thing but between
your own devices (anything from Android to desktop to servers in the cloud).
I've been using it on Linux and Windows, and it seems pretty good.

[1] [https://syncthing.net](https://syncthing.net)

~~~
luhn
I don't think Syncthing is the same. As far as I recall, it stores the full
sync folder to disk, similar to traditional sync services. This project only
stores a cache on the disk.

~~~
mkl
You are correct. With SyncThing each device gets a full copy of whatever
folders you've asked it to sync there. The FUSE aspect of Zero lets it do
just-in-time file transfer and have no solid upper limit on storage, saving
bandwidth and disk space at the cost of portability, latency, and redundancy.

------
ArtWomb
Gcloud also has a FUSE adapter for cloud storage buckets

[https://cloud.google.com/storage/docs/gcs-
fuse](https://cloud.google.com/storage/docs/gcs-fuse)

------
mbrumlow
Just wondering. Would anybody want a block devise backed by s3 or other object
storage? Local cache with snapshots that can be rolled back ? Maybe giving you
a exobyte of addrisable storage ?

This would be a actual block device not a fuse file system.

~~~
chrisper
You mean like iSCSI in the cloud?

~~~
mbrumlow
Yeah like iSCSI backed by the cloud but with local cache for recently used
data. And the ability to roll the entire device back to a state in the past
(depending on how big you want your S3 bill to be).

------
Fnoord
So, what all these filesystems do is having a local cache (in RAM or on SSD
for better performance). The successors of NFS in the Linux kernel have been
doing that all along.

Not at all interesting IMO. WebDAV and Nextcloud already work (just like
Rsync). What's interesting is applying some kind of encryption on top of it.
For that, I use Cryptomator [1] which also works on mobile devices.

[1] [https://cryptomator.org](https://cryptomator.org)

------
swalsh
I built something similar years ago when I had a laptop with a really tiny
hard drive. I would move a bunch of files to a cloud server, and left a bunch
of files with no data on the hard drive. Then I wrote a file system filter
driver which would watch for the files opening, catch the open request, and
quickly download.

It was better than walking around with an external drive. But it was super
slow. After I upgraded the laptop, I had no further use for it.

------
Shorel
This is one of the reasons I use pCloud. It adds a huge disk device that is
actually remote and only the recent files are locally cached.

The other reason is Linux support.

~~~
ptman
Lifetime prices for online services? Sounds really sketchy.

~~~
adrianN
Lifetime of the service perhaps.

------
burmecia
Nice work!. I also made a similar file system Zbox:
[https://github.com/zboxfs/zbox](https://github.com/zboxfs/zbox). The
difference is Zbox is a in-app file system focused on privacy, so FUSE is not
supported intentionally. Although it already supports key-value store now, I
am currently trying to extend its capability to cloud storage.

~~~
johntash
Zbox looks pretty interesting, thanks for linking it!

Have you done any performance testing to compare zbox vs xfs/ext4/zfs/whatever
on the same system? I saw the benchmark in the readme, but that doesn't
necessarily show how much overhead or loss of performance there is over the
native filesystem.

~~~
burmecia
No, I didn't. And I don't think it is necessary as Zbox is much more like an
'application-level' fs, obviously can't match the system level fs.

------
quantumwoke
Awesome idea! Recently I've been looking for a solution to automatically back
up ~GBs of scraped data that is updated daily. Is this solution trustworthy
enough? I was burned by OneDrive silently deleting data on a previous attempt.

~~~
tempay
The README states "Do not use in production" so I wouldn't trust it yet.

For ~GBs of data per day is it necessary to use something that avoids having a
full local copy? I'd have thought you could have a full local mirror backed up
with Dropbox, Backblaze, rsync, rclone, etc.

~~~
quantumwoke
Pricing and reliability are my two main problems. I have access to 1Tb
OneDrive but after it silently deleted 40Gb of irretrievable data I will never
trust it again. Pricing wise Dropbox GDrive and others seem too expensive.
Backblaze is the current frontrunner for sure, particularly with its
deduplication facilities.

------
__adrien
Reminds me of [https://github.com/Azure/azure-storage-
fuse](https://github.com/Azure/azure-storage-fuse)

The code is good, maybe you'll find inspiration there.

------
tader
Why would one prefer this over s3ql
([https://bitbucket.org/nikratio/s3ql/](https://bitbucket.org/nikratio/s3ql/))?

~~~
konschubert
Zero has a cache where it keeps the most recently accessed files. For example,
your latest 100GB of raw video recordings. Does s3ql do that? I skimmed over
their documentation and could not see it but maybe I didn't look long enough?

~~~
tader
"S3QL splits file contents into smaller blocks and caches blocks locally." \--
[http://www.rath.org/s3ql-docs/about.html#features](http://www.rath.org/s3ql-
docs/about.html#features)

It allows you to configure an arbitrary cache size, I've been using it with
60GB local cache.

~~~
konschubert
But are writes and reads really 100% local or do they require synchronous
networking?

------
starkruzr
I can't see how this is used. Can we get a "here's how to set up the file
system" guide somewhere? I see the bit about the config file. Then what?

I'd primarily like to use this to back up a couple of Proxmox hosts.

~~~
konschubert
Yes, I'll add a guide and more description but please don't use it yet, it's
work in progress.

------
hilyen
Not seeing, but does it support encrypting files/folders on cloud storage?

~~~
yjftsjthsd-h
Worst case, could layer with encfs or equivalent. Be _very_ careful to
understand the exact threat model that covers (for starters, it leaves you
painfully exposed to metadata issues), but it would work easily enough.

~~~
konschubert
My plan is to use it with fusecrypt for now and eventually include encryption
directly.

------
fazilakhtar
Keybase does something similar with their file system kbfs(which is
encrypted), they use fuse too.

Local caching is limited to 10% of your disk space (if I remember correctly).

Cool project though. Will definitely keep it in my radar.

~~~
konschubert
Does keybase sell could storage packages?

Btw, I really love the idea of keybase, I hope they take off.

~~~
fazilakhtar
No, they have 250GB per user. IMO that is good for backups for now till they
maybe start selling more storage?

Same here, they're doing some really good work.

------
corybrown
Been using rclone mount for something similar, curious how this compares

~~~
konschubert
I think with rclone you need as much local space as you take up in the cloud?

~~~
dpacmittal
Rclone has a cache backend in the latest version which works exactly like OP.

~~~
konschubert
Cool! Do you have a link?

------
ohiovr
Very cool idea. I will be keeping an eye on this one. 1 tb of practical
storage for 5 bucks a month. You could have a petabyte of storage for $5000 a
month!

------
j88439h84
rclone.org is a great tool for syncing to/from cloud storage.

------
mirceal
are there any plans to support S3?

~~~
konschubert
Hi, the Back-Ends are in principle pluggable so I'm very happy to incorporate
a PR for an s3 Back end. I may also do it myself eventually.

I just find s3 very expensive for long term storage of personal data.

------
alexnewman
Minio can do disk caching

------
balkierode
NFS again?

