
Writing a FUSE filesystem in Python - StavrosK
http://www.stavros.io/posts/python-fuse-filesystem/
======
hemancuso
I write ExpanDrive, [http://www.expandrive.com](http://www.expandrive.com) \-
which mounts SFTP/S3/Dropbox/Openstack Swift/Google Docs/Box on Mac/Windows &
soon Linux all in Python/FUSE.

Ctypes is the way to go for interacting with libfuse. Even better would be
speaking the FUSE protocol directly to /dev/fuse*, but is rather complicated
by our Windows support.

~~~
casca
For those who don't know the space, ExpanDrive used to be SftpDrive. I've been
recommending it for years to people who are not technical for accessing sftp
shares and have never been let down. Thanks for a great product and sharing
some detail about how it works.

~~~
hemancuso
Thanks :)

------
aray
Neat project and great post. One thing I'd like to point out is that there is
_another_ mechanism for implementing userspace filesystems in Linux: V9FS.
FUSE has a lot of documentation -- much of it in various stages of bitrot --
and many of the comments here point to other FUSE projects as well!

The underlying mechanisms for FUSE are pretty convoluted and difficult, so
much more than trivial examples can break the abstractions afforded by the
libraries like the one used in the post.

V9FS is in contrast exceedingly simple and basically unchanged. And, yes, it's
based on the Plan 9 filesystem protocols :)

[http://v9fs.sourceforge.net/](http://v9fs.sourceforge.net/)

~~~
marbu
This is interesting. I'm aware of the fact that some plan 9 protocols were
ported to linux, but are you suggesting that this v9fs is a replacement for
fuse in common use cases? Do you know about simple comparision of fuse and
v9fs implementation of the same userspace filesystem?

~~~
aray
V9FS and FUSE both enable userspace filesystem drivers. They are actually
pretty interchangeable for 99% of use cases, and the rest of those cases are
where you'd directly interact with both, so it wouldn't be fair to compare a
FUSE library abstraction with V9FS.

FUSE is too complicated to directly interact with, so everyone (OP included)
is using libraries and abstractions on top of it. So what you'd really be
comparing is that abstraction, as well as possibly how it limits actual FUSE
interaction. Additionally, the performance of these abstractions is hard to
predict, as many do lots of work (threads/IPC/etc) silently in the background,
so the apparent workload on your filesystem driver is far from actual
workload.

V9FS is simple enough that drivers often just talk it directly. The protocol
is a simple RPC mechanism, and parallelism happens naturally.

As far as a direct comparison example, I don't know of one.

------
seiji
I did one of these a while ago that lets you mount a redis server as a local
filesystem:
[https://github.com/mattsta/redisfuse](https://github.com/mattsta/redisfuse)

It's a great way to do FS-like things ("edit X in vim", "run aspell across all
keys in Y") on very non-file-like objects.

~~~
StavrosK
Oh, wow, that's a fantastic idea! Well done, I will play around with it. I
love the idea of exposing disparate datasets as filesystems.

~~~
cdjk
You should take a look at Plan 9. Everything is a file, and service are
implemented as file servers - including the windowing system (rio).

Some of the ideas are available in Plan 9 from User Space.

~~~
StavrosK
I've seen that, it sounded like a great idea to me when I was looking at it.
Too bad Plan 9 never caught on.

------
notacoward
FWIW, you can also write a GlusterFS "translator" in Python.

[https://forge.gluster.org/glupy](https://forge.gluster.org/glupy)

This allows you to add your own functionality alongside everything GlusterFS
already has - e.g. distribution, replication, handling for all sorts of
annoying VFS/POSIX special cases - instead of having to do everything from
scratch yourself. I'm not saying it's the right option for everyone who might
just use FUSE directly, but it might be an option to consider.

Disclaimer: I'm the original author, though others have taken over since.

------
lamby
Another "me too", but I really enjoyed writing this: [https://chris-
lamb.co.uk/projects/aptfs](https://chris-lamb.co.uk/projects/aptfs)

~~~
StavrosK
Hey, I love "me too"s, they are all interesting projects. That's a very novel
way of managing packages, does it work if you copy all the symlinks out and
then restore them (so you can back up/restore your installed packages)?

~~~
lamby
It's simply a view "remote" APT repositories, not your local package setup.

~~~
voltagex_
That's a really cool utility - can it be used with a caching proxy to reduce
network traffic?

~~~
lamby
Yes, assuming that caching proxy is an regular APT proxy.

------
frio
What kind of performance did you get out of this? I wrote a naive
implementation of a FUSE filesystem in Python a while back -- it read the
sickbeard DB and built a directory of the last snatched programs. The
performance was pretty shocking, but workable for a single user. I always
assumed that was down to Python, so I'd be interested to hear how well yours
performs.

~~~
StavrosK
I haven't tried it at all, since it just needs to be faster than my Internet
connection (it's for backups), but you can just run the file provided there
and it should work properly. Please post a benchmark if you do (I'm not at
home now)!

~~~
frio
I'll give it a try tonight :)!

------
wizzardy
If anyone interested to see some real-world open-source FUSE example (but
written in C), take a look at our project RioFS:
([https://github.com/skoobe/riofs](https://github.com/skoobe/riofs)): an
userspace filesystem for Amazon S3 buckets that runs on Linux and MacOSX
(FreeBSD support is comming as well).

~~~
StavrosK
Yes! Fantastic, thank you, I've been looking for something like that.

------
bambambazooka
Fun fact: bup has a fuse module to access saved files:
[https://github.com/bup/bup/blob/master/Documentation/bup-
fus...](https://github.com/bup/bup/blob/master/Documentation/bup-fuse.md)

~~~
StavrosK
That's pretty great, I think I will use bup when EncFUSE is done, rather than
rdiff-backup. It's also very actively maintained, which is great.

~~~
IgorPartola
I found rsnapshot to be far superior.

~~~
StavrosK
Interesting, in what way?

~~~
IgorPartola
It requires less wheel inventing as it is mostly declarative. It provides a
higher level of abstraction vs rdiff-backup. Finally, it is more like Apple's
Time Machine since t uses hardlinks to dr duplicate files. This means that you
can browse all you backups in parallel and do not have to worry that one of
the incremental backups being corrupt means that all subsequent ones are
corrupt as well.

~~~
StavrosK
That's good to know, thank you. It looks like the only drawback is that rdiff-
backup stores file diffs as well, so rsnapshot will have to store the entire
file again if you change one byte in it, but, since most of my files will
never change, it sounds very good for my use case, thank you.

~~~
IgorPartola
Yes. Another drawback is the lack of encryption support, which may or may not
be easy to work around.

~~~
StavrosK
Well, I'll be using the EncFUSE script from this thread to provide the
encryption, so it should be quite easy to work around it (just point it to the
encrypted FUSE mountpoint).

------
gpsarakis
Nice post! You mentioned that you implemented this wrapper for backups. Apart
from creating a virtual file system for sandboxing perhaps, isn't this
generally slower? Maybe I am not getting the exact purpose of this task.

~~~
StavrosK
Slower than what? This is for creating encrypted versions of your files before
you back them up, so your backup program will only see encrypted files.

------
nickthemagicman
Ha working on a fuse file system for one of my projects in grad school.

------
j_s
Does encryption prior to backup mean less de-duplication?

~~~
StavrosK
It does. encbup uses CBC mode, which means that you'll only have per-file
deduplication, but EncFUSE uses the less secure ECB mode, which means per-
block deduplication (roughly almost as good as unencrypted).

