
Show HN: ZFS Implementation in Python - alcari
https://github.com/alcarithemad/zfsp
======
lunixbochs
Note: this was implemented without referencing any ZFS source code and should
not be subject to the CDDL.

~~~
josteink
So we port this python to something not slow, and all the kernel-people can
shut up about ZFS being terrible ;)

~~~
j88439h84
Dont forget, pypy is fast.

~~~
cure
PyPy is faster than Python, yes. But Go, C and many other (compiled) languages
are way faster than PyPy. Plus, if you use a language like Go or Rust then you
avoid Python's GIL and you'll have much more reasonable memory usage. Best of
all, deploying is a matter of copying a binary, rather than having to deal
with the absolute disaster that is Python packaging.

~~~
nine_k
Go? A GC'd language in kernel? (Well, yes, this has been done, from Lua to
Haskell, but only experimentally.)

~~~
weberc2
Python is also a GC’d language...

~~~
twa927
CPython is mostly reference-counted.

~~~
int_19h
With a synchronous garbage collector for cycles. Which is like the worst of
both worlds, since you get the constant overhead of refcounting, plus
unpredictable interruptions of unspecified duration that can happen every time
a new object that might contain references to other objects is created.

To be fair, the GC can be disabled. But it's only safe to do so when you know
there are no cycles, and even when such guarantee can be had for your own
code, I've never seen a library guarantee that to API clients.

------
ivanbakel
Pierre Menard, author of The Filesystem.

But I'm surprised this is possible without a specification - how can you test
a filesystem through hexdumps? The effects of some operations are going to
pretty far-reaching, surely?

~~~
aerovistae
One of my favorite short stories, and nobody else has ever read it. So glad to
see someone else reference it.

~~~
hueving
Link?

~~~
iiv
The original title is "Pierre Menard, Author of the Quixote":
[http://www.coldbacon.com/writing/borges-
quixote.html](http://www.coldbacon.com/writing/borges-quixote.html)

------
mfsch
Does someone know whether it would be legal for someone to go through the ZFS
code and write a specification of the features this author hasn’t figured out
yet? I.e. could someone write a detailed description of the missing
functionality that doesn’t include any details about the implementation so
other people can implement it in non-CDDL code?

~~~
atomicwrites
That's called a clean room implementation and was the standard way to make
x-compatible products (like for example, the bios on an IBM PC clone). Not
sure what the current legal standing of that method is.

EDIT: Ninjad because I left the reply in a tab without posting.

~~~
zymhan
Reverse engineering is legal in the US, but you had better have detailed
records proving no one who knew the insides of the original product ever
influenced the clone. And be prepared to explain that in court.

------
AlexanderDhoore
What?! How?! Why?!

This is the greatest thing ever. I wish I could just write code for the fun of
it. Every time I wonder whether people will use it and give up before I even
get started.

~~~
mirceal
why not? writing code can fall into one of a few bucket. one of them is play.

~~~
anon4242
Exactly! Write code that you find interesting and/or need for something and
then share it. If someone uses your stuff, then great, if not, at least you've
become a slightly better programmer! It's a win-win!

------
lelf
No ARC/L2ARC?

Edit: of course not. This is actually just it’s just a ZFS user-facing ”front-
end”, not a _ZFS implementation_.

~~~
lunixbochs
It's capable of doing IO against a real ZFS array without any other code. ARC
is an implementation detail and not necessary for correctness. If you removed
ARC from ZoL it would still work, just slower. ARC is far from the most
interesting milestone for a reimplementation effort because an ARC
implementation doesn't need to be anything like the Sun version internally, as
long as it offers similar performance.

This project is cool not because you're going to run the Python in your kernel
today, but because someone can use it as a documented reference implementation
of all of the data structures and transactions that is not covered by the
CDDL, so another implementation based on this can live in the Linux kernel
without problem.

~~~
hnlmorg
If the GPLv2 GRUB ZFS code[1] wasn't enough to get someone started then I
doubt this will make any different in porting ZFS to GPL given there would be
more work involved in turning this into a usable kernel driver.

Not taking anything away from the work that the author has done though. It's a
nice project. I just think a little pragmatism is needed before we get carried
away with the ZFS GPL comments.

[1] [https://blogs.oracle.com/solaris/zfs-under-gplv2-already-
exi...](https://blogs.oracle.com/solaris/zfs-under-gplv2-already-exists-no-
kidding-v2)

------
4oo4
This is awesome! Do you plan on blogging anything about how you went about
reverse-engineering?

------
PaulHoule
i love userspace implementations of filesystems.

note that the issues are entirely different from those with a kernel
implementation since you aren't having to think about page cache et all.

~~~
burmecia
There is an userspace implementations of filesystems ZboxFS:
[https://github.com/zboxfs/zbox](https://github.com/zboxfs/zbox).

------
gaze
Hell yeah, dude. This is awesome.

------
foxhop
Hey alcari!

I know you from ICV - we used to hang out online on the forums and IRC.

Nice work on this project, I'm looking forward to diving into the codebase!

~~~
rashkov
Just curious, what is / was ICV?

~~~
foxhop
It was / is an community of people who formed around a book called 1337 h4x0r
h4ndb00k by tapeworm.

ICV was the name of the forum for the book, now defunct (icodeviruses.com)

[https://www.amazon.com/1337-h4x0r-h4ndb00k-tapeworm/dp/06723...](https://www.amazon.com/1337-h4x0r-h4ndb00k-tapeworm/dp/0672327279)

------
RantyDave
This is all kinds of funny. I'm awash with awe and admiration.

------
adamnemecek
There's also TFS a ZFS inspired FS in Rust [https://github.com/redox-
os/tfs](https://github.com/redox-os/tfs)

------
garmaine
Simultaneously fucking awesome (that you pulled it off at all), and fucking
useless (performance...).

Thanks for sharing though. Maybe could be useful in making a suite of zfs
inspection tools?

Is the OP here? How difficult would it be creat a zfs reshaping tool, allowing
for the offline expansion of a vdev?

