
From Filesystems to CRUD and Beyond - ingve
https://www.cloudatomiclab.com/post-crud/
======
mwcampbell
I'm aware of one filesystem implemented on top of S3 using FUSE that makes a
serious effort to be fully POSIX-compliant: ObjectiveFS [1]. It doesn't map
files directly to S3 objects, so you can't access your ObjectiveFS data
directly using other tools. As I understand it, it treats S3 objects more like
the blocks in a log-structured filesystem. I've successfully used it to run
applications that aren't "cloud-native" in an environment where VMs are (at
least theoretically) ephemeral.

[1]: [https://objectivefs.com/](https://objectivefs.com/)

------
gumby
BTW Multics didn't have the first filesystem; its design was influenced by
experience with earlier systems like CTSS also developed at MIT. The AI lab's
ITS OS also had a filesystem, though inferior in many regards to Multics's
design.

An interesting side point relevant to the article: the original intent of
Multics was that _pages_ would be the basic data structure, with the
filesystem essentially merely a way to keep track of groups of pages when they
were not otherwise in use or when a way was needed to refer to them. (The
actual initial implementation fell short of this ideal.)

~~~
justincormack
(author here) Thanks. I found it quite hard to find references to early
filesystem papers, I must try harder.

~~~
gumby
Much of that early information may not be digitized.

~~~
blattimwind
Even if it is, it may not be widespread knowledge. E.g. WoFS, the first
embodiement of many ideas found in ZFS and friends, is virtually unknown,
despite being developed in the late 80s, and with documents available online.

------
KaiserPro
"so in summary, don’t use filesystems for large distributed systems."

Don't use a _normal_ filesystem for large distributed systems.

Being able to seek() on a file (without having to download it first) is
something that is very underrated these days

just like databases, clustered filesystems also conform to CAP. You need to
pick which part of the triangle you need.

I really like GPFS, as it has lots and lots of hooks (like S3) for events. It
also has "HSM" which allows you to optimise which part of CAP you want based
on arbitrary parameters (for example I've seen a rule that was if the file was
an image, larger than 1000pixels and red, put it on long term storage.)

~~~
zokier
> just like databases, clustered filesystems also conform to CAP

Instead of being like databases, I like to think filesystems (clustered or
not) as one type of database. Of course that is just more of a mental trick
rather than any deep insight to anything.

------
blt
I don't really understand this post. It reads like a neutral description of
differences between kv/content addressed storage and file systems, but then
concludes that you shouldn't use file systems for distributed systems. Further
motivation is needed.

------
toolslive
> I quite like the term “value store” for these.

[https://en.wikipedia.org/wiki/Content-
addressable_storage](https://en.wikipedia.org/wiki/Content-
addressable_storage)

~~~
gilbetron
He literally talked about CAS right before that, here's the whole quote: 'The
core part of git is a content addressed store. I quite like the term “value
store” for these.'

------
blattimwind
Hash-based cache-busting is pretty much CAS, and a widespread practice. Using
CAS for user uploaded content is somewhat common, too.

------
bitwize
uhhhhh

CRUD has been around forever. Like literally, since the mainframes.

