And in specific:
Or the Redhat sponsored HekaFS:
I think Ori is confusing people by calling itself a file system instead of a sync/backup system. Sure, it's implemented using FUSE, but it doesn't behave like a traditional network file system.
All a fs is really is a specialized key-value store. Redis is a file system.
How exactly is data stored? Is the data stored encrypted should a device be stolen? Is it easy to "revoke" trust from devices storing parts of one's data?
Again, very little on the security of the overall system is actually described.
I hope the authors posted this or are here?
> Our data model, like that of Git, forms a Merkle tree in
which the Commit forms the root of the tree pointing
to the root Tree object and previous Commit object(s).
The root Tree object points to Blobs and other Tree ob-
jects that are recursively encapsulated by a single hash.
Ori supports signed commit objects, by signing the seri-
alized Commit object with a public key. The Commit ob-
ject is then serialized again with the key embedded into
it, thus forcing subsequent signatures to encapsulate the
full history with signatures. To verify the signature we
recompute the hash of the Commit object omitting the
signature and then verify it against the public key.
The filesystem they describe has a few other very interesting features that paint it as a possible replacement for something like Dropbox, or even a network filesystem like NFS. You should check it out.
I mean, we technically shouldn't trust our HDDs and should do data integrity checking at every level.
I'm interested in what they mean here by security, and what things they support beyond the normal integrity features of most modern file systems (zfs, btrfs).
If that's the main security claim, than I feel that much isn't new here or even "secure."
At the conference, and in the paper title/abstract, it did not seem that _security_ was a focal point, or even a point, of their design.
I understand already that they protect the _integrity_ of data---but I feel like that should be a given for any modern file system anyways.
I am more specifically interested about the security of the file system in general. How does it secure data in a distributed setting? Is there anything new and cool there?
1. Git for code, obviously, just not binaries larger than 25mb. Profound network effects (Github, hi) will keep this true indefinitely.
2. For versioning large binaries, I go back and forth between boar and git-annex, for the desultory reason that I can't really get down with either the symlink insanity of git-annex or the old-school server/client svnish model of boar.
2. Couchdb, which I use for (1) couch apps, (2) a backup medium, and (3) the main document store for my company. Couch is great, but is emphatically not a filesystem -- you get CRUD, but few of the niceties you'd get with a 'real' FS. That's okay -- the whole point of Couch is worse-is-better network-native data storage -- but I can only be pacified so many times by a recitation of the CAP theorem. At some point, you want a POSIX compliant filesystem, and you're willing to bite many bullets to get there, including those that Couch made a name for itself by dodging! In other words, you'll take a hit (probably on partition tolerance) in order to get, say, transactional atomicity, and so you'll toss Couch and go with a SQL, at least for some use cases. (Don't get me wrong -- you can pry couchdb from my cold dead paws -- but I also know when not* to use it.)
3. There are many ways of implementing file versioning in an RDBMS, and between Concrete5, Owncloud, and the innumerable sqlite databases scattered throughout my system, I'm sure all are in use, somewhere.
4. My main file server runs btrfs, and sometimes zfs, and the excellent 'snapper' tool that comes with OpenSUSE helps me maintain and access fs-layer diffs. My home directory is regularly snapshotted in this manner.
5. For backing up the linux machines on my network, I use rsnapshot, which uses non-symbolic links for creating differential backups. It's one of those old-school solutions that just happens to work great.
6. Google Drive and Dropbox both offer versioning, so at least when I have these installed (not at the moment) I arguably have a sixth source of version history.
7. My vms are all getting snapshotted on the daily too, and I use bedup to manage the btrfs partition where they are stored. The net effect is storage-equivalent to differential backups, although admittedly less efficient to create, at least on btrfs (deduplication occurs offline). But then again, having an image is the gold standard for backups for good reason.
8. And then there were eight. Welcome, ori. But you wouldn't happen to speak git, couch, btrfs, zfs, vmdk, and sql, would you? Because that would make my life a lot easier.
Personally, I like zfs and rsnapshot. That is what I use on my NAS and it seems to work without me spending time one maintaining it after the initial setup. Why rsnapshot and not snapshots of the zfs volumes? Because I want to keep lots of backups and my understanding is that zfs loses performance after you cross into a few dozen snapshots on the type of hardware I have.
Now, I never understood backing up your home for. Why? I keep any documents on the NAS (thanks VPN and ssh for making life easy here), dot files on GitHub, and source code in git with origin on either GitHub or the NAS depending on if it is private or not. Chat logs are going to be my one exception, but perhaps the servers will back them up for me instead a la GChat.
My problems number two: On the first hand, there are often bizarre interactions between versioning systems when they are hosted atop of one another, breaking encapsulation in unexpected ways. You can't use .vmdk snapshotting on btrfs, for example, because the performance is whatever the opposite of 'breathtaking' is. (Sighgiving?) Meanwhile, a git repo on a Dropbox share is going to chew itself to pieces the first time there's an fs-level merge conflict in .git. I could go on.
The second problem is the flip side of the first -- there is no way of importing and exporting history between versioning systems. I shouldn't worry about git and dropbox: dropbox should offer itself as UI for git.