
Casync – A tool for distributing file system images - Nekit1234007
http://0pointer.net/blog/casync-a-tool-for-distributing-file-system-images.html
======
tkfu
I'm not sure I buy the embedded/IoT use case; OSTree is a really good model
there and is more featureful. The "well, if your filesystem image delta
happens to be in the form of a lot of very small files it's not so great for
CDNs" doesn't strike me as a terribly good reason to give up everything OSTree
gives you (especially with stuff like the meta-updater [1] Yocto integration).

[1] [https://github.com/advancedtelematic/meta-
updater](https://github.com/advancedtelematic/meta-updater)

(Full disclosure: I work for Advanced Telematic, the creators and maintainers
of the meta-updater Yocto layer.)

~~~
poettering
Well, I am pretty sure IoT devices should be designed with security in mind,
and that means that they need to be protected against offline modification.
And that's something OSTree can't really deliver, but dm-crypt can. And casync
works pretty well for delivering dm-crypt enabled disk images.

I think OSTree is great — but for embedded devices that are installed in the
wild, humm, uh, I don't think so? I am pretty sure there are better options
than that.

~~~
thinkMOAR
Please elaborate on 'need to be protected against offline modification'?

~~~
poettering
Think of cell towers or wind power turbines: they both are primary hacking
targets in today's world, and they are placed in the wild, in uncontrolled and
unprotected locations. This means more or less anybody can just walk by,
temporarily cut the power source, take the harddisk out, plug it into their
hacking laptop, install an OS trojan on it, place it back into the original
device and restore the power. From the PoV of the cell company or the power
company this was just a short power cut, and nothing changed. I reality the
system was just hacked. And in order to protect yourself against that OSTree
can't help you, because disk accesses aren't validated. The only validation
takes place during downloading. dm-verity OTOH will protect every single
access, and if deployed properly then such "offline" modifications to the OS
will result in the device not booting anymore, which is much preferable over
accepting that the device was hacked with no scheme to detect it.

And it's not just cell towers or wind power turbines: pretty much any device
which is around people not unconditionally trusted needs to be protected
against such offline modifications. In fact, if people today build cars, TVs,
surveillance cameras or anything else like that and do not deploy dm-verity in
some form to make sure the devices cannot be modified offline without noticing
are just participating in turning IoT into Internet of Shit.

~~~
thinkMOAR
But physical access == game over? Whatever software layer you add imho.

Wouldn't it be easier to simply dunk the whole device in some epoxy preventing
access to the hardware with some anti-tamper deadman switch?

~~~
poettering
trusted boot and TPMs with remote attestation exist precisely to ensure that
physical access does not mean game over. It's all there, people just need to
make use of it in their systems. And yes, trusted boot and TPM has issues, but
without all this the attack surface is massive, and I think needlessly so.

~~~
thinkMOAR
(trusted boot and TPM are afaik already compromised albeit you need to bring a
near rocket scientist)

I will always think physical access is game over whatever 'rocket science' or
re-invented old principles people come up with software wise and i'm not sure,
but hardware probably too but software is easier to mangle.

And indeed yes, security is layers, layers that make it more difficult, and
having many options for layers to choose from that is great.

Also didn't hear about OStree before really, reading up on both for some
future project.

------
dom0
If you read the internals description it could just as well be about Borg,
very similar principles here, though the application is very different.

By the way, both buzhash and SHA-256 are kinda poor choices for a new system,
especially one that targets servers.

~~~
rdtsc
Yap borg is the first thing I thought of. It already does a lot of this an
more: encryption, configurable encoding, a rolling hash computed by the
Buzhash algorithm and so on.

Maybe it wasn't geared for CDN delivery during restores but otherwise I've
been impressed by borg so far (haven't deployed it in production, only played
with it locally though).

[https://github.com/borgbackup/borg](https://github.com/borgbackup/borg)

This is a description of the internal design:

[http://borgbackup.readthedocs.io/en/stable/internals.html](http://borgbackup.readthedocs.io/en/stable/internals.html)

~~~
dom0
The "latest" version of that page has seen significant additions:
[http://borgbackup.readthedocs.io/en/latest/internals.html](http://borgbackup.readthedocs.io/en/latest/internals.html)

~~~
rdtsc
Thanks. That is a better reference indeed. It's got nice diagrams as well.
Can't edit my post any longer, so hopefully others will just see your message.

------
tomfitz
Great. The chunked model (inspired by Borgbackup/Tarsnap) seems preferable to
Docker layering, and diff-based approaches.

As far as I can tell, the advantages compared to Borgbackup seem to be:

* casync offers control over which FS metadata is included

* casync, the server, exposes chunks over HTTP

* casync, the library, is written in C so is more easily used by systems software.

I'm betting we'll see machinectl integration. Excellent!

~~~
tomfitz
Oops. Just realised my comment contains a mistake.

casync does not act as a server. Its on-disk representation and client
behaviour is designed in such a way that the server need only serve static
files. This makes deployment easy.

------
RachelF
All these great Unix based tools make me wish I did not have to work on
Windows Servers all day.

------
the_arun
What is the difference between Casync & rclone
([https://rclone.org](https://rclone.org))?

~~~
striking
rclone is a nice cloud cloning solution for files and folders. casync is
intended to clone and delta entire filesystems, in a way that makes them
nicely deployable. rclone is very cloud focused while casync says nothing of
the details of how images are served.

casync also has fs composition, a multitude of recorded file attributes,
automatic reflinking/hardlinking, uid/gid shifting, and so much more.

tl;dr: rclone is for files, casync is for entire filesystems/deployments.

~~~
the_arun
Thanks for explaining

------
peterwwillis
I think this is a very useful tool if working with arbitrary file changes on
block devices, but it's still very low level and would need a crapton of
modification/wrapping to make it useful in a complex system. I would rather
use Kickstart or the like to distribute changes intelligently, or barring
that, rsyncing hardlinked directory trees, or zsync (but RPM/Yum would really
be ideal due to the features gained)

------
JustinGarrison
I'd be really interested in creating full disk images and then cloning to
another disk. If that is a use case (I think I skimmed the post correctly) it
could be very useful and performant over dd and similar disk cloning tools.

------
nwmcsween
It sounds like a sort of half done distributed filesystem, a lot of
similarities.

------
abhineet97
This sounds like a centralized BitTorrent.

