
4Q: the final archive format - bpierre
https://github.com/robey/4q
======
creshal
Final? Really? What if I need xattrs and Posix ACLs (or I use Windows and want
NT ACLs and streams; or forks under OSX; …)? Hard-coded encryption algorithms
also don't seem particularly future safe.

~~~
keenerd
No forward error correction either. It can't be considered "final" without
some form of bitrot protection.

And no, storing it on a ZFS or BTRFS volume with error recovery enabled does
not count. (They don't use FEC, they use 1960's triplicate storage. Hugely
wasteful of space, does nothing to protect against transmission errors and can
still be corrupted by two of the exact wrong bits being damaged.)

Storing it on a media that does use FEC also does not count. I want a per-file
tunable FEC knob, not one vendor-determined setting. And as history has shown,
it needs to be done through FOSS code and not trade secret firmware.

~~~
gtirloni
Is there any FS that implements FEC?

~~~
creshal
Not that I'm aware of. The device layer usually does it already and it's
_supposed_ to be sufficient.

~~~
gwern
> it's supposed to be sufficient.

Supposed to. If you do any backups to DVD/BD and still want to get your data
back in 5 or 10 years, though, you'd be well advised to do _some_ sort of FEC
- burn multiple copies of each disc, generate a bunch of PAR2, whatever.

(You might want to do that for backups on hard drives too. Yeah, maybe the
hard drive firmware is supposedly taking care of any errors below the block
level and you're not too worried about bitflips, but that just means you'll
lose entire blocks and files when you lose something.)

------
al2o3cr
The README implies that `tar` is "not streamable". Someone needs a history
lesson on what it was originally used for...

~~~
FooBarWidget
Tar is not streamable _when compressed with gzip_. 4Q supports compression on
a per-file basis and thus can support streaming and compression at the same
time. Zip also supports compression on a per-file basis, but it requires an
index and thus is not streamable.

~~~
falcolas
Funny, I've done plenty of gzip compressed tar streams:

    
    
        nc -l 7000 | tar -xf -
        tar -czf - * | nc 1.2.3.4 7000
    

Files appear one by one on the remote end.

------
dajobe
Things it doesn't support: symlinks, posix acls (xattrs). The first one makes
it a certain failure for archival use. The hardcoded link to an external crypo
service keybase makes it a failure for long term use.

~~~
qrmn
"The final archive format" is a very big promise that 4q doesn't keep right
now. It falls short of 7z, RAR and tar.xz, and certainly isn't ready to
replace them at the moment.

I'm not too familiar with Coffeescript, but it doesn't seem like a good choice
of language to write an archiver. There's no actual draft file format spec I
can see, either? But from a first pass, I have the following comments:

Crypto: Encrypted blocks: AES-256-CBC, random IV, with no MAC (!!!). You
_need_ to look at that again: that could be a Problem. Hashed blocks:
SHA-2-512. Maybe OK (how's length encoded? Look out for extension attacks).
That crypto is 14 years old and missing a vital bit: not "modern". Modern
choices would include CHACHA20_POLY1305 (faster, more secure, seekable if you
do it right); hashes like BLAKE2b (as the new RAR already does); signing
things with Ed25519. Look into that kind of thing. You need a crypto overhaul.
The keybase.io integration is a nice thought for a UX - but is an online
service in invite beta really ready for being baked into an archive format?

Packing: LZMA2 is pretty good: 7z and xz already use that. For a fast
algorithm, Snappy is not as good as LZ4, I understand? Neither is the last
word in compression. Text/HTML/source code packs much better with a PPM-type
model, like PPMd (7z has that, too, as had RAR, but removed it recently), but
you need to weigh up the decompression memory usage. ZPAQ's context model
mixing can pack tighter, but that's much more intensive and while I like
extensibility, I don't like the ZPAQ archive format having essentially
executable bytecode.

Other missing features that other archivers have: Volume splitting? Erasure
coding or some other FEC? Can you do deltas? (e.g. binary software updates)

You've got some pleasant UX ideas for a command-line archiver (compared to
some other command-line archivers!), but sorry, I don't think you're ready for
1.0.

------
simondelacourt
Somehow this makes me think of this xkcd comic
[https://xkcd.com/927/](https://xkcd.com/927/)

~~~
edward
I like the mouse over text: "Fortunately, the charging one has been solved now
that we've all standardized on mini-USB. Or is it micro-USB? Shit."

Now we're going to transition to USB-C.

~~~
creshal
Unless we're going to transition to DockPort to tunnel USB over DisplayPort.

~~~
lsaferite
Fairly certain they dropped DockPort as USB Type-C covers all the
functionality already.

~~~
creshal
I hope so. But when did that ever stop anyone?

------
joepie91_
Why would you make an archive format depend on a third-party service
(keybase)? That's just a terrible idea for data longevity.

------
edem
I really hope that this does not get mainstream or I will have to install yet
another archiver tool...I really don't understand why people use 7zip for
example when storage is cheaper than ever. Just use tar and get on with your
life.

~~~
ojanik
Maybe because it doesn't have unicode issues that zip has?

~~~
kozak
Put "cu=on" (without quotes) in the "Parameters:" box of 7-Zip's "Add to
Archive" window, and all your Unicode issues are solved.

------
whitingx

      "the final archive format"
    

[https://xkcd.com/927/](https://xkcd.com/927/)

;)

------
__michaelg
How is that any better than modern archive formats like 7z?

------
fleitz
What are the advantages of this over tar?

~~~
StavrosK
The four things in the first paragraph of the link?

~~~
anon4
Tar already has the first two, and even POSIX xattrs (which this doesn't
preserve), the third seems useless (seems being the key word here, some people
might find it useful), and I'd rather just use a program that will encrypt the
archive for me (i.e. have a .tar.xz.enc).

One advantage this could have over the above, is if you can open any file at
random, as with the above scheme, you might have to linearly decrypt and
decompress the entire archive up to that file.

~~~
rakoo
Technically speaking then, the tar utility cannot compress or encrypt per-
file, but the _tar format_ can be used for this, and since we're talking about
format then the tar format _can_ accomodate the requirements. It's just that
there's no tool doing it at the moment.

(A counter point: while each file could be compress and encrypted, there's
nothing in the tar format that explicitly says so, meaning that each file
would have to be probed to determine if it was compressed or encrypted)

------
pjc50
It's in coffeescript! I wasn't expecting that, I was expecting C, Go, or Rust.

------
sylvinus
If it's final I guess the TODO section should be empty ;-)

------
mahouse
Click the language bar of GitHub, see CoffeeScript, close the tab

