Show HN: Moochacha, quantum-safe file encryption (analyzed by Frama-C)

markdascher · on Feb 24, 2023

This started out as an excuse to try writing C code safely. Eventually stumbled upon Frama-C, which is amazing.

But anyway, moochacha is a simple file encryption command based on libsodium's Argon2id, BLAKE2b, and ChaCha20-Poly1305. Nothing fancy, just symmetric encryption with a keyfile. It also supports optional padding to conceal file sizes.

After goofing around for months, I've just about reached the limit of my abilities, and am tempted to actually use it. Please talk some sense into me! See the README for a full specification. The UI could certainly be improved, but I'm mostly curious if there's anything unsafe about it.

Also, am I reinventing the wheel here? Similar tools either involve asymmetric algorithms or derive encryption keys directly from the password. I want the output to be safe even with a bad password, and even if quantum computing becomes absurdly successful.

chasil · on Feb 24, 2023

Can you make a mode that is compatible with the -chacha option to (modern) "openssl enc" output?

That level of interoperability would greatly increase confidence, I think, but this doesn't use Argon2id and BLAKE2b.

  $ echo foobar > .pwfile

  $ openssl enc -chacha20 -e -pbkdf2 -pass file:.pwfile -in sensitive.db -out sensitive.cha

  $ openssl enc -chacha20 -d -pbkdf2 -pass file:.pwfile -in sensitive.cha -out test.db

  $ diff test.db sensitive.db
  $

markdascher · on March 1, 2023

It's a good idea, and some sort of easily-verified compatibility with openssl was actually an early goal. But I gave up on that since the algorithms are so different. A compatible mode would end up sharing very little code, so it wouldn't say much about moochacha itself.

But good news! I've come up with... something.

https://codeberg.org/markdascher/moochacha/src/branch/main/o...

That strings together a combination of argon2 and openssl commands to decrypt a file. It's extremely unsafe though, since all of the secrets end up in command-line arguments. Also some of those commands might be new in OpenSSL 3.

chasil · on March 2, 2023

I will take a look tomorrow.

I recently discovered exactly how unsafe PuTTY's "psftp -pw" option is (20+ years after introduction), so purging argv is always prudent.

https://www.linuxjournal.com/content/putty-scripted-password...

junon · on Feb 24, 2023

Frama-C is indeed pretty cool, as long as your application is pretty self contained. Using frama-c on code that runs e.g. in the kernel breaks its models too much.

naasking · on Feb 24, 2023

Cool, I've been meaning to try out Frama-C and other verification tools. Have you tried CBMC or similar tools at all?

markdascher · on Feb 24, 2023

I haven't yet. This was my path to Frama-C:

1. Try writing C code that's "obviously safe." Gave up on that pretty quickly.

2. Maybe if I make all buffers global, I can verify some basics with a script. Gave up when I learned that "abstract interpretation" already exists, and it's way more advanced than anything I could whip up. (But kept the global buffers for now...)

3. The only example I found for a while was https://github.com/NASA-SW-VnV/ikos, but it needed an old version of Clang, and my laptop didn't have enough memory to build it.

4. Found a couple other examples that were clearly someone's research project.

5. Finally found Frama-C by searching through Fedora packages. And it's exactly what I needed.

There's definitely a learning curve, and it's not always obvious how to get it to do what you want. But I can at least install it, there's a nice GUI, and it's thoroughly documented.

I didn't hear of CBMC until later, but I see there a Fedora package for that too. Should probably take a crack at it!

fdupress · on Feb 25, 2023

You could also try VeriFast [0], which is closer to Frama-C than CBMC. (VeriFast and Frama-C are deductive verification tools, CBMC is a bounded model checking tool.)

It is slightly less approachable than Frama-C because it uses separation logic, but it's slightly more approachable because it uses symbolic execution, which allows it to display an actual execution trace that causes the failure. (You can then inspect that and decide whether your code or model is wrong.)

[0]: https://github.com/verifast/verifast

naasking · on Feb 25, 2023

Good reference, I had actually starred that on github to refer to later and totally forgot about it.

Do you know of a review of the various open source verification tools available for C? The ones I'm aware of: CBMC, Frama-C, VeriFast, ESBMC.

I've found papers like [1] which benchmark various model checkers, but no evaluation of the verification effort required for the various approaches.

Edit: the tutorial for VeriFast does not make it look very usable compared to Frama-C's ACSL language. Their tutorial doesn't explain a bunch of the syntax or why they made certain choices, so it all feels very unergonomic.

Edit 2: The SV-COMP competition on software verification [2] tests a whole bunch of tools, but Frama-C and VeriFast are not among them. Looks like CPAChecker [3] has the best power-to-weight ratio if you're looking for one tool to learn with the most errors caught across all benchmarks.

[1] https://arxiv.org/abs/2003.11689

[2] https://sv-comp.sosy-lab.org/2020/

[3] https://cpachecker.sosy-lab.org/

fdupress · on Feb 25, 2023

No, sorry; I'm not aware of proper comparisons for C verification tools, or at least recent and maintained tools.

SV-COMP tends to propose a huge number of relatively simple tasks, which attracts model checkers and static analysers. Frama-C and VeriFast are functional verification tools. You can use them as static analysers, but there's a barrier to entry in terms of minimal required annotations that might not be present for tools designed for static analysis.

intelVISA · on Feb 25, 2023

Cool as heck. Frama-C is quite tempting, could you share any resources on how you got started w/ it?

rwmj · on Feb 25, 2023

Here's an engineering tech talk I did at Red Hat about Frama-C. It was an internal talk but I don't suppose anyone cares ...

http://oirase.annexia.org/tmp/2020-frama-c-tech-talk.mp4 [134M, mp4]

tptacek · on Feb 24, 2023

Help me understand how this is quantum-safe in any sense that all other symmetric encryption is? I'm not critiquing, I'm just giving you the data point that I'm probably missing something here.

monero-xmr · on Feb 24, 2023

Isn’t all symmetric encryption using one time pads safe from quantum decryption? It’s a weird thing to lead with considering all symmetric is safe

tptacek · on Feb 24, 2023

I don't pretend to understand the "keyfile" thing here. I guess that's what I'm asking: if this is at bottom just a symmetric key cryptosystem, like, even an encrypted DMG is "safe" in that sense.

markdascher · on Feb 24, 2023

You're right. The honest answer is that my nose has been in this for so long that I've lost track of what's interesting and what isn't. =)

A big reason to mention quantum safety is just to make it an explicit goal. If I've accidentally chosen something that even smells uncertain, it's a bug. There are plenty of ways for symmetric key cryptosystems to accidentally weaken themselves, like leveraging a 128-bit intermediate key at some stage of the process.

Now, that might not even weaken things realistically, for all sorts of reasons. But I want something that you look at and say "yeah that's obviously fine," without having to think much. No surprises.

bawolff · on Feb 24, 2023

Its pretty clickbaity to call symmetric encryption "quantum safe". Of course its quantum safe, the whole quantum crypto issues are with asymmetric crypto not symmetric*

* yes, yes, you want longer key lengths because of grover, but realistically that's not a threat anytime soon and also trivial to work around.

zxexz · on Feb 24, 2023

I love this, thank you. It’s a topic I’ve oft thought about, and I’m having fun reading through the code - I was just thinking earlier today that I need to brush up on my C, as I was going through some of my own, old, code, I can barely follow.

Thanks for posting!

ThePhysicist · on Feb 24, 2023

Looks really cool. A bit surprised no one here told you to "not roll your own crypto yet". Great that you specified the protocol and threat model in any case. Why 12288 bytes for p0?

markdascher · on Feb 24, 2023

> Why 12288 bytes for p0?

Good question! Tripling the first block's size (12288=3x4096) happened the most recent commit, to make all of the Padmé sizes possible.

https://codeberg.org/markdascher/moochacha/commit/3145d7fb95...

The original encrypted format was more straightforward: 32 + 4112 + 4112 + ... + (last block, less than 4112) bytes. That initial 32-byte seed messed up the math just enough that a 2 MiB output was impossible. (It would require the last block be exactly 4112 bytes, which isn't allowed.)

If those extra 32 bytes weren't there, then it's a lot easier to calculate the impossible sizes, since they're just multiples of 4112. And those never collide with Padmé, at least for all 64-bit file sizes.

I like to think of that first block as 32+12288+16, getting block boundaries lined back up on even multiples of 4112.

progre · on Feb 24, 2023

I think most people mean "don't roll yor own and use it in production"

Writing your own crypto for fun should be encuraged.

latenightcoding · on Feb 24, 2023

Author is not rolling their own crypto. They are using libsodium.

touisteur · on Feb 24, 2023

Almost as fun as SPARK's own Rod Chapman doing some tweetnacl porting and proof https://blog.adacore.com/sparknacl-two-years-of-optimizing-c...

stan_kirdey · on Feb 24, 2023

nothing is quantum safe until there is quantum compute