Hacker News new | past | comments | ask | show | jobs | submit login
Ditching OpenPGP, a new approach to signing APT repositories (debian.org)
106 points by FiloSottile 39 days ago | hide | past | favorite | 56 comments



I took a look at the design and think there are a few issues with the format as proposed.

# The public key is stored with the signature.

This should be stored separately. A public key found here is too tempting to use, rendering the signature worthless. Authenticating it would be OK, but low value. This is unauthenticated. A "key ID" should be used instead if the intention is to support lookups among multiple keys.

# The algorithm is stored with the signature.

This is slightly less bad than above, but still bad. Attacker-controlled algorithms have been used repeatedly in "downgrade" attacks. Agility is bad, but if you must support multiple algorithms, store this with the public key (somewhere else). Some info here: https://github.com/secure-systems-lab/dsse/issues/35

I didn't look at the sub-key protocol in detail. The ephemeral key for every release is an interesting choice. The root key is "offline". But if it must be brought online to sign a new ephemeral key for every release anyway, you might as well just use it to sign the release itself.

Using minisign/signify like OpenBSD does and keeping things very simple makes sense to me. The complexity designed into this system (sub-keys, multiple algorithms and signatures) starts to stretch the bounds to where TUF (https://theupdateframework.io/) might make sense. TUF is very complex and not worth it for most projects, but Debian is exactly what TUF is designed for.


> >The complexity designed into this system might make sense. TUF is very complex and not worth it for most projects, but Debian is exactly what TUF is designed for.

I disagree that TUF is too complicated for most projects. While our documentation, tutorials, and tooling can be better, the setup is about just as complicated as, say, devising an in-toto root layout. Most open source projects should really just worry about subscribing to something like PEP 480 and signing with one-time Fulcio keys. But I think we are largely on the same page here: yes, please just minisign/signify if you want simplicity, but if you want resilience from nation-state attacks, you need something like TUF (coupled with in-toto and sigstore). We are happy to advise.


> I disagree that TUF is too complicated for most projects. While our documentation, tutorials, and tooling can be better, the setup is about just as complicated as ...

I've heard great things about TUF but if you want people to adopt it then it seems like the documentation/tutorials/tooling should be a first class citizen


Thanks for your comment. I completely agree, and we are working on it. If you have any suggestions for documentation/tutorials/tooling you would like to see, I'd be happy to add them to the list.

We are actively working to improve reference implementation, to make it easier to maintain (easier to read code, type annotations, generally more Pythonic, cleaner design) and use (cleaner documented API, easier to plug in your own implementation of things a content update system might already have an opinionated implementation of -- i.e. the network communication stack).

We hope to build more tools on top of the cleaned up reference implementation once it is feature complete.

For the specification itself, we recently switched to publishing a rich HTML document with cross-linking, syntax highlighting, ToC, etc. https://theupdateframework.github.io/specification/latest/ and added a new section covering some of the repository operations https://theupdateframework.github.io/specification/v1.0.19/#...


Interestingly, the reference implementation doesn't seem to include the algorithm, which is hard coded. But that means the reference implementation is inconsistent with the described format.


Yes, the implementation lacks behind a bit. Project's been stale a few months


I don't understand your point about the public key. The goal is to be able to verify all signatures; whether we trust a signature or not is configured inside APT by specifying a list of trusted primary keys.

> # The algorithm is stored with the signature. > This is slightly less bad than above, but still bad

Which algorithms are trusted is a client side decision. We do need the string to be able to parse the signature itself. signify also has it, except that it includes it inside the base64 as the first two bytes.

I do not believe there is a need to be able to configure the trusted algorithms on a per primary key basis (aka storing it with the public key).

Basically the problem we have with GPG is that it trusts algorithms for far too long. So we had to implement lists of trusted algorithms inside APT ourselves. We still have no way to reject 1k RSA keys, though, because GPG does not tell us the key size.

We can assign algorithms trust levels and prevent downgrades to weaker keys alongside the other downgrade attack checks for Release files we already have in APT.

> I didn't look at the sub-key protocol in detail. The ephemeral key for every release is an interesting choice. The root key is "offline". But if it must be brought online to sign a new ephemeral key for every release anyway, you might as well just use it to sign the release itself.

Yes, well, if you go with the subkey approach you'll always have a subkey, so you can't sign directly with a primary key.

I think there's some confusion here in that people believe both formats should be supported at the same time, but they are two separate proposals, and only one should make it into real code.

> Using minisign/signify like OpenBSD does and keeping things very simple makes sense to me. The complexity designed into this system (sub-keys, multiple algorithms and signatures) starts to stretch the bounds to where TUF (https://theupdateframework.io/) might make sense. TUF is very complex and not worth it for most projects, but Debian is exactly what TUF is designed for.

I've said it 4 years ago - we don't want TUF. TUF does not provide useful things, but adds (duplicates) a lot of complexity. It's the opposite of what we're trying to achieve here.

https://blog.jak-linux.org/2017/08/17/why-tuf-does-not-shine...

Add to that that we'd have to rewrite the whole thing to use YAML/deb822 instead of JSON files for repository format compliance.


> The goal is to be able to verify all signatures; whether we trust a signature or not is configured inside APT by specifying a list of trusted primary keys.

Why is it a goal to be able to verify signatures from untrusted primary keys? At best, this conveys zero information. At worst, it may confuse people into thinking that the signature is safe when it really isn't. It also opens your format up to JWT-style vulnerabilities.

Generally, to avoid JWT-style vulnerabilities, you should be using some sort of key ID attached to the signature to look up the key in the list of trusted keys. If the key is not known, you just ignore the signature. If it is known, you know from the key itself what kind of key it is, and thus how to parse the signature. Enforce algorithm deprecation when deciding whether to trust a key, rather than when parsing a signature.

This talk by Sophie Schmieg at Real World Crypto 2021 is extremely informative about this: https://youtu.be/CiH6iqjWpt8?t=1028


Being able to validate consistency of a repository without needing additional files is a worthwhile goal. We also validate MD5 sums of hashes when downloading, despite them not being trusted (so they can only reject on failure). If there is an invalid signature, it should fail verification whether we trust it or not.

The other goal of the design - with the subkeys - is to be able to start signing repositories with a new subkey, without having to distribute the new subkey to users out-of-band.

The talk is unfortunately not understandable due to inadequate audio quality. It's also not the right medium to bring this across, you'd want to have a paper of up to 3 pages.

gpg has the same distinction between valid signatures and good signatures, though you need to distribute the key out of band there.

I don't see how this can confuse people. Nobody else should be implementing this specification.


Exactly this. The flow should be:

* Look up keys that I trust

* Check if the signature verifies against any of those keys

Key hints can aid in selection here if you have many trusted keys, but you can also just loop through.

Not:

* verify the signature

* check if the key matches one I trust

The second is much more dangerous, a "valid" signature from an untrusted key is just as bad as an "invalid" signature from a trusted key. There's no reason to treat these states as different.


The point is not about valid signatures from untrusted keys, it's about invalid signatures from untrusted keys.

Some repositories are signed with multiple keys. People building the repositories might only have one key T as trusted, and then mess up signing with the other key U (imagine that one is an offline key - this is how debian stable releases are signed).

By verifying all signatures and rejecting any invalid one, it will be immediately obvious if the signature by U is wrong.

Also, as mentioned before, the scheme allows for rapid subkey replacement in case of potentially compromised subkey.


gpg precisely has valid signature and good signature to distinguish signatures that are technically valid from good signatures.

e.g. a valid signature can be expired, so it's not good or shit like that.

So, the argument is flawed.


I don't expect to change your mind about TUF, though I would love to talk about how Debian could adopt pieces of the TUF design where it makes sense _and_ how TUF might provide affordances to better suit Debian.

For other readers, I think it's important to point out that:

1. TUF does not require use of JSON, or any other specific format.

> Implementers of TUF may use any data format for metadata files as long as all fields in this specification are included and TUF clients are able to interpret them without ambiguity. from https://theupdateframework.github.io/specification/v1.0.19/#...

2. The blog post you link to says that TUF does not handle key revocation. That's untrue. TUF does key revocation explicitly, by replacing the listed keys in the root (for top-level roles) and targets (for delegated roles) metadata, and implicitly through expiration times.


I believe the concern with the public key is that by adding it directly to the signature format, instead of a key identifier, you're giving a cue to naive implementors that they should trust the key that appears on the signature, which is a bug that has happened with SAML implementations in the past.

I'm just guessing, though.


Yeah, roughly this. It's only a matter of time before someone gets a signature verification error, then sees the public key here and adds it to their apt policy to get past the error.


That's no different from the apt-key adv --recv-keys <key id> that people do now on signature failures, so I don't see that as a reasonable argument.

We can only show the key id in the output, though, and then people have to go search on the web to get the full key (or they echo $signature | base64 -d | head -c32 | base64; but really, who does that).


It's an appeal against specifying what has been historically an anti-pattern in signature formats; it's caused problems for SAML (and XMLDSIG generally) and JWT already. It won't bite you hard, at least any time soon, because of how few verifying codebases your project deals with. But it's not what you'd call a best practice.

Compare this to the lack of verification criteria (cf HdV's post) --- the signature ambiguity is (sorry) unambiguously a wart on the format and one you should be able to agree, ceteris paribus, should be fixed. But it's probably less likely to cause a real world problem than this design choice.


I think there's a reasonable compromise of embedding the key id of the primary key, but shipping the subkey completely.

This way we lose the ability for checking total consistency without public keys present; but we have the technological requirement you want for the user to trust the primary key (and no way to construct it from the signature).

We discussed the signature format in great detail, and looked at multiple options:

  Ed25519-Signatures:
   base64<sig1>
   base64<sig2>

  Signatures:
    ed25519:base64<sig>

  Signatures:
    ed25519 base64<sig>

  Signatures:
    base64<ed25519 | sig>
The first option was rejected by people for reasons I can't remember, and the next 2 are equivalent, albeit the one with : is easier to copy-paste (who does that for an apt signature?).

The last one we did not like because it made the code more complex because we'd have to decode to a buffer first and then copy into the appropriate struct; and also because it's not clear from looking at the plaintext which algorithms it has been signed with.


Also, sure, the key type is also stored with the public key, and then the signature type must find a public key for the same type; so it looks like ed25519:<base64>.


I don't understand this reaction to TUF, at all. The post you link to explains that "APT addresses all attacks TUF addresses", which implicitly recognizes the value in TUF's design.

If you're going to rebuild the entire trust and signature system of APT, why roll your own crypto instead of adopting the model designed by an academic that has been proven useful in other situations (e.g. Docker and Notary)?


> # The algorithm is stored with the signature. > This is slightly less bad than above, but still bad

So I just also realized that for the subkey scheme, it matters even less, because the algorithm is effectively part of the subkey packet.

It's not encoded in there, but how high is the chance that a different algorithm would produce the same signature for it?

We really just need the algorithm field to be able to determine how to parse that, but it reinforces itself by virtue of the subkey being signed by the primary key itself.


Please note that the spec is ahead of the implementation, and the project has been stale for 4 months. For spec and implementation to stay in sync, they'd have to be in the same repo I guess. For now, the spec is driving and the implementation catching up with it.

For more background, see

https://mastodon.social/@juliank/105679564085465156


Can anyone with more expertise comment on the use of cryptography.hazmat which is apparently (1) a frontend to openssl, (2) "full of land mines, dragons, and dinosaurs with laser guns"?


It's "hazmat" because cryptography is dangerous, and the library authors would strongly recommend you use an opinionated library that does the cryptography for you, like Nacl, where you don't think about which crypto primitives to use, but rather they're all set up for you. Here, they're working with an existing format and have specific requirements; they're doing the thing the "hazmat" library exists to facilitate.


Do we ever learn anything from past security breaches? What is the compelling security reason to ditch pgp?

The _one_ thing it is good at is long-term identity. There's ECDSA keys available now in PGP, and Ed25519 wouldn't be a stretch.

This looks and smells like NIH syndrome. PGP has had a lot of time to bake. Let's make incremental improvements, not start all of over because you don't understand history.

Oh and going all eggs in on one algorithm is stupid. Thinking there will never be problems with Ed25519 is naive. PGP was designed to be extended.


These ideas are pretty close to discredited in cryptography engineering. Diversity in signature algorithms doesn't help when everyone uses the same algorithm anyways, and, more importantly, a negotiated algorithm makes it harder to recover from an new attack on algorithms, not easier: with a non-negotiable algorithm, you version the entire repository, and can reliably rule out vulnerable signatures. Negotiation and downgrade attacks have been endemic in "ciphersuite" schemes.

There's a reason the field is moving towards simple, modern schemes like this one, and Debian has more experience than most in trying to make broken, crufty old PGP work.


So, no.


It's OK to just say "I have no idea what you're talking about".


Doesn't appear to have any domain separation. This means that if a common private key is reused in another application you may be able to use that application as an oracle to produce signatures it doesn't intend to use. Say you later want to have packagers sign with the same keys for security announcements. Obviously not reusing keys is good but there are reasons people do that (e.g. to share the trust of an existing key). The lack of domain sep means stuff like a subkey itself is a valid payload signed by the master key. The modern advice is to personalize the hash function for each distinct application and use within the application.

Signer is setup to have the whole input in memory at once. The protocol design makes it tricky to do otherwise without using two passes at best. This could be avoided if anyone cared, e.g. using a signing scheme with prehashing (which has its own obscure tradeoffs).

This text says very little about validating trusted keys. A signature where you just take the pubkey from the input and don't validate it is essentially just a really computationally expensive hash and doesn't itself provide security. The text about validating all signatures regardless of trusted or not could be read as saying that you just accept stuff with valid signature regardless if they were being signed by the Evil League of Evil instead of an approved source. :)

It uses ed25519 but doesn't do anything about inconsistent validation criteria ( https://hdevalence.ca/blog/2020-10-04-its-25519am ), different implementations consider non-overlapping sets of signatures valid. If you want to have automated klaxons on signature failures, this ought to be addressed for the same reason as "repositories do not fail for a subset of users". (though it's not fatal in this application as it is in consensus systems).

The format will be a little clunky if used with post-quantum signatures, e.g. sphincs+ would have a couple kilobytes of signature data stuffed on a line. I don't have a real recommendation there other than saying you should probably have test vectors that have some couple-kilobyte-long examples.

No obvious affordance for additional signed metadata, which is useful for forensics. So, for example, if a project has multiple parties with access to the keys they may want to tag which user(s) authorized the signature or included a timestamp. If their signing is performed by a HSM (e.g. something like USB armory that allows no user access) the consistency of their metadata could be preserved. This could probably be kludged on by requiring the metadata just be inside the payload somewhere.

I like that there are subkeys with expiration. It's not entirely clear to me that it's supported to hand out multiple subkeys to different parties all with the same serial, and they stay all equally valid until a key is observed with a newer serial (at which point all subkeys need to be updated)? "Subkeys may reuse/share serial numbers provided" is confusing on this point. Good security practices suggest that if there are multiple places that can sign each should get its own subkey (so if there is a compromise you can isolate it), and this might mean that there are multiple subkeys in parallel all valid for a given trusted master.

Personally, I'm a little disappointed to see a proposal for repository signing that doesn't include affordances for something like opentimestamps or a certificate-transparency like facility (e.g. to validate that parties aren't making backdoored alternative versions for private distribution). But I guess these could be overlaid as additional signature types and the idea is just to phase out pgp, not advance the security state of the art forward.

Authors of this proposal should be aware the NSA has made a formal recommendation against the use of ECC with curves smaller than 384-bits. There are plenty of applications where compatibility, performance, or computational constraints make it attractive to stay with 256 bit options. But if I were doing something entirely greenfield where a double size key/signature would be a non-issue, I'd be really tempted to use ed448 instead: If nothing else, at least if there are problems in the future you at least didn't violate existing credible advice. :) (Really I'd want to use sphincs+ for something like this, but it's probably too premature). Considering how close to security theater package signing often is in practice, I'm not sure if it's worth worrying too much that the underlying cryptosystem might be stronger-- it's not the limiting factor.


> The lack of domain sep means stuff like a subkey itself is a valid payload signed by the master key

Well. There are two proposals. If we go with the subkey proposal, we'd not implement the simple protocol, solving that issue.

> Signer is setup to have the whole input in memory at once

It's fine.

> This text says very little about validating trusted keys. A signature where you just take the pubkey from the input

That would be silly. APT would maintain a list of allowed public primary keys. The spec hints at that with "Clients shall verify all signatures from supported algorithms in the Release file, whether a given key is considered trusted or not."

> No obvious affordance for additional signed metadata, which is useful for forensics. So, for example, if a project has multiple parties with access to the keys they may want to tag which user(s) authorized the signature or included a timestamp. If their signing is performed by a HSM (e.g. something like USB armory that allows no user access) the consistency of their metadata could be preserved. This could probably be kludged on by requiring the metadata just be inside the payload somewhere.

We don't have a use for that. The feature set of GPG is already over the top.

> It's not entirely clear to me that it's supported to hand out multiple subkeys to different parties all with the same serial, and they stay all equally valid until a key is observed with a newer serial (at which point all subkeys need to be updated)

Yeah, the serial exists to revoke all priorily issued subkeys, effectively.


> APT would maintain a list of allowed public primary keys.

You've blatantly forgotten the implication of TOFU regarding new debian keys. Without a way to certify debian's keys, there's no way to verify that installation media is correct any more. With PGP, there was the web of trust. Now there is TLS at best, no?

It seems the proposed specification lacks a crucial addition of keysigning.


> Well. There are two proposals. If we go with the subkey proposal, we'd not implement the simple protocol, solving that issue.

Avoiding that concrete exploit.

The lack of domain separation design flaw remains. :)

As typical for design flaws it may not be meaningfully exploitable at any given time. Doesn't mean it isn't a design flaw.


> It uses ed25519 but doesn't do anything about inconsistent validation criteria ( https://hdevalence.ca/blog/2020-10-04-its-25519am ), different implementations consider non-overlapping sets of signatures valid. If you want to have automated klaxons on signature failures, this ought to be addressed for the same reason as "repositories do not fail for a subset of users". (though it's not fatal in this application as it is in consensus systems).

This is annoying, but ultimately I guess not super relevant, as it needs to verify with APT, and that's about it in terms of requirements. And APT doesn't change it's crypto backend all the time, and we don't need to care about alternative implementations. It does mean we can't switch the crypto library inside APT I guess.

Does ed448 have the same issue? Do we have hardware tokens that support ed448?

The C++ code right now uses libnettle to verify signatures for the PoC.


They do say "there may not be any practical ed448 implementations", however the cryptography.hazmat.primitives.asymmetric library used by the reference implementation has an ed448 function that could very likely slide right in. They would however need to change some hardcoded lengths.

I do feel their discussion about not using separators to avoid coding string splitting actually makes it more complicated to code around variable lengths.

(I'm finding that to be a great reading on validation)


> They do say "there may not be any practical ed448 implementations", however the cryptography.hazmat.primitives.asymmetric library used by the reference implementation has an ed448 function that could very likely slide right in. They would however need to change some hardcoded lengths.

But that's not what that sentence means at all. It meant that APT might start just with support for ed25519, as IIRC I was told that ed448 is overkill right now.

I successfully have signed and verified signatures using Ed448 using the branch of python-aptsign for it.

> I do feel their discussion about not using separators to avoid coding string splitting actually makes it more complicated to code around variable lengths.

I don't follow


For reference why I don't follow, the two word approach to determine the algorithm and then be able to parse the base64 directly into a struct are easy.

like in C++, I can

std::string algo, encoded;

in >> algo >> encoded;

if (algo == ed25519) base64_decode(encoded, ed25519_pointer); return std::move(ed25519_pointer); else if (algo == ed448) base64_decode(encoded, ed448_pointer); return std::move(ed448_pointer);

Where the _pointer are std::unique_ptr of Ed25519Signature and Ed448Signature which are each subclasses of Signature, and the return type hence std::unique_ptr<Signature>.


It's only upon the later clarifications in this thread that I've understood "key option 1" and the second option with subkeys are options for the standard and you won't need to support both.

Both of these options have different lengths, and if both were supported you've have to split based on.. magic?

It's been 20 years since I looked at C++, so I don't understand how you're splitting the public key from the signature given they are one concatenated blob, but that could just be me.


C++ code isn't there yet, the Python one just tries to decode as one and if that fails, falls back to the other; for testing purposes.

In C++ the blob can just be directly decoded into a struct, though. e.g. the initial format looked like this:

  struct signature
  {
     uint8_t primary_key[ED25519_KEY_SIZE] = {};
     struct
     {
        uint8_t pubkey[ED25519_KEY_SIZE];
        unaligned_little<uint64_t> expiry;  // seconds   since epoch
        unaligned_little<uint32_t> serial;  // serial   number
     } subkey;
     uint8_t subkey_signature[ED25519_SIGNATURE_SIZE] = {};
     uint8_t signature[ED25519_SIGNATURE_SIZE] = {};

     bool decode(std::string_view view);
  }
and then signature::decode effectively just needs to call a

  base64_decode(this, sizeof(*this), view.data(), view.size()).
(given a base64 decode function that allowed a fixed size output buffer)

You can see there are magic unaligned integer types in there so that the whole thing is not architecture specific.

The entire sample is in

https://gist.github.com/julian-klode/4514ce39d3dc62647b502e5...

Obviously it does not fully match the current spec.

If you wanted to support both structs, you could just try decoding as one or the other, or change the spec to use ed25519 and ed25519-subkey algorithm names :D


great news!


Why not use the Rust implementation[1] of the PGP instead? I feel that writing something new and complex like this in non-statically typed language (Python in this case) is a recipe for disaster. Someone suggested using TUF, but it's also in Python and prone to shifting compile-time errors to runtime.

[1] https://sequoia-pgp.org/


Because even memory-safe implementations of PGP are saddled with the complexity and legacy cruft the the protocol itself, and the thing Debian projects are actually looking to do is exactly what the signify/minisign family of signature schemes does, using exclusively modern cryptography, with the simplest possible interface.


Makes sense if they don't plan it to become more accommodating and complex over time.


Ironically, by using a modern, stripped down signing system like this, they've probably increased their opportunities to do interesting things down the road, since they don't have all the possible weird interactions with the PGP system to consider.


Still have to trust 1k RSA keys because GPG does not allow us to say they're unsafe.


You can tell how long a key is just by looking at it. You can then say anything you want. The use of particular keys is a Debian policy issue. It has nothing to do with the use of a particular message format like OpenPGP.


That's not how it works, no. This is a gpg specific issue as it does not communicate the key size over the status fd.

We don't parse the keys ourselves for obvious reasons.


Dunno about that but --with-colons certainly works to show the key length of a key.


i am pretty sure you can mark them yourself. but that they are generally still used, well that's called backwards compatibility. and that's needed.


i would bet that if debian had some sane requirements, sequoia could fulfill them.

sequoia is a rewrite which avoids the legacy cruft of the old OpenPGP. and they already have a library, a cli tool and tests


That's an absurd statement. OpenPGP _is_ the legacy cruft. It's made for _emails_. It makes no sense elsewhere. The encoding for clearsigned signatures is overly complex.

And Rust is inappropriate in any case for now, it's not allowed in Ubuntu main.


> That's an absurd statement. OpenPGP _is_ the legacy cruft. It's made for _emails_. It makes no sense elsewhere. The encoding for clearsigned signatures is overly complex.

No. It's a message encryption, and namespaces are emails, but then, most logins nowadays are identifiers via some email. this is not true. and encodings are fixable and doable in a good library, so there's no misuse.

> And Rust is inappropriate in any case for now, it's not allowed in Ubuntu main.

that's new to me, but possible, why? any link?


Rust is a security nightmare. We'd need to add over 130 packages to main for sequioa, and then we'd need to rebuild them all each time one of them needs a security update.


It's a problem of Ubuntu then, not Debian.


Not gonna have two different signing implementation. Same security concerns apply to Debian too, of course.

Having 130+ packages to rebuild for a critical security update is no fun.


Does Debian enforce a particular implementation of OpenPGP? For all we know Sequoia is already in use. OpenPGP is valuable mostly as a popular and widely standardized format. There a lots of implementations.


You have the choice between GnuPG 1 and GnuPG 2 :D




Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: