
Ask HN: Is there a “ground-up” explanation of PGP/GnuPG? - kqr
Understanding how git works internally &quot;from the ground up&quot; has been incredibly helpful in my everyday work; things like blobs, commit objects, hashes and how they connect to form the git experience as I know it. Where I had been cargo-culting along previously, it all became clear once I understood the fundamental model of what was going on underneath the interface.<p>I feel like the same thing could apply to PGP&#x2F;GnuPG. I am cargo culting my way along but I feel like I would feel much, much, much more comfortable if I knew how it worked from the ground up.<p>I have loose ideas of asymmetric cryptography and trust circles and such, but nothing concrete to hinge my actions upon, so I mostly try different permutations of command line arguments until GPG appears to do what I want it to do.<p>Is there a &quot;from the ground up&quot; good guide to PGP that allows me to break out of this pattern?
======
lucb1e
I don't know of any full explanations where they dissect the data. However you
mentioned (in another comment) that you are familiar with RSA already, so
assuming a basic code and crypto background, and from from what I know on a
high level, PGP messages are something like this:

    
    
        print(number of recipients, algorithm used, etc.)
        for each recipient:
            print(RSA_encrypt(symmetric_key, recipient.public_key))
        print(AES_encrypt(message + hash(message), symmetric_key))
    

Typically if you send an email from a@example.com to b@example.com, it will
find the two public keys from both parties and encrypt the symmetric key for
both. The sender obviously wrote it, but they might want to read it back so
the symmetric key is also encrypted with their public key, as well as the
recipient's.

A random symmetric key is chosen to encrypt the message, since it would be
silly to encrypt the whole message for each recipient again and again. And
even if there's only one recipient, random key generation plus symmetric key
encryption is typically faster than encrypting the whole message with
asymmetric crypto (unless the message is just a few bytes, in which case it's
fast regardless).

File encryption probably works the same way, except you're typically the sole
recipient.

Signatures are done by encrypting a hash of the message with your private key,
which everyone can decrypt with your public key to verify the hash. Since
you're the only person with the private key, you are the only person who could
have encrypted that hash, and since hashes are unique, you must have wanted to
sign this text. (N.B. Both keys, public and private, can be used for both
encryption and decryption, you just can't use the same key to decrypt if it
was already used to encrypt and vice versa.) The hash is used rather than the
full message for both speed and because it makes your signature a lot shorter.

Did I miss anything, at least from a crypto standpoint (since I don't know
details of the file structure)?

------
epaulson
Dave Steele's blog:

[https://davesteele.github.io/gpg/2014/09/20/anatomy-of-a-
gpg...](https://davesteele.github.io/gpg/2014/09/20/anatomy-of-a-gpg-key/)
[https://davesteele.github.io/gpg/2015/08/01/intermediate-
gpg...](https://davesteele.github.io/gpg/2015/08/01/intermediate-gpg/)

------
hannob
It depends a lot on from which angle you want to understand it. There's a
difference between "understanding the variety of command line options" vs.
"understanding the meaning of the raw data structures". I learned quite a bit
by looking up things in the RFC:
[https://tools.ietf.org/html/rfc4880](https://tools.ietf.org/html/rfc4880)

~~~
kqr
I'm hoping that by understanding the meaning of the raw data structures, I can
ask much more educated questions when I am faced with a new operation I want
to perform using GnuPG. The idea is that instead of asking myself "what
happens if I set a new, future expiration date on a revoked key?" with no
clear answer, I could just think in terms of (e.g., I have no idea how it
really works) "since the expiration date is set as an optional header
extension to the key data structure, and the revocation bit is maintained in
the mandatory header, the revocation bit takes precedence over any other
extension headers, including the new expiry date I accidentally set."

This way, knowing the raw data structures makes it easier for me to figure out
which command line arguments I want, if you will.

~~~
kazinator
That's like saying that if I understand the layout of the Ext4 filesystem, I
can ask much more educated questions about how to write shell scripts or other
kinds of programs to manipulate files in various ways.

------
daxelrod
I haven't read it, but "PGP: Source Code and Internals", by PGP's author
Philip Zimmerman is worth a try. I've read his other book "Official PGP User's
Guide" and learned quite a bit.

[https://www.amazon.com/PGP-Internals-Philip-R-
Zimmermann/dp/...](https://www.amazon.com/PGP-Internals-Philip-R-
Zimmermann/dp/0262240394)

~~~
JshWright
Apparently you didn't learn the correct number of n's in Zimmermann ;)

(prz is very particular about his n's)

~~~
daxelrod
Mea culpa. I could try to blame autocorrect, but the fault is entirely mine.

------
kazinator
> _Understanding how git works internally "from the ground up" has been
> incredibly helpful in my everyday work;_

I'm a very competent git user and don't know how blobs work (nor care). I
haven't hit a problem scenario so far in which I had to dissect a blob.

> _I feel like the same thing could apply to PGP /GnuPG._

If you don't know _crypto_ , the internals of GNUPG are a bad way to learn it.

I would recommend reading a book, such as _Applied Cryptography_ by Bruce
Schneier.

Someone who reads that should be a much better informed, much more
sophisticated _user_ of crypto, whether it be an application like GNUPG or a
some cryptographic programming library or communication protocol. A developer
who reads that book should have the know-how to implement some crypto and spot
some crypto-related security flaws.

How GNUPG stores things in various formats is less important than the
semantics of those things: like what is a private key, what is a signature and
so on. You need to understand what is happening when you, say, verify a
signature; just not necessarily at the bit level.

------
artemist
If you have time and are fine with it being a bit dry, you can read RFC4880
[0], the RFC for OpenPGP.

This is something I have done some work on (I wrote a basic implementation in
an attempt to understand a while ago [1]), but I don't have a nice writeup.

An OpenPGP file, whether it is a public key or encrypted file, consists of a
list of packets. Generally it is a binary file, but an armored file consists
of this binary in base64 and then a checksum. You can get these packets with
gpg --list-packets <file>

Example output from a signed and encrypted file

    
    
      gpg: encrypted with 2048-bit RSA key, ID 09FBFEF359DD186F, created 2016-11-30
            "asdfas <sdfasdfasd@asdfasd.asdf>"
      # off=0 ctb=85 tag=1 hlen=3 plen=268
      :pubkey enc packet: version 3, algo 1, keyid 09FBFEF359DD186F
    	  data: [2047 bits]
      # off=271 ctb=d2 tag=18 hlen=3 plen=377 new-ctb
      :encrypted data packet:
    	  length: 377
    	  mdc_method: 2
      # off=293 ctb=a3 tag=8 hlen=1 plen=0 indeterminate
      :compressed packet: algo=2
      # off=295 ctb=90 tag=4 hlen=2 plen=13
      :onepass_sig packet: keyid 0D3B106118D1EFBE
      	version 3, sigclass 0x00, digest 8, pubkey 1, last=1
      # off=310 ctb=ac tag=11 hlen=2 plen=19
      :literal data packet:
      	mode b (62), created 1480523012, name="file.txt",
      	raw data: 5 bytes
      # off=331 ctb=89 tag=2 hlen=3 plen=284
      :signature packet: algo 1, keyid 0D3B106118D1EFBE
      	version 4, created 1480523012, md5len 0, sigclass 0x00
      	digest algo 8, begin of digest 05 c4
      	hashed subpkt 2 len 4 (sig created 2016-11-30)
      	subpkt 16 len 8 (issuer key ID 0D3B106118D1EFBE)
      	data: [2046 bits]
    

The pubkey encrypted packets contain a key used to encrypt the data. The
encrypted data packet includes that symmetrically encrypted data.

When I have more time, I may do a more useful writeup on my site, but
currently I am too busy.

[0]
[https://www.ietf.org/rfc/rfc4880.txt](https://www.ietf.org/rfc/rfc4880.txt)
[1] All I could find was my file parsing code, I dumped it at
[https://github.com/artemist/mupg](https://github.com/artemist/mupg)

------
avmich
If you have some pointers to Git internals explanation, similar to what you're
looking for PGP/GnuPG, can you provide them? That would be useful and
illustrative :) .

~~~
jzl
Scott Chacon's videos are priceless. (He's the co-author of the git book.)
They explain exactly how git works internally in the simplest terms possible.
Either or both of these:

Introduction to Git - talk by Scott Chacon
[https://www.youtube.com/watch?v=xbLVvrb2-fY](https://www.youtube.com/watch?v=xbLVvrb2-fY)

Introduction to Git with Scott Chacon of GitHub
[https://www.youtube.com/watch?v=ZDR433b0HJY](https://www.youtube.com/watch?v=ZDR433b0HJY)

The latter is newer but a bit longer.

Don't be fooled by the video names, these are introductions to the internals
not just the interface.

------
knweiss
An Advanced Introduction to GnuPG by Neal H. Walfield

[https://begriffs.com/posts/2016-11-05-advanced-intro-
gnupg.h...](https://begriffs.com/posts/2016-11-05-advanced-intro-gnupg.html)

------
enzolovesbacon
Although it doesn't have PGP/GnuPG, I found "The Architecture of Open Source
Applications" to be very interesting and something that should be spread out
more.

My work demanded me to read the ITK and VTK parts. Git and GDB are also very
nice.

[http://aosabook.org/en/index.html](http://aosabook.org/en/index.html)

~~~
kqr
Ooh, ooh, ooh they're partly by Greg Wilson! I'm a huge fan of him! Shame they
don't cover GnuPG. :(

------
jzl
This is a great pair of videos that intuitively explain Diffie-Helman key
exchange and then the RSA algorithm in a way that requires no previous
knowledge of cryptography:

"Public key cryptography - Diffie-Hellman Key Exchange"
[https://www.youtube.com/watch?v=YEBfamv-
_do](https://www.youtube.com/watch?v=YEBfamv-_do)

"Public Key Cryptography: RSA Encryption Algorithm"
[https://www.youtube.com/watch?v=wXB-
V_Keiu8](https://www.youtube.com/watch?v=wXB-V_Keiu8)

Good coverage of the fundamentals before attempting a deeper understanding of
PGP specifically.

------
lmm
I found _The Code Book_ a helpful read, though it's very much a high-level
overview.

If you want to really understand what's going on at low level, one option is
to just read the RFC and follow the references.

~~~
Kadin
I was going to recommend Singh's book as well. I have an older edition of it,
and some of it is certainly out of date (I think it has a whole chapter on
Freenet, which may still be around but these days has been eclipsed by Tor),
but the core principles are quite good. It's among the best explanations of
asymmetric cryptography that I've read, anyway.

------
acqq
The low-level is more than good covered, the actual use of the GPG in
different scenarios is what's not discussed enough.

To understand the low level you have to learn enough of cryptography. For
example, to understand the logic of RSA algorithm, read:

[https://simple.wikipedia.org/wiki/RSA_(algorithm)](https://simple.wikipedia.org/wiki/RSA_\(algorithm\))

~~~
kqr
How RSA works belongs to those snippets of random information that I do
possess, but can't reliably link together to get a full picture of the PGP
experience.

~~~
acqq
So what do you miss? There are of course other crypto primitives involved:
even when public key algorithms are used, the whole message is encrypted with
the symmetric cipher, only the key for the message is encrypted with the
public key cryptography. An example for symmetric cipher is AES,

[https://en.wikipedia.org/wiki/Advanced_Encryption_Standard](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard)

but note that older GPG's used less strong symmetric ciphers by default.

There's the RFC about the format of the OpenPGP message:

[https://tools.ietf.org/html/rfc4880](https://tools.ietf.org/html/rfc4880)

------
hackermailman
[https://www.nostarch.com/pgp.htm](https://www.nostarch.com/pgp.htm) for the
command line overview from a sysadmin perspective then look at the PGP/public
key crypto section here (or read the whole thing)
[https://www.crypto101.io/](https://www.crypto101.io/)

------
gmluke
I wouldn't call it 'ground-up' but you may find this useful:
[https://www.gnupg.org/gph/en/manual.html](https://www.gnupg.org/gph/en/manual.html)

