Hacker News new | past | comments | ask | show | jobs | submit login
How to Spoof PDF Signatures (web-in-security.blogspot.com)
222 points by furcyd 19 days ago | hide | past | web | favorite | 45 comments

Cool. So it seems like PDF signatures are broken by design, not just implementation. If you can just append unsigned content to a signed PDF, you are toast.

Interestingly, Adobe has created a cottage industry of certificate providers and has strict security requirements like use of HSMs to prevent private keys from being exposed. As you can imagine, these certificates are stupidly expensive for what they do.

> Cool. So it seems like PDF signatures are broken by design, not just implementation. If you can just append unsigned content to a signed PDF, you are toast.

I don’t know that I’d say that. Seems like it'd be useful behaviour to be able to make post-signature modifications so long as it’s clear what they are.

As a concrete example, it’s the obvious way to implement a documented contractual negotiation: party A drafts a contract & signs it, then party B makes modifications & signs the bundle of (original contract, signature, modifications), then party A countersigns party B’s modifications & signature, indicating acceptance.

Or maybe I’m missing something?

I seem to remember that you could amend a signed PDF and add stuff to add a white rectangle over something you want to change (e.g. a pass/fail entry) and then stick in the new text over the rectangle.

The original text was there, and hadn't been modified so the signature was still valid, but the superficial appearance of the document was misleading even if it was possible to identify the tampering if you knew what to do (which I suspect 99.9% of people reading PDFs would never bother doing).

Yeah, that would be bad — hence my proviso ‘so long as it’s clear what [the post-signature modifications] are.’

If modifications aren’t clearly called out, then that’s a (bad) bug in the viewer.

I think that's where certification was supposed to come in (its been 10+ years since I worked with this stuff so my recollection is a bit vague) - you create the main structure of the document and then lock that down with certification and allow people to complete existing fields, including adding signatures.

A PDF Signature can embed a signature reference which specifies things about what changes to the document are allowed without invalidating the signature. It is possible to make a signature that doesn't allow any changes to the file, or allows changes to annotations, or doesn't allow signature fields to be modified.

To learn more google about "DocMDP" and "FieldMDP"

The issue isn't the functionality, it's the design of the format. This design will lead to an endless slew of possible attacks. It is just a poor way of doing things.

What you want is to sign the whole file. It looks like the design here is a hack using the existing PDF format instead of adding a container to the PDF. If you want multiple parties signing the same document, you should have a container that allows you to append additional signatures to the container. The idea that you need to have a squiggly line inside the PDF to show that it's valid is really dumb in this context anyways. Instead the PDF reader should show you that "@raul signed this document on 2019-03-06 at 11:47:55 EST, here is his GPG signature", and that's it. No squiggly lines needed.

Pretty much all you talk about is supported inside PDF though.

But that’s not the point. You can’t treat the cryptographic signature as both a signature and content. Otherwise you are opening yourself up to all these different attacks.

I think you can use pretty much any certificate to sign PDFs - there is a separate option to certify PDFs that does indeed require special certificates and use of HSMs:


Edit: As an aside, it is amazing the number of people who think that PDFs can't be edited (and are amazed when even MS Word can do it these days).

> these certificates are stupidly expensive for what they do

Not everywhere, in my country you can get them for free, from institutional CAs included in the EU Trusted Lists that Adobe trusts: https://ec.europa.eu/digital-single-market/en/eu-trusted-lis...

Well, Adobe Reader 9 seems to catch all three attacks. Maybe the design is dumb but it can be saved: Don't accept a signature that excludes anything from hashing except the signature and certificates itself. Although I hope that visual elements can't reference into the byte range of the signature/certificates.

From looking at the table at the end of the article, it would seem it is difficult to get PDF signature verification completely correct, but it is possible.

That said, having worked on PDF signature generation in the past, the spec is very complicated, partially due to the structure of PDF documents and how they can be appended, including signed additions.

PDF Signatures violate the The Cryptographic Signature Doom Principle, which is that your signing scheme will lead to doom unless the signature wraps the entirety of your document's binary representation. Same for XML signatures.

Violating the TCSDP doesn't mean that your scheme is technically wrong, but it does mean that implementations will often be wrong in funny ways.

"the The"

It's so important it requires two articles.

Especially complicated if you want the "LTV" thing to show up in Acrobat.

As far as I could determine (I've done it and acrobat is happy), it requires fetching and embedding the necessary CRLs for the TSA (timestamp authority) cert chain before getting the signature, but you don't have the certs with the CRL urls until after you get the signature. And the URLs can change over time (our TSA just switched its cert chain, but they did notify us ahead of time).

>As you can imagine, these certificates are stupidly expensive for what they do.

For US based certificate looking at - $2000~ for 10k signing via entrust trusted adobe cert. Not cheap correct

> President Bill Clinton enacted a federal law facilitating the use of electronic and digital signatures in interstate and foreign commerce by ensuring the validity and legal effect of contracts. He approved the eSign Act by digitally signing it. (emphasis mine)

Am I the only one bothered by the fact that there's a bug in that procedure? Before he signed the act, digitally signing wasn't legal, and the act doesn't become law until it's signed, so... I understand why they did this, but as an engineer, I find this problematic XD

Or maybe it's a neat, built-in, self-destruct sequence for the law. i.e. The law is only valid if we're confident it's his signature. As soon as digital signatures are proven insecure, we can no longer be certain the act was signed, therefore it's not law.

Probably not the case, and just a PR gimmick, but still. I like to dream. :)

> there's a bug in that procedure

Are you sure? I don't think there is!

> Before he signed the act, digitally signing wasn't legal,

Is that true? I think this is the source of error here. In your quote citation (the full extent of my knowledge on this topic), it says that the bill being signed is to facilitate the use of those signatures, and to ensure the validity. It doesn't state that it is making e-signatures legal for the first time ever, it's stating that it is clarifying that they are for-sure good-to-go.

I don't think there is a bug in the procedure.

I caught that too, but the president doesn't have to sign a bill for it to become law. If its unsigned for 10 days and congress is in session then it takes effect.

It seems the issues are two fold:

1. PDF file format doesn't support a "whole document signature" as a concept. Instead, it only supports signing arbitrary fragments of the document.

2. PDF tooling/software doesn't warn the user when a PDF contains some signed and some unsigned fragments. Or where fragments have been signed with different certificates.

I think that is the gist of it?

So, the document is a catalog of numbered objects. Think dictionaries and lists. The objects point to other objects and there is a table of contents near the end of the file showing where to find everything.

To be backwards compatible, the signature is one of these objects.

1. It's a whole document signature, but you can't sign the signature because you don't have it yet. So you have to leave a hole for the signature data.

2. The specification for where this hole is (/ByteRange) should be in the signed data, some viewers do not appear to be verifying this.

So they're sticking a fake byte range on the signature, extending the hole enough to cover said fake byte range and additional objects (modified pages) and a fake table of contents (at the original offset from beginning of the file).

I'm describing one of the exploits as I understand it.

To further complicate things, you can modify pdf files by appending more objects and a new TOC at the end. Think of this as append-only versioning.

Historically this is used for editing (giving you a revision history) or form filling (adding the text you typed in), but it also is used to add additional signatures. (You have to add to the already signed document if you want a second person to sign it.)

A signature on Version 2 of a file would encompass the entire file including the embedded Version 1 and its signature.

Readers like acrobat can show you each version of file. And they can show you the version of the file as a given person signed it.

Correct. And #1 is a really big problem because while theoretically you could make a perfect piece of software that accounts for all fragments in current and future PDF format versions, its not likely.

A bunch of universities are now releasing their official academic transcripts as signed PDFs, which I think is a very clever approach. The signatures have proper trusted certificates issued by major CAs, which makes verifying them very easy.

If you can spoof the signatures, well, now we can have people saying they got 4.0 GPAs from Stanford with legitimate signatures and all...

And yet, wouldn't any sane verifier just call up the school...?

Having the POC pdf files would be interesting (together with a plain invalidly signed pdf). I don't even know if these pdf readers check signatures: okular, evince, mupdf, qpdfview, sumatra.

POC files: https://www.pdf-insecurity.org/signature/viewer.html

And most of the readers you mention don't support signature verification - so they're perfectly secure from this attack ;-)

pdfsig from poppler-utils is vulnerable to SWA. I expect any poppler based pdf readers (that actually bothers with signatures) to have the same vulnerability.

edit: mutool sign is also vulnerable to SWA.

While this research is, as some have pointed out, mostly about implementation deficiencies in signature checking code, I want to point to my own earlier research that shows that the PDF standard is actually also inherently broken, as the method that is used to transform the document into the byte sequence that is fed into the signature mechanism is not reversible:


So, please don't think it's just a problem of incompetent implementations. Yes, these newly-found vulnerabilities are embarrassing and shouldn't have happened, regardless how terrible the standard is, but just implementing the standard correctly (as far as that is even possible, given how vague it is in many regards, lacking a formal grammar and all that) won't result in cryptographically sound signatures.

In our time, forging a signature on a PDF file is a very simple task. I came across this question when there was a need for a fake signature, and a simple but very useful online editor of https://pdfmovavi.com/sign-pdf helped me. Since then there was no need for this, but in any case, now I know how to arrange it.

@angealbertini does a lot of cool work in this area. Check out this repo with examples of hash collisions for PDF, MP4, PNG, GIF, etc. https://github.com/corkami/collisions

We actually find some PDF signature collisions in the wild in Polar (https://getpolarized.io/).

One user found that a few of the documents he added conflicted with the signature of other PDFs.

I knew that the PDF signature wasn't strong but didn't think we would actually have real world conflicts.

We're already moving to SHA256 of the raw binary data instead of the fingerprint for newer documents added to your repository.

We're going to use the actual strong hashcode to enable "distributed" annotations so that two people annotating the same document can easily discover each other and share comments, highlights, flashcards, etc.

I'm pretty damn excited about this feature actually.

Your link doesn't seem to have any such "PDF signature collisions", it's just a link to some product you're presumably hawking.

The article ("How to Spoof PDF Signatures") doesn't find any collisions or similar cryptographic weakness. Instead, as you'd expect, what they found was that software doesn't actually implement the cryptography very well. So what is in principle a perfectly good secure signature system can be "spoofed". If you use fixed software, the problem goes away.

It seems that this wouldn't be an issue if they had made the signature a separate file and had it hard coded to check the entire pdf file.

Or if at least it checked the whole file and just treated the exact block of signature bytes within as a set value (like Authenticode).

I am looking into how to implement pdf signing with certificates for a side project I am trying to turn into a business (think signing an offer letter). can anyone point me to some good resources for learning more? I have only a basic understanding of cryptography. or if you'd be willing to chat sometime id really appreciate it.

HelloSign has a pretty nice workflow!

NaCL/Libsodium is a very highly regarded crypto library.

The Digital Signature of a message is based on the content of the message and the private key of the signer. Therefore a digital Signature is bound to a particular user and a specific message.

So, If I copy your digital signature and attach with my message, the receiver (relying party) would be able to (instantly) determine (through a PKI enabled Application) that I have not signed the message.

I don't think you read the article.

It's hard to take a post like this seriously with so many spelling errors.

Or with statements like "We are quite familiar with the security of message formats like XML and JSON."

Perhaps XML and JSON are secure by virtue of the fact that their built-in security features can never be broken.

Not immediately obvious from the text indeed, but I'm certain they mean (the joy that is) XMLSEC and JWS here. Not sure about JWS but XMLSEC certainly has similarities to the format described here: you define exactly what parts are included in the signature.

English isn't the author's first language from the feel of it.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact