This was a fixed prefix collision attack. That means they can make two documents (P | A | anything) and (P | B | anything) for a fixed P, and they can find A, B, such that A and B are different but
SHA1(P | A | anything) = SHA1(P | B anything)
The Merkle-Damgård construction (used in MD4, MD5, SHA1 and SHA2 but not in SHA3 and some other modern hashes) invariably means length extension is possible, if you can collide two documents then you can add a suffix to both and also get a collision.
This is how there's already a web site where you feed it images and it makes a "different" colliding PDF, it's just using Google's result with a different suffix after the 128-byte collision near the start.
SHA1(P | A | anything) = SHA1(P | B anything)
The Merkle-Damgård construction (used in MD4, MD5, SHA1 and SHA2 but not in SHA3 and some other modern hashes) invariably means length extension is possible, if you can collide two documents then you can add a suffix to both and also get a collision.
This is how there's already a web site where you feed it images and it makes a "different" colliding PDF, it's just using Google's result with a different suffix after the 128-byte collision near the start.