

First Look Media Releases PDF Redact Tools - patrickod
https://firstlook.org/code/project/pdf-redact-tools/

======
bazzargh
The PDFs this produces are simply collections of PNGs, and won't be
accessible. It's always a compromise though. If you try to edit the PDF adding
black boxes, and remove hidden objects, you may still leak data via the tagged
pdf text; it doesn't have to match up to what's on the page exactly. So,
converting to PNG isn't a terrible idea, but it would be nice to combine this
with something that OCRd the PNG conversion? eg

[https://github.com/fritz-hh/OCRmyPDF](https://github.com/fritz-hh/OCRmyPDF)

(which uses tessaract under the hood). The other thing this is missing,
comparing it to commercial redacters I've used, is the ability to assist in
the redaction: eg removing SSNs, phone numbers, all occurrences of key
phrases.

