Hacker News new | past | comments | ask | show | jobs | submit login

That's why I archive my document scans as one bit per pixel PNGs. It ends up being 20KB-50KB per page at 150 PPI. I figure that there will always be a way to get the pixels out of a PNG. PDF is a more complex and dynamic standard.



That's true, but PNGs can't have a text layer for searchability.


I'm pretty sure text files can reside in the same folder as PNG files.


Yes, but not all documents are trivial to convert to a text file because the layout can be quite complex. A PDF file can have little bits of text floating anywhere and when you search inside the file, you can see it highlighted at its actual position.


I've had to work with PDF files before, and they're absolutely horrible. Precisely because of what you state: "little bits of text floating anywhere". Or something like disjoint, not-grouped, lines for table drawing, instead of a generic table with formatting, width/height, etc.

Though, I generally agree with you, PDF/A is quite a good way of storing documents for long-term. But, that doesn't mean that PNG files along with text files, even with x:y coordinates next to the pieces of text, aren't a feasible alternative.


Consider archiving them as djvu (http://djvu.sourceforge.net/). One bit per pixel djvu files at 150PPI will likely become 2-5KB pages instead.

Djvu also supports a text layer just like PDF.

Note that 150PPI is barely better than FAX, so your documents will likely look 'faxed' if you ever have to output hardcopies for some reason.


Djvu is patented. As a result it is very possible that is will never achieve enough critical mass to be suitable for long term archiving.


People who worry about these things professionally generally would veer towards TIFF if PDF was insufficient. PDF/A does stuff like embed fonts and avoid proprietary compression & encryption, to avoid likely long term failure scenarios.

In the US, permanently retained documents like court records are kept in PDF. It will be around.


TIFF. Often expanded as "Thousands of Incompatible File Formats".


PDF is certainly more complex but its an ISO standard and even has an archival version: PDF/A.


FWIW, PNG is ISO/IEC 15948.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: