Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: PDF format sucks. Why didn't the alternatives catch on?
3 points by behnamoh on Aug 11, 2023 | hide | past | favorite | 6 comments
PDF is full of bugs. Editing PDF files (even a simple task like replacing a font with something else) is painfully difficult, ugly, and hacky.

Why hasn't the industry just switched to better solutions yet? Is it because PDF is the thing you get regardless of the typesetting program you used (e.g., Word, Latex, Markdown, HTML->PDF (Save as PDF), etc.)? It seems to me that there must be a better way.



The content of a PDF file is not like the content of, say, an HTML or ODT file. With the latter you use plain text with formatting instructions and the application needs to do all the layouting stuff, like glyph positioning (which is already a hard task), paragraph layout (Where to break the lines? How many lines for widows? ...) and so on.

A PDF file is essentially pre-rendered. So the application creating the PDF file needs to do all the stuff mentioned above and the PDF itself just contains the instructions at what exact position on the page which glyph should be rendered.

This makes displaying or printing a PDF much easier (but still a hard task). And that is also the reason why editing PDFs is hard because all the additional information like what is a paragraph, a heading ... is usually not available.

FYI: Tagged PDF has all that structural information and there are developments to allow e.g. reflowing of PDFs on smaller devices.


Better in what way? Sure, editing a PDF sucks, but why are you even doing that? The purpose of a pdf is to preserve formatting precisely. If you don’t need that, you could use something else. You could output everything as an epub file, using html.


PDFs don't do an excellent job of preserving formatting either. Over the years I've looked at lots of PDF formatted files that are quite different on screen and on paper.


Pretty much the whole book publishing industry runs on pdfs, and if they weren’t reliable, that would change.


I'm also told that the publishing industry depends almost solely on "Word".

If this dependence on "Word" is true, then I bet that if a PDF malfunctions, they just shrug their shoulders and tell themselves "that's just the way it is" ad move on. Because that's what you have to do when "Word" does something goofy, and "Word" is the Standard Word Processor.


An impressive feature set, widely available tools, standardization, and marketing:

https://en.wikipedia.org/wiki/History_of_PDF




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: