The are widely used and in a terrible format that's hard to work with (programmatically). So yes, there are problems.
My particular pet peeve is that many rendering platforms suffer from floating point rounding shenanigans, such that rendering/rasterizing a 8.5x11 doc at 300 dpi results in a the height being off by a pixel :/
My issue with PDFs was the fact they are hard to read on smaller screens because the format isn’t natively easy to reflow.
I guess another issue today is RAG for AI: how can a pdf be chunked in meaningful pieces. Funny to see old methods such as document layout analysis being used for RAG.
Not necessarily. The clarity offered is helpful if you need to cite something, e.g. for a paper, and know exactly which page it's coming from - though this only works for book scans and is a somewhat specific use.
Or when you need everyone to have exactly the same words on exactly the same page. Quite helpful for standardization.
> e.g. for a paper, and know exactly which page it's coming from - though this only works for book scans and is a somewhat specific use.
I'd include that as "intended to be printed". I should have said "unsuitable for anything that isn't printed matter", though.
In the purely electronic world, there are much better ways of achieving the standardization you're talking about that don't come with the rather large downsides of PDFs.
Problem with docx is that there's no real way to _just_ view a file. I don't really want to spin up an entire word processor just to read the meeting minutes. I honestly much prefer a PDF to docx.
Someone could make a docx viewer easily enough if there was demand for it. it's just an xml file inside a zip file with the contents. it's no LaTeX but it's serviceable.
My particular pet peeve is that many rendering platforms suffer from floating point rounding shenanigans, such that rendering/rasterizing a 8.5x11 doc at 300 dpi results in a the height being off by a pixel :/