Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Especially with PDFs, my "sanitization" can be your "stripped away all the fonts and functionality - might as well have given me a plain .TXT", and vice versa.


"might as well have given me a plain .TXT""

Yes, please - that sounds fantastic.


I agree - but it's 1.surprisingly complicated for a general solution (positioning and such), and 2.not really a solution for the usual end user (who might appreciate a JPEG instead)


(btw there's `pdftotext`, which is pretty good in most cases)




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: