I've noticed that when I view a PDF of an old book or article, often I can't select and copy text. I assume this is because (1) text selection is disabled somehow, or (2) the document is essentially just a collection of images. Does software exist that can convert a printed page with a lot of math notation into a truly digital document? I'm looking for the same level of quality as TeX. Thanks!
Apache PDFBox in Java https://pdfbox.apache.org
Previous discussion https://news.ycombinator.com/item?id=11327493
For a list of others, see http://okfnlabs.org/blog/2016/04/19/pdf-tools-extract-text-a...
