This is assuming you cleaning up the output of `pdftotext` which in my experience is the best command line tool for extracting plain text.
This is assuming you cleaning up the output of `pdftotext` which in my experience is the best command line tool for extracting plain text.