Hacker News new | past | comments | ask | show | jobs | submit login

You could use Libreoffice's command line interface to convert from .doc to a more manageable format.

  lowriter --convert-to odt some-document.doc
odt is not the only supported target, but doc --libreoffice--> odt --pandoc--> plain seems to give better results than e.g. doc --libreoffice--> txt or doc --libreoffice--> docx --pandoc--> plain.

if that's the case, i'll stick with catdoc. my use case is to create a full text search index of the content, trading libre office cli for catdoc, i'd rather just stick with catdoc, but thanks.

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact