So anyway, a slightly longwinded way of saying sorry that it didn't work out as a business (though I was about to sign up!) and many many thanks for open-sourcing it. I'll be installing in the AM and am deeply grateful.
Aspose is pretty good, as it does not require Office interop to function. We've hit some limitations with it at work, such as dealing with PDF attachments and larger file sizes.
But anyway, yea. I just wanted to note that I think the parent poster works there. I made a similar response a while back and they actually Tweeted me in response to it, so I know they're on here. :)
I have something similar, but a slightly different focus (image->image and pdf->image using Ghostscript/Imagemagick) here:
It would be good to combine it all into a single service.
It's a pity that docverter didn't work (so far?). Have you tried broadening a bit the range of formats that you cover? Obviously (as I'm also working on the same space) I think there is a real need to cover here. Good luck!
For example: I'd very gladly pay a small monthly fee to use your API for my invoicing system which I'm working on right now. I already have a load of open source tech I rely on, and some things just make more sense to pay for (e.g. using GitHub instead of self-hosting something like GitLab - it's a tiny monthly fee that saves me hours of hassle). I'd much rather use a hosted API that I can integrate in minutes than spend hours potentially faffing about with installing stuff from the repository, and worrying about keeping it up to date, explaining it to outsourcers/team members, etc.
If I do end up using your solution in my biz, I'll gladly donate - make sure you have donate buttons up and prominently displayed!
As for keeping a hosted version running, I'll consider it. Your reasons for wanting one are the same reasons I built it in the first place.
I did talk to some companies who could use it to speed up SEC/EDGAR submissions, but they were trying to get it almost for nothing and still wanted customizations.
If it's multiple companies and they all want the same thing, I would do it.
That said, it's all about business development. Gotta put your sales hat on! Tell them you can do customizations but to match their price, you'll have to do recurring $y amount for at least z number of months with the first 1 month free for trying it.
Of course I'm making a lot of assumptions about your want of adding features and giving the first month free...gotta start somewhere!
I sent in a little PR with some URL changes: https://github.com/Docverter/docverter/pull/1
Best part of LibreOffice?
1) It's fast
2) You can call it via a simple command to convert a file to another format. So easy as 1 2 3 to integrate in your code. Convert to HTML, PDF, whatever, you name it :)
3) It's free
Also, just check out pandoc. You'll love it. Abiword works too.
#pandoc -o output.html input.txt
#abiword --to=doc filename.odt
I'll be opening an API soon :)
Any "lessons learned" you can share on the venture?
Fortunately, the tables were the only parts I wanted! I needed to get them from the PDF into text (csv) form. So, from Word, I copied the tables, pasted them into Excel, and saved that as csv. Easy as 1-2-3-4-5!