

Show HN: Docverter, the hosted document conversion service, is now open source - zrail
http://bugsplat.info/2012-11-23-docverter-is-now-open-source.html

======
nlh
An remarkably odd bit of timing: I saw your initial announcement 6 weeks ago,
bookmarked your site for a project I've been working on, and just tonight
finally got to the stage where I'm ready to use Docverter. I went to the site
and was puzzled, because I'd remembered that this was a paid service, and I
couldn't figure out whether I'd bookmarked the wrong link or had gone crazy.
Then I checked the github repo and say everything committed ~10 hours ago, and
lastly, checked HN only to see the #1 story is your announcement. It usually
goes in reverse order! :)

So anyway, a slightly longwinded way of saying sorry that it didn't work out
as a business (though I was about to sign up!) and many many thanks for open-
sourcing it. I'll be installing in the AM and am deeply grateful.

~~~
zrail
I'll gladly accept your money if you'd still like to pay :)

~~~
nlh
Done and done :)

------
latchkey
Too bad you couldn't make it work as a business.

I have something similar, but a slightly different focus (image->image and
pdf->image using Ghostscript/Imagemagick) here:

<https://github.com/lookfirst/convert>

It would be good to combine it all into a single service.

~~~
e98cuenc
Shameless plug, I have a webservice (<http://thumbr.it/>) similar to your
project, but I also added instagram like filters to the images and it converts
word and pdf files to images, and I'm adding html and pdf as output formats
and html as input format.

It's a pity that docverter didn't work (so far?). Have you tried broadening a
bit the range of formats that you cover? Obviously (as I'm also working on the
same space) I think there is a real need to cover here. Good luck!

------
josscrowcroft
My advice, for what it's worth, and based on my experience building and
running <https://openexchangerates.org>, would be to continue to offer it free
(in your case, open sourced) but provide the hosted version as a service.

For example: I'd very gladly pay a small monthly fee to use your API for my
invoicing system which I'm working on right now. I already have a load of open
source tech I rely on, and some things just make more sense to pay for (e.g.
using GitHub instead of self-hosting something like GitLab - it's a tiny
monthly fee that saves me hours of hassle). I'd much rather use a hosted API
that I can integrate in minutes than spend hours potentially faffing about
with installing stuff from the repository, and worrying about keeping it up to
date, explaining it to outsourcers/team members, etc.

If I do end up using your solution in my biz, I'll gladly donate - make sure
you have donate buttons up and prominently displayed!

~~~
zrail
Donate buttons are a good idea, thanks. I'll have one up later today.

As for keeping a hosted version running, I'll consider it. Your reasons for
wanting one are the same reasons I built it in the first place.

------
z-factor
I've implemented a very nice word to html converter previously but market
research have shown that people are barely willing to pay for such a service.
Maybe related consulting services can make it worthwhile for you. Good luck!

~~~
namank
This is a B2B play. Talk to companies that would find this feature useful.

~~~
z-factor
Any suggestions as to what companies to try?

I did talk to some companies who could use it to speed up SEC/EDGAR
submissions, but they were trying to get it almost for nothing and still
wanted customizations.

~~~
namank
Customizations..? Is this sass or stand alone package?

If it's multiple companies and they all want the same thing, I would do it.

That said, it's all about business development. Gotta put your sales hat on!
Tell them you can do customizations but to match their price, you'll have to
do recurring $y amount for at least z number of months with the first 1 month
free for trying it.

Of course I'm making a lot of assumptions about your want of adding features
and giving the first month free...gotta start somewhere!

------
themgt
Thanks for this. It boots pretty much out-of-the-box on our platform thanks to
the Heroku support: <http://docverter.a.pogoapp.com/> \- very cool buildpack
use BTW

I sent in a little PR with some URL changes:
<https://github.com/Docverter/docverter/pull/1>

~~~
zrail
Thanks, merged.

------
frozenport
Is it illegal to run a Microsoft Office web based front end? Install a copy of
Office 2010 and have it scripted to open and export files.

~~~
mongol
It is not illegal but violates the license

~~~
carlospaulino
Can you expand more on this ?

~~~
ams6110
Probably meaning it is not illegal in the criminal sense but violates the
terms of the license (which is a civil/contract law issue).

------
z0a
This is neat! Thanks for making it open source.

------
zrail
Thanks for the incredible response, everyone. For the people wanting to
donate, I've put up a Stripe-powered donation page here: <https://docverter-
donate.herokuapp.com/donate>

------
earroway
Thank you for making the product open source.

Any "lessons learned" you can share on the venture?

------
aleemb
is there something good for pdf conversion to epub?

~~~
zrail
Not really. The problem is that PDF is basically a destination format.
Converting to PDF strips all of the semantics out of it, leaving you with
plain text, fonts, and boxes. The latest versions of the official Adobe
Acrobat Reader are able to convert PDF to Doc but I have no idea what the
quality is like.

~~~
sliverstorm
It's probably possible to do, but nobody's needed one badly enough to do it.

~~~
zrail
There is actually an Apache project that can extract the text from a PDF. It
does a passable job, but like I said all of the formatting is gone.

<http://pdfbox.apache.org/userguide/text_extraction.html>

~~~
Toshio
There is a very good pdf-to-html converter at [0], so it's a two-step process.

[0] <https://github.com/coolwanglu/pdf2htmlEX>

------
shock
Thank you!

