
24 Days of Hackage: pandoc - mercurial
http://ocharles.org.uk/blog/guest-posts/2013-12-12-24-days-of-hackage-pandoc.html
======
tikhonj
Pandoc is a wonderful tool and by extension a wonderful library. It takes a
simple idea--convert between different document formats--and executes it very
well. It then turns out that this simple idea is extremely handy.

This article about the API shows how it uses a great design: the actual tool
is written as a frontend to a generic library. This way, the actual logic is
not tied to the frontend at all; the dependency flows from the logic to the
interface. Since it's designed as a library first, the API is a first-class
citizen. Using it never feels like scripting an app, because it isn't.

I really like this way of separating concerns. It makes the "meat" of the
program so much more useful and reusable. Moreover, there is no reason not to
use this approach for GUI apps as well, but I see it all too rarely.

It would be particularly nice for IDEs. There's no reason why all the great
refactoring and analysis tools need to be tied to a given frontend. I'd rather
have the two clearly split to make the backend much easier to reuse and
repurpose.

This sort of design is implicitly encouraged by Haskell (as in the case of
Pandoc). The language naturally pushes you to carefully split the logic and
the interface, partly through managing IO explicitly. Even the build tool
pushes you in this direction, making it very easy to build both a library and
an executable at the same time. Of course, this is also very easy to do in
other languages, although I've found it a bit easier to mix concerns in an OOP
setting. It's just too easy to add a render method to your Document class
instead of maintaining a strict separation; this is especially true if
Document contains extensive private state.

I really hope future tools will continue with this sort of split design so
that building new things on top of them continues to be easy. I've already
benefited from this with Pandoc: the static site generator I use is built on
top of Pandoc, giving it some very nice capabilities. Since it uses Pandoc as
a library, it's more powerful than just passing files through the tool--the
configuration can also use and depend on variables in the Pandoc templates,
for example, which is very useful and requires programmatic access to Pandoc's
internals.

------
gwern
Pandoc is a great tool for anyone using Markdown to write, too, above and
beyond the conversion capabilities. I use it to write my website gwern.net
using Pandoc+Hakyll, but I also use Pandoc for other things because of its
API.

I generate my book review page
([http://www.gwern.net/Book%20reviews](http://www.gwern.net/Book%20reviews))
from my GoodReads CSV export, but that's not all.

I also have functions to add support for Wikipedia links in my Markdown files
(so I can write '[Adolf Hitler](!Wikipedia)' rather than '[Adolf
Hitler]([http://en.wikipedia.org/wiki/Adolf_Hitler)'](http://en.wikipedia.org/wiki/Adolf_Hitler\)'),
which is convenient and a huge time-saver for longer more complex article
titles).

And besides that, I have static checks of my pages via the Pandoc API: I have
2 scripts which parse Markdown files, 'markdown-length-checker' and 'markdown-
footnote-length'. length-checker is checking for the case where I have too
many spaces and a line suddenly becomes a fixed-width line of 200 characters
or something awful like that (surprisingly easy to do with lists). footnote-
length parses the file for all the footnotes, extracts them, sees whether
their bodies are longer than 2400 characters, and warns if any are (since 2400
characters means the footnote needs some work).

------
bcjordan
Fell in love with Pandoc after seeing my friend's online course notes[0]
processed using it. Since then I have been playing with it to simul-publish
HTML[1] and PDF[2] cheat sheets from markdown for a course I'm working on[3].

It's worked great. The one thing it's missing is a good way to do image size
adjustment using markdwon (but it appears to be in progress).

[0]: [http://www.asimihsan.com](http://www.asimihsan.com)

[1]:
[https://googledrive.com/host/0BxJBBicEiS8aYlhGb1Y0cXFBUlU/ou...](https://googledrive.com/host/0BxJBBicEiS8aYlhGb1Y0cXFBUlU/output.html)

[2]:
[https://googledrive.com/host/0BxJBBicEiS8aYlhGb1Y0cXFBUlU/ou...](https://googledrive.com/host/0BxJBBicEiS8aYlhGb1Y0cXFBUlU/output.pdf)

[3]: Warning: those are an old buggy copy, notes have since been updated.

~~~
danieldk
_Fell in love with Pandoc after seeing my friend 's online course notes[0]
processed using it._

For websites, it's even more brilliant in conjunction with Hakyll, a static
site generator that allows you to create site generation logic via an EDSL.

[http://jaspervdj.be/hakyll/](http://jaspervdj.be/hakyll/)

------
anonfunction
Looks great, but the barrier to entry is high as myself and I'm sure others
here have no experience with Haskell.

Have you thought to consider a hosted solution with an API? I'm sure plenty of
people would be willing to pay for it on an API marketplace for instance.

~~~
zrail
Check out Docverter[1]. It's a host-your-own pandoc + html to PDF API. It
doesn't support all of pandoc (for example, PDFs are generated using Flying
Saucer instead of LateX), but it's an extremely useful subset

(disclaimer: I wrote docverter, it's open source)

[1]: [http://www.docverter.com](http://www.docverter.com)

~~~
mercurial
Looks pretty cool. But it's not monetized?

