
Writing a Book with Pandoc, Make, and Vim - halst
https://keleshev.com/my-book-writing-setup/
======
leephillips
I’m always interested in reading articles like this, as I like to see the
setups that people come up with to produce books and documents. I didn’t know
about set virtualedit=all in vim!

If you learn how to extend Pandoc with your own filters, which you can write
in several languages, there is no limit to what you can do. Here’s the
description, published in the sadly defunct _Linux Journal_ , of the system I
created to help me write a book about gnuplot:

[https://lee-phillips.org/panflute-gnuplot/](https://lee-
phillips.org/panflute-gnuplot/)

~~~
ddrt
Everything in that link can be done with inDesign and having data. Finding a
way to complete using console or alternative applications would take hacking
the inDes app or finding some sort of IFTTT sort of automation when needed,
then saving as a high res image, and referencing as a link in your console
layout doc. At the end it would have to compile as an image into something
(might as well be inDesign) and at that point why not just layout the book
with inDesign from the start? Writing the book in a text doc with some tagged
markdown for rules, linking text to connected and flowed into styled text
boxes that have rules assigned to them, and generating all the charts and
sheets necessary to complete. Visual communication isn’t a strongpoint in code
interfaces.

~~~
leephillips
I’m not sure I understand your comment, but I believe inDesign is a
proprietary, closed-source product, probably driven mainly through a GUI. My
goal was to write my book in vim. All I need to do is type, and the book comes
out, including a visual index of all the plots in the book. Every link in the
chain, and every tool I used, is open source (and free). The result is
_exactly_ what I want. To each his own, but the project, described in my
article, is to create an interface for me as an author. That interface is
typing in vim, using a set of tags I created for the purpose.

------
airstrike
I know this is tangential, but I would love for someone to talk about writing
a more visual type of book, full of images, tables and charts for the business
world.

A table like the one in the first screenshot of this post works well because
the author is not repeatedly iterating on it, there's very little text and
information flows top-to-bottom very neatly. That's great, but it's also
extremely basic.

Take a look at something like
[https://www.jpmorgan.com/jpmpdf/1320605428574.pdf](https://www.jpmorgan.com/jpmpdf/1320605428574.pdf)
and imagine writing _that_. How do you lay things out on a page? How do you
make content fit a layout? There's no grid.

The reality is people use PowerPoint to do that, but PowerPoint is a slide
authoring tool that assumes you have a few bullets, maybe one or two images
per slide.

Dense presentations make its shortcomings obvious and quite painful.

It boggles the mind that with all of the resources dumped into CSS/JS and web
development in general, nobody has leveraged that experience to build an
authoring tool that's 21st-century ready, with version control, with a clear
separation but nonetheless linked relationship of raw data, actual content
output and formatting and final publishing into PDF.

What am I missing?

EDIT: one more example for good measure
[https://www.jefferies.com/CMSFiles/Jefferies.com/files/W%201...](https://www.jefferies.com/CMSFiles/Jefferies.com/files/W%201420%201%20Al%20Noor%20Hospitals%20Group.pdf)

~~~
Symbiote
Those aren't books, they are presentation slides.

Using Powerpoint, for every slide the author chose (potentially) a different
Powerpoint template (2×1 columns, 2×2 etc). They have complete freedom to
"break" the structure, such as with callouts pointing to the "other" column,
images going beyond the margins.

A automatic template removes this flexibility, but allows scripting or
rebuilding the document with different text/data. That's the compromize.

Remark.js achieves some of the most basic parts of this, but would need some
fiddling to add some CSS grid support and/or default templates:
[https://remarkjs.com/](https://remarkjs.com/) (Except for being ugly,
[http://mobmad.github.io/js-tdd-erfaringer/](http://mobmad.github.io/js-tdd-
erfaringer/) shows some possible structure with Remark.js).

~~~
marvindanig
Going by the strict definition of a book [1] a file, a webpage or a website
isn’t a book either.

[1]
[https://en.m.wikipedia.org/wiki/Book](https://en.m.wikipedia.org/wiki/Book)

------
Klasiaster
It's even possible to replace (Xe)LaTeX with weasy¹, a Python HTML-to-PDF
converter. It supports two-colums via CSS, automatic CSS hypens, CSS page
counters and embedding SVGs. I just needed an HTML header with CSS in the
markdown file.

    
    
        $ pandoc --filter pandoc-citeproc --csl ieee.csl --bibliography=paper.bib --smart --normalize -f markdown+multiline_tables+inline_notes -t html5 -V margin-top:0.5in -V margin-bottom:0.5in -V margin-left:0.5in -V margin-right:0.5in -o output.html input.md
        $ python3 -c "from weasyprint import HTML; HTML('output.html').write_pdf('output.pdf', presentational_hints=True)"
    

For LaTeX-style math equations I added mathjax-pandoc-filter² as filter to the
pandoc args:

    
    
        --filter ~/node_modules/.bin/mathjax-pandoc-filter -Mmathjax.centerDisplayMath -Mmathjax.noInlineSVG
    

¹ [https://weasyprint.org/](https://weasyprint.org/) ²
[https://github.com/lierdakil/mathjax-pandoc-
filter](https://github.com/lierdakil/mathjax-pandoc-filter)

~~~
leephillips
This is a very interesting (open source) project that I didn’t know about;
thank you for mentioning it.

But it doesn’t replace LaTeX, as it doesn’t produce the same results. A glance
at the sample documents reveals the ugly typography resulting from the word-
processing layout strategy employed in web browsers. This is confirmed in the
documentation. So this could be useful if you have an existing set of HTML
pages that you need to convert to PDFs, but, if you’re starting a project
where you want to produce both HTML and PDF, this should not be part of the
solution.

~~~
j88439h84
I cant tell the difference between this layout quality and latex. What are you
noticing?

~~~
leephillips
The first things that jump out are the large and uneven gaps between words and
the “color” variations among paragraphs. What I mean by the “word-processing
layout strategy” is the algorithm where, when you run out of space on a line,
you simply break the line at the end of the previous word, fill up the space
(for justified text) by expanding the spaces between words, and begin the next
line. When you get to the end of the paragraph you go on to the next one. The
TeX layout engine, in contrast, makes several passes over each paragraph,
adjusting the line breaking (including hyphenation) in order to optimize its
appearance (which includes such things as trying to avoid successive
hyphenated lines); then, when the page is set, it goes over the entire page to
try to equalize the density, or color, among paragraphs.

~~~
yiyus
Maybe you already knew about it, but the microtype package improves the aspect
of your documents even more:
[https://ctan.org/pkg/microtype](https://ctan.org/pkg/microtype)

------
RMPR
To emulate the live preview, there is a neat piece of software called entr[1],
from their main page, you can do something like:

    
    
        ls | entr make
    

And whenever you save a change, the build is triggered and the preview is
updated.

[1]: [http://eradman.com/entrproject/](http://eradman.com/entrproject/)

~~~
JoshMcguigan
The pipeline approach here is interesting, but it seems you'd be on your own
for filtering out changes in the build directory, etc. I typically use
watchexec [1] for this.

    
    
        watchexec make
    

By default, watchexec will filter out changes in files based on `.gitignore`.

[1]:
[https://github.com/watchexec/watchexec](https://github.com/watchexec/watchexec)

~~~
RMPR
This is pretty good, didn't know about watchexec, but you can achieve the same
by choosing carefully the command you pipe from, for example:

    
    
        ls *.md | make
    

Will only trigger the build if a md file is modified, which is what I think
the author is interested in.

------
frozenlettuce
After reading a couple posts here on HN about building a "second brain", I
found a surprisingly effective setup to do that:

\- Vim with vimwiki
([https://github.com/vimwiki/vimwiki](https://github.com/vimwiki/vimwiki))

\- A private Gitlab repo

\- A simple cron job to commit all changes in `~/.vimwiki` to my private repo

And this is it! It would be possible to publish the wiki on the web using
Gitlab pages, but so far it is working nice to me.

------
ggambetta
Similar story here. I wrote and self-published a novel, both for e-readers and
paperback, using only open-source tools, mainly around Pandoc. I wrote some
more details here:
[https://gabrielgambetta.com/tgl_open_source.html](https://gabrielgambetta.com/tgl_open_source.html)

------
Finnucane
"SVG is well supported with EPUB"

SVG is part of the standard, but not well supported by all epub reading
systems. Some displays will fail, some will display as small non-scalable
images. Apple's iBooks reader is one of the better ones in that regard.

------
LeonM
I was building a new API recently, and was looking for a good documentation
solution.

The commercial cloud based solutions (Gitlab, Confluence, et al) are pretty
good, but you have to keep paying or your documentation disappears. Self
hosted Wiki or documentation solutions were also out, due to the pain of
migrating content in and out.

We ended up with a very simple solution of Markdown + CSS + Pandoc + make.
Pandoc takes the CSS and MD files as input, and outputs HTML. The MD files are
in the API repository, deployment has been setup so that the latest
documentation is deployed automatically with each API update.

~~~
bryan2
Excuse me if this is a dumb question but did you consider swagger?

~~~
LeonM
There are no dumb questions.

I did have a look at swagger, but it felt way to bloated and complex for what
we wanted. With Markdown we know that even in 10 years time when services like
swagger are long gone, it'll be possible to view markdown files. Also, there
is barely any learning curve with Markdown.

------
pianomanfrazier
For any interested, here is my Pandoc book writing setup.

I have a couple bash scripts that I use to call pandoc to generate PDFs, HTML,
or ePub.

Here is the repo [https://gitlab.com/pianomanfrazier/pandoc-markdown-
book](https://gitlab.com/pianomanfrazier/pandoc-markdown-book) and here is my
blog post [https://pianomanfrazier.com/post/write-a-book-with-
markdown/](https://pianomanfrazier.com/post/write-a-book-with-markdown/)

------
DyslexicAtheist
pretty cool. I'm using a similar setup that allows a real-time preview of
every change by means of the `entr`[1] command and gets triggered by saving
the markdown.

    
    
      ls ./presentation.md |entr -c bash -c "pandoc --pdf-engine=xelatex --toc -N presentation.md -t beamer -o presentation.pdf; killall -HUP mupdf"
    

this would reload the pipeline and update the content of the pdf output. easy
as 1-2-3 (no Makefile though which would be another step).

[1]
[https://www.systutorials.com/docs/linux/man/1-entr/](https://www.systutorials.com/docs/linux/man/1-entr/)

------
asicsp
Nice and thanks for sharing your setup. The footer is very informative, but I
use GitHub style markdown, need to check if there's some workaround. For epub
customization, this article [0] might help. Good luck for your book.

Here's how I generate PDF with pandoc+xelatex [1] I use gvim as my editor and
have mapped a key (which then executes a shell script) to generate the book.

[0] [https://cmichel.io/how-to-create-beautiful-epub-
programming-...](https://cmichel.io/how-to-create-beautiful-epub-programming-
ebooks/)

[1] [https://learnbyexample.github.io/tutorial/ebook-
generation/c...](https://learnbyexample.github.io/tutorial/ebook-
generation/customizing-pandoc/)

------
sevensor
I did this too, although the makefile presented in the article is _much_
cleaner than mine was. Definitely recommend XeLaTeX. You're not going to get
very far without unicode. I had to drop down to LaTeX often to control
formatting, but Pandoc helpfully lets you do that.

------
andrepd
>It allows to move the cursor past the last character. If you insert a new
character there, it is automatically padded with spaces. It is easier to see
it than to explain it:

>My first programming environment was Turbo Pascal, and this is exactly how
the cursor works there, which I grew accustomed to.

Holy shit! What a rush of memories reading that unlocked :)

------
maxmunzel
Vim + Pandoc + Beamer + pdfpc is also the best way to write Presentations I
have found so far:

[https://github.com/maxmunzel/talk-algorithms-for-np-hard-
pro...](https://github.com/maxmunzel/talk-algorithms-for-np-hard-problems)

------
btreecat
Nice to learn how others approach this task. Curious, with so many extensions
to markdown, why not use something like ascii doc or rst instead?

~~~
Rotareti
Same here. I don't understand why AsciiDoc doesn't get more attention.

------
ggerules
A couple of things that might be of interest:

1) pandoc is awesome. 2) There are integrated development environments that
allow you to write in markdown and output to pdf, html, and word with the
flick of a switch. Rstudio with knitr, bookdown, and markdown has some nice
functionality. Plus you can do graphs and drawings and embed them in the the
rmd (r markdown) text. 3) There is an earlier post in HN from Gilles Castel on
how to speedily write text through the ultisnips package. Very much a game
changer on how I use vim to work with anything text related.

[https://castel.dev/post/lecture-notes-1/](https://castel.dev/post/lecture-
notes-1/)

Nice post!

~~~
lozf
> [https://castel.dev/post/lecture-notes-1/](https://castel.dev/post/lecture-
> notes-1/)

Very nice! Thank you.

------
BooneJS
Can you refer to figures and have the name rendered? For instance, a piece of
text referring to a figure and the figure’s label will both get rendered to
“Figure 2.8” regardless of paragraph edits and figures inserted or delete
before it?

~~~
Symbiote
For anything beyond the most basic document, use Acsiidoc rather than
Markdown.

I prefer to use the Asciidoctor toolchain, but it's compatible with Acsiidoc.

------
jojo14
Nice article. I'm all for it ! I use Pandoc and Makefile as well. Except I use
Emacs and Inkscape for SVG graphics. IMHO this is the way to produce documents
in the 21st century.

------
paultopia
Pandoc and make totally. I have a personal workflow for all my academic
articles; which is expanding to books, and which involves chapters stored in
individual markdown files in github, pandoc to build with a makefile to tie it
all together[1], and zotero spitting out CSL json bibliographies, etc.

Honestly, though, I find writing in Markdown in vim/emacs to be really
unergonomic. The unit of writing in code is the line (or the function or the
block or the s-expression or whatever depending on language and task---
something comprehensible to vim and emacs anyway); but with prose in Markdown
the unit of writing is much less defined---sometimes sentence, sometimes
paragraph, sometimes clause... and the movements just don't work for me there.

So I just do the writing in Sublime Text. Seems to work for me.

[1] But I'm not nearly good enough with make to do complicated things like
back up or build every file in a directory. I hacked together my own backup
utility @
[https://github.com/paultopia/writingBackup](https://github.com/paultopia/writingBackup)

------
zmmmmm
I used LiterateMarkdown [1] to write a PhD thesis in Markdown with reasonable
success. It's main feature is being able to read Jupyter notebooks and include
computations and plots using R, python and Groovy inline, so pretty handy for
a thesis with lots of data analysis.

A beauty of using Pandoc is you can translate to Word for people who insist on
that for review / edit / comments.

One tip: on OSX I use Skim as the PDF previewer. It is not a particularly
great or special PDF viewer except for one thing: it does live update of the
PDF without shifting the position of the page, even if you are zoomed in etc.
This means you can work on a section iteratively and watch it update live as
you work, which is pretty handy for proof-reading what you are writing.

[1] [https://bit.ly/2XzTpSy](https://bit.ly/2XzTpSy) [2] [https://skim-
app.sourceforge.io/](https://skim-app.sourceforge.io/)

------
roland35
Thanks for sharing this work flow and also for writing this book! I sometimes
think I would like to write a book about microcontroller basics (which I've
collected knowledge from countless blogs and white papers) but I know it's a
huge project!

Also it is cool that you can run draw.io yourself! I've used yed for
documentation but this looks nice and is capable.

------
0az
I use something very similar for mathematical homework and notes: MacVim, with
a Makefile that runs Pandoc with the Eisvogel template.

I also have a script that runs fswatch to run make on save.

Didn't know about virtualedit, though: tables are going to be so much easier
now.

------
snide
Somewhat related. I highly suggest the Goyo plugin for Vim if you want
distraction free writing.

[https://github.com/junegunn/goyo.vim](https://github.com/junegunn/goyo.vim)

------
napsy
This reminds me of my own project to use pandoc for generating blog posts
[https://outfloor.org/](https://outfloor.org/)

------
JoshMcguigan
Thanks for sharing. Your book sounds very interesting to me. I like that it
targets generating real ARM assembly.

I've signed up for updates, looking forward to the release!

------
hmcamp
Well done. Thanks for sharing your process! Looking forward to the completed
book.

------
MattBlissett
Here's a result [1] from a system I've put together, primarily using
AsciiDoc(tor) and PO4A[3], to allow us to write a source document then
translate it into multiple languages. It produces HTML and PDF, but ePUB is an
option too.

Using AsciiDoc rather than Markdown has several benefits. The language
supports many common book features, especially for technical books, like those
"! Warning here" callouts, cited quotes, captioned figures/tables/codeblocks,
internal links, I think even an index. It's also a lot more stable; I'm not
concerned that there will be significant syntax changes in 5 years time. The
user manual [2] is the quickest way to see what AsciiDoc can do.

PO4A is an adaptation of GNU GetText to use on prose. PO4A's output can input
into a typical translation workflow -- distributing the files, or using online
translation services. It mostly supports AsciiDoc, though there are some bugs,
and outputting a PO file directly from AsciiDoctor (with a plugin) might be
better -- PO4A parses AsciiDoc itself.

The code is at [4]. It's in slow development when necessary for new documents;
I don't particularly intend to polish it for release or wider use.

KiCAD's documentation was the best example of something similar (AsciiDoc +
PO4A) to what I've put together.

The missing pieces, which are closely related, are translatable and flexible
diagrams. AsciiDoctor supports plenty of diagram tools, but none of them can
do this. For example, the diagram at [6] is an SVG, which (since it's XML) can
be translated using PO4A. However, in French the longer text spills out of the
boxes. The previous diagram is an image, for this reason.

 _Is there an open-format (preferably open source) diagramming tool, which
supports wrapping text, and even resizing "too long" text?_ I would be very
interested!

[1] [https://docs.gbif.org/collections-idea-
paper/](https://docs.gbif.org/collections-idea-paper/) or (in progress)
[https://docs.gbif.org/effective-nodes-
guidance/1.0/](https://docs.gbif.org/effective-nodes-guidance/1.0/)

[2] [https://asciidoctor.org/docs/user-
manual/](https://asciidoctor.org/docs/user-manual/)

[3] [https://po4a.org/](https://po4a.org/)

[4] [https://github.com/gbif/gbif-asciidoctor-
toolkit/](https://github.com/gbif/gbif-asciidoctor-toolkit/)

[5] [https://gitlab.com/kicad/services/kicad-
doc](https://gitlab.com/kicad/services/kicad-doc)

[6] [https://docs.gbif.org/effective-nodes-
guidance/1.0/en/#box-e...](https://docs.gbif.org/effective-nodes-
guidance/1.0/en/#box-example-staff-roles)

------
marvindanig
Books are not files though!

Pandoc is great. make and vim are great too, but as you can see these tools
will produce PDF files, HTML files, text files, markdown files and a lot
jargon that the readers simply aren’t interested in. I mean normal readers
here and not tech folks holed up inside a terminal with a homebrew theme.

