
Writing a Book with Unix - webappsecperson
https://joecmarshall.com/posts/book-writing-environment/
======
tombert
I will not name the company, but a book deal I was working on fell through
because I refused to install MS Office (since I didn't have a Mac and I
refused to install Windows), they refused to accept markdown or LaTeX, and I
couldn't get their template working in LibreOffice.

The part I find funny was that the book was about doing network server
development with Haskell...on Linux.

~~~
johannes1234321
There is a reason for word: Reviewers are used to Word's versioning and change
tracking system, where they can suggest edits and the author can accept them
or not. My publisher also offered using .prn files but then the reviewers
wouldn't have done the full job ... (in the end I cancelled the contract for
other reasons, so no idea how well that would have worked out)

~~~
simula67
Can Google Docs replace MS Word in this workflow ? It seems intuitive enough
and there can be real time collaboration. This might allow authors to not
install MS Windows and MS Word

~~~
johannes1234321
I don't have experience with Google docs, but I don't think they produce
layouts which are good enough™ for book printing.

Google docs change tracking probably could be taught, but you are dealing with
people who spent lots of time in their workflow and who are resisting to adapt
for a single author. Mind that those reviewers and editors switch from book to
book depending on frequency of author's feedback. The author is just one
between many ...

~~~
AnonymousPlanet
I beg your pardon? I have never heard of a decent publisher doing the
_publishing_ in MS Word, i.e. the process from formatting to printing. The
_editing_ most often gets done in Word. As far as I know, Word lacks the
formatting capabilities to do professional publishing. But maybe I just
misunderstood you, or things have thoroughly changed in the last couple of
years.

~~~
johannes1234321
My story was ~10 years ago. The publisher asked to install a specific printer
driver (as Word's rendering depends on printer settings) and using a specific
Word template (defining margins, formatting, ...) and final result would be
printing to .prn files.

If you look at many contemporary books typesetting is not an important topic
for many.

------
dredmorbius
I've been writing on Linux/Unix systems since the late 1980s, including now a
few book-length projects. Tools have varied over the years, including some uni
work in nroff, HTML, and more recently, LaTeX and Markdown, among other markup
languages.

LaTeX strikes me as the ultimate tool, and far less intimidating than most
people seem to think (Laport's book is an excellent intro), though for most
purposes, Markdown is more than sufficient. In practice, I tend to use
Markdown and either include inline LaTeX as needed, or convert to LaTeX and
continue editing in that where finer-grained control is necessary.

I'd add pandoc to the toolkit, as well as GNU Make. With the two, I've got a
standard makefile that can output a wide range of formats (I refer to them as
"endpoints") ranging from ASCII text to standalone or snippets of HTML, PDFs,
PS, MS Word, OSX, Mediawiki, and others. Adding a new endpoint is a simple
matter of tweaking the makefile.

One piece of organisational advice: _Do NOT apply your chapter numbers to your
filenames._ Instead, allow your principle document outline, using an include
structure, to define the flow of the text. Depending on the project size and
complexity, I'll either directly include chapters within that level, or have a
top level of parts with chapters specified (as second-level includes) within
those.

As you decide you need to re-work flow, this becomes much easier to manage and
rearrange than if you'd pre-labled the files themselves.

~~~
User23
TeX is necessary when you want to go to print and have a high level of
quality, especially if any math is involved. For screen display Markdown is
sufficient.

~~~
dredmorbius
In addition to HTML for online viewing, I tend to produce PDFs or ePub formats
for reading, either onscreen, on a tablet, or (very rarely) printed. I've
discovered that consistent pagination really matters to me for content
retention, an argument which still favours PDFs over many alternatives (though
Postscript and DJVU offer similar capabilities).

Markdown is _very_ nearly fully sufficient, and for almost any nontechnical
work with minimal art or layout, will suffice. It falls flat in some
interesting areas:

There's no underbar/underline markup.

There is no native colour markup.

There is no formula support -- not something typically encountered in most
texts, but when you need it, you need it.

There is no fine-grained placement control for callouts, boxes, figures,
images, etc. They simply appear where they happen to be dropped on a page.

(In several of these cases, you can revert to embedded HTML or styles, which
are fine _when rendering to HTML_ , but this won't be picked up by all Pandoc
endpoints.)

Mind, if a work consists of nothing more than text, bold, italic,
strikethrough, super/sub-script, lists, tables, sections, footnotes/endnotes,
and images whose placement is _not_ critical, _Markdown is entirely
sufficient._

But if you find you need more control, exporting to LaTeX and doing your final
editing there will buy you a great deal more control.

------
akandiah
I'm glad there's always LaTeX.

For any serious writing, Microsoft Word is one of the worst pieces of tools
out there. It starts showing its ugly side when start using anything remotely
advanced.

~~~
dagw
_For any serious writing, Microsoft Word is one of the worst pieces of tools
out there_

Why? I'm no big fan of Word and doing any sort of layout in Word is painful,
but for actual writing I don't see a problem. It has good tools for outlines,
TOCs, reviews and comments, tracking changes, footnotes and citations (when
combined with EndNote) and just about anything else I've ever needed. Hell the
equation editor even lets you use LaTeX if you're into that sort of thing.
What makes Word so terrible in your eyes?

And yes I wrote my Masters thesis and most of my other university work in
LaTeX, so I know what I'm comparing it to.

~~~
akandiah
A couple of things I've noted over the years:

\- Fields, or anything that uses fields: They're completely broken. They only
seem to function in straight-forward case. For e.g: it's very easy for
sequences to completely go off whack.

\- File comparisons: this feature can only tolerate up to a certain number of
pages. It will crash and burn for large documents. I sometimes wonder why this
is even offered!

\- Formatting: No matter how carefully paragraph spacing etc. is controlled
there are always instances where things won't work as intended. A good example
is cover pages.

\- Hanging and freezing on large documents while Word figures out how to
render them.

\- Font rendering: it's never great. It never looks the same as what it would
on a printer (irrespective of the printer). The Mac version does a decent job,
but the Windows version, even with ClearType configured, is not great.

\- Lastly, my biggest issue: feature incompatibilities between the versions of
Word available on Mac and Windows. The Mac version (the latest O365) release
doesn't have a lot of advanced features that are available on Windows such as
document signing and style breaks.

------
timClicks
Was hoping that this would be using the original UNIX typesetting tools like
troff. Also, surprised to see that asciidoc wasn't deployed over markdown.

~~~
svnpenn
i was recently turned on to asciidoc from seeing that its supported at github

[https://github.com/github/markup](https://github.com/github/markup)

and ironically development seems to have stalled on markdown with the advent
of commonmark

[https://github.com/commonmark/CommonMark/issues/558](https://github.com/commonmark/CommonMark/issues/558)

[https://github.com/commonmark/CommonMark/issues/559](https://github.com/commonmark/CommonMark/issues/559)

[https://github.com/commonmark/CommonMark/issues/560](https://github.com/commonmark/CommonMark/issues/560)

~~~
nitemice
I don't understand how these links indicate that markdown development has
stalled.

All these issues were only created recently, by one person, and are actually
not in the right place for the type of discussion they're trying to prompt.
CommonMark have a separate
[forum]([http://talk.commonmark.org/](http://talk.commonmark.org/)) for
feature discussion, which seems quite active.

~~~
jabl
Looking at [https://spec.commonmark.org/](https://spec.commonmark.org/) , it
does seem it has stalled. They'll need to put in another gear if they're ever
going to reach 1.0 (if that's a goal..?).

------
numbers
I like that you mentioned `bat` and `ag`, two of many favorite CLI tools.

Others you might want to checkout not necessarily for writing a book but
general CLI pleasantness:

\- fzf ([https://github.com/junegunn/fzf](https://github.com/junegunn/fzf))

\- autojump
([https://github.com/wting/autojump](https://github.com/wting/autojump))

\- jq ([https://stedolan.github.io/jq/](https://stedolan.github.io/jq/))

\- fd ([https://github.com/sharkdp/fd](https://github.com/sharkdp/fd))

~~~
ipozgaj
If you use and like `ag`, I suggest taking a look at ripgrep (`rg`). It seems
to be by far the fastest out of three (`ack`, `ag`, `rg`). And it has a pretty
interesting codebase (written in Rust).

~~~
Myrmornis
If you're working in a git repository then IMO the most appropriate search
tool is simply `git grep`. I don't think there's any reason to use ripgrep,
ag, ack etc in that situation. (Personally, if I'm working with text files,
then I'm nearly always in a git repo.)

~~~
burntsushi
(author of ripgrep here)

Well at least one reason is because ripgrep is faster. On simple literal
queries they'll have comparable speed, but beyond that, `git grep` is _a lot_
slower. Here's an example on a checkout of the Linux kernel:

    
    
        $ time rg '\w+_PM_RESUME' | wc -l
        8
        
        real    0.127
        user    0.689
        sys     0.589
        maxmem  19 MB
        faults  0
        
        $ time LC_ALL=C git grep -E '\w+_PM_RESUME' | wc -l
        8
        
        real    4.607
        user    28.059
        sys     0.442
        maxmem  63 MB
        faults  0
        
        $ time LC_ALL=en_US.UTF-8 git grep -E '\w+_PM_RESUME' | wc -l
        8
        
        real    21.651
        user    2:09.54
        sys     0.413
        maxmem  64 MB
        faults  0
    

ripgrep supports Unicode by default, so it's actually comparable to the
LC_ALL=en_US.UTF-8 variant.

There are other reasons. It is nice to use a single tool for searching in all
circumstances. ripgrep can fit that role. Maybe you don't know, but ripgrep
respects your .gitignore file.

~~~
Myrmornis
Thanks! I knew ripgrep was praised in particular for its performance but I
didn't know the difference was that large. The repo I usually work in has 8.7M
lines of code and I had been finding `git grep` performance very adequate (I
use it in combination with the Emacs helm library where it forms part of an
incremental search UI, and hence gets called multiple times in quick
succession in response to changing search input.) It looks like it will be fun
to try swapping in ripgrep as the helm search backend; I'll try it.

------
flocial
Org mode and pandoc work quite well for a similar workflow. The ability to
move around chapter trees in org mode is a godsend. It's crazy to see how far
the art of "word processing" deviated from WordStar days. MS Word's
proprietary doc binary didn't help either (people would mess up formatting and
lose entire documents). It's nice to see the focus come back to content and
streamlining production with reproducible formatting.

------
dcchambers
I love articles like this. I write quick daily notes on my computer in
markdown and back them up in a GitHub repo. (Using this fun little script:
[https://github.com/dcchambers/note-
keeper](https://github.com/dcchambers/note-keeper)) It's worked really well
for me and helps me easily synchronize my notes between systems.

I love the elegance and simplicity of plain-text notes.

~~~
noir_lord
vscode has these two phenomenal plugins[1] that together convert vscode into a
true journal with basic check lists and the ability to add arbitrary markdown
notes to any particular entry.

It works wonderfully as a programmer journal since I generally have vscode
open anyway (for gitlens even when I'm working in intellij) the friction is
close to zero.

[1]
[https://marketplace.visualstudio.com/items?itemName=pajoma.v...](https://marketplace.visualstudio.com/items?itemName=pajoma.vscode-
journal) and
[https://marketplace.visualstudio.com/items?itemName=Gruntfug...](https://marketplace.visualstudio.com/items?itemName=Gruntfuggly.vscode-
journal-view)

------
contras1970
i wonder why he has the "wrapper over wc" which does just what wc does?

    
    
        #!/bin/sh
    
        total=0
    
        for FILE in `find . -type f -name "*.txt"`
    
        do
            wc -w $FILE
            words=`wc -w < $FILE | tr -d ' '`
            total=$(($total + $words))
        done
    
        printf "%'d" $total
    
        echo " words"
    

all this achieves is

    
    
        wc -w $(find . type f -name "*.txt") | sed '$s/total/words/'
    

and frankly, i'm not sure the total->words substitution is worth the trouble.

then there's the inefficiency of running wc twice per file. while this is not
exactly bitcoin-level disaster, it rubs me the wrong way...

    
    
        wc -w $(find . type f -name "*.txt") |
        awk -v t=0 '
          { print; t += $1 } 
          END { print t, "words"; }
        ' 
    

personally i'd just do this (in zsh):

    
    
        wc -w **/*.txt(.D)
    

the (.D) is two "glob qualifiers": the . (dot) limits the result to plain
files, the D turns GLOB_DOTS on for the pattern.

------
thangalin
Of possible interest is my open-source, Java-based desktop Markdown editor
with live preview and variable interpolation.

* [https://github.com/DaveJarvis/scrivenvar](https://github.com/DaveJarvis/scrivenvar)

* [https://github.com/DaveJarvis/scrivenvar/blob/master/USAGE.m...](https://github.com/DaveJarvis/scrivenvar/blob/master/USAGE.md)

The software provides a simple way to include variables in technical
documentation. It also integrates with an R engine for editing R Markdown
files, which can also use variables sourced from an external YAML file.
(Editing XML documents that have stylesheets is possible, too.)

My authoring workflow involves Scrivenvar, Markdown, pandoc, knitr, and
ConTeXt. As Markdown separates content from presentation, I prefer ConTeXt to
LaTeX for the same reason.

------
boazbarak
I am writing a book on introduction to theoretical computer science in
markdown and use pandoc to transforming it into HTML, Latex (and from there to
PDF) and MS Word. (The latter format is rather buggy at the moment, but I am
including it because I've heard from visually impaired students that it is
often the easiest format to read as you can control the font size.)

I've now put my scripts on
[https://github.com/boazbk/tcs/tree/master/scripts](https://github.com/boazbk/tcs/tree/master/scripts)
in case anyone finds them useful. (This is not a "plug and play" package that
you can install and use, but people that are better programmers than me might
be able to adapt it and improve on it.)

------
nategri
Was hoping to see some sed tricks, and would have settled for some vim, but I
guess I need to stop being such a gatekeeping grump about stuff. Hell, maybe
I'd _like_ SublimeText if I tried it.

------
nils-m-holm
Looks like the author is just using some apps that happen to run on Unix.

I write all my books using vi (not vim!), troff and friends, make, and
ghostview (gv) for the layout. Plus a couple of shell/awk/sed scripts for
making the TOC, index, etc. I cannot imagine any better tools for the job. I
tried LateX, which only got me into trouble, and Lout, which was fun, but too
complex in the end. After 20-something books, above turns out to be the sweet
spot.

------
emgee_1
Looks like everything he does can easily accomplished using emacs orgmode and
pandoc

Ripgrep is maybe an alternative for ag

~~~
_emacsomancer_
Any sufficiently advanced text management system contains an ad-hoc,
informally-specified, bug-ridden, slow implementation of half of Org mode.

------
Myrmornis
> I could use a git repo to keep a backup of the book and ... ag (basically a
> faster grep)

IMO if you are using git then you should use `git grep` rather than
ag/ripgrep/ack/grep etc.

------
boomlinde
On the "txt" script I want to note that this does the job for files like those
in the example that have no whitespace:

    
    
      wc -w $(find . -type f -name "*.txt")

------
Insanity
I will give bat a try. That looks nice :)

~~~
O1111OOO
> I will give bat a try. That looks nice :)

Two of the three comments already referenced _bat_. It's also what impressed
me the most. It looks great. Link below:

[https://github.com/sharkdp/bat](https://github.com/sharkdp/bat)

------
freedman1611
Misleading title, I thought we has writing his book on an ancient AT&T UNIX
mainframe. He's using Linux or MacOS, bfd. Why do people throw around the word
Unix to describe Linux is beyond me. MacOS claims it's Unix too, they payed
some organization to get them a Unix cert, but we all know it's
BSD/Mach/Darwin rewrite. Nobody is really using the original Unix.

~~~
coldtea
It's because most people don't confuse the etymology or history of a term with
its actual use and current meaning -- and aren't stuck up with BS pedantic
distinctions.

There's no "original Unix" (except the first Unix back in the day). There's
lineage of operating systems.

Heck, even the people who actually created UNIX in the 70s and early 80s don't
have such stickups.

