
My eBook build process and some PDF, EPUB and MOBI tips - ejpastorino
http://patshaughnessy.net/2012/11/27/my-ebook-build-process-and-some-pdf-epub-and-mobi-tips
======
KeliNorth
It takes a significant amount of effort to do this all personally, this is
very commendable. I self-publish and work for a company that helps others
self-publish by editing/designing and converting their work, and it's always
something very involved.

It's not as simple as just clicking convert either. Tables render differently
in Epub and kindle, TOC creation can be a veritable mess and table rendering
in Epub and Kindle conversions look different if done wrong.

Mobi conversion tools may or may not fix issues that'd appear if the official
Kindle KDP tool is used, meaning a converted Mobi may end up looking entirely
different than the same file once uploaded to Amazon.

Graphs... I don't even want to think about dealing with those and Kindle,
congratulations on doing all of this yourself without screaming. My first
couple book conversions were learning experiences, and they were mostly plain
text (chapter fiction books) without varying fonts and diagrams.

So, if I had to sum up what I've learned into one idea that'd make the entire
process easier, it'd be this:

Write plain text or rich text. Don't format when writing. Do write as non-
fancy as possible in your master text. Then when you have to add stuff to it
later, you don't discover little surprises that throw everything off.

For example: Don't indent. Kindle auto-indents when converted if indentation
hasn't been defined in the style. (The official Kindle will, and only
currently, who knows if it'll change in the future, and a Mobi generator
generally doesn't.) So using TAB indents throughout a book instead of MS
Word's feature will cause a massive headache just when you think you're done.

Since I work on Windows, the tools can be rather simple. Create a document in
MS word. Save as a web page (filtered) - to remove some of the junk word
creates in html. Edit the html to remove a couple other quirks, set tags for
chapters, remove unicode that Kindle won't recognize (some are, some aren't),
save, convert, and check what you missed. The less special your document has
to look, the easier it'll be.

Again, congratulations, welcome to the world of self-publishing. It can give a
nightmare of a headache, but once it's done, the joy of knowing you created
something on your own is magical.

~~~
pat_shaughnessy
Thank you :) Yes! "magical" is a great way to describe the experience...

As you suggest, I tried to keep things simple with the text, but in the end
since I use a lot of code samples and diagrams it was difficult to do that.
And yes, rendering graphs was also a challenge. I didn't get into those
details in the blog post, so let me know if anyone is interested in hearing
more about that.

------
ilamont
Great review. I was not aware of Bookshop, thanks for sharing your deep-dive
into using it and other tools.

I began writing ebooks over the summer. To date, I have written four
educational/reference titles around a theme -- "In 30 Minutes". The idea is to
let newbies quickly understand a mildly complex topic. The most recent title
is _The $10 Small Business Website In 30 Minutes_ (1).

Not only do these books contain lots of screenshots and detailed TOCs, I also
publish them in multiple formats -- .mobi, ePub, PDF, and PDF for paperback.

Unfortunately, I discovered that the most popular writing tools -- Microsoft
Word, Google Docs, and Apple Pages, are not up to the task of creating ebook
files across all of the platforms. Even if they can export to a certain
format, there are limitations that force additional production and conversion
steps. I also encountered the problem of forked masters, when I had to use
different tools for exporting/converting to special formats.

Currently, I am using Google Docs for composition and collaboration with my
copy editor. I then copy and paste the text into Scrivener (2), which is the
most powerful writing and publishing tool I have ever used. It exports to
.mobi, ePub, PDF, and print book PDF. Like TFA, I use Kindle Previewer to test
the .mobi files. The ePub files need some mild HTML cleanup, which I do in
Sigil (3).

1\. <http://10dollarsmallbusinesswebsite.com/>

2: <http://www.literatureandlatte.com/scrivener.php>

3: <http://code.google.com/p/sigil/>

~~~
__mharrison__
I'd be interested to hear what cleanup the ePub files need after the scrivener
export. Is it just tweaking styles or does the ebook not render correctly on
some devices. Am also curious about how the scrivener mobi export compares to
kindlegen. Does it do KF8?

[EDIT] - From looking at the scrivener site, they require kindlegen for mobi
generation.

~~~
ilamont
There are a few issues that I encountered when I tested Scrivener ePub output:

\- Chapter suffixes not being properly appended

\- Extra spaces appearing after headings/subheadings/sub-sub-headings

In addition, I like to look at the HTML and CSS associated with images to make
sure they are not being reduced in size during the ePub compile process. This
was an issue I encountered with Word and Pages. So far it hasn't cropped up in
Scrivener, but I need to manually confirm this for peace of mind.

Regarding Kindlegen: Scrivener incorporates Kindlegen into the compile process
for .mobi files. It's nearly seamless, and generates good output when I test
in Kindle Preview and the Kindle app for iPad.

KF8: I wasn't aware of this issue until you pointed it out. It might explain a
spike of returns I experienced this month. See
[http://www.literatureandlatte.com/forum/viewtopic.php?f=34&#...</a> for more
information.

------
__mharrison__
Thanks for sharing this info.

As a Python developer and ebook writer I've gone a similar route. I'm
currently writing in emacs using reStructuredText as my base format and have
written code to generate [0] clean epubs which translate to mobi/kf8 pretty
well. Part of this is a CSS file I'm working on [1] to make common formatting
"just work" across the big ebook readers (old-kindles, newer kindles, ipads,
nook and kobo).

I've spent a little bit of time (and money) getting very nice pdf generation
working using sphinx and the memoir class for latex. It's not there yet as
I've been focusing on the ebooks.

Yes, it feels like you need to be a programmer to create ebooks right now. You
definitely need to feel comfortable editing html and css. I'm not completely
happy with rst as the base format, but I don't think there is another
lightweight markup language specifically targeted to authoring books. Even
sphinx which is supposed to be for documentation isn't really well suited
towards books. So there's little hacks here and there. Plus as I usually write
about Python related material, rst lets me "test" my books using doctest and I
can even templatize some stuff (I'm doing that in a book I'm currently working
on).

0 - <https://github.com/mattharrison/rst2epub2>

1 - <https://github.com/mattharrison/epub-css-starter-kit>

------
dangoor
I've published some children's fiction this year, but I suspect I will publish
programming works again at some point.

I'll give another plug for Scrivener[1], which is really a great tool. All of
the editions that I have in the major online bookstores come straight out of
Scrivener (though I obviously used an image editor for the covers).

I wanted to also mention out awesome Leanpub[2] is. Write your files in
Markdown (and they support code snippets well... definitely a service that
works well for software topics), save them in Dropbox. Press a couple of
buttons in your browser and you've got PDF, mobi and epub. And, you can sell
right away and keep 90% - 50 cents. They make it easy to publish early in the
process and keep readers up to date as you complete the work.

One bonus that's not as obvious: Leanpub also makes distribution to a sample
audience easy. You can generate coupon codes trivially.

I'm planning to go straight to Leanpub with my next technical work.

[1]: <http://www.literatureandlatte.com/scrivener.php> [2]:
<http://leanpub.com/>

~~~
pat_shaughnessy
The only complaint I've heard about Leanpub is that they don't give the author
access to end user email addresses. I don't use email as a marketing tool much
(almost never) but I would guess this could be a show-stopper for many eBook
authors who are considering their solution.

~~~
peterarmstrong
(Leanpub cofounder here.)

We changed this very recently: readers can new choose a checkbox to share
their email with the author, and this checkbox is on the purchase form as well
as on the reader dashboard.

(And, being Canadian, I feel the need to apologize. So, we are sorry it took
us so long to get this feature right!)

------
kroger
Interesting. I agree that using something like Pages or Scrivener (I _love_
Scrivener) can help you to focus on the writing and I wish I could have used
them to write my book [1].

However, I'm curious to know how he handled code examples; did he just pasted
them in Pages? One advantage of using something like Sphinx is to be able to
include code examples from external files. This makes it easy to update and
test the examples. Another killer feature is pyobject: it allows you to
include only a Python function or class from a larger file. It'd be nice to
have this feature for other languages, though.

I blogged about using Sphinx to write my book here:
<http://pedrokroger.net/2012/10/using-sphinx-to-write-books/>

[1] <http://musicforgeeksandnerds.com>

~~~
pat_shaughnessy
FYI I used a Ruby library called Coderay to handle code highlighting. This
wasn't integrated with Pages, but instead with the Bookshop Ruby gem after I
moved the text into HTML files.

The code was inline in the HTML, but set apart using pre tags.

------
arocks
It would be interesting to compare this with another author's experience [1]
using Sphinx. He also notes that EPUB and MOBI outputs are more challenging
than PDF ones.

[1]: <http://pedrokroger.net/2012/10/using-sphinx-to-write-books/>

~~~
kroger
I'm the author of this article. The main problem I found with EPUB and MOBI is
how they are implemented in each reader. For instance, kindle 2 doesn't format
tables correctly, while kindle 3 does. iBooks has some undocumented bugs that
makes you want to cry. If your book is simple (just text) there's no problem;
but if you have source code, tables, images, etc., you may run into problems.

~~~
arocks
Agree. As I was looking into various ebook production tools that are
commandline based, Sphinx and pandoc struck me as two very good solutions.

------
robomartin
Thanks for sharing. I recently launched into writing an ebook (educational,
kids). Even though I use Word for tons of stuff I became concerned that it
would insert all sorts of unwanted junk into the file. I looked around and
decided to start the process on Sigil. My reasoning was that I could easily
move the text into just about any other platform if I wanted to.

My first impulse was to simply write it all in a plain text editor and deal
with formatting and producing all the various file types later on. One file
per chapter, etc. However, with Sigil I can deal with images and TOC from the
very start, which might be an advantage.

It's interesting to read about how other's have approached this. It sounds
like the toughest part of the job might very well be getting the various
formats to look the way they should.

~~~
__mharrison__
If you are doing ebook only (no pdf) and it has limited formatting it should
be pretty simple to get it to work on most devices.

Though I've yet to actually publish physical books (only have proofs), I'm not
willing to commit to using HTML as the base format. (Sounds like princexml
might help though if you decide you want to go the physical route later).

------
zio99
It seems there's a huge opening in the market to make an end-to-end publishing
solution. We're all taking this scrappy approach* to making our books work
cross-device. It's something I'd gladly pay for.

*I wrote about my pipeline here: [http://startupframework.tumblr.com/post/36675629669/format-e...](http://startupframework.tumblr.com/post/36675629669/format-ebook)

~~~
neoveller
I just shot you an invite to the PenFM beta, because you hit it on the head.
www.pen.fm is taking on that end-to-end publishing challenge, and then some.
Having worked in epublishing for a few years now, I think I've finally come up
with a way to automate the hell out of epublishing--and the results show up.
At any point in writing in PenFM, you can click download and get your work
formatted perfectly in epub, mobi, or pdf. Formatting improvements are coming
rapidly, and mostly present already for mobi.

The biggest problem I've recognized with epublishing is that you have to get
the formatting down perfectly, and usually that requires a lot of work by hand
making sure your input HTML file is exactly as it should be. When you control
how content is inputted to a platform, it's much easier to automate perfect-
formatted rendering of that input HTML file, including TOC by inference.

~~~
pseingatl
www.pen.fm yields an ''Internal Server Error".

------
davidw
This article is a great example of why LiberWriter has been doing fairly well.
Most of us here understand it and think it's pretty cool. Imagine your mother
or father, who have retired and decided to finally write that book they've
always thought of trying to parse all of that. Gems? XML? Say what?

------
zrail
I hadn't heard of Bookshop before, that's pretty neat.

<shameless plug>

I put together a project named Docverter that uses pandoc and calibre to do a
bunch of this stuff. The conversions work pretty well, and it's all free.

<http://www.docverter.com>

</shameless plug>

------
dpapathanasiou
Doesn't Pages export to epub?

~~~
pat_shaughnessy
Yes, and it might be good enough for short, simple documents.

For me using a product like PrinceXML gave me the power of HTML/CSS to control
the eBook's appearance more precisely. Also, using a Ruby-based build process
let me take advantage of source control, ERB/Ruby code and other things. In
other words, it made the whole process easier to manage for producing a large
document.

~~~
brini
I'm interested in whether you looked into using a DocBook tool chain via
publican or asciidoc.

~~~
brini
The reason I asked was your comment about taking advantage of source control
and the implied separation of content from formatting.

The only real advantage that I can see to going with publican[1] or
asciidoc[2] is that they're free tools. The main disadvantage is that you'd
have to define formatting via XSL.

[1] <http://fedoraproject.org/wiki/DocBook>

[2] <http://www.methods.co.nz/asciidoc/>

~~~
pat_shaughnessy
XSL! Oh no! I'll have to steer clear of that :)

