
Docbook - brudgers
https://docbook.org/
======
cryptos
I tried DocBook a long time ago. Back then the tooling was so terrible that I
gave up. However, the XML syntax made sense to me, but was not fun to write.

Today I'm using AsciiDoc with AsciiDoctor. The AsciiDoc syntax is as pleasant
as MarkDown, but much more powerful and more consistent. AsciiDoc can be
converted to DocBook if needed, but I've never needed this possibility,
because all I need is PDF and HTML output.

~~~
tannhaeuser
DocBook is originally an SGML-based vocabulary (I believe DocBook 5 is XML
only, with only marginal changes from DocBook 4), so you can always use SGML
short references to create your own own little custom Wiki or casual math
syntax. Or you can use pandoc to write markdown syntax and convert to DocBook
or PDF or whatever. I guess DocBook is mostly for whole books, but the one
conference paper I did using DocBook was a pleasant experience; the workflow
and results, apart from the PDF produced by LaTex which of course was
fantastic-looking, were much better than using LaTeX, publisher-specific LaTeX
templates, and in particular latex2html if you want to publish your paper on
your home page. Even when using LaTeX for an ACM-published paper (not a math-
heavy one, though), I ended up using pandoc for converting LaTeX to HTML
because the result of latex2html was unusable, attempting to exactly match
two-column layouts, screwing up semantic text order, biblio formatting etc.

------
runemadsen
When I was at O'Reilly Media, we worked on an HTML replacement to Docbook
called HTMLBook.

[https://oreillymedia.github.io/HTMLBook/](https://oreillymedia.github.io/HTMLBook/)

As far as I know, O'Reilly Media still uses this for all book production.

~~~
deanebarker
I don't think so anymore. I wrote a book for O'Reilly in 2016
([http://flyingsquirrelbook.com/](http://flyingsquirrelbook.com/)) and it was
done in ASCIIDoc, which is a variant of Markdown (or perhaps vice-versa) with
some custom extensions and structures.

~~~
Sean1708
> which is a variant of Markdown (or perhaps vice-versa)

Neither, AsciiDoc has similar design goals to Markdown but it's a completely
separate markup language. I like to explain AsciiDoc as a markup language that
sits somewhere between Markdown and LaTeX, it handles far more things than
Markdown (and is therefore more complex) but isn't nearly as complex as LaTeX
(but also doesn't handle nearly as many things).

------
jdub
_takes a long drag_ … now that's a name I haven't heard in a long time.

~~~
danso
I know meme comments are generally downvoted on HN, but this comment helped to
immediately confirm for me that "Docbook" referred to what I thought it did.

------
08-15
Does anyone else have the feeling that Docbook was killed by XML?

20 years ago, there was Docbook, an SGML application. There was also Linuxdoc
SGML, another, simpler SGML application for us common people. You had to write
the SGML in an ordinary text editor, but the markup isn't too obtrusive.
Stylesheets were written in DSSSL (essentially Scheme) and processed with
Jade. Output was either HTML or flow objects, which could be processed by TeX
to get nice postscript. There were also some shortcuts to produce Unix man
pages.

And then the whole world apparently decided that SGML is bad and XML is good,
and started over. Markup became annoying, DSSSL was replaced with XSLT, which
should not be looked at with unprotected eyes, instead of Jade we got Apache
Xalan(?), which needed a supercomputer to run (thank you, Java) and was
generally a pain to use (again, thank you, Java), and it produced either HTML
or XSL:FO, and there was no working processor for the latter, so there was no
longer a good way to produce PS or PDF.

It took a few years until xsltproc finally made XSLT usable (but still not
pretty to look at), and Apache FOP could format to PDF (when running on a
supercomputer, just like Xalan). But nobody wrote their documentation in
Docbook or Linuxdoc anymore. Because markdown and asciidoc and a bazillion
similar ad-hoc languages may not have been as theoretically clean, but at
least the tools worked, even on a cheap workstation.

~~~
peatmoss
Oh my yes. “XML all the things” was very much a hot trend in the early 00s.
I’m even guilty of getting caught up in it. I hand-coded a handful of docbook
docs for no good reason, and then even bemoaned not having XSLT tooling. In
retrospect DSSSL was the better formatting thing, and XML docbook was
horrendously complicated.

I’m prone to Stockholm syndrome relationships with technology, but XML Docbook
/ XSLT eventually popped my self delusion bubble.

------
mike-cardwell
I redesigned the exim.org site about 10 years ago (I know it's not very
pretty, but you should have seen the previous version). Anyway, the site is
entirely static, and they wanted to keep it that way, and they also had a
system for generating documentation at build time into a docbook file.

I wrote a perl script which converted their docbook into html by applying XSLT
to it. Worked pretty well. They just run a script to generate the new version
of the website each time they do a release.

------
bambax
Over 10 years ago I wrote a tool to transform Open Office files (with proper
styles applied) to Docbook using XSLT and Schematron. It was fun to do. Now
all these techs seem to have almost completely disappeared, which is kind of
sad. It was fun.

~~~
chrisseaton
Did XSL FO ever become used somewhere serious? That seemed an incredibly
powerful tool but so complicated and it didn't ever seem to be fully
implemented.

~~~
ivoc
There's AntennaHouse and PrinceXML, but both are pretty pricey ($3-5K /
server), and both of these do now support CSS as well.

Apache FOP / Batik is okay for generating nice looking PDFs, but its SVG
rasterization is much worse than that of Chrome's PDF renderer (Skia).

As I didn't get the memo about XML being dead, I'm still using Docbook and
XSLT/XInclude to structure and filter content written in markdown/rst.

But I will be switching over from XSL-FO to CSS paged media for layout as soon
as Chrome supports bookmarks in PDF output, or either WeasyPrint or
wkhtmltopdf improve their CSS3 column support.

~~~
jamespaden
Biased cause I work there, but DocRaptor offers a saas API version of the
Prince library with a much lower usage-based monthly fee.

------
aasasd
I wonder now and then as to what's the format and tech to use for docs that
need to include same chapters in different pages, to be published in several
formats, etc. Afaik Docbook was used for this in the unixlands, but I guess
the key word is ‘was.’

Should I just go for Tex once I get the irresistible itch to do this sort of
stuff? What's the web-first solution that would still permit to also support
the godsend Epub and the pathetic PDF? I'm also up for sacrificing some of my
life on the altar of Lisp―is there anything like Butterick's ‘Pollen’ with any
popularity?

~~~
rubidium
Don’t use Tex if you’re aiming for web formats. There’s technically ways to do
it but I’ve never found it effective.

There’s a few high cost options used by technical writers that are likely
overkill for what you need (Madcap Flare is one I relatively enjoy).

Sphinx is probably the best for what you’re looking for though:
[http://www.sphinx-doc.org](http://www.sphinx-doc.org)

~~~
aasasd
I thought Tex could be used to publish to pretty much anything that's in use
out there? HTML should certainly be easier than PDF.

~~~
mikekchar
TeX is a document typesetting language, not a file format. It's really, really
awesome for making documents that are typeset in a very specific way. LaTeX
gives you a set of macros on top of TeX that allows you to abstract some of
that stuff to a certain degree, but the underlying engine is for typesetting.

HTML does not have the fidelity of TeX. You just can't do the same things in
HTML that you can do in TeX. Just to make it clear, TeX is a real programming
language, though with a truly bizarre syntax and grammar (I still love it,
though ;-) ). Trying to execute TeX and output HTML is incredibly difficult to
do with any kind of quality. You could take LaTeX documents and then write a
completely different formatting system that output HTML, but at that point,
you would be better off using a system that is intended to output HTML: like
docbook :-)

~~~
aasasd
Hmmm. This convinces me that I should look into Tex and Latex soonish, as I
can't comprehend why anyone would want exact typesetting for electronic-first
documents in the age of varying screen sizes.

I dream of the day when PDF is as dead as Flash, for I have dozens of PDFs in
my reading pile—which I can't reasonably get to, because I don't have a 14"
tablet or a printer and can't spend days on end propped in front of the laptop
just to read through them. Meanwhile, practically all of those docs are two-
column walls of text alternating with rectangular pictures, nothing more
complicated than Markdown, and they won't be printed in a paper journal. But
the authors felt for some reason that they have to employ PDF.

I've been sermonizing for years that it'd be better to publish in HTML and, if
the author feels like it, build PDF from the same source—and now it turns out
that Tex is a one-track thing and I could just as well ‘print’ HTML to PDF! I
gotta see what's so special about Tex that HTML doesn't look serious enough to
the big boys.

~~~
mikekchar
TeX was written in the 70's ;-) The main reason for using TeX is if you want
to typeset a document that will be "photo-ready" for a print publication.
Quite frequently it is for typesetting scientific papers (which is why it is
popular with academics).

Like I said, TeX is a typesetting language. It is literally a programming
language that takes typesetting programs (including the data) and outputs an
exact photo-ready output. It's very easy to convert the photo-ready output to
a PDF, or a PNG, or GIF, SVG, etc. Because TeX is a programming language, you
can write macros (there are no functions in TeX). These macros work on input
you give it. In the old, old days, someone would create a set of macros for
typesetting documents for each of the major journals and publications you
might want to publish to. Eventually, a guy name Leslie Lamport wrote a whole
set of really good macros called LaTeX that would allow you to write documents
in a consistent way for a whole bunch of different types of publications. That
way you could minimally change the document and resubmit it to a different
publisher.

So why do people still you TeX? There are two main reasons: First because
there is still a need for very high quality print output. TeX is special in
that it has _perfect_ algorithms for doing things like line breaking, kerning,
etc, etc, etc. The output of TeX is dramatically better than just about any
other piece of software. It's insanely better than any word processor will
output and there are probably only a handful of desktop publishing packages
that are even close. If you need absolutely perfect print output, then TeX
(and LaTeX) are a really good fit.

The other reason for using LaTeX (not so much TeX, because typesetting
documents in TeX is almost insane: I used to do it for fun, but I have never,
ever in my career met another person who has done this) is for typesetting
mathematics. The LaTeX macros for math are extremely convenient and the
pictorial output from TeX are perfect. This allows you to typeset your
mathematics in LaTeX, generate a picture and then paste that picture into an
HTML document, or Word document, or whatever.

There is one notable file format that is built on top of TeX that also is
meant for digital use: Texinfo. This is the original documentation format for
GNU. You can write hyperlink documents in Texinfo and it will generate TeX
output for high quality print documents, and HTML (as well as a few other
formats). It is _very_ limited in its features, though and the HTML output is
so-so, IMHO.

There are a couple of other document formatting systems that output TeX as
well (and in fact, I'm pretty sure there is one that will allow you to build
some rudimentary TeX documents with docbook), but I'll let you look for those
if you are interested.

~~~
Crinus
> Texinfo

Note that Texinfo was originally meant to generate output for the info file
format (viewed by the GNU info viewer as well as GNU emacs and a few other
less used viewers) alongside TeX, which is probably where it got its name
(tex+info output). The info format is essentially a primitive hypertext format
with support for hierarchical nodes and keyword indices packaged in a single
file (essentially a primitive text-only form of CHM, though later versions of
info also support external image references). The HTML and other output
formats were added much later.

------
h91wka
I still find this format useful. Suppose your docstrings are scattered across
the code base or some sort of data model, and you need to assemble them into
large document. XML-based formats will work reliably. pandoc does decent job
of transforming DocBook into all kinds of outputs.

------
doctor_eval
We used DITA extensively for many years. DITA was largely awesome in terms of
authoring, we could and did produce HTML and PDF with one click, but the tools
were awful and expensive. Maybe that’s changed now.

My company is now moving to putting their docs in confluence, which I hate.

~~~
ainar-g
> My company is now moving to putting their docs in confluence, which I hate.

The only good thing I've found so far about Confluence is that you can kind-
of-sort-of escape its WYSIWYGness by pressing the small “<>” (“Edit Source”)
button at the top and just paste your XHTML, that you've written like a normal
human being, in there. But of course it only supports a subset of XHTML,
doesn't support anything CSS beyond very basic styling, and it makes it very
hard to work with tables of contents and other extensions.

~~~
doctor_eval
That’s a good idea. I found confluence’s UI to be almost unusable. I was
constantly fighting style changes, like spending 5 minutes trying to un-bold
something when I should have been getting my brain dumped.

------
HugoDaniel
Is this still used ?

~~~
tkfu
It's not used directly in a lot of places. However, it's still used in a lot
of publishing toolchains as an intermediate format, usually compiled from
asciidoc via asciidoctor[1]. O'Reilly books, for example, are all written in
asciidoc and then run through a publishing toolchain that goes through
asciidoctor to docbook, to eventually get rendered into printable formats as
well as PDFs.

[1] [https://asciidoctor.org/](https://asciidoctor.org/)

~~~
thechao
Docbook is an “ok” filter to acquire competent tech-doc-folk who are capable
of both content improvements (prose), and asset delivery (books). It’s also a
well-known check-box for good programmer-typographers.

