
Wkhtmltopdf Considered Harmful - lobo_tuerto
https://blog.rebased.pl/2018/07/12/wkhtmltopdf-considered-harmful.html
======
jarvuschris
IMO puppeteer is the best way to go now, it's maintained by the Chrome team
and uses officially supported developer APIs:
[https://github.com/GoogleChrome/puppeteer](https://github.com/GoogleChrome/puppeteer)

I built a basic CLI wrapper to make it easy to drop it in in place of
wkhtmltopdf: [https://www.npmjs.com/package/puppeteer-
cli](https://www.npmjs.com/package/puppeteer-cli)

I've used wkhtmltopdf heavily for years and it is a disaster today. Font
rendering is inconsistent across systems, its CSS/HTML support is frozen in
time many years ago.

The author's suggested alternative doesn't use a real browser rendering
engine, it uses its own CSS/HTML parsing and rendering implementations (!!!).
I don't trust it to keep up with standards and don't want content authors to
have to deal with yet another dialect of HTML/CSS with its own pile of quirks.
We already know Chrome's capabilities and quirks and I'm quite happy with
Chrome's print menu and the output of my backend PDF generator being the same
thing.

> The CSS layout engine is written in Python, designed for pagination, and
> meant to be easy to hack on.

Raise your hand if you want to hack on a CSS layout engine in Python while
generating your PDFs. I'll wait

~~~
TobalJackson
Came here to mention puppeteer since I didn't see it in the author's
suggestions. I developed a huge script at my last job which processed PST
files into PDFs, and it relied heavily on wkhtmltopdf.

After discovering the issues surrounding font embedding and lack of
interoperability with windows/adobe acrobat (when redacting) I was tearing my
head out trying to figure out how to get it to work. In the end, I chalked it
up to being a limitation of how the program handled fonts, and looked for an
alternative.

After discovering puppeteer and dropping it in (replacing wkhtmltopdf),
basically all the problems I was having with fonts disappearing were solved.
Did not look back.

------
KiwiCoder
I will join the chorus - wkhtmltopdf has been working fine in production for
many months, reliably producing thousands of 10+ page PDFs with cover pages,
headers, footers, page numbering, and images. Sure, it takes bit of fiddling
to get it working as you like the first time - as most non trivial things do -
but I've no regrets and would recommend it.

~~~
astrodust
I have lots of regrets and most of them relate to wkhtmltopdf.

The thing makes ImageMagick look well architected and streamlined.

------
matz1
This wrong usage of 'Considered Harmful' is Considered Harmful. I've been
using wkhtmltopdf to enormous amount of pdf and it has been fine. I will try
to replace it with chrome headless though for future project just because its
newer.

------
liquid_x
Bit of an exaggeration to use "Considered harmful" just because there's a new
(better?) alternative

------
tebruno99
Harmful? I don't see it hurting you in anyway and if you don't like the
OpenSource project don't use it. This is stupid.

------
brassattax
I have been using Wkhtmltopdf to build pdfs in my application for quite a
while. It has never caused any harm. Actually it has always "just worked" and
is a good fit for my app.

~~~
kkaske
I was going to say the same thing. I actually had to go back and check to see
if that is what I was using. I set it up so long ago and have never had to
mess with it that I forgot. Works great for me!

------
jessewmc
The conclusion is telling: They switched from Prawn to wkhtmltopdf, have all
these issues with it, but have found nothing better.

Wkhtmltopdf is far and away the best solution (at least in Ruby land) even if
it is often painful to work with.

~~~
pavel_lishin
Their conclusion is TLDR'd at the top, and that they have found something
better.

> TL;DR: replace it with weasyprint.

~~~
jessewmc
The last line of the article is "We'll be keeping an eye on weasyprint." I'm
pretty sure that means they don't think it's ready yet, and aren't using it
yet. It looks promising but they don't directly test it against any of their
complaints!

I, like many people in this thread, have used wkhtmltopdf for years and it
works pretty well.

------
icebraining
Curious they didn't mention our main problem: every single release changed
something in the output. Upgrading fixed a bug and brought another; every
minor version required hours of tests to verify multiple copies (single page,
multiple page, long pages, etc) of every report. We ended up just applying the
specific patches and building it ourselves.

The developers are hard-working and helpful (@ashkulz even commented on a
commit on my personal fork!), I just think the job is too big for the
resources they have. Too many edge cases.

------
nathan_f77
I used wkhtmltopdf to build the HTML/CSS templates [1] in FormAPI [2]. I agree
with all the points, but they haven't really been dealbreakers. It's been
working fine, at least for simple things like invoices. weasyprint does sound
a lot better though, so I'll probably switch to that.

Also I didn't realize that DocRaptor has an integration with Heroku. I should
really look into that.

[1]
[https://formapi.io/templates/tpl_mFAsZEmhCtG9FX5N/edit](https://formapi.io/templates/tpl_mFAsZEmhCtG9FX5N/edit)

[2] [https://formapi.io](https://formapi.io)

------
kirjavascript
Have been using puppeteer [1] at work as a replacement for wkhtmltopdf with
great success.

[1]
[https://github.com/GoogleChrome/puppeteer](https://github.com/GoogleChrome/puppeteer)

------
danenania
Funny, I just implemented wkhtmltopdf yesterday (via ruby's PDFKit) for
invoice generation. None of the issues listed are a big deal for our very
simple use case, but I did have to rebuild docker images a bunch of times in
order to figure out the right dependencies on debian. I couldn't find a
canonical list of these anywhere. The binaries installed through apt-get are
also outdated and didn't work for me, so I ended up using the wkhtmltopdf-
binary gem.

These are the os dependencies I ended up needing: libxrender1, libxext6, and
libfontconfig.

So yeah - the library is fine, but installation ux/docs could use work.

------
taf2
I enjoy “mother f’ing Prawn man”. - district 9 references aside, it’s a great
library for generating PDFs in Ruby.

No mention in this of headless chrome - setup as a lambda service is also
pretty nice

------
gandhium
Looks like they just choose a wrong library. Wkhtmltopdf is intended (as name
implied) to convert HTML to PDFs, like 'we rendered this nice page in browser
using D3 and fancy set of CSSs, now we need to export it to PDF'.

And their grievance is about abilities to generate PDFs from a template with a
predefined presets. Clearly, other libraries will suit that task better.

~~~
icebraining
I disagree; wkhtmltopdf has many options for exactly their use case (like the
headers and footers), not just for regular webpages.

~~~
inferiorhuman
If headers and footers are a significant issue something like XSL-FO (with
Apache fop) may be a better idea. But then you're trading one monster for
another.

------
type0
Prawn is nice, too bad it's incomplete though, I wish Weasyprint was there but
you just can't guess when and how it will fail on you. At least Puppeteer is
predictable that way, it's good for invoices, short reports and other small
stuff, it's not really suitable for anything bigger.

------
tonetheman
Headless chrome will create PDFs...

[https://developers.google.com/web/updates/2017/04/headless-c...](https://developers.google.com/web/updates/2017/04/headless-
chrome#create_a_pdf_dom)

------
thsowers
I've used wkhtmltopdf for many years, and although it certainly has it's
issues, it does work once you get it to a stable point. Is it as easy to get
there as it should be? No

Recently, I've been investigating writing new projects in react-pdf

------
rahimnathwani
I tried using Weasyprint ~9 months ago, and really wanted to like it. But lack
of features meant it couldn't render my (not that complex) page properly. So I
switched back to wkhtmltopdf.

------
coldtea
All of the reasons mentioned were BS when wkhtmltopdf was in active
development, as, even with all that, wkhtmltopdf (and phantom.js later) was
the only game in town.

Now we have Chrome headless etc as well.

------
smacktoward
_> PDF is an ancient format, as old as the Web. It was introduced in 1993_

Just because something is old doesn't mean it's bad. "Old" can also mean
"well-understood" and "widely implemented," both of which are true of PDF. I
can make a PDF and be reasonably confident that it can be read on any platform
under the sun, and that someone needing to transform it will have plenty of
tools available to do so. Neither of these things are true of XPS, or really
any other competing format save HTML, which isn't really comparable as it aims
to solve different problems than PDF does.

There are definitely things to not like about PDF, but its age isn't one of
them.

~~~
icebraining
I think that's exactly OP's point - PDF is old, therefore is should be widely
implemented, so "you’d think wkhtmltopdf would be easy to avoid."

It's not a dig against the format.

------
masukomi
i was recently looking for a good html to pdf solution and found wkhtmltopdf
and weasyprint results to be... poor at best. turns out, you can do it on the
command line via a call to Google Chrome and the results are great. Details
here: [https://weblog.masukomi.org/2018/05/25/html-to-pdf-on-the-
co...](https://weblog.masukomi.org/2018/05/25/html-to-pdf-on-the-command-
line/)

------
lmm
Is pandoc a poor option? That would be my first thought for converting between
most document formats, and I thought it had HTML and PDF support.

~~~
jacobush
Pandoc uses Wkhtmltopdf.

~~~
buckminster
It can use any of pdflatex, xelatex, lualatex, pdfroff, wkhtml2pdf, prince, or
weasyprint.

------
baccredited
My mileage varied considerably: using wkhtmltopdf to archive intranet pages
and it has worked great for years.

~~~
pavel_lishin
What's the benefit to archiving intranet pages as PDFs, instead of as .html
files? Are there a lot of embedded image assets?

~~~
baccredited
Mainly the single file output. And PDFs fit nicely into our (ancient) document
management tools. Also pretty confident they will look the same despite
browser changes down the road.

------
untitaker_
If one blog post for every piece of bad software out there was on the front
page of HN I would probably stop reading HN.

------
noir_lord
Hah compared to Jasper it's not harmful.

------
tinus_hn
[https://meyerweb.com/eric/comment/chech.html](https://meyerweb.com/eric/comment/chech.html)

Considered harmful is kind of a tired cliche.

~~~
slantyyz
Misleading too. When I saw the headline I thought it was related to some
critical security issues.

It ended up being "8 reasons why I dislike Wkhtmltopdf"

