Hacker News new | past | comments | ask | show | jobs | submit login
Pdfmake – PDF printing in pure JavaScript (pdfmake.org)
185 points by tilt on May 30, 2015 | hide | past | web | favorite | 47 comments

pdfmake is based on my PDFKit project http://github.com/devongovett/pdfkit, which does the actual PDF generation under the hood. pdfmake is a layout engine on top of PDFKit supporting a nice declarative json document description format.

Has anyone here done a comparison of server-side PDF generators? PDFKit, ReportLab, etc?

Just looked at the sample on the playground section of the website. No hyphenation, no ligatures, and overall the output looks ugly (compared to TeX). What is the advantage over simply creating an appropriate LaTeX/ConTeXt document in the background and serving that?

For the same reason everything eventually gets rewritten in Javascript: sometimes you want to do it in a browser.

In my case, I have a Javascript app that stores critical information for offline use, and users want to be able to print that information nicely. So I generate PDFs in Javascript. This could make that much easier.

> For the same reason everything eventually gets rewritten in Javascript: sometimes you want to do it in a browser.



Deprecating TeX is long overdue - for example creating non-book layouts (i.e. magazines) is a real pain.

So: A a TeX compiler that can be called from JS can only be an intermediate solution.

I think http://txtjs.com/ goes in the right direction.

Actually with css-hyphens[ch], I think html/css has finally taken a leap towards usable text -- something that has been sorely lacking (mostly due to a kind of regression across browsers, where everyone seem to have given up on making html work for text, without a ridiculous amount of css, and maybe even js/canvas/svg. This is related to the avoidance of user style-sheets, and the idea that a bare bones hypertext documents could actually be useful. There's not really anything preventing browsers from presenting an un-styled html-page like how A-list-apart styles their articles -- absent any css-reset/css-styles).

I don't really think calling TeX from js is a very good idea, nor do I really think canvas is a good idea either.

I do wish Adobe had gained more traction with css regions[cr] -- although I also understand some of the arguments against them[ch].

I still think the idea is good: css for style, semantic html for ... semantics -- and "layout html" for layout. I think pairing semantic markup with css columns is just a bad idea -- and it gives the kind of "half-power" that initially lead people to use tables for layout.

So Lie is right in that css regions aren't a great fit for html[ch] -- but I still think they're the best fit for html I've come across. I might be convinced that css regions are the kind of things that fit well with js/poly-fills[pf] -- although I'm sceptical. We know fonts are turing complete and have had security issues -- I don't see why we should need to implement core layout with js. At least I suppose "rogue" software vendors can choose to support css regions[cr] -- For eg an e-book reader based off of a subset of html+css, css regions might be a good fit, while avoiding the complexity of adding js support, and trying to make that secure (including proof against denial of service etc).

Thanks for the link to texjs, though -- I'll definitely have a closer look.

[ch] http://caniuse.com/css-hyphens

[ha] http://alistapart.com/blog/post/css-regions-considered-harmf...

[cr] http://caniuse.com/css-regions

[pf] http://webplatform.adobe.com/css-regions-polyfill/

Agreed on the LaTeX front. The best part is that it's native and cross-platform on the backend, but there also exist libraries to render client-side in JS, though I can't say I've used any of them.

>overall the output looks ugly (compared to TeX)

what doesn't!

>What is the advantage over simply creating an appropriate LaTeX/ConTeXt document in the background and serving that?

the obvious advantage is that you wouldn't have to learn LaTeX/ConTeXt, no?

Pdfmake could generate LaTeX, which can then be properly compiled to a pdf (potentially with texlive.js).

In Haskell this is possible with similar DSLs. Very useful to generate reports.

You do need to learn _a_ way to generate document. I don't see how the JS method is easier than, say, using a ConTeXt Lua Document (CLD). I'd really like to compare samples.

pdfmake was not meant to be a LaTeX replacement

If you want to print a book or a perfectly aligned article do it in LaTeX

If you have a web-application however and it prints invoices or other documents pdfmake can be a good choice and an efficient solution

People don't want to wait 2-3 seconds these days to get a simple document on their printer. With pdfmake lots of standard documents will print in less than 300 ms and... it scales - as you do everything on the client side you don't have to care about server overhead, no matter how many concurrent users you have

How does it compare to

Mozilla's PDF.js https://mozilla.github.io/pdf.js/


Parallax's jsPDF https://parall.ax/products/jspdf

PDF.js is a Portable Document Format (PDF) viewer.

jsPDF prints using:

    var doc = new jsPDF();
    doc.text(35, 25, "Paranyan loves jsPDF");
    doc.addImage(imgData, 'JPEG', 15, 40, 180, 160);
Pdfmake on the other hand:

   var dd = {
	content: [
		'First paragraph',
		'Another paragraph, this time a little bit longer to make sure, this line will be divided into at least two lines'

jsPDF doesn't trim each line like Pdfmake does. Pdfmake however supports new pages, witch jsPDF doesn't. Would be cool if Pdfmake didn't trim the lines!

jsPDF has a method called `addPage()` which adds a new page

Well, I have used PDF.js and it is great. I think it also has printable option.

There's also https://github.com/devongovett/pdfkit, which I've used to bulk-print ticket envelopes for an event. It does support multiple pages, as I had one PDF file per event occurrence (with 50-80 envelopes per).

pdfmake is actually built on top of pdfkit.

This looks really well done. Any chance that it could emulate the functionality of wkhtmltopdf in the browser (a.k.a. CSS styling, structure from HTML)?

I would say pretty low wkhtmltopdf like PhantomJS uses Webkit to layout HTML then converts that layout to PDF, HTML layout including CSS and javascript execution is a heavy lift, this library only scratches the surface of what would be needed to cover wkhtmltopdf.

First you would need a pure js web browser, the closest thing currently being zombie I think. If you didn't care about running on the server then you could take advantage of the browser the lib is running in to get the current computed layout from the active dom and transform into the PDF layout, still a big job but the heavy lift is done by the browser. Telerik does this in their Kendo lib, where they render a active dom element to thier abstracted vector graphics model, then have PDF, SVG and Canvas renders for that: http://docs.telerik.com/kendo-ui/framework/drawing/drawing-d...

Right now I'm using PhantomJS to render an HTML page. It supports styling via CSS, multiple pages, etc etc. What advantages does this have?

PhantomJS is a full web browser that just happens to output PDF's. This is a library to create PDF's with it's own layout model that shares nothing with HTML. Much simpler and low level.

Is it really simpler? I have to learn a DSL to create PDFs with this, whereas I can just use HTML in the other case.

It's not like there is a direct translation from HTML to PDF. They don't really map 1:1. The PDF exported by a browser from an HTML page is an interpretation of that content. Converting from HTML is not a straightforward, reliable way to produce PDFs. Sometimes the process turns out beautiful, sometimes it turns out kind of shitty. Not to mention there are PDF features which are not supported by an HTML export (for example, forms).

A DSL designed for PDF creation is a much better way to approach this. It gives you way more control over what you're doing. Concerning pdfkit specifically, I think a JSON file is a gruesome way to design a PDF, but IMO it's still better than converting from HTML unless you're doing something very simple.

I sort of disagree, HTML as implemented in PhantomJs, wkhtmltopdf and all interactive browsers is incomplete for paged/print output. However HTML with CCS page media is pretty close to being a complete solution, unfortunately the only implementation currently is the commercial PrinceXML.

If someone (Apple or Google) would get wise and implement the full paged media spec then PhantomJS/wkhtmltopdf/your browser might be a viable option for full fidelity print output.

See: http://www.webkit.org/projects/printing/


Until then our apps will have to suffer with one layout language for viewing (HTML) and have another for print like this lib.

Simpler as in the complexity of the library, a web browser is orders of magnitude more complex than this lib.

Simpler as in ease of use? Well that's subjective, some might find HTML easier than learning a new layout model, other might like something more focused on just PDF layout like this.

You are using a school bus to transport one person.

But the bus is free and it works fine.

but it comes with schoolbus-type problems and you look ridiculous.

Our customers don't see it so it doesn't matter how we "look". We can re-use existing design assets and UI elements. I can embed SVG charts drawn by D3, or anything else I can otherwise do with "an entire school bus".

Side note: Under the hood, PhantomJS uses wkhtmltopdf. You can probably save yourself some overhead using the wkhtmltopdf binary directly.

No I believe both use QTWebkit directly.

It, it used QTWebkit directly. Wkhtmltopdf uses a forked version of Webkit with several PDF-specific enhancements (better support for paging, fixes a couple bugs, etc).

Thanks for clarifying!

Nice. This is something I have wanted to find time to write and I'm glad somebody else already has.

I use pdfkit because I have an offline-capable JS app with fairly rich document printing needs. A friendly layout engine on top can save a lot of effort making nice designs.

I used PDFMake on a recent project and was very pleased.

One issue we ran into in browser was the huge size overhead from vfs_fonts - is there a preferred workaround to this issue (we had to serve it separately from the rest of the app), or any solution in the pipeline?

Little of topic. What would be nice is to have something like "print to pdf, aka save page as pdf" via a browser using pdfjs.

In Chrome you can press Ctrl+P and select "Save as a PDF".

"PDF" is a valid application-agnostic printer target on the Debian box I am using.

In OS X too. I believe Windows is the only OS that needs you to install a PDF printer.

On Windows, CutePDF uses Ghostscript for the virtual printer.

On iOS, Apple disabled the API for local virtual printers. There are various "html-to-pdf" apps, but these often perform a non-interactive fetch of the source web page, which use different auth/cookies and can result in a PDF which is different from the one displayed by Safari.

Does this support svg images (e.g. d3.js charts)?

Commercial, but DocRaptor does with some additional Javascript parsing on top of PrinceMXL

Does any of these PDF making solutions properly support RTL languages like Arabic and Hebrew?

Cudos! PDF has a god-awful spec., with which I am intimate. I am impressed with what you have been able to do. I don't think most people realize that the spec. for PDF is over 1,000 pages is and was written by bipolar sadists.

Is this really longer than the combined specs for HTML and CSS?

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact