Hacker News new | past | comments | ask | show | jobs | submit login
Pollen: the book is a program (racket-lang.org)
254 points by giornogiovanna on May 28, 2019 | hide | past | favorite | 56 comments

MB has done an amazing job at creating an identity. It seems that some content of his stirs up small controversy on this forum, and while I can't pick apart the minutiae, it's clear on a high level that he has really made a place for himself as a person in the digital world. His work is interesting and beautiful, and is all related in a way that make it clear that he has a theme or drive which he is fulfilling.

From pollen to practical typography to his fonts and back to racket lessons. While I don't know the timeline of publishing, it seems each fit as a part of his path towards a goal, and he did the reader a service by publishing the meta-work. I see his racket lessons and pollen as the side result of goal to write his books, and so then became works themselves.

Beautiful Racket is one of my favorite books in any genre.

I'm actually using pollen to create my online book[1] and it's been very good.

While I write my blog in markdown it's super nice to be able to mix real code with in your markup language. For example if you want to create a special layout for a specific page or if you want a table with subtly different properties than the rest, it's easy.

It's also very powerful to extend the markup itself. I added support for Tufte style sidenotes[2] which I use extensively throughout the book. This is the markup for the sidenotes:

    Lisp is a pretty nice◊sn{cult} language.

        Some may say it's the language to rule them all.
The way it uses X-expressions to represent text is really intuitive and easy to work with. I do think there's merit to do this in lisp instead of say Python, because the modeling maps so well to lisp.

[1]: https://whycryptocurrencies.com/

[2]: https://www.jonashietala.se/blog/2019/03/04/pollen_sidenotes...


Your book looks lovely, very much in the style of practicaltypography.com. Perhaps I'll follow along, despite being, as you put it, a "cryptocurrency skeptic".

I'm not sure if fixing this would look better than leaving it as it is (I'm not a typographer), but the "jumping" of the main body during the transitions to (and from) the rare pages that don't have side-notes, feels slightly jarring.

Thanks for the feedback!

I didn't realize it before, but flipping through them I think you're right. I'll make sure to adjust them for consistency.

Edit: Practical typography is obviously a big inspiration. I also use two of his fonts (Century Supra and Concourse).

Further off-topic (from the main thread), but your blog Atom feed[0] seems to be slightly broken, due to an un-escaped, literal ampersand on line 687 (Tufte style sidenotes & marginnotes in Pollen). Presumably, it's a hakyll bug, as it should do the escaping for you. (Don't have time to dig further.)

[0] https://www.jonashietala.se/feed.xml

Thank you, it does seem to be a Hakyll bug which doesn't escape the '&' char properly.

I read a few chapters. Well written!

Thanks for the kind words!

I’ve been using Pandoc + Markdown + Latex & friends for a personal project, and at least to me, it seems like a great combo. Prose can be prose, without markup interfering. Macros make it easy to separate style, and build consistent structure, and there are oodles of tools in the Tex ecosystem for just about any typographic need you might have. Add Tex.stackexchange and a build system (I use make) and it’s pretty easy to set up a great looking book and and environment that allows you to focus on the content while having all of the bells and whistles (bibliography, index, etc.).

I do like the idea of having the ability to manipulate an AST of the book’s contents. Sometimes it’s difficult to do free-form text manipulation in Latex, which means solving the problem in other ways.

How's the support for tables? IIRC there wasn't a single method to write tables that rendered in both Word doc files and pdfs. If I wrote the text in markdown and the table in HTML, it rendered fine in Word but not in pdfs, while the opposite was true for latex tables. I didn't try the different markdown flavours that natively support tables, like MultiMarkdown or github.

In that respect, would something like asciidoc work better as a common source file?

> Sometimes it’s difficult to do free-form text manipulation in Latex, which means solving the problem in other ways.

Pollen supports LaTeX as a backend, so that's a way to consider.

I saw that, which would make me consider it, however, Pandoc supports LaTeX embedded in Markdown, which has been a nice combo.

Except if you also want an HTML output from the same source...

While this looks impressive, I think "programming a book" is still too tricky for authors.

Repeating syntax and formatting books is hard, but authors struggle the most with writers block, creating stories, sourcing content for nonfiction, etc. Once much of this hard work is done, a lot of the good ones will partially hand off formatting to an editor/publisher. Of course they'll work closely with them, but whether that work is done in MS Word vs a Programming language is the secondary.

The majority of these docs are clearly aimed at programmers who are interested in MB's though process and why the data model is clean, as opposed to what life of an average author (or best-selling author) looks like, and how the tools will help them write better books.

Wishing the best for this project. Perhaps a version of it could be sold to publishers?

I wouldn't even necessarily say it's "too tricky", just that it's not something that they probably want to be doing (computer scientists aside). If they'd wanted to code, they'd be coders, not authors. I'd be perfectly capable of writing and typesetting a book, but would I want to? Hell nah. So I'd expect the same is true of authors.

No non-developer author or editor is going to look at this and think "Wow, that is amazing. This is the future of publishing."

And I wouldn't bet on unquestioning support from developer authors/editors either.

To be harsh, this is one of those projects where someone who doesn't understand how key parts of an industry work has decided that their "I made a command-line content build system in $interesting_language!" solution is superior to absolutely everything that already exists - based solely on their own opinion.

Professional peer review would not be kind about this.

As someone who does understand how key parts of an industry (i.e., publishing) work, authors broadly fall into one of two camps:

(1) Authors who let other people do all the "non-writing" work, whether via traditional publishing or, if they're self-publishing, paying contractors to do that work. This is both editorial work and, most important for this discussion, layout work. These folks aren't going to use Pollen, but a vanishingly small number of them are going to use anything but a standard word processor. If they get nerdy, they're getting nerdy with Scrivener.

(2) Authors who, for whatever reason, do in fact want to do the layout work themselves. (That group can be subdivided into "people capable of doing the layout" and "people who believe they are capable of doing the layout but really aren't.") They're going to use whatever the hell they feel like using, within whatever constraints are set by whoever they're working with.

In either case, whoever is doing the actual layout is doing it with actual layout tools: at most non-technical publishers this is going to be Adobe InDesign or Quark Xpress; at tech publishers it's at least as likely to be Framemaker. (In my observations LaTeX rarely shows up unless the author is in the "do the layout myself" group.) In either case, Pollen is aimed at the people doing the layout.

And, since you like being harsh, there's no indication you put the least bit of effort into understanding what Pollen's author understands about publishing, typesetting, or anything else. There are valid criticisms to be made of Pollen, starting with the premise that books "on the web" are something that the majority of readers actually want (and the corollary assumptions like "ebook readers can't do as good a job," which is true only to the degree that current ebook readers/creation software can't do so), but "this guy has no idea what he's talking about, he just wanted to write a command line app in Racket" is not one of them.

I've been frustrated for years by the problem of producing paged media (e.g. stuff that is supposed to go on paper) through open source means. Yes, there's Tex and its ilk, which is a great typesetting software, but hard to learn, and hard to adapt (in my opinion). Then there are expensive commercial options.

Currently I think the best solution is a headless browser, I use chromium+puppet, plus polyfills like paged.js. Now I can finally have most of the CSS paged media features, including page references.

Then there is the topic of word hyphenation. Chrome technically supports it through CSS, but in practice that is harder to achieve, afaik due to problems with distributing the dictionaries. There are again polyfills. I also think there should be a way to train a neural net on existing hyphenations to hyphenate virtually anything else in a sensible fashion, including fantasy words.

Without hyphenation, text justification doesn't work properly. It's also a good way to save space in tabular layouts, even if just for display on-screen.

Have you actually tried LaTex? Because I found it quite easy to learn and to adapt. I mean, yes, there are nasty edge cases, when you want to achieve sth which is not so standard (but often you should not really do that anyway). But you can very much also just avoid them, and then I think it's pretty simple and just works.

I don't really know any other solution, so I cannot really compare. But I would assume that for any other solution, you also would have to learn how to handle it, and not sure if that is really simpler and less effort than LaTeX.

I have used LaTex, though I can't proclaim to be very proficient in it. At best it is extremely different to HTML+CSS+JS, which I know much better. At worst I did have trouble going beyond templates/packages other people created.

Subjectively I'd much rather generate reports/bills/etc through HTML than LaTex, though it is undoubtedly possible.

LaTeX is nice for some use-cases. It works quite well for academic papers. Especially if the journal has a standard template to use. But probably not so nice in other cases. I’m not sure it would be so convenient to layout a brochure or a book in LaTeX. (And I mean LaTeX, not TeX or ConTeXt.)

With books, you want lots of control over page size and headers and the styling of section breaks and so on. Often that leads you down the path of importing sixteen different packages to try to get the control you want.

The memoir document class, on the other hand, combines everything you’d need for a book into one place with a consistent interface and excellent documentation. Want to do a book in LaTeX, I think it’s memoir that would make it easy.

Yeah, I’ve done a bunch of program-style booklets in Latex (including some with side by side translations) and it’s really a nice environment for that sort of thing. It’s a bit tricky figuring out all the moving parts but, once you have, it’s straightforward to just drop the text in, edit it a bit, and print out something that looks professional.

Sounds like you might also want to keep eyes on another of Butterick's projects, a document processor called Quad, which creates documents directly, i.e., without going the (La)TeX route.

Repository: https://github.com/mbutterick/quad

Documentation: https://docs.racket-lang.org/quad/

Besides other suggestions here, one of my favourites for publishing is orgmode + Latex (https://orgmode.org/manual/Embedded-LaTeX.html#Embedded-LaTe...).

Org-mode can be thought of as a wrapper around LaTeX. Content in Orgmode is way more human readable than writing entirely in LaTeX. It is great for prototyping, enabling you to get results quick and fast and also plays well with version control. The produced document can also contain Table of contents, cross-references etc. There is also the possibility to generate an intermediate LaTeX file where you can fine tune the typography and layout. To make the most of it though, you will need to learn a bit of LaTeX.

Definitely worth a try if someone is looking for something simpler than totally committing to LaTeX, but like me would rather avoid HTML + CSS pain points.

I'd rather avoid LaTex pain points...

Also CSS has gotten a whole lot better.

Is XSL-FO usable? Last time I looked at Apache FOP it lacked things that have been in css for decades (e.g. floats)

I write Markdown and convert it to pdf via LaTeX with pandoc [0]. You can also embed some LaTeX directly, like math formulas.

[0]: https://pandoc.org/

Similar here, i use Docbook and then convert to PDF with FOP, this all via Red-Hat's Publican. Recently i actually started writing in asciidoc, which i convert to docbook and then to pdf and other formats. This is now pretty slick. Both asciidoc and docbook also provide good localization tooling for gettext, which i need.

”Then there are expensive commercial options.”

What applications are most common these days in the publishing industry? Adobe Indesign? Quarkxpress?

I was talking about PrinceXML mainly, when trying to automate reports etc. There are others.

> “Sounds a lot like LaTeX. Why not use that?” Also a good idea. LaTeX gets a lot of things right. But it’s also missing a lot — for instance, Unicode and web publishing.

Xe(La)TeX has been around for a while[0] and allows for Unicode.

[0] 2004 for Mac, and 2007 cross-platform.

It's missing a clear description what those "unique features" are, both from using Racket and to a visitor considering using it. In short, the site doesn't sell properly.

It's ironic that the Pollen documentation at this site is not available in book form, i.e. a PDF document. Or is it?

The documentation is written in Scribble, another Racket language, and Scribble does allow for PDF output. You can easily produce a PDF version if you have Racket installed. It probably wouldn't offer enough usability improvements over the web version to justify hosting a separate PDF version online.

More generally though, although you can make printed books with Pollen, they are not a core part of Pollen's value proposition. Pollen was created specifically so that books published on the web could be done well. The very first sentence of the documentation says "Pollen is a publishing system that helps authors make functional and beautiful digital books."

Alternatives in the Python world include Jupyter notebooks and Sphinx.

Both have tons of extensions for including all sorts of media and content you need, including musical charts, maps, formulas, plots, tables etc., and can be extended to include virtually everything you want.

The route from Sphinx/Jupyter to pdf/prints leads either via LaTex or via a headless browser. The latter option seems to have more potential though. HTML+JS now can render anything. Including rendering math formulas, 3D graphics, still shots from videos.

> The latter option seems to have more potential though.

How well does it handle page numbers and an associated table of contents?

Really well. CSS has lots of paged media features, which are however not widely supported. There is paged.js which works quite well as a polyfill.

CSS grids and Flexbox are very useful.

Also, due to javascript it is really easy to incorporate custom layout logic. There are even several ways to use Knuth-Plass text justification.

Also there are libraries to perform linear constraint resolution, google for cassowary, kiwi or gss(constraint css).

Does anyone know of anything similar in a more popular language, like Python?

In particular, what I dislike about all other static generators is they come with their own templating language, which is usually limited, stupid and broken in various ways. I just want to mix Python and Markdown, but have the code replaced with its output, to make it easy to build more complicated structures.

Maybe I'll just write something myself...

Interestingly, the author of Pollen attempted to write it in Pollen first, then switched to Racket.

> “I wrote the initial version of Pollen in Python. I devised a simplified markup-notation language for the source files. This language was compiled into XML-ish data structures using ply (Python lex/yacc). These structures were parsed into trees using LXML. The trees were combined with templates made in Chameleon. These templates were rendered and previewed with the Bottle web server.

> “Did it work? Sort of. Source code went in; web pages came out. But it was also complicated and fragile. Moreover, though the automation was there, there wasn’t yet enough abstraction at the source layer. I started thinking about how I could add a source preprocessor.”

[1] https://docs.racket-lang.org/pollen/Backstory.html#%28part._...

I've created https://github.com/gpoore/codebraid for running Python in Pandoc Markdown. It's written in Python and focused on executing Python code, though it also supports Julia, R, Rust, Bash, and JavaScript. It operates on the Pandoc AST, so there are no Markdown preprocessor issues and there's no custom Markdown syntax.

I'm working pretty much exactly on that. It's not at all fit for consumption yet though. This is a file I'm using for testing: https://gitlab.com/mbarkhau/litprog/blob/master/lit_v3/11_ov...

There's bookdown in R which is great.


Perhaps pweave, which is like knitr for Python.


If you want to publish on the web, why not give Pollen a try (I'm assuming you haven't)? It isn't really a static generator, but the end result is similar.

It's a pretty neat system and worth learning before you head off to invent your own wheel.

I tried pollen, but I just don't have room for another language in my head I'm only going to use for one purpose, and I felt the cool features were around the combination of text and code, not the racket/lisp part.

Not a fan of DSL's? I err on the other side of that coin and tend to produce them even when it may be overkill.

Pointers: paged.js for paged-media polyfills. Sphinx (you can customize the layout). Static page generators.

In general, you'll want javascript-assisted solutions working in the browser, and then some pre-processing in Python, depending on your needs.

I looked at the typography book a bit, did not immediately pick up on what made this superior to other digital books?

Terence Parr's 'bookish' is worth a look too, for those wanting to stick to markdown + embedded LaTeX


I got excited for a second thinking this was a reference to the book Pollen by Jeff Noon and some heretofore unknown embedded program. Anyways the Vurt series was way ahead of its time.

Am I the only one who clicked on this because I thought it was about the book Pollen by Jeff Noon? (Didn't look at the URL...)

Has anyone succeeded in installing Pollen on Windows Subsystem for Linux?

Does it support ReStructuredText?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact