Hacker News new | past | comments | ask | show | jobs | submit login
Writing HTML with Racket and X-Expressions (2019) (xy2.dev)
86 points by xy2_ on April 1, 2020 | hide | past | favorite | 29 comments



RE first Python example:

> But this approach is a little limited. We end up manipulating strings around, instead of a proper data representation for HTML.

That applies even more to the template example. Template engines are gluing strings, and shouldn't be used for the same reason the first example shouldn't - there's a language mismatch. HTML is a tree. By treating it as a series of glued strings, it's easy to generate syntactically invalid HTML, which opens you up for XSS problems. It's the exact same problem that led to SQL injections in the past, and why people use parametrized queries now.

The X-Expressions solution, i.e. generating HTML from a tree structure, is the correct one.


> HTML is a tree

I'm not disagreeing in general, but this is reductio ad absurdum. The original formulation of HTML as an SGML vocabulary has very specific formal rules about the kind of escaping needed in a particular context. Not to mention empty/void elements and tag omission/inference. SGML, since its beginning, has entity references which do have types informing about how they can/must be expanded in a given context. The problem is that template "engines" (except SGML proper and very few HTML-aware ones) want to use ad-hoc "${...}" syntax and treat HTML/SGML as an unstructured text string.


So with entity references, I guess it's a DAG then, with extra idiosyncratic grammar. That doesn't change my main point: it's a structured language. Expressions in a structured language should be built up through an API that reflects that structure, and thus maintains the structural correctness of constructed expression at all times. Not by gluing arbitrary strings together.


Except that HTML/SGML is a format invented for authoring and delivering semistructured text. At the risk of sounding arrogant, the idea is to have domain experts (rather than programmers) create stand-alone documents that can then be rendered and type-checked, can be refined by web developers using text macros, page boilerplate, transformations/stylesheets, and script, and that can (but don't have to be) used for rendering dynamic content from markup or other sources such as services and/or databases, while remaining self-standing, autonomous, type-checkable documents in a larger workflow.


Not exactly shiny technology, but XQuery seems to have a nice syntax [1]:

  element html {
    element body {
      element div {
        attribute id {"main"},
        "foo bar!",
        1 to 15,
        element footer { "this is the footer" }
      }
    }
  }

  <html>
    <body>
       <div id="main">foo bar! 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15<footer>this is the footer</footer>
       </div>
    </body>
  </html>
1: https://www.w3.org/TR/xquery-31/#id-computedConstructors


I wrote my book[0] in Racket and the experience was quite good. The idea of manipulating html as X-expressions is great and it allows you to easily implement markup for sidenotes for instance.

[0]: https://whycryptocurrencies.com/


Nice. Thanks!


I don’t get why templates are such a pain?

> We've lost a lot of expressiveness: what if we wanted to create a nested list again? We could make it a template macro, but we have to do extra work to call it inside itself, and suddently we have an extra language to learn.

Not that I’ve used jinja aside from some Ansible, I don’t get the second language part for templates or why templates are limiting? Even React’s syntax really isn’t that different from a template as you need to enclose JS expressions in curly braces right?

My favorite HTML server language is Elixir which has a Ruby flavored template engine called using `<%= some_func(@val) %>`. It’s only 4 “tags” [1]. Well maybe since it’s a functional language and everything is an expression the semantics are simpler, mainly limited to whether or not to return a value. Jinja probably is more difficult since Python control flow isn't an expression so you add another layer (probably same in Ruby, Java, etc). Calling functions is as easy as ensuring they’re in the scope of where it’s loaded (or just the current scope when using the ~E sigil. So in the first example encapsulating a sub-lists is just creating a function that returns a template. When compiled it’ll create functions that return IO lists so it’s pretty efficient too [2].

Given that, I prefer having HTML look like HTML.

1: https://hexdocs.pm/eex/EEx.html 2: https://edmz.org/personal/2016/09/06/phoenix_templates-_yup,...


> Even React’s syntax really isn’t that different from a template as you need to enclose JS expressions in curly braces right?

The difference is that they're real, actual JS expressions evaluated in the current scope. Anything you can express in JavaScript you can express in JSX without needing to learn anything new.


Ah, I keep forgetting Jinja expressions can't be real Python. That makes more sense. So really this article is relevant to only full template languages or non-functional languages. edit: Looks like [Selmer](https://github.com/yogthos/Selmer) is similar to EEx in the Clojure world.


Can someone explain the difference between an S-expression and an X-expression? Even after reading the blog post I'm not following, seem like the same thing to me.


Started dabling recently so Im not an expert on the subject, but from my understanding, Racket represents XML data as an X-expression. On the other hand symbolic expressions aka s-expressions are the syntactic elements of the Lisp programming language. Both programs and data are represented as s-expressions: an s-expression may be either an atom or a list. So x-expressions relate to XML


Generating HTML is a common task. Over the years we have seen multiple approaches in the Racket community. Representing HTML as S-expressions is common. There is even two approaches in use `xexprs` and `sxml`.

There are other solutions, such as representing HTML as structures (often called records in other languages). This is what I prefer - this allows a little static checking. (See `scribble/html` or `urlang/html`).

Others prefer to work with `templates`.

My point got lost: There is more than one "Racket way" of working with html.


X-expressions (as [1]) is simply a subset of S-expressions.

[1] https://docs.racket-lang.org/teachpack/2htdpbatch-io.html?q=...


The idea is right, but all the 'quoting and explicit list calls make this super awkward.

If you're going to do it, do it right... use macros.

Of course, then you'd end up something that looks a lot like Haml (http://haml.info/) with parens everywhere.


The Hiccup notation in the Clojure world (eg Reagent) works well, and it doesn't need macros. You're still left with some quoting (for text inside elements and attributes), but in practice most of these come from non-constant values in code.

  (defn simple-component []
     [:div
      [:p "I am a component!"]
      [:p.someclass
       "I have " [:strong "bold"]
       [:span {:style {:color "red"}} " and red "] "text."]])


This is probably one of the major things that Clojure gets right (ignoring EAVT databases) that no one copies

You don't need a CSS preprocessor if you can manipulate your CSS using the standard data manipulators in your given language, this applies to so many DSL languages like HTML, CSS, SQL, Clojure has Hiccup, Gardern and HoneySQL respectively

Here's a PHP SQL port I've been working on https://github.com/slifin/moonlight (keywords removed for ergonomics but it did lose power when I did that)

If a third party gives you a stringy language to work with you gain power back by creating a data representation, operating on that then converting it back to string at the execution point

In the case of PHP we took the Java route and put everything into objects: https://github.com/zendframework/zend-db, the amount of code required to do that is almost comical when compared, Zend DB also suffers from a lot of bugs/lack of features compared to honeySQL which is about 5 files of data orientated code: https://github.com/jkk/honeysql

I should mention Zend DB is not an outlier here for bulky code have a look at other PHP query builders they're all impenetrable


Not a fan. Given that in a template context the first atom will always be a symbol, all the : is just redundant. I despise visual noise.


The `:` is not visual noise, it indicates that it is a keyword. Without it, you're indicating that you're writing a symbol. Symbols are evaluated, so keywords is a better choice given that they always are what they say they are, whereas you could define `div` to mean `<p>`, point to a function, or any number of other things.

You could write a macro to use symbol names literally, so that `div` always means `<div>`, but then you'd have a macro where none was needed, and you'd have no obvious way of using symbols in a non-literal/evaluated way.

Fulcro[1] lies little closer to what you're after. The above would be:

  (div 
    (p "I am a component!")
    (p :.someclass
       "I have " (strong "bold")
       (span {:style {:color "red"}} "and red") "text."))
This is since Fulcro defines the element tags as functions, which are called in order, rather than as a data structure that's interpreted (a la Hiccup).

The immediate and obvious benefit of Hiccup over the function hierarchy is that Hiccup can be trivially (and safely) seralized/deserialized.

1: https://github.com/fulcrologic/fulcro


The advantage is that you can include normal Clojure expressions in it without any special syntax. Also lets you compose Hiccup data and forms without having two mental models of the data representation. Worthy tradeoffs IMO.


In Common Lisp land, there's spinneret, which does something like that: https://github.com/ruricolist/spinneret

I use it to generate my academic publications list, although I'm not 100% satisfied with the elegance of the resulting code: https://github.com/anadrome/cl-pubgen/blob/master/generate-p...


https://koyoweb.org/haml/index.html uses macros to add haml-inspired features on top of xexprs.


You can avoid parens and have some fun building up HTML with combinators.

  > numbers n = docTypeHtml $ do
  >     H.head $ do
  >         H.title "Natural numbers"
  >     body $ do
  >         p "A list of natural numbers:"
  >         ul $ forM_ [1 .. n] (li . toHtml)
https://jaspervdj.be/blaze/tutorial.html


In an even more minimalistic syntax, there is Pug. Similar example as the article:

https://pugjs.org/language/mixins.html

(I think there's a python port too, but I use the JS version both in Node and in the browser.)


That's not bad!

HTML is one case where I think significant whitespace is a really obvious win.


I think the power of X-expressions would have been better demonstrated with a function that manipulates an HTML tree after creation, which is something you don't really want to do if your HTML tree is represented by a big string rather than an actual tree.


This is JSX with parens.


Something similar (and less of a parentheses hell) is Scalatags by the amazingly prolific scala programmer lihaoyi. The good things about Scalatags is that the same code can be used both in the backend and in the frontend. (I don't use scala on the backend anymore but Scala.js is really good!)


In one of my current projects, I'm using Scalatags on the server-side generating strings, and then using the same templates on the client to generate Preact.js vdom nodes to re-hydrate the server side templates. Cross-platform, cross-backend templates is really a very nice thing to have!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: