Hacker News new | past | comments | ask | show | jobs | submit login
That people produce HTML with string templates is telling us something (utcc.utoronto.ca)
232 points by ingve 11 months ago | hide | past | favorite | 387 comments



"One of my fundamental rules of system design is when people keep doing it wrong, the people are right and your system or idea is wrong."

excellent, and in hilarious contrast with the responses in this thread...


> "One of my fundamental rules of system design is when people keep doing it wrong, the people are right and your system or idea is wrong."

Digital desire paths? https://en.wikipedia.org/wiki/Desire_path


I prefer digital cow paths.


For the desire/cow/elephant fans out there, this Instagram account shares satisfying photos of urban desire paths (and the attempts to block them): https://www.instagram.com/olifantenpaadjes/



I learned this concept hiking with a ranger when I was young. I caught myself about to cut the corner of the trail. I jokingly shamed myself for the thought, and he says,"that's a sign we designed it wrong here". He went on to explain that they do watch hikers for how they got it wrong in that sense


The assumption in that statement is that the trail being convenient for people is the only important consideration.

I was out hiking a couple of years ago on a very steep trail with lots of signs telling people to stay on the trail because they were trying to regrow the forest in the surrounding area to prevent land slides... and what did people do? They cut through it anyway. No wonder they had to shutdown entire portions of the trail.

Sometimes things have to be done a certain way for other reasons. The most convenient technical solution is not always the right solution either.


The assumption? I think it is more subtle than that. Your example shows that if perhaps there were two trails, say the one for the hasty short-cutters, and one for the others, the inevitable damage could have been minimized and a closure avoided.

But my comment wasn't about trail management, I'm recounting an anecdote from 25 years ago. The point was to check your assumptions against reality, and adjust accordingly.


I wish more trails I walked had "fast, hard, short" vs "slow, easy, long, pretty" route markers - and not necessarily in that combination! Sometimes a long trail is long because it's pretty, sometimes because it has low grade. I've walked with people in crutches and wheelchairs, warnings that "this trail has steps" have been invaluable (and really annoying when missed).


Also, shortcuts on switchbacks increase erosion of the hill. Too much of that and the trail will get closed as unsafe.


> The assumption in that statement is that the trail being convenient for people is the only important consideration.

Is that an assumption, or what the ranger said? No need to make up requirements that don't exist.


I might not totally understand the context of the trail you were on, but how does making the trail less convenient support re-growing the forest? It seems like the regrowth probably has little to no preference on where it occurs, so couldn't the trail designers still have made the trail with convenience as the chief goal?


I have seen this on trails.

Regrowth, itself, has no preference. But the people maintaining the mountain want regrowth to happen in such a way to mitigate erosion.


That sounds a lot like the OP's point to me. They didn't go around telling people to stop, they closed the trail, and people stopped.


It reminds me of theatre writing, bizarrely. A lot of playwrights I know think that the audience is wrong when they don’t come to see their plays, don’t interpret the meaning of the play as understood by the playwright, or don’t laugh at the jokes… but in my opinion the audience is always right, and playwrights are there to serve them, not the other way round


You have to be careful with this — the playwright should have a model of the “intended audience”. It is to this group he serves — not any random person off the street.

The playwright is not wrong because the play could not be understood by a man who doesn’t understand the language the play is written in. There is an expected background that allows the play to be more than a blast of noises designed to only interact with your basic senses (which would also assume the audience has those senses in the first place), and the audience can be wrong for not meeting that expectation.

Of course, the intended audience may have no relationship to the audience he will get — in this case the playwright is unreasonable, though the play may still be correct.

I would say both the audience and the playwright operate in a symbiosis; they serve one another, and they both have responsibilities in the matter.


Yes, exactly. I’ve seen the same discussion play out as “if you make a game and somebody plays it, and they don’t enjoy it, it’s your fault.” It seems as nonsensical as, “if you’re a chef and somebody doesn’t like your food, it’s your fault.” A midwesterner with a dislike for fish, who likes meat well done, could conceivably end up in a sushi restaurant.

The point of having a market of plays, games, and restaurants is that we can match producers and consumers with each other. People are going to watch movies they don’t like, eat food they hate, and watch plays that they think are boring. That doesn’t mean that we have to assign responsibility (or blame) to anybody for it! Not everybody has to like your play.


There's a large difference between a single one off not liking something and a majority not liking something.

Even with food tastes. Knowing the expected audience where you are at does count. And if you're in an area with enough people, even then you can do well with a limited portion of the population.


> There's a large difference between a single one off not liking something and a majority not liking something.

If an army of vegetarians walk into a brazilian steakhouse, and complain about the lack of food available, the story remains the same. The vegetarians are wrong for not meeting the expectations of the restaurant's intended audience. It does not matter if you have 1 vegetarian, or millions of them. The second they chose that restaurant, knowing its intended audience, and their own restrictions/preferences, they were in the wrong.

If an army of meat-loving southerners, clamoring about their love for all things flesh, walked into the same restaurant the next day -- would you suddenly turn around and claim the restaurant is now correct? The audience has spoken!

And if it were 50/50? The restaurant is simultaneously right and wrong!

It would be an act of absurdity. The audience is no singular contiguous thing; it can be shifted and manipulated into all sorts of opinions -- the majority opinion is a temporary state.

It would be just as absurd to demand that the steakhouse be made hospitable and of similar quality to both the vegetarians and the omnivores -- it is in serving these subsets of the world's preference that the provider refines their production. To serve equally to all is to provide the lowest common denominator -- something to please none, just as it offends none.

It doesn't matter that the area is full of vegetarians; should the omnivores not be granted meat because 51% of their peers refuse it? Because 67% refuse? 99% refuse? Let the market dictate it nonviable, but do not reject simply because of majority rule.


Again, I was not referring to a one off group. Or person.

But if you have a restaurant and in the course of your first year, only one person likes your food, it's probably you. I mean if you want to go into hyperbole let's go there.


I wasn’t talking about a one-off group either. Let the vegetarians come daily; they are no more correct on day 301 than they were on day 1. They can make up 1% of the restaurants visitors, or 99%; They have not been made more correct.

You could argue that the restaurant is unreasonable to not service this audience — they’re leaving money on the table — but you cannot say that the restaurant is incorrect in trying serving a particular cuisine for a particular audience.

To argue otherwise is to demand that no Chinese restaurant should exist in an American town, serving Chinese food appreciated by Chinese people, because the majority of the locality is American. If you want to argue what matters is the people who actually visit… then ignoring those incorrect visitors will eventually filter them out, leaving you with the audience actually intended (or rather the audience you deserve? Which hopefully matches your intent)


My argument is mostly, that if you don't have enough of an audience to remain open, and there is generally enough of an audience in the area of the restaurant, than it's upon you, not the (potential) customers to adapt.


Sure; I'm calling that unreasonable, but not incorrect. The market determines what is profitable, not what is good. Ultimately if you want something good to persist, you must also ensure it is profitable (or find ways around the market -- subsidies), but it not the case that profitable things are inherently good, and it is not the case that things are inherently not good because they not profitable.

So I say it is unreasonable to hold onto something good in the face of lack of profitability, unwilling to change, but it does not say anything about whether they it was produced well for the audience they intended to serve (it is simply the case that their intended audience either does not exist, or does not exist in sufficient numbers to be profitable -- or it was poorly produced for the intended audience).


I know many people who believe that the "American Chinese food" in some regions of the US is so bland and greasy not because the people making it don't know how to make good Chinese food; but because they're trying to sell Chinese food to a market of people who actively dislike everything that makes authentic ethnic Chinese cuisine distinctive; and that some watered-down tasteless glop (and I don't mean congee, lol) is the local maximum they've found for marketability in that environment.

(Of course, the global maximum — at least for someone who wants to continue to serve that particular market — would be to stop trying to sell these people Chinese food at all, if they're not going to like it. And instead, to learn to cook something where you and your target market can agree on how it should taste.)


> Of course, the global maximum — at least for someone who wants to continue to serve that particular market — would be to stop trying to sell these people Chinese food at all

It's entirely possible that those people like bland Chinese food.


They do like it more than they like Chinese food that has Chinese-food flavors in it, but even with it taken as far as it can be toward their tastes, they still don't like it as much as they like mediocre examples of other cuisines, let alone good examples of other cuisines. To go from a 3/5 to a 4/5 in the eyes of many of these markets, there's nowhere to go but to just start selling tacos or something. (Source: my Cantonese chef uncle-in-law who lives in the midwest.)


I spent most of my life in Arizona... There's some pretty garbage tacos out there... And taco bell,. Del taco and taco dons aren't good. They're ok... Not good or authentic.

So even your counter example can have the same bias.


Even when most people don't like something, all that it means is that it's niche.

But when you do have a lot of those people trying it, just to come out disappointed, then you have a communication/publicity failure.


That's a great point about the intended audience but I wanted to mention that I was a bit distracted by how you gendered some of your language.

For example you talk about "The playright" and then refer to "the group he serves" which causes one to imagine the archetypal playwright as a man. Again, the person who doesn't understand the language is "a man who doesn’t understand."

I know this style of writing was the norm in the past but I found it quite jarring to see it today and honestly I had to read it again to catch your point. On the re-read I caught that only the "random person off the street" was a person and not a man.

Anyway, hope it's ok to call it out, don't mean to come across as unfriendly.


In the English language, ungendered terms or unknown gender is identical to the masculine gender. And even in languages without this property, there is no issue referring to a hypothetical as male.


> I would say both the audience and the playwright operate in a symbiosis; they serve one another, and they both have responsibilities in the matter.

You may enjoy reading R. G. Collingwood’s The Principles of Art. Here is a relevant excerpt:

“ Next, with regard to the arts of performance, where one man designs a work of art and another, or a group of others, executes it. Ruskin (who was not always wrong) insisted long ago that in the special case of architecture the best work demanded a genuine collaboration between designer and executants: not a relation in which the workmen simply carried out orders, but one in which they had a share in the work of designing. Ruskin did not succeed in his project of reviving English architecture, because he only saw his own idea dimly and could not think out its implications, which was better done afterwards by William Morris; but the idea he partly grasped is one application of the idea I shall try to state.

In these arts (I am especially thinking of us and drama) we must get rid, to put it briefly, of the stage-direction as developed by Mr. Bernard Shaw. When we see a play swathed and larded with these excrescences, we must rub our eyes and ask: ‘What is this? Is the author, by his own confession, so bad a writer that he cannot make his intention clear to his producer and cast without composing a commentary on his play that makes it look like an edition for use in schools? Or is it that producers and actors, when this queer old stuff was written, were such idiots that they could not put a play on unless they were told with this intolerable deal of verbiage exactly how to do it? The author’s evident anxiety to show what a sharp fellow he was makes the first alternative perhaps the more probable; but really there is no need for us to choose. Whether it was the author or the company that was chiefly to blame, we can see that such stuff (clever though the dialogue is, in its way) must have been written at a time when dramatic art in England was at its lowest ebb.’

I am only using Mr. Shaw as an example of a general tendency. The same tendency is to be seen at work in most plays of the later nineteenth century; and it is just as conspicuous in music. Compare any musical score of the late nineteenth century with any of the eighteenth (not, of course, a nineteenth-century edition), and see how it is sprinkled with expression-marks, as if the composer assumed either that he had expressed himself too obscurely for any executant to make sense of the music, or that the executants for whom he writes were half-witted. I do not say that every stage-direction in the book of a play, or every expression-mark in a musical score, is a mark of incompetence either in the author or in the performer. I dare say a certain number of them are necessary. But I do say that the attempt to make a text fool-proof by multiplying them indicates a distrust of his performers on the part of the author which must somehow be got rid of if these arts are to flourish again as they have flourished in the past. This cannot be done at a blow. It can only be done at all if we fix our eyes on the kind of result we want to achieve, and work deliberately towards it.

We must face the fact that every performer is of necessity a co-author, and develop its implications. We must have authors who are willing to admit their performers into their counsels: authors who will re-write in the theatre or concert-room as rehearsals proceed, keeping their text fluid while the producer and the actors, or conductor and orchestra, help to shape it for performance; authors who understand the business of performance so well that the text they finally produce is intelligible without stage-directions or expression-marks. We must have performers (including producers and conductors, but including also the humblest members of cast and orchestra) who take an intelligent and instructed interest in the problems of authorship, and are consequently deserving of their author’s confidence and entitled to have their say as partners in the collaboration. These two results can probably be best obtained by establishing a more or less permanent connexion between certain authors and certain groups of performers. In the theatre, a few partnerships of this kind are already in existence, and promise a future for the drama that must yield better work on both sides than was possible in the bad old days (not yet, unfortunately, at an end) when a play was hawked from manager to manager until at last, perhaps with a bribe of cash, it was accepted for performance. But the drama or music which these partnerships will produce must in certain ways be a new kind of art; and we must also, therefore, have audiences trained to accept and demand it; audiences which do not ask for the slick shop-finish of a ready-made article fed to them through a theatrical or orchestral machine, but are able to appreciate and enjoy the more vivid and sensitive quality of a performance in which the company or the orchestra are performing what they themselves have helped to compose. Such a performance will never be so amusing as the standard West-end play or the ordinary symphony concert to an after-dinner audience of the overfed rich. The audience to which it appeals must be one in search not of amusement, but of art.

This brings me to the third point at which reform is necessary: the relation between the artist, or rather the collaborative unit of artist and performers, and the audience. To deal first with the arts of performance, what is here required is that the audience should feel itself (and not only feel itself, but actually and effectively become) a partner in the work of artistic creation. In England at the present time this is recognized as a principle by Mr. Rupert Doone and his colleagues of the Group Theatre. But it is not enough merely to recognize it as a principle; and how to carry out the principle in detail is a difficult question. Mr. Doone assures his audience that they are participants and not mere spectators, and asks them to behave accordingly; but the audience are apt to be a little puzzled as to what they are expected to do. What is needed is to create small and more or less stable audiences, not like those which attend a repertory theatre or a series of subscription concerts (for it is one thing to dine frequently at a certain restaurant, and quite another to be welcomed in the kitchen), but more like that of a theatrical or musical club, where the audience are in the habit of attending not only performances but rehearsals, make friends with authors and performers, know about the aims and projects of the group to which they all alike belong, and feel themselves responsible, each in his degree, for its successes and failures. Obviously this can be done only if all parties entirely get rid of the idea that the art in question is a kind of amusement, and see it as a serious job, art proper.

With the arts of publication (notably painting and non-dramatic writing) the principle is the same, but the situation is more difficult. The promiscuous dissemination of books and paintings by the press and public exhibition creates a shapeless and anonymous audience whose collaborative function it is impossible to exploit. Out of this formless dust of humanity a painter or writer can, indeed, crystallize an audience of his own; but only when he has already made his mark. Consequently, it is no help to him just when he most needs its help, while his artistic powers are still immature. The specialist writer on learned subjects is in a happier position; he has from the first an audience of fellow specialists, whom he addresses, and from whom an echo reaches him; and only one who has written in this way for a narrow, specialized public can realize how that echo helps him with his work and gives him the confidence that comes from knowing what his public expects and thinks of him. But the non-specialist writer and the painter of pictures are to-day in a position where their public is as good as useless to them. The evils are obvious; such men are driven into a choice between commercialism and barren eccentricity. There are critics and reviewers, literary and artistic journals, which ought to be at work mitigating these evils and establishing contact between a writer or painter and the kind of audience he needs. But in practice they seldom seem to understand that this is, or should be, their function, and either they do nothing at all or they do more harm than good. The fact is becoming notorious; publishers are ceasing to be interested in the reviews their books get, and beginning to decide that they make no difference to the sales.

Unless this situation can be altered, there is a real likelihood that painting and non-dramatic literature, as forms of art, may cease to exist, their heritage being absorbed partly into various kinds of entertainment, advertisement, instruction, or propaganda, partly into other forms of art like drama and architecture, where the artist is in direct contact with his audience. Indeed, this has begun to happen already. The novel, once an important literary form, has all but disappeared, except as an amusement for the seine-literate. The easel-picture is still being painted, but only for exhibition purposes. It is not being sold. Those who can remember the interiors of the eighteen-nineties, with their densely picture-hung walls, realize that the painters of to-day are working to supply a market that no longer exists. They are not likely to go on doing it for long.‘


Extremely weird to have either opinion.


Why do you think this?


Because people write plays for a variety of reasons, not just to "serve" the audience. We might write plays to intentionally provoke the audience, to make them angry, or sad, or otherwise feel some emotion they might otherwise feel for some surprising reason. We might write a play because we were seized by a "genius" and neither personal nor social goals adequately describe the reason. Or maybe we write plays merely as an interaction in some kind of entertainment "marketplace." In reality, these two different sorts of activities are interrelated in complex ways and the foremost goal of the artist (even the foremost goal of the viewer) might change from day to day or moment to moment.

Socrates talks about this, I think. A good doctor doesn't serve the patient. A good doctor serves the patient's health and this might actively piss the patient off. Sometimes an artist is a kind of social doctor (or they may aspire to be one).

Thinking of the complex social relation between playwright and audience as driven entirely by either the ego of the artist or the ego of the audience member flattens out the roles of both to such an absurd degree that discussion of the thing in question is impossible. Hence, doing so is "weird."

[EDIT]: Thinking about this has crystalized something which has been swimming around in my brain for awhile. I think a fundamental way in which the current internet undermines human beings and produces alienation is that people fundamentally need to be met with a degree of resistance from people and serendipity from the world. When we seek out art we are, in a certain sense, seeking deliberately to be given something we don't want, explicitly. When we forage for novelty, we do not want to be served up something "curated" for us, but something which we could not have anticipated on the basis of our previous habits. Building marketplaces for every conceivable kind of human interaction undermines this basic need on the part of human beings. Recommendation engines and curation algorithms undermine this need. Even an object like ChatGPT, in a way, can't meet it. When I talk to a human I want to be, in some small way, and not always, genuinely surprised by what they say. It is difficult for a machine which is trained to predict the next token to do this (it is obviously not impossible because LLMs (and other algorithms) know much more than a person and can thus surprise us simply by conjuring up that with which we haven't yet made contact).


Good thing people dont use this reasoning for web security:

- Storing credentials in plaintext

- Not validating input

- SQL injection

- etc

All more convenient than doing it the right way.


If we don't care about the UX, then it would be more "convenient" for the developer to just not write the program in the first place.

Using string templating makes the DX better without compromising UX, since users just see the rendered output. Implementing bad/nonexistent web security also makes the DX easier since there's simply fewer features to implement, but this obviously has negative consequences on UX when folks have their accounts/credentials easily stolen.


Using string templating for HTML is bad/nonexistent web security, so by your argument it does compromise UX.


By your argument, everyone using string templating for HTML has bad/nonexistent web security. I disagree.


Not everyone, just the people whose pages display untrusted inputs. Which is a huge fraction of the modern web...

(The rest just have brittle websites that might break when someone uses certain punctuation for the first time.)


Ah okay I see now you were referring to failure to sanitize inputs/outputs in the original comment. I don't know if this oversight occurs more often when using string templating, but I'm pretty sure this was already a problem long before string templating came into practice.


It's literally the reason why HTML templating is done with other means than string concatenation, these days.


Isn't that why server side validation exists? What's wrong with letting the user enter whatever they want? It doesn't mean it has to be accepted.


Validation can force usernames to be a-z but it doesn't work on freeform text. Forum comments should be able to state that the HTML open comment syntax is <!--


Not really. Lots of template engines escape and/or sanitize interpolated expressions, according to the context, by default.


Well that goes far beyond what I think of as "string templates", now you're parsing the string into HTML.


Not necessarily, it just means the right way should be made easier.


I'm sure everyone is very interested in hearing about how that should be done.


Simply because you don't have an answer doesn't mean you can't point out that something is wrong.


'Something is wrong' is different from 'this can be done better', though. The first one is unfalsifiable, the second can be potentially falsified by putting forward a new way to do it. The first one is mushy language with no clear goal, the second one is an actual constructive plan forward.


The primary design constraint for all of those is security whereas for generating HTML it's done combination of simplicity/maintainability/performance.


There are also security implications for how HTML is constructed.


And which of those implications bare importance on string templating?

Of course there are examples of situations where this heuristic doesn't apply, but that doesn't mean its a bad idea that we should totally disregard. This kind of thinking has plagued engineering fields for a long time; Don Norman talks about it in "The Design of Everyday Things". Engineering teams get mad when users don't use their products the "right way", when really they just won't admit to themselves that they've implemented bad design. Simpler, cleaner designs and use patterns tends to win as time goes on.


Untrusted input has to be escaped before injecting it into an HTML document, or else there is a script injection vulnerability when text from one user is executed as script in another user's browser. Good templating systems eliminate this possibility through parameter systems, but maybe those are still considered string templating systems?


I was going to jump in with a pithy DOET quote[1], but you got the essence quite correct: if the intended users of your system can't get it right, then you, the designer, got it wrong.

[1] Maybe something about "probably won awards" :-)


And there are plenty of things that were one convention that are wrong. Along with plenty of implementations of security that actually aren't good.

If security is a goal, there is a difference between doing anything and actually having a secure system. There is also such a thing as secure enough.


Right? I've never seen a thread like this; it's hilarious.

Hacker news thread talks about how to do HTML. Guy writes article refuting the thread.

But it happens backwards

TENET!


Can you explain? What I’m reading here is people discussing drawbacks of various non-string-based systems, which seems like an appropriate reaction to a guy telling them that maybe people use strings because the non-string-based stuff sucks. (Not being well-known or available in a widely-used language is a drawback in this context!)


And the author made the (annoying) point that I'm now ready to call as a bit of an old-timer here. When people keep recreating the "bad design pattern" over and over, you should probably get over it and roll with it.


> [Y]ou should probably get over it and roll with it.

I’m not sure that’s what he’s saying:

> If people want to displace string templating, figuring out what those current advantages are and how to duplicate them in alternatives seems likely to be important.

But that’s not an interesting objection—if you want to say it, we might as well use that to justify talking about it instead. What is interesting from my point of view is that I can’t see what it would actually mean to “roll with it”.

Stop trying to invent something better? Thanks but no thanks. (I’m just a sucker for potentially extremely neat things with a long history of mostly failing—structural editing, live programming, graphical programming... I doubt anybody can reform me at this point.)

Try to mitigate problems that result from this? If there ever was something that failed even heavier and in even stupider ways than eliminating string templating, it’s web application firewalls and their ilk. At least I haven’t ever heard of them stopping a determined or even somewhat competent attacker.

Try to trick people by doing something that looks like string templating but is in fact syntactic? Worth exploring, but doesn’t really count as rolling with it, I think.

The only thing I can imagine here is tainted strings, and those do work, but like the previous option they are hardly seamless. Something else? What?


> I’m just a sucker for potentially extremely neat things with a long history of mostly failing—structural editing, live programming, graphical programming... I doubt anybody can reform me at this point.

There exists a cohort of people, so called “harbingers of failure”, that inexplicably prefer and buy new products which turn out to be flops. I suspect I am one, too.

The topic of this strange kind of people seems to be discussed here quite a lot: https://hn.algolia.com/?q=harbingers+of+failure

You could probably be one of such people. I think you should document your preferences somewhere public, so that we know what else is likely to turn out to be a flop.

> tainted strings

At least in Perl’s implementation (one that is famous among me) it’s possible to untaint them accidentally by doing some innocuous operations which may not be directly related to their final purpose.


I think you still need to explain further. it is not clear why you think the author is being annoying. it sounds like you are agreeing with the author in your last sentence. I'm confused.


No matter what other damage it causes?


"Roll with it" probably means that we should expect people to use strings for templating, so we should design other downstream mechanisms to handle or prevent potential damage.


For years I have been fascinated with the idea of a template system that works at the DOM level, that’s a bit like what happens with React, but really parsing HTML at the DOM. I wrote some of the ideas up here

https://ontology2.com/the-book/html5-the-official-document-l...

One big problem is that systems like this are orders of magnitude slower than text-based template systems. Another one is a problem with namespaces. If you mash together two arbitrary documents they could have identical id attributes (forbidden) or identical classes (talk about wires getting crossed.). You ought to be able to transcoded an arbitrary HTML document into another one but you’d need to rewrite the CSS to eliminate conflicts in some cases, which I think is possible but is rarely done.


One of my fundamental rules is that if a rule involves a simple binary, it's probably wrong.

Many of the responses in this thread are highlighting that the analysis here is useful but oversimplified: if people are "doing it wrong" it's an opportunity to reflect on our approach to UX/DX, but accepting populism for its own sake is throwing out the baby with the bathwater.

It's also been pointed out by numerous commenters that the central qualifier - that everyone uses string templates - isn't even true. The most popular front-end systems in modern stacks are structured html. It's in common use today.


Anyone here ever take analytical chem lab? This is essentially how you get graded. Just wondering if anyone else was subjected to punishment for being careful?


someone please tell this to game designers


This is lesson #1 in https://youtu.be/QHHg99hwQGY


Yet another incident of Greenspun’s Tenth Rule.

Lisp macros make adding HTML syntax easy. You won’t find anyone using string templates in that language because a handful of macros means you can just program like it’s just lisp.

Strings such primarily because they don’t establish regularity. If you don’t understand everything fully and follow their patterns exactly, it’s easy to accidentally lose your pseudo-macro hygiene and output garbage.

JSX was a revelation simply because it was a “macro” (DSL) that ECMA had already designed an entire spec around (E4X) and thoroughly baked into the language. Like with Lisp, you could just use your normal coding patterns to interface.

A custom HTML macro baked into the syntax makes sense in a language where almost everyone using it is going to need HTML. It would make far less sense to dedicate all that syntax space in a more general purpose language.

And in JS, even with all that design time spent on E4X, you are still back to doing string interpolation the second you step away from that specific syntax (or you’re forced to express everything as HTML even if it’s not a good fit).

The world would be a better place if JS had been scheme and people had been forced to learn a lisp.


Another reason you won't find people using string templates to produce HTML in Lisp is that no one uses it for web development. This phenomenon where multiple language features conspire to prevent misuse is called defense in depth and is one of the great strengths of the language.


Except, you know, the site you are using right now, and a large number of sites that use Clojure/Clojurescript.


I could type a single open parenthesis and blow up this whole site.


this trolling would be funnier if you didn't obviously spend so much time posting here


Ironically, the dominant solution for dynamic HTML in Clojure apps, Hiccup, does not rely on macros as much as it relies on keyword and collection literals.


Historically, web development is among the most noteworthy uses of Lisp in business. Reddit and PG's work come to mind.


Reddit ran away from lisp as fast as it could once the original goal of using the language (to secure funding from YCombinator) had been met.


Reddit apparently ran away from Lisp primarily because all the servers ran on FreeBSD while development ran on Macs and because at the time they were forced to use different Common Lisp implementations (OpenMCL on Mac and CMUCL on FreeBSD) so they couldn't even test what they were deploying, essentially. Today with SBCL that wouldn't have been an issue.


Right, today there would be some other issue.


Don't be so sure. People who want to develop and sort-of-test with one stack, and then deploy on another, will find a way to do it in 2023.


I don't think that's an accurate representation.

They were merged with another company. Reddit guys knew both Lisp and Python. Other company coders only knew Python.

Since then, they've had massive issues scaling Python in general and their ORM dependence in particular.


I've been on reddit since the beginning when it was effectively a lisp-weenie site (clued into its existence by pg (discovered via his blub essay)).

It migrated to python very early in the game (via Aaron Swartz (rip)), and it's my understanding that it was that move that allowed them to scale it.


> Another reason you won't find people using string templates to produce HTML in Lisp is that no one uses it for web development

...how does this change conditional probability? If of those people who use Lisp for web development, nobody uses strings, it's unrelated to how many people use Lisp for web development.


> This phenomenon where multiple language features conspire to prevent ~~misuse~~ use is called defense in depth and is one of the great strengths of the language.

I kid, I kid. Lisp is great.


Reasonably sure this was already the parent's joke


Ah, slow this morning.


Not only do they use it for web development, but they manage to regularly update and upgrade their Lisp based web apps (as opposed to ignoring customer emails because their pile of PHP/Perl is too hairy to debug).


>The world would be a better place if JS had been scheme and people had been forced to learn a lisp.

None of the modern web would be around though because we’d still be waiting for a sufficiently advanced compiler.


It wouldn't be the same and that would be a GOOD thing.

Chez scheme is probably about as fast as JS JITs and with only a fraction of the time spent creating it. If you restrict continuations, you can get even better performance. On the flip side, new JS features like BigInt would have existed from the start (along with generators, rest/spread, typed arrays, let/const, etc). Features like threads that don't exist probably would exist.

On the better side, all the terrible things people complain about like hoisting, `with`, type coercion, weird prototypal inheritance, bad Java-based dates, etc simply wouldn't have happened because Scheme already specced out most of the relevant things.

HTML would have likely disappeared over time because innerHTML and string parsing would be radically less efficient than just using the macros.

We wouldn't have 10 different versions of JS because most of the new stuff either would have been baked into the first version or could be easily accomplished with macros. Major versions would be little things like adding optional type hints or

CSS wouldn't exist because you'd create sets of styles with lisp lists then pass them in. It would be a better version of CSS in JS, but done 25 years ago.

JSON wouldn't have been discovered because lists do all the things better. Likewise, there wouldn't be a need for the "lost decade" of XML development because those same scheme macros would do that job and transformer macros are far easier and better to write than XSLT.


> all the terrible things people complain about like [...] `with` [...]

I would be fairly surprised to hear someone complain about with-statements. My impression is that most folks don't even know it exists, and I'd be very shocked to see it actually being used in the wild.


Mark Miller (the ocap / promises / E guy) used `with` in the “eight magic lines” implementing realms (i.e. complete isolation) on top of vanilla JS[1]. Other than that, it’s probably effectively unused, but I suspect the mere possibility of it still makes implementors’ lives markedly worse.

[1] https://youtu.be/mSNxsn0pK74


> None of the modern web would be around though because we’d still be waiting for a sufficiently advanced compiler.

Huh? This doesn’t make any sense. I don’t think people have done a lot of Scheme JITs, but Scheme has some pretty damn impressive compilers—Chez[1] first and foremost. Certainly ones with better codegen than pre-V8 JavaScript ones. Scheme (the standard fragment) is less dynamic than JavaScript, not more (which has been used as an argument against that fragment by more OG-Lisp-inclined people).

(The one potenial problem I can name is tail calls—IME LuaJIT is much, much worse at compiling Lua’s Scheme-like proper tail calls than it is at the same code expressed as a loop. But then the price for LuaJIT’s small size is that it’s a bit picky at which code it’s willing to compile well. Production JS engines probably handle that better, if at a cost of a couple of orders of magnitude more code.)

[1] https://www.scheme.com/


JS originally was a scheme, then the syntax got nerfed by managerial diktat and the rest is history. It also went a horrifically long time without a sufficiently advanced compiler. Some (who would immediately grin, duck, and run) would say it still lacks one.


JavaScript has very similar semantics to scheme, and is just as hard to compile. V8 works well due to incredible engineering effort that draws upon scheme compiler research.

The biggest language difference I can think of is the guarantees about numeric types. JS can easily compile to native float or integer operations, when it's hard to do that in scheme.

What other scheme features do you have in mind that make it harder to compile? ( Maybe ignoring call/cc)


I'm still bitter that the E4X was deprecated and removed from Firefox, instead of waiting it out to get wider adoption from the browser ecosystem.


Me also. I was there at the birth of E4X as a fresh out of college kid writing test cases for the language. There are some warts in the language but it is still much easier to use E4X than the DOM.


The E4X spec was just bad. There were too many corner cases with very unintuitive behavior or just plain spec bugs. I wish it was E4H focusing on needs of HTML with no xml namespace nonsense. It could have a chance then.


> You won’t find anyone using string templates in that language because a handful of macros means you can just program like it’s just lisp.

HTML templating is even popular in Lisps. See djula and selmer.


How popular is djula in the ecosystem?

For Clojure (re selmer), hiccup is a way more popular way of doing HTML (probably even the de-facto standard), and it's not doing string templates.


I am pretty sure the OP thinks about static html pages, no JS required. So even with JS being Lisp, we'd still have all those string interpolation.


It's why JSX is such a god-send in React (and very rarely do people use string templates to produce HTML in React). String templates are much easier to read and understand than nested function calls. If a language incorporates a safe readable HTML-generation feature akin to JSX, people will be much more enthusiastic about it.

More generally, syntactic sugar matters a lot. E.g. Python's list comprehensions are merely syntactic sugar for filter + map, but they make functional code so much more readable.


Clojure's great for this too, I've never seen a Clojure programmer generating HTML with string templates and that's not because Clojure programmers are more disciplined, it's because Clojure's syntax is flexible/simple enough to make generating HTML in a structured way the natural solution.

https://github.com/weavejester/hiccup


In the context of this article, it's worth pointing out that hiccup doesn't escape strings automatically. Rum does, and I've been using it as a drop-in replacement:

    user> (require '[rum.core :as rum]
                   '[hiccup.core :as hiccup])
    nil
    user> (rum/render-static-markup [:p "<script>alert('you have been pwned')</script>"])
    "<p>&lt;script&gt;alert(&#x27;you have been pwned&#x27;)&lt;/script&gt;</p>"
    user> (hiccup/html [:p "<script>alert('you have been pwned')</script>"])
    "<p><script>alert('you have been pwned')</script></p>"
(Note that Rum is also a React wrapper, but you don't have to use that part of it; you can simply use it for static rendering of HTML.)

https://github.com/tonsky/rum


I think it's more that the syntax is fairly similar to HTML already. HTML is basically deeply nested lists with some attributes attached to the nodes, and lisp code is deeply nested lists with keywords replacing attributes.

That is to say Clojure/lisp programmers are likely more familiar working with deeply nested lists of data.

You could write a very similar library in Python if you wanted using lists, I'm just doubtful anyone would use it because it's not "Pythonic".


> Python's list comprehensions are merely syntactic sugar for filter + map, but they make functional code so much more readable.

More readable than filter/map/reduce _in Python_, because Guido dislikes them and wanted them out of the language (and succeeded in moving reduce to functools). Python's excuse for lambdas makes this so much worse.

Compared to functional languages, though? I'll take filter/map/fold/reduce from Haskell or Clojure versus list comprehensions any day.


What makes me upset about JSX is how it's build-time-dependent on whichever specific library you've configured it to translate to (and can't be used otherwise)

Imagine if JSX produced a standard data structure that could then be

- Rendered by React or another framework

- Serialized to a string

- Deeply inspected/compared

etc. And framework integration only happened when you make the actual framework call, not globally at build-time


It certainly could be done for many use cases, it would probably be ~a superset of the various transformers’ ASTs. But that could add a fair bit of overhead to runtimes, particularly for transforms which deviate significantly from React’s tree structure (eg Solid’s underlying dom-expressions).


You can write custom jsx function and let typescript call your function instead of react's.

Your custom jsx function can return data in reusable format, that is passed to different functions for difference use case.

ts-liveview is using this approach to use jsx for both server-side rendering and compact over-the-wire updates


What I'd like is for it to have been a standard language feature, that can be used with or without a framework and without any build involvement


> Python's list comprehensions are merely syntactic sugar for filter + map, but they make functional code so much more readable.

Compared to what? Languages with actual filter & map generics seem far more legible to me than list comprehensions...


It's a matter of comfort.

I was doing imperative programming for years, and more functional now.

You've got for loops, list comprehensions, newer dict comprehensions, map/filter, generators, etc. in Python.

Python programmers tend to prefer for loops and comprehensions. List comprehensions, especially, dict comprehensions being much newer.

Living mostly in the pre-Java 8 Java world, I used for loops as much as the next guy, like the rest of the Java world. But I really liked list comprehensions. And then Java got streams and map and filter, etc., and wow, I loved it. Even most Java programmers started embracing the new "functional" paradigms.

Now that I've been both (and its more powerful cousins in Clojure), I'd say map/filter are "more readable". It depends on the implementation too: try merging two dictionaries in Python, ugh [1] [2]. Awkward any way you slice it. Watch out what version of Python 3.x you have! Watch the order of arguments, and some variations modify in-place! You're also never sure if your result will be a list or a dict.

Try map'ing a dict in Python: don't, it's not worth it [3]. These things are actually pretty easy, even in Java, let alone Clojure.

For the record, Clojure has map/filter and list comprehensions (called for comprehensions, and they're even more powerful than Python because they can terminate early [0]).

So, yes, in Python, map/filter can often be a pain. They were afterthoughts, and less readable than the alternatives. For languages with powerful map/filter and friends, these functional equivalents are much more flexible and readable. And it appears that most people that are familiar with both prefer the functional variety.

[0]: https://clojuredocs.org/clojure.core/for

[1]: https://stackoverflow.com/questions/38987/how-do-i-merge-two...

[2]: https://gist.github.com/SZanlongo/bc4baa90d3795db7c6ed7e8d41...

[3]: https://stackoverflow.com/questions/23862406/filter-items-in...


Exactly. You made my comment unnecessary, but if I'm not wrong, JSX has proved this wrong 10 years ago:

> No one has structured HTML creation that's as easy as string templates.

Well, maybe it's about the "easy" part, maybe the author considers JSX string templates. I assume the former, because the latter would be absurd.

The advantages of so-called frontend "frameworks" such as React et al is 50% this point. Also a reason for Lit's subjectively slow adoption.


> E.g. Python's list comprehensions are merely syntactic sugar for filter + map, but they make functional code so much more readable.

Can't say I agree. Maybe it's because I worked with simple functional programming primitives like filter and map before I ever really worked with Python, but I find Python list comprehensions weirder and harder to read than things like Clojure's thread macros, Ruby blocks, or even just chaining functional method calls together using the normal dot syntax in Java or Scala.

They do look better than the equivalents shown in the official Python documentation¹:

  squares = list(map(lambda x: x**2, range(10)))
but that is exceptionally hideous and not what I'd consider a normal way to write functional code for dealing with collections or streams.

Seems like a Python workaround for a Python problem to me. ¯ \ _ ( ツ ) _ / ¯

--

1: https://docs.python.org/3/tutorial/datastructures.html


Yeah I think having some a few more methods on some types would be nicer than having global functions like `map`

for example, imagine if your example could be written as:

range(10).map(lambda x: x*2).list()

but as a list comprehension it's slightly shorter, but autocomplete isn't as good:

[x*2 for x in range(10)]

Also some historical context, Python's list comprehension are based on Haskell's


For whatever reason, having just the global functions feels more natural to me in Lisps, but weirder where function names precede the opening parentheses.

I do also find the comprehensions more readable when one is actually using all of their components, i.e., when the function in the map behind the sugar would not be the identity function. But occasionally, when all the author really wanted to do is filter, you'll see stuff like

  [x for x in ... if ...]
and that always strikes me as a bit weird and unfortunate.

> Also some historical context, Python's list comprehension are based on Haskell's

I didn't know that! I do kinda like Python's preference for English keywords over arrows here, even though the Haskell syntax is more concise or whatever.


I guess preferences vary but to me [x**2 for x in range(10)] is a lot easier to read at a glance than range(10).map(x -> x**2).


Haskell always seemed like it was meant for the really mathematical subset of compsci that loves notation and that's why some of those who really love compsci love it.


Incidentally, Haskell has list comprehensions so you could write

  [x^2 | x <- [1..10]]
which looks about as legible as the Python version.


Because people are taught to think in strings. And programming languages coddle them with tools like concatenation and string formatting. And because we let people think they can do useful things with strings as a result.

But what people actually need are grammars.

The exact same reason why parsing HTML with a regex unleashes Zalgo is why generating HTML with string templates is bad. Because both treat HTML as a string, not a grammatically restricted language.


> [P]eople are taught to think in strings[, b]ut what people actually need are grammars.

I don’t actually disagree with you for the most part, but I feel that an important caveat has gone unacknowledged.

Grammar formalisms have the same weakness compared to dealing with raw strings as sound static type systems do compared with dynamic typing: there are small, mostly isolated islands of feasibility in a sea of intractable (often undecidable) generality, and if your problem doesn’t fit inside those borders things start to get nasty (cf how even GCC’s handwritten rec-descent parser didn’t get its lexer hack interactions correct in all cases[1]).

I still agree that we spend criminally little time on syntax. Starting with the simplest cases: with how much time is spent in school on “order of operations” you’d think we could take a moment to draw[2] a damn syntax tree! But nooo. There are in fact working mathematicians who don’t know what that is. (On the other hand, there are mathematicians who can explain that, in a sense, the core of Gödel’s incompleteness is not being able to reason about arithmetic—it’s being able to reason about CONS[3], which arithmetic happens to be able to very awkwardly do.)

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67784

[2] https://mlochbaum.github.io/BQN/tutorial/expression.html

[3] https://dx.doi.org/10.1215/00294527-2008-028


I feel like nobody ever ends up having this discussion about JSON.

Generating JSON data using string interpolation or templating is clearly wildly insane, right? You don’t do it.

Maybe for some config file generation scenarios you might just run a template JSON file through a token substitution or env var interpolation or something. But you’d feel bad about it, because it’s so easy to NOT do it that way. And even then you’re not interpolating in JSON fragments like ‘“age”: 25’ - you’d have the decency to only interpolate in values like ‘25’.

In the node ecosystem it’s so easy to switch from a .json file to a .js file, too, if you want to build the json dynamically.

For some reason people feel more willing to attempt it with YAML. And then regret it when they realize how significant indenting has screwed them.

And then with HTML people just give up and go ‘yup, it’s all text, even the angle brackets’


> Generating JSON data using string interpolation or templating is clearly wildly insane, right? You don’t do it.

I'm sorry to report that I've seen a lot of JSON generated by string concatenation and templating, in different projects.

Often using 'printf' or 'echo' in various languages. Sometimes using whatever's used for HTML string templating if the JSON is embedded in HTML or served as a resource similar to HTML.

Yes, its horrible and breaks if fed variable values that have characters like quotation marks in. People do it anyway.

Even in languages that have perfectly good data structures and JSON libraries.

I've seen a fair amount of parsing values out of JSON using regexes too, assuming specific formatting of the supplied JSON.


Well yeah! Of course people do that because JSON is just text!

It's less common, because JSON is simpler, so the tradeoff point for using a grammar is lower, but it still makes sense in things like shell scripts, and other cases where the equivalent of `print(obj)` (or `eval(totally_not_rce)`, but let's pretend that's not available anyway) doesn't happen to produce (or consume) valid JSON by coincidence.

  # Using:
  grep -oP '(?<="bar": ")[^"]+' foo.json
  # and
  printf("{\"count\": %i, \"type\": \"%s\"}\n",nfoo,tfoo);
is a general-purpose solution that can be adapted to pretty much any text-based format just by looking at examples, without having to cross-reference with a external specification (that the thing you're feeding input to or pulling output from may not even correctly implement anyway), so obviously people do that!

See also various discussions under the heading "Worse is Better". Whether it's the right thing or the wrong thing, it very clearly is a thing.


Oh.


But unlike with HTML I don’t think anyone’s going to jump in here and say ‘well yeah! Of course people do that because JSON is just text!’

I think everyone agrees that those approaches with JSON are bad.


"Nobody ever generates JSON using string interpolation." Oh boy, you just took me back to my first ever PHP project. Feast your eyes: https://github.com/oxguy3/phpStageManager/blob/master/events...

This file generates a feed of events (rehearsals for my high school play) to be rendered by the FullCalendar JS plugin. FullCalendar required a particular data schema that didn't match the format of my MySQL table, which meant I couldn't just json_encode() the MySQL results. I guess I just didn't conceptualize that I could create a new object that matched the FullCalendar format, and then call json_encode(). So, I generated JSON with strings.

Honestly it's a toss-up whether the JSON generation is the worst thing about this file. It looks like I also made a separate database query for every single row to get the username, because I apparently didn't know how to do joins. Could probably spend an hour listing some of the other little nuggets of awful in there. But hey, it got the job done! :)


Imperfect is better than incomplete.

All is fair in love, war, and programming.


Awesome. Yes, put this out there! We've all started somewhere or just had to get something shoveled out.


I guess you have never authored a Helm template, which does on worse and templates Yaml...


Number one reason to avoid Helm indeed.

Taking the opportunity to ask for more sane alternatives. What do the crowd here use for manifest templating?


CUE(lang)


I've used AWK to output (flat!) JSON in ETL pipelines, because it's blazingly fast and "zero" deps.

But I'd never try to implement my own parser or output deeply nested JSON.


A useful tool for transforming JSON to and from a format that is more amenable to simple text tooling is gron: https://github.com/tomnomnom/gron

Takes all the tree and hierarchy management away, makes it so ordering doesn’t matter.

If I’m generating JSON from batch scripts it’s my preferred tool (easier than fighting jq for many tasks)


So you have heard about StringBorg: https://edolstra.github.io/pubs/injections-gpce07-final.pdf

It knows when to do escaping and how. It also can detect, though dynamically, when fragments have been combined into an illegal sequence which would be rejected by the full grammar. It can not however guarantee that the result will parse only that it can not detect that it would fail.


Was that a question? In that case, now I have :) The authors include both of the Eelcos I’ve ever heard of no less!

The ugly quasiquoting seems unfortunate (I’ve a half-serious suspicion the reason Template Haskell never got popular is that it looks so bad), and the GLR sledgehammer precludes ever having a lightweight implementation, but otherwise it seems like a interesting entry in the extensible languages story.


If Twitter and TikTok is any indication, schools do not spend enough time on order of operations

5 - 2(3 - 1) - 5 = ?


This ironically brings us back to the point of the original article: if spending that much time teaching people to do it right didn’t help, spending even more time doing that in the same way is hardly going to.

Also, respectfully, it doesn’t matter. Not having learned maths in English, I don’t know the mnemonic, I don’t care to know it, and I find even the concept of it completely asinine. (For eighteenth-century mathematicians, addition and subtraction bound tighter than multiplication and division, and they could calculate perfectly fine.) You can look up the precedence table if you need to—as long as you need to understood the idea of precedence (and not order of operations, for goodness’ sake). You won’t then be able to calculate fluently, but fluency is a different problem with a tedious and time-consuming solution, and given the time crunch I’d rather talk about some actual Maths as She Is Spoke instead.


It's the more ambiguous ones I see that get people to argue about how PEDMAS is interpreted. Your example is unambiguously -4. But consider:

6 / 2 (1 + 2) = ?

I approach the problem the same as I would 6 / 2(x + y). When the multiplication is missing 2(x + y) is a single term. The implicit multiplication is part of the parenthesis and reduces the problem to 6 / 6. People who argue that you have to strictly use PEDMAS left-to-right will divide 6 / 2 first and get 9.

Neither way is wrong as long as you can explain the process but everyone wants to argue and have there be a single answer.


I’ll stick my chin out, and claim that nobody who knows anything about math would ever write the expression like that. Either use parentheses or a long, horizontal division bar making it obvious what groups with what. As given, it looks like some small-minded school teacher (I’m not saying all teachers are small-minded, just a few of them!) who has taught a set of rules, with little regard for actual practice outside the classroom, and then tests to exactly those rules.


I'm sure you can see how having a single correct answer for a math problem is useful.


When calculating by hand, it’s useful to have multiplication × (not dot, if you value your sanity at all) binding as tight as division / and multiplication-by-juxtaposition binding tighter than that. On the other hand, this only comes handy when your intermediate results are so large that one or two levels of (unambiguous) fractions still aren’t enough, and if at all possible you shouldn’t be communicating results that unwieldy. If you really need to, don’t confuse your readers and use some bloody parens. (You’ll probably need more than one kind.)


I see it as more important to realize that not all questions are well formed enough to have a right solution.

A test question like "6 / 2 (1 + 2) = ?" is not asking for the mathematical meaning of those symbols it is asking for "Guess what I as thinking when I wrote this".

(Unless it is a programming class and you are learning how the compiler reads your code)


If Facebook is any indication, schools have been failing to teach order of operations for decades.

My older relatives are the ones I see repost these inane order-of-operation tests, getting the answer consistently wrong.


Are schools failing to teach it, or do some lessons fail to stick through the decades depending on how much math you’re doing? Schools have no control on what your older relatives have been up to.


Today’s Facebook grumpy posters would be the same kids doing “New Math” in the 1960s: https://en.wikipedia.org/wiki/New_Math - just something to consider


My school taught PEMDAS but failed to teach that multiplication and division are equal in priority, nor did they teach that when there is ambiguity working left to right takes priority.


In math notation, division is indicated using fractions, which removes the ambiguity. The idea of equal priority of multiplication and division is a programming thing.


Or that

   ÷ x 
is equivalent to

     1
   * —
     x
The thing schools don’t do a great job of doing is explaining when transition from doing ‘arethmetic’ to doing ‘algebra’. Many of the symbols you use in arithmetic continue to be used in algebraic notation, but what they mean changes subtly. Arithmetic is a procedural activity - performing a series of operations to get to an answer. Algebra is a declarative activity - making truthful statements about the world.

For example in arithmetic

   x + y
means ‘add y to x’. But in algebra it means ‘the sum of x and y’. In arithmetic ‘=‘ means ‘gives the result:’; in algebra it means ‘is equal to’.

The failure of teaching to explain that you’re moving on to use those symbols to do something fundamentally different is, I think, one of the things that leaves some kids behind and dooms them to always annoy their relatives in Facebook comment threads about operator precedence.


HTML is text. Using something other than strings to process text is unnatural, so even systems that care about correct syntax and correct escaping tend to go from syntax tree to actual HTML eagerly.

Moreover, important parts of HTML processing would be significantly more brittle and complicated and less powerful with objects: "escape some completely arbitrary text to valid PCDATA or a CDATA section, whatever is shorter" is strictly more general, robust and principled than "render a Street Address to a fragment that isn't supposed to contain markup".


HTML isn’t arbitrary text.

HTML is a grammatically restricted subset of text.

I can take arbitrary text and embed it in HTML by escaping characters within it. That produces a grammatical fragment of HTML that represents the arbitrary text, but it is not the text.


Exactly. HTML has structure. It is not the same as flat text, although you can flatten and edit it as such.

As an example sentence, take the following:

"The French equivalent for the English "Good Evening!" is "Bonsoir!", whereas Italians might say "Buonasera!" to one another for similar effect."

There are four languages in that sentence, two of which are English. You may need three editors to deal with them, or you can flatten the sentence and simply edit everything assuming you knew all three.


I guess this is true of all formal languages, then. Since we sit in front of text editors most of the time, the fundamental truth of all languages is that they are strings.

This is not even true of natural language, which has a vocal representation that is at least as important as the written representation.

Though I agree that representing HTML as objects is a poor substitute.


It's a bit deeper: since HTML is defined as text markup, text is the truth of HTML documents and the standard of their users, while any sort of object representation of HTML documents is someone's idiosyncratic and possibly limited implementation, necessarily specialized and necessarily harder to use.


On the contrary - the entire purpose of HTML is to construct a specific object model in a web browser. HTML is a serialization format for expressing DOM structure. It’s not ‘idiosyncratic’, it’s the way the language is defined.


JSON is text, but most code that works with JSON sticks with the "syntax tree" (i.e. some native object representation) and only handles decoding/encoding the actual JSON format at communication boundaries.

The problem with HTML is that its syntax trees are relatively unpleasant to use.


> The exact same reason why parsing HTML with a regex unleashes Zalgo

For anyone who hasn't seen it yet, top answer from https://stackoverflow.com/questions/1732348/regex-match-open...



And more pedantry than you can handle on whether or not that is correct can be found, of course, at https://news.ycombinator.com/item?id=27094085


> people think they can do useful things with strings as a result

Besides not "being proper" or whatever your argument boils down to, people (arguably) are doing useful things by just manipulating strings.

I'd argue most of the web is probably built with just strings and duct tape holding all the pieces together.


Doing useful things that are riddled with bugs and security holes and fail to handle people whose name is O’Reilly.

It would be better if programming languages discouraged people away from those mistakes and prodded them towards the pit of success by making string concatenation harder and providing better tools for constructing grammatically sound structures.


People with wrong names should be encouraged to change them to something correct.


Or we can use type systems to automatically escape. No more XSS injections _and_ still as easy as string concatenation.

Python's MarkupSafe (used in jinja) and go's html/template are good examples.


That would be in the category of ‘better tools for making grammatical structures’.


Hi, we use a grammar - but it looks exactly like HTML strings. If you were naive, you'd think our system treated the strings as naked strings. But we get compile-time errors if our HTML fragments are ill-formed. Fields we pass in get automatically escaped. Best of all worlds, I think. Well, other than the fact ours isn't open source. I mean, I can imagine some improvements to the system we use, but they're all incremental improvements.


In Lisp the templating systems are S-expression based instead of string interpolation based, which at least models the tree structure of HTML documents.


I mean, in Lisp everything is S-expression based. That this models the tree structure of HTML documents is convenient and not entirely coincidental (SGML, plus the deep truth that trees are fundamental to structure, and S-expressions being the language in which The Word was spoken and all)

But if we were talking about outputting CSV data you wouldn’t be able to say

In lisp the templating systems are S-expression based instead of string interpolation based, which at least models the tabular structure of CSV documents

Because hierarchies don’t model tables especially well.

So the suitability of lisp for outputting HTML feels slightly coincidental.


What is your suggestion besides string templates? Writing out the html syntax tree? No thanks!


A generic solution for handling structures, built into the language's themselves, not just some hidden lib that mostly nobody knows. I mean most modern languages come with some XML-parsers, and often they also come with some more or less useful XML-generator. Add them as a first class-citizen, pimp them up and allow them to barf out all kind of tree-like structure which are similar enough and shove it in peoples faces to animate them to use it.

I mean it basically worked with JSON too.


S-expressions!


Many frameworks do exactly that, including React and https://github.com/vanjs-org/van


I think Mithril was the first one to pioneer this.


Nah, it’s waaaaay older than that. It’s been done from the beginning of machine-produced HTML.

The probably slightly newer aspect is producing an intermediate representation that is then serialised to HTML, though I think that’s still going to be back in the ’90s. But the oldest examples I know of (while I was yet a small child) used functions and methods to produce serialised HTML strings directly, which was more efficient (at least in the languages in question) and also allowed you to mingle with string templating.

Perl’s CGI.pm let this example be written no later than 1997 (no idea when it was actually written, can’t be bothered searching harder for older than CGI.pm 2.32): https://github.com/Perl/perl5/blob/54310121b442974721115f936...

For stuff that worked on the frontend, it’s still way older, though it tended more to XML-based stuff like XSLT (… which still works in browsers now, e.g. https://chrismorgan.info/blog/tags/meta/feed.xml is an Atom feed but the <?xml-stylesheet?> processing instruction is basically a pointer to the file for the browser to use to convert it to HTML which it then renders). But there were definitely things in this vein even on the frontend in active use more than five years before Mithril, though I can’t be specific as my memory is fuzzy as I wasn’t paying much attention to it all back then.


Lol. I saw PHP libraries doing this 15 years ago.

Literally cringing as I read the readme. We decided over a decade ago that writing HTML with code is rediculous but somehow it comes up again and again.

A designer shouldn't need to code JavaScript to edit your design.


Pioneer this in the JS/client-side rendering world of course. Mithril is about 10 years old, so in the same ballpark.

Web applications aren't just HTML though, that's why code might be a more appropriate format.

You can argue that designers need better tools to edit structured markup in other formats, but that doesn't entail that HTML should be the default format. For instance, something like repl.it for mithril or similar that immediately renders the output so you can see the results would be useful.


Something like Pug[1] is really nice to avoid all the xml cluttering.

[1] https://pugjs.org/


As a mediocre developer at best, I love PUG but I'm a bit afraid about the lack of updates.



This amounts to saying ‘but I want to be able to generate invalid HTML!’


No, this amounts to saying "I don't care enough about accidentally generating HTML, I want to use faster + more convenient method instead"


Interpolation should have been added to the HTML specification over a decade ago. It's one of the many things the standards body has gotten wrong.

This is something that literally every single framework, front-end and back-end has had to deal with in some way since the 90s. From chucking ugly <?php tags to more elegant solutions like curly braces.

Having one standard in the spec would standardize something that is currently done a million different ways.



But string interpolation is a small part of it. Think about formatting a shopping cart: you need interpolation, yes. But also a "for" loop (for iterating over item list), and "if" statements (to show "cart is empty" or item notes), and functions, like money pretty-printing, would be nice too. And don't forget the nested function-like blocks so one can embed standard design elements...

This gets complex fast, I cannot imagine having something like this as a slow-moving spec.


I agree with you. I think Svelte and the syntax it uses to do everything you're describing, from iterating to interpolation should be part of the HTML spec.


It's a standards producer/consumer disconnect. XSLT on XHTML documents was always the Right Way(tm), but XML got such a backlash it never stood a chance once HTML5 landed.

Also, XSLT is horrible.


XSLT is just XML with pattern matching and no side effects. Lisp-ish in a way

That said XML is like violence, if it’s not solving you problems you need to use more.


XSLT feels like the fact that it's expressed in XML syntax is gratuitous meta-tomfoolery. It's like rather than thinking about the problem they grabbed the closest parser that happened be lying around, minding its own business. Mind you, to be fair, xml parsers are extremely convenient, and if you're going to make it part of a standard it makes sense to use something else you already know is going to be in that standard. I don't know if there was ever any user-facing tooling which made that a particularly good decision, but it certainly makes the language painful to read.


I think they also screwed up by calling it a ‘stylesheet language’.

The idea of producing visual presentation by running XSL-T and then XSL-FO was the kind of thing that makes you think writing TeX macros might be easier.

Hilariously over engineered for the problem users actually wanted to solve (making data driven web pages look pretty).


This is a very good point, and I say that as someone who willingly chooses to use XSLT (I used to use it to generate my website before that got too unwieldy, and still use it for some things).

It's a very natural proclivity of a language designer designing a template language for language X to want to find a way to articulate that template language in that same language X also. I think most engineers can't help but love ideas like that. It's probably the same reason everyone who creates a programming language wants to make a self-hosted compiler for it.

In this case though, XML being a rather verbose and arguably limited semantic markup language for textual documents, it's an extraordinarily unergonomic choice for templating itself (in notable stark contrast to SXML combined with Lisp code, which is an example of homoiconicity between the structure being templated and the templating language works very well).


XSLT was derived from DSSSL used for SGML, and DSSSL is Scheme language. I think XSLT would be nicer as Schema, but I think they were trying to have a single XML parser. Or they wanted to be able to generate XSL-FO and other XML languages with XSLT by mixing namespaces.


On a related note, I do wonder if that XML backlash would ever have been so bad if we'd had the `</>` closing tags from SGML. Sure, it's no `)`, but then again, what is?


> Lisp-ish in a way

And Lisp S-expressions are a fine way to model mark up and transformations on markup.


> Also, XSLT is horrible.

XQuery can do everything XSLT can but with different syntax. XQuery 3 can also handle JSON. It's a clean way to generate well-formed XHTML.


How would that work if html is to remain dumb?


"we let people think they can do useful things in strings". okay? But my counter point is that people have been doing useful things with strings for a very very long time and it kinda works.


In the age of LLMs everything is text


Author doesn’t have real answers as to why people have been using strings for so long, but (as author almost speculates) it’s possibly just because the people who have been programmatically generating HTML the longest had only string functions to work with in the beginning, and only relatively recently have there been robust, reliable, sufficiently-flexible and sufficiently-capable libraries for generating structured documents any other way.

That’s not to mention years of having to cook up hacks to deal with inconsistent browser implementations that violated the document structure you’d be trying to create.

In the early 2000s I was working on web projects. We built such data structures, we did XML/XSLT, we were very careful to make sure everything was well-formed… and we still ended up using string templates somewhere. The tools just didn’t always exist to do everything the way we wanted, so we had to work with what we had. It hurt every time we resorted to it, because we knew we’d have to clean it up someday. But sometimes you just have to do what gets you home in time for dinner.


It's telling something, for sure; but that it must be "HTML (alternative?) must address use-by-string-interpolation" is a questionable suggestion, no matter how subtle.

I believe people use string interpolation to construct HTML, or just about any other language (eg. SQL, JSON, and even human languages) -- is that they see it as a nail and the hammer of string manipulation is almost universally acquired in the very first lessons of any programming language, only second to arithmetics.

People construct HTML with string because the language and environments they use doesn't have mainstream and suitable -- in terms of accessibility and efficiency-- constructs for building HTML.

This problem doesn't exists with React, Elm, and friends that has first class constructs for building HTML.


That is why I like Hiccup/ Clojure so much: https://github.com/weavejester/hiccup It is very natural to produce something resembling a document in pure Clojure data structures and then just convert it to valid HTML. I think, Reagent has some hiccup extensions that are nice like writing the class or id with a . or # notation right in the keyword describing the tag. So there probably still is some space to improve the ergonomics and probably performance. Concatenating strings still wins performance wise by a lot.


Although one way to generate HTML is to construct a structure with objects and then walk it to produce text, an alternative approach is possible whereby our HTML-in-Lisp syntax is a macro language that compiles to code that directly generates HTML without the intermediate object. This could be quite optimized to coalesce the text where interpolation isn't happening, and do constant folding in general.

E.g. using Common Lisp as an example, this:

  (html output-stream
        ((a href "https://example.com") "foo"))

could translate to the code:

  (write-string "<a href=\"https://example.com\">foo</a>" output-stream)
which just dumps a string literal to the stream.

One issue in Common Lisp html generators is that if you want to use the backquote syntax for it, you're steered toward an implementation that just lets the backquote do its job of constructing the list, and then walk the list. The reason being that backquote expands in an implementation-defined way. If backquote expands to a macro syntax like Scheme quasiquote, you can suppress its evaluation and then walk it yourself to give it your own meaning:

  (let ((url "https://example.com"))
    (html output-stream
          `((a href ,url) "foo")))
Here, html could intercept the quasiquote syntax, walk it itself and spit out code like:

  (let ((url "https://example.com"))
    (html output-stream
          (write-string "<a href=\"" output-stream)
          (html-write-attr url output-stream)
          (write-string "\">foo</a>" output-stream)))
Historically, an example of a Common Lisp HTML formatter which walks a constructed nested list object is Tim Bradshaw's htout library. An example of an efficient generator of write-string calls is CL-WHO.

CL-WHO doesn't use backquoting for interpolation. The code template uses keywords for indicating expressions that are HTML tags. Other expressions are implicitly Lisp to be executed. Inside evaluated lisp, the htm macro switches back to HTML templating. E.g.:

  (with-html-output (*http-stream*)
    (:h4 "Look at the character entities generated by this example")
     (loop for i from 0
           for string in '("Fête" "Sørensen" "naïve" "Hühner" "Straße")
           do (htm
               (:p :style (conc "background-color:" (case (mod i 3)
                                                      ((0) "red")
                                                      ((1) "orange")
                                                      ((2) "blue")))
                (htm (esc string))))))
which, according to the documentation, generates code similar to:

  (let ((*http-stream* *http-stream*))
    (progn
      nil
      (write-string
       "<h4>Look at the character entities generated by this example</h4>"
       *http-stream*)
      (loop for i from 0 for string in '("Fête" "Sørensen" "naïve" "Hühner" "Straße")
            do (progn
                 (write-string "<p style='" *http-stream*)
                 (princ (conc "background-color:"
                              (case (mod i 3)
                                ((0) "red")
                                ((1) "orange")
                                ((2) "blue")))
                        *http-stream*)
                 (write-string "'>" *http-stream*)
                 (progn (write-string (escape-string string) *http-stream*))
                 (write-string "</p>" *http-stream*)))))


Ooo that’s slick


The LISP people were right to make trees essentially a first class data structure.


Lists, technically, but nestable lists are indeed basically trees.


The problem with lists as tree is that there is no universal way to distinguish a:{ b c:d } from a:{ b:(c d) } which is why we need proper maps as first class citizen for "the next lisp"


Clojure does have maps as a first class citizen (besides sets, vectors and lists). Then there is clojure.walk and other namespaces suitable for tree manipulation. https://clojure.org/api/cheatsheet

Using trees of these collections is quite customary in Clojure - on the front-end you might keep the application state in a single atom, like re-frame does and update various branches of it using events/ effects and listening on changes to those branches using subscriptions. This approach work for us at orgpad.com quite well.


Yet what we really need is nested MAPS as first class data structure:

a{ nested:tree with:{lots-of:data and:more}}



Not being a frontend developer, I'm not sure why everyone in the comments (and the post) is so convinced this is a bad thing. Maybe I missed it, but the only reason listed so far as to why this is bad is some hand-wavy "might be insecure", which would also be true of hand-written HTML or HTML generated by other means.

As an outsider this strikes me as a purely stylistic argument, like people who argue bubble sort is the worst sort and should never be used. If it's sufficient to the task, who cares?


I like to read and sometimes participate in these types of arguments, but do people actually hope to convince one another? We're all just flexing our viewpoint and knowledge, as far as I can tell.

If you told, for example, suckless that their page builder code[0] should not use string wrangling they'd certainly laugh and ignore the advice.

[0] https://git.suckless.org/sites/file/build-page.c.html#l23


As somebody doing this work for over 20 years I agree with you. You have to understand that HTML is itself a string serialization as is XML, markdown, YAML, and JSON. HTML is not the end state or the goal. The end state is a DOM object in memory and the goal is something visual and/or auditory rendered onto a screen. That said, HTML immediately achieves obsolescence when a browser accepts any string format with which to parse into a DOM instance.


I don't really think that's quite correct.

To someone reading (or authoring) a document, the end state/goal is having that string of text visible and formatted well enough. They don't care about DOM objects.

The main reason we use HTML is not because people enjoy DOM-traversal or parsing or abstract syntax trees. It's because a little markup in your strings can make them format nicely and make it easy to link to and embed other stuff like images, video, and audio.

String templating/interpolation is the goal.


What some people claim to want is irrelevant to how the technology executes. We may want little strings, HTML, or whatever. The end state is a DOM object in memory irrespective of what some developers enjoy. If you want a different end state your choices at the moment are either fully abandon web technologies or use WASM.


It's not handwavy. If you use un-escaped user-provided strings in your HTML, it is almost certainly catastrophically insecure. One user can submit text that includes password-stealing code and when it's displayed to another user it can steal their passwords, private information, etc.

Hand-written HTML is static and thus doesn't use user-provided strings, while HTML generated by other means automatically escapes any strings given to it.


Security is far from the only reason, but it's a good starting point to understand the downsides of string templates so we can start by looking at it purely through the lens of security:

From a security perspective, when it comes to HTML output you're concerned about injection. A HTML opening/closing tag are code - they're read by the browser and have technical meaning for the renderer. The text in between those tags is content: you generally want to render that as is. So these two parts of the string have different purposes, they're contextually different.

Injection happens when a malicious actors gets data into your content that the browser will think (& interpret) as code. The best way to avoid this on the server side is to be aware of whether you're outputting content or code at any given point in your html document.

That's possible with string templates: its called output escaping and you just wrap and variable printing in a function that escapes special characters to avoid them being interpreted as HTML code - this is simple as there's only 5: gt, lt, ampersand, single- and double-quotes.

Every modern string templating system does this by default and its ok. But it's only a start, and is pretty limited it how far it can go.

There's actually three types of injection for HTML: injection actual HTML is just one. There's also JS or CSS injection (CSP is starting to allow mitigation of some of this but no-one uses it, mainly as it's discouraged by Google & others) and attribute injection.

Output escaping only deals with the first type of HTML injection. It can be extended to try and deal with the other two, but it's very error prone if it doesn't have knowledge of whether it's printing a variable inside a script tag, inside an HTML attribute, or just in a normal tag content. It's very difficult for string templates to get that context: structured generation gets all that context for free.

Once you have all that extra context for security, it's also useful for a bunch of other stuff (debugging/analytics/dynamic server side formatting/etc.)


I don't think, in 20 years with HTML, that I've used string interpolation. Why would you when there's a web standard (XSLT) that gives you declarative templates? In my more recent work, I've used LINQ to XML "templates" for this. It's functional vs declarative. The approach works with JSON or XML input. And seeing how Microsoft isn't invested in XSLT, it'll be my approach moving forward.

https://learn.microsoft.com/en-us/dotnet/standard/linq/funct...


XSLT has a steep learning curve and it’s extremely verbose. But my main problem with it is that (as far as I know) XHTML isn’t being updated alongside the current HTML spec (with semantic elements).

Related: I’ll admit that I used to use XSLT for my static site generators [1] and now I’ve switched to string interpolation (with conservative escaping as the default) [2]. Edit: In my case, the goal was to simplify and reduce dependencies.

[1] https://github.com/jaredkrinke/flog/blob/main/post.xsl

[2] https://github.com/jaredkrinke/literal-html


There has been a short moment in the history of the web when XML + XSLT looked like to become the way to go. It was at the end of the 90s, when we were looking at a way to develop web sites for both people at home with modems or slow fiber and for people with 2G phones (that is even slower modems and very small screens). We had HTML and WAP and products to apply XSLT to XML and deliver the same site to both audiences. Then 3G came with larger screens, faster CPUs, better browsers and in a few years we were using HTML for everybody with responsive layouts, media query, etc to the rescue. This is probably the first time I heard about XSLT in the last 10 years.


I don't think, in 30 years with HTML, that I've come across a library or technique that I prefer over string templates. They're so easy. And especially since ES6 has `format ${my_variable}` interpolation, it's even easier!


With ES6 there is actually even more:

    tagFunction`string text ${expression} string text`
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


Very cool use of XSLT is rendering an RSS feed as html if viewed from a browser. So if someone clicks on your feed icon, they don't get an .XML download but a webpage instead:

https://pitsidianak.is/blog/feed.xml

NoScript on firefox is blocking it for me, but there's no javascript involved, only XSLT, HTML and CSS.


Don't the people who read your code get angry that you used XSLT?


Confused at first and then astounded perhaps that there's a standard declarative approach. But never angry. And they get paid to read my code, so if they have any such emotion, they know that they best keep it to themselves.

Also, non-programmers dig XSLT specifically because it's not "code".


Yes they do!

I manage a team of reporting analysts who have to maintain data feeds from our clients to brokers. They are from business and accounting backgrounds and use XSLT for it all. They don't need to worry about build steps or any of that junk, we just store the transforms in a database and they use IntelliJ to incrementally build out the files to the brokers specs, and we can display the transforms in a UI to other staff so they can know whats going on whenever there are questions.

Hundreds of reports, couldn't manage it all without XSLT.


to each his own, I guess.


It definitely feels "unnatural" that others don't get angry.


I think Martin Fowler's on record... somewhere... saying that his sites are that sort of thing. XML, XSLT, and a Makefile.


I am a heavy user of Go's html/template library. The reason is that it allows me to not write Javascript.

With vanilla JS you have to translate to JSON, then build elements in the front end.

The solution to building elements in JS is to use a framework, but compiling Javascript just never sat right with me. It's like ok, now I have an incomprehensible bundle of minified JS and I probably need to run a JS backend if I don't want to go insane.

Alternatively, you can skip all of that and just template.

I'm excited for htmx but haven't used it yet. Simplicity wins. LAMP stack was great for its simplicity. Go monoliths are a modern solution.


You're conflating two things, dynamically creating HTML is great and lets you avoid JS in a lot of cases. It's whether you should treat your HTML document as the tree of nodes that it is in your programming language or treat it like a string.

https://github.com/Knio/dominate is a Python lib that implements this principal.


I’ve been using <template> to avoid the whole “create DOM elements in JS” thing, and it’s been working well so far.

I wrote a little wrapper class to help with it:

https://github.com/retrohacker/template


I use that, is there a better pattern than doing a query selector to put the values in the correct places?


I have been using https://j2html.com/ for some projects and after some initial doubt and discomfort I have to say it is a really nice way to do it.

All the IDE tools just work: refactoring, auto-completion, usage search, debugging.

Also I get really nice exceptions and can express stuff about templates in their types.

When combined with Classes = Templates, static functions = macros and using implementation inheritance (the one permissible case haha) it covers all the usual corners of nice templating engines.

Sure the syntax and indentation is a little wonky, but for me it pays off in so many more convenient ways that it is easily worth the tradeoff.

Is this the best implementation of the concept? Probably not, but it is good.


I really like using j2html as well. The number of times I've accidentally passed in a Map with a wrongly typed key-name has happened too often. With a statically typed template, that's caught at development time, not run-time.


There is no issue with producing HTML with string templates.


> There is no issue with producing HTML with string templates.

There is no issue, until you forget to use escaping (or use the wrong one) for one variable, and someone uses that hole to inject arbitrary HTML and/or JS into your page. As long as all your escaping of interpolated variables is perfect, producing HTML with string templates is fine.


That's just a bad system, not inherent to templating systems in general. Django (python) got it right: All variables that go into a template are escaped by default, you have to go out of your way to tell it not to do that.

String formatting on the other hand, yeah, no good way like that in a language not designed for it.

Not sure which you and GP meant by "string templates".


Unless the template is aware of the semantics of the html being output, it can’t always know how to escape. E.g. the escaping rules are different for a css variable embedded in an inline style compared to using it in a javascript context.

That is what made JSX so neat.


and modern templating systems do! https://pkg.go.dev/html/template

> This package understands HTML, CSS, JavaScript, and URIs.

No JSX needed.


Failure to properly escape HTML and SQL used to be the most common security issues people found (and perhaps bugs).


How is this problem solved using most of the libraries people have mentioned in this discussion that don't use strings?


You'll generally have two functions:

  addFragment : (String, IntermediateHtmlAST) -> IntermediateHtmlAST

  renderHtml : IntermediateHtmlAST -> String
There is a sanitation pass that occurs either in the final conversion of the intermediate data structure to an HTML string (renderHtml), or immediately on the function call (addFragment).

This is similar to how database query libraries let you build up a SQL query via an intermediate data structure and then convert that to a prepared SQL statement (most common) or do data sanitization on the input fragment (less ideal).


Why don't you just look up one of those libraries? Most of them have some sort of description of how they work.


I use string templates for small hidden services and I have never ever once ran into a problem. So yeah, really no issue. Anyone complaining otherwise is being really picky about subtleties in particular contexts. At large they completely work!


To be fair, I did this with SOAP/XML for one project. 99.999% of the time the messages are exactly the same except for the payload, and nobody cares about the other .001%.

If it's parametrizable why bother with completely dynamic generation? That's just more code to maintain.


Code can be verified, i.e. linted, tested, type checked, factored, formatted, and sometimes sorted. More maintainable than a blob of text on larger projects.


Low and smooth learning curve, ease of maintenance for every skill level, nearly the same in every language and framework.


That makes a lot of compelling arguments indeed.

The main drawback I see is how large a door it can let open on security side. Not that I can’t be done with reasonable security check, but it’s far easier to inadvertently shoot oneself in the foot.


The best way to represent HTML is with HTML itself, not an HTML-like system that varies in subtle ways between languages and libraries that happens to be missing the one feature you need.

I'm a large fan of Embedded DSLs (Domain-Specific Languages) for these types of problems, as they allow using normal HTML syntax directly in the rest of your code. Combined with macros (for compile-time parsing/analysis) and interpolation (for safely templating values), DSLs can be safer, more performant, and overall more maintainable than other approaches. For languages like HTML and SQL that are standard and well-defined, this is undoubtedly a better approach in my mind.

I've been developing my own approach to Embedded DSLs as my thesis research with my own programming language Rhovas [0], which addresses concerns with syntax restrictions (e.g. string quotes), semantic analysis (AST/IR transformations), and most importantly tooling. Happy to answer any questions about work in this area.

[0]: https://blog.willbanders.dev/articles/introducing-syntax-mac...


I guess we have to confront whether we think that HTML fundamentally is a string that happens to contain tokens that could be interpreted as a data structure, or whether we see it as a data structure that has a string representation.

I think it just happens to be that the string representation is one of the most familiar and accessible formats to many people.

I don't subscribe to the idea that HTML somehow fundamentally is a string. However, even if we see it as a data structure, to many languages, data structures are not at hand, while objects with parochial APIs are. So you end up with a flurry of different libraries with their own way of doing things, instead of "this is just data, I'll use my language's generic data manipulation tools to deal with this".


I'd say it's definitely a data structure, but one with a canonical string representation. It may not be fundamentally a string, but that representation is critical to how we convey it's meaning to such extent that it's used across languages and browsers as the method of transport.

JSON is conceptually simpler and there's still a lot of quirks between libraries for that already (e.g. null/undefined, integer/decimal representation, and large numbers). XML has more going on to start, and then you get all the different libraries inventing their own abstractions as you said and it picks up a lot of pitfalls. FWIW; I've messed around a lot with configuration languages and it's definitely hard to get right so I understand how this differences accumulate.


I think this is actually a strong argument against HTML templating, if you saw someone templating JSON files with Jinja you would think them mad but we're somehow okay with it when it's HTML.


This is one of the reasons I was always a bit annoyed with WhatWG splintering from w3c. If we had bitten the bullet 20 years ago and stuck with xhtml then I think we would be significantly closer to people using structured formats for web documents by default. HTML being loosely structured was a conscious choice - not a mistake.

I'm also surprised that TypeScript finally managed to sway people. I would bet a large number of people here won't even remember the whole ES4/ES5 debacle. There was a time when types (and much more) were being added to ECMA Script proper through the standards process before Yahoo and others killed it.


The languages usually don't have great support for doing it any other way. So people often do it the same way they would've done it in the mid-1990s (as strings, only in Perl CGI scripts).

Racket (and most other Lisps) have built-in syntax transformation features that will let you do it as your own macros, or even inlining with different parsers, without some distinct preprocessor kludge.

Here's a macro example: https://www.neilvandyke.org/racket/html-template/

Here's doing it with data: https://www.neilvandyke.org/racket/html-writing/

You could also make a Racket reader so that you could have inline HTML in its customary angle-bracket syntax.


If you have a lisp program that generates HTML, now your 'HTML person' can no longer easily work on it with familiar tools.


With Lisps, you can also make it work with external HTML files (or HTML template files), for that person. In my case, I'm usually more productive doing HTML in my main programming language code, which is why I wrote those particular libraries.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: