Hacker News new | past | comments | ask | show | jobs | submit login
Two Approaches to Decoupling (htmx.org)
93 points by telotortium on May 2, 2023 | hide | past | favorite | 57 comments



Another approach (which I'd love to see others pick up on) is treating the client as a dumb proxy to the server. This is the philosophy I'm embracing, and it's wonderful!

If you buy into the philosophy that the UI is a deterministic function of server state and ephemeral view state (ui = f(state, viewstate)), then the core question is medium.

With browser, embracing HTML is the only sane choice for business. As such, the function at hand needs to convert those two state objects into a blob of HTML. This is the classic HTML template problem.

However, if you treat the state(s) as streams, then you'll be where I am at in building a differential HTML engine. I'm calling this first (big) version: RxHTML ( https://book.adama-platform.com/rxhtml/ref.html ) and it uses my modified JSON delta format ( https://book.adama-platform.com/reference/deltas.html ).

The spiritual design question is what minimal things does HTML need with reactive data binding.

It's pretty awesome to be able to debug a user experience by having 4 views up at once, and then see a real-time update on all 4 views at once.

The tech at this time is beta quality, but I'm working through this as I build a real SaaS for everyday people. I'm having a bunch of fun especially as tailwind dramatically simplifies the styling.


This sounds similar to what the Phoenix team has done with LiveView. Is it a flavor of the same idea or are there other concepts I am missing. Can this also be rendered/hydrated by the server?


Yes, this was my first thought as well. https://www.phoenixframework.org


Firstly, I wish you luck. It's an ambitious undertaking to get something better adopted by the industry and a healthy dose of luck can only help.

But, having spent the better part of an hour trying to understand your proposal, I have to say that it isn't something that solves a proportionally sized problem.

Modified html with added tags for conditional processing has been done to death. I've done it. At least five former colleagues of mine has done it. At the end of it you'll come to the same conclusion as we, and everyone else, did and abandon it.

Secondly, this seems like a very overengineered solution, and it's not clear what the problem is. Your project will end up trying to be all things to all people because no one can tell at a glance what problem it should be used for.

If you're targeting a problem at a higher level then there's very little need to invent new low-level components; the existing ones should mostly be fine.

To be honest, if the browser allowed s-expressions and specified the evaluation of them, most current frameworks (and JSON itself) would be made redundant.


Thanks, I need all the luck I can get. Currently, I'm building a couple of products with my stack, and it's going well.

For conditional processing that you and your colleagues did, was it within the browser? I'm doing this as simple step such that I turn a giant "HTML forest" into lean javascript and DOM binding. It's working well for me, so I wonder what the gap is?

Personally, my stack is underengineering since I'm combining the roles of web server, database, messaging stack, queues, caches, load balancer, workflow into one cohesive platform. My dry run video: https://www.youtube.com/watch?v=bWYgChA_aYA&feature=youtu.be goes into more details.


> For conditional processing that you and your colleagues did, was it within the browser?

Yes - it's the most common way to do it (the script language that comes with htmx runs in the browser too).

> It's working well for me, so I wonder what the gap is?

It usually turns out that it's less work simply using JS in the browser than remembering the new language. I'm not saying that your effort would have the same result, but it's the result I've seen in the past for this sort of thing.

> Personally, my stack is underengineering since I'm combining the roles of web server, database, messaging stack, queues, caches, load balancer, workflow into one cohesive platform.

That's what I meant by over-engineered - too much is included in the project, much of which won't be used by most intended users. I accept that maybe "over-engineered" is the wrong word.

Tell me, what's the problem statement you had that prompted this development effort? What problem where you trying to solve when you started this?


Originally, it had to do with board games and running them in a web/cloud environment.

Using websockets made building the game easier, but now there is exceptionally complex state within the process (await/async). Generally speaking, the cloud is unforgiving place to hold state within a process.

I can speak deeply about streams at scale ( https://dl.acm.org/doi/10.1145/3477132.3483572 ).

Adama emerged as a tool to help define a "transactional document schema" which I could use to emit deltas to a logger (like Kafka), and once I added a notion of "privacy polices with code"... the potential exploded.

I believe I'm onto something with this platform, so I've decided to start hiring to build digital products that fit the thesis.

I would say I'm over-engineering in that I'm making the service available to others rather than leveraging in a portfolio of products, but I'd rather keep it open rather than have secrets.

(It's a good thing I'm retired because this is a large endeavor)


This approach is very old indeed, pioneered by first video terminals (as opposed to teletypes).

It assumes always-online, synchronous operation.

How do you in your model implement an auto-completion list? Do you allow it to re-generate not on each keystroke?


Auto-completion would update the view state which is gossiped back to the server, and the server can regenerate the list.


Where does P2P/distributed architecture come into your earlier comment? Seems like traditional server<>client communication.

Maybe it's just an unfortunate wording, but "gossip" has a specific meaning in P2P/distributed systems and I can't see how it fits in with what you've written so far.


No P2P; I'm using gossip as a background synchronization.


Your philosophy is nothing new. It’s called REST. Yes, REST must be hypermedia-driven and if your server and client “speak” the same media type, your client doesn’t have to keep state. Add HTTP caching to the mix and the result is beautiful. Of course, it’s easier to concatenate strings and write stores on the client. Or is it?


That's the old world, and the world I want has limited JavaScript yet fully interactive products.


REST with JQuery ;-)


JQuery doesn't solve reactive data, and it's like building an interactive product in assembly.


The argument against GraphQL in this article (and the article is linked to) is hand-wavy. It just says GraphQL is not secure without going into why and the example is: exposing a field on a record that you shouldn’t - which can happen in REST too! Especially if you are implementing something like JSON API includes[0]. After building frontend for 8 years GraphQL seems like the most scalable solution if you have a large dynamic frontend app. The approach of coupled server (whether it’s htmx or Remix or Rails for that matter) works as well but as the article mentions it makes you build another API for any other client.

0: https://jsonapi.org/format/#fetching-includes


> The biggest issue that we see is security, as we outline this in The API Churn/Security Trade-off[1] essay.

> Apparently facebook uses a whitelist to deal with the security issues introduced by GraphQL, but many developers who are using GraphQL appear to not understand the security threats involved with it.

I link to a longer essay on the inherent issues w/ increasing the client-side expressiveness of an API. It boils down to the fact that you give that power to anyone who can fire up a web console, in contrast with server-side expressiveness.

[1] - https://intercoolerjs.org/2016/02/17/api-churn-vs-security.h...


I agree with the article that security is a topic that a GraphQL implementation needs to grasp and explicitly address but I’m with you in that it is not a showstopper and good strategies already exist and that a JSON api could also fall victim to this.

From personal experience the biggest challenge a GraphQL implementation faces is tighter/direct coupling with the backend data Schema which tends to either become friction in schema evolution or attract so much engineering effort that it is essentially also sustaining development of a hypermedia API.


It is bizarre to see people rediscovering the same stuff that we were doing during the early days of the dynamic web and the web application (probably circa 2000-2003 ish?) with partial rendering via API endpoints.

What is old is new again and all that.


Correct.

I am a web developer from the late 90s and I am trying to resurrect hypermedia as a viable option for rich web applications.

https://htmx.org

https://hypermedia.systems


I believe the author also had their start in the 90's/turn of century.

I read this post as education for the younger webdevs who may not know these things.


One advantage of the "Another Solution" brought up in the article is that the code you write to create JSON for your web pages can be in large part reused for your mobile API. Certainly mobile screens are shaped differently, but not entirely. The trick is to abstract JSON-rendering logic in a way that lets you reshuffle bits and pieces for your mobile.

Disclaimer: I wrote the linked article for that solution.


This article is so hard to parse. What exactly are the requirements of their desired api solution, both functional and non-functional and how exactly does each approach solve them?

Instead, it’s a meandering journey to their already chosen solution.

I can’t tell what their objection to graphql is and why hypermedia API’s are less of a security concern.

And their example with json API’s and close coupling doesn’t address the fact that adding a field doesn’t break backwards compatibility, and even if you do need to spin up a new api, any competent architecture will separate rendering an api from the business logic to generate its data, so standing up a new api variation won’t be a huge amount of work.


> I can’t tell what their objection to graphql is and why hypermedia API’s are less of a security concern.

GraphQL increases client-side expressiveness of an API, but it also puts that power in the hands of potentially hostile users who, for example, could open up a console and start crafting dangerous queries (showing data that shouldn't be allowed, creating expensive queries that DoS your system, etc.)

Since with hypermedia APIs the UI is produced server side, you can, when producing the HTML, have full access to something like SQL. The server side is a trusted computing environment.

Facebook whitelists their GraphQL queries to avoid security issues, but many developers don't even realize that it is one.

> And their example with json API’s and close coupling doesn’t address the fact that adding a field doesn’t break backwards compatibility

In my example, I remove functionality.


Anyone can watch the traffic go back and forth and see all the hypermedia urls that are present in the response, just like they could hypothetically see the graphql console.

However, the smart way to implement graphql is to map queries to a restful url (that’s a simple tool to build), so that the full graph isn’t exposed to end users.


Again, facebook whitelists every query.

Hypermedia URLs are typically specialized and don't expose general query functionality, whereas, as its expressiveness increases, GraphQL approaches the expressiveness of SQL. That's the whole point: you don't want to have to make back end changes to support new front end features.

The more powerful you make your GraphQL-based API, the more dangerous the security implications are. The less powerful you make it, the more it fails to address the fundamental problem of requiring back end changes to support front end needs.

It boils down to the fact that you simply can't increase the expressiveness of an untrusted computing environment like the browser without increasing security risk, whereas on the server side you can hand full SQL-style query access to your developers.


Whitelisting doesn’t limit expressiveness at all. It’s part of their build system so it’s a very easy issue to address (especially if you are okay with a manual mapping file).


>It would be cumbersome and error-prone to try to download this HTML, parse it and try to extract information from it

This is the crux of the problem. Also the reason XML fell out of favor and lead to a massive and still incomplete reinvention of the wheel in the form of JSON schema etc.

Imagine a client side programming language for which a valid html snippet is simply the serialization of an object. And vice versa. No parsing required.

The point is that much of the web has been shaped by the adoption of javascript as the exclusive browser side programming language.


Yeah it's interesting how XML/XHTML was both data and presentation, we completely gave that up for a much inferior JSON format.

Most of this was due to the overuse of XML, but it seems we are heading in that direction with either JSON or YAML everywhere these days.

The usual complaint is XML is bloated, but it is really how you write it,

{ "Name":"John", "age":20}

Vs

<name="John" age="20" />

and XML gives you both types and presentation with DTD and XSLT


I often see variants like this in xml:

<item> <Name>John</Name> <Age>20</Age> </item>

or <item_John age="20"/> where john is parsed manually out of item object

or full blown up insanity <person> <item> <key>name</key> <value>John</value> </item> <item> <key>age</key> <value>20</value </item> </person>


> <name="John" age="20" />

As far as I know, that is not well-formed XML; the tag name is missing. You'd have to write it like this:

<person name="John" age="20" />


There are more peculiarities with XML, e.g., the ambiguity of attributes vs nested elements (you could also write <name="John"><age>20</age></name> to express the same thing) but these don't feel insurmountable. You could pin down a subset based on extra rules that would simplify working with XML/XHTML and obviate the need to invent a parallel universe in the form of JSON.

You mention the data/presentation mix and that is not necessarily an advantage - separation of duties is usually a good idea, unless you can properly channel that "efficiency". But that too seems not insurmountable: the real, visual presentation on screen is anyway delegated to CSS. The (X)HTML structure is still in a sense supposed to capture the logical structure of the document/object that is being transmitted. What would be required is a way for the client-side language to distill the data-centric logic from the occasionally inflationary use of things like <div> elements (which seems another hack trying to fix poor design anyway).

In any case, no point crying over spilled milk. The winner-takes-all nature of the web means it is doomed to be building on the hacks adopted by the dominant entities to solve their needs. A historical accident that keeps giving.

Who knows, maybe as the browser is enabled with WASM and whatnot there will be a chance in the future to have a more rational approach to API's, REST, Hypermedia, Linked Data and that magical universe.


Am I stupid? Is the crux of the article that HTML/hypermedia has built-in representations for API items such as links to other resources (i.e. as in the examples of the article), but JSON (for e.g.) doesn't. Therefore, using this 'standard' schema decouples your application components because one (e.g. the frontend) doesn't need to understand a particular, custom schema (re)implemented in nonhypermedia?


That's roughly correct.

What's interesting, from the perspective of this article, is that at the application level, a hypermedia API is much more tightly bound to your particular application (you are returning specific UI to the client that would be a pain for a third party to consume) but that, despite that tighter coupling, the hypermedia API handles changes better.


The hypermedia API option really doesn’t add much flexibility, when consumed by client code (as opposed to a human, or possibly by ChatGPT). It allows changing the URLs, but it doesn’t allow changing the structure or naming or semantics of the API. Arguably the only benefit it affords is being able to transparently move linked resources (such as subresources) between hosts, and that’s usually a case of YAGNI.


The hypermedia response is not intended to be a bag of data that the client parses and transforms into a view.

The HTML response is the view. This is how the basic web works. The browser (as a client) knows how to render HTML. HTML also provides affordances (links and forms) to transition state.

Htmx has similar behavior to the basic web. Their client adds additional affordances to HTML and they allow state transition within the page.

People may disagree with this approach but to say it is not flexible seems odd when browsers are the most flexible hypermedia client in existence.


Most of the discussions about approaches are about shifting the locus of "common language" at a particular point between the producer and the consumer. The common language is basically a codified description of the data that is highly (ideally always) predictable, so that the producer and consumer both speak that language at all times at the boundary of information exchange between the 2 systems.

Something like RPC nudges towards the producer (or backend); graphql moves it to the consumer (or frontend).

A different way is to not look at this as a line of information flow between 2 points, by introducing a 3rd party, an impartial standards body. Maybe it's just a semantic or procedural difference, but when the locus is defined at the meta-architecture level there's no more room for discussion about exactly where you place the locus. Pay the vendor before you ship, or pay the courier when you receive? Neither: introduce an escrow.


Interesting idea! However it will be very expensive to store all of this on a CDN.

The problem is users will interact with a majority of (get) APIs through a CDN and as developers we should reduce unnecessary data sitting in the CDN because it costs quite alot of money to serve.

For example if a CDN is returning JSON data {name: “Bart”} were only paying for 50 bytes or so of data transfer. BUT if instead we convert this to HTML and serve it, we would be 10x (or more)the data stored on the CDN to serve that data.

I don’t see a strong argument for paying 10x more to serve data as HTML.


Is a JSON response smaller here? With HTMX I can return the text `Bart` to be swapped into the innerHTML of the appropriate element. With JSON, the actual text is longer, and I need `Content-Type: application/json` and possibly other headers. My initial page load also requires less bandwidth by not loading React. What do you think? Maybe I should just test it.


I get what you're saying but you have to follow that logically to the end. 10x or more of what? What does 50 bytes cost over the wire and what is 10x that? Is this really the thing we are worrying about? Is the best argument that HATEOS costs slightly more fractions of a penny than a json API?


hypermedia apis are not good data apis, as I say in the article:

https://htmx.org/essays/two-approaches-to-decoupling/#but-th...

> Many people would object that, sure, this hypermedia API may be flexible for our web application, but it makes for a terrible general purpose API.

> This is quite true. This hypermedia API is tuned for a specific web application. It would be cumbersome and error-prone to try to download this HTML, parse it and try to extract information from it. This hypermedia API only makes sense as part of a larger hypermedia system, being consumed by a proper hypermedia client.


You might not need a CDN


Two apis: a write interface and a read model interface. Read models are easy to maintain and there is no coupling. The ui asks for a bunch of data and we don’t care what that model looks like. We use events to update and cache read models.

We do care about commands that add and change state. Those are simple apis that have strict property requirements.

Also GraphQL is a horrible invention for quick and dirty work with no regard for long term maintenance. You’d be begging for an eventual big ball of mud.


I am probably misunderstanding the article, but as far as I can tell, the two ways discussed are "JSON API like almost everybody does" and "do server-side rendering and call it the API".

Not to say it's wrong to do server-side rendering of the frontend, but I got the impression that the article wanted to make a distinction that I'm not seeing.


server side rendering (that is, returning HTML) is an API, it's just that it's a hypermedia API (poor hypermedia, can't get no respect) rather than a data API, intended to be consumed by a hypermedia client (e.g. a web browser) and revealing/affording further API navigation via hypermedia controls (e.g. links)

hypermedia APIs have not proven to be good data APIs, but they have proven to be excellent at surviving change within a larger hypermedia system (like the web)

in this article I'm trying to show why, despite the promise of decoupling that comes with a generic JSON Data API, it is hypermedia APIs (with tight server/front end coupling) that survive change better, due to the uniform interface of REST


Thanks so much for this article. I have something that I want to try out HTMX on and this really clarified some things for me.

I particularly liked the bit about the way that the JSON approach, decoupled in theory, often ends up being tightly coupled in practice, such that you keep having to change both at once. And many places are structured such that you need two people or even two teams to make a single change. That doesn't sound decoupled at all!

I'm excited to see how it is to give on that in-practice-imaginary decoupling and try for HATOEAS at the fractions-of-a-page level that HTMX looks to enable.


and revealing/affording further API navigation via hypermedia controls (e.g. links)

I haven't a hard time understanding the utility of this, the article gives an example (removing ability of transfers), but why would a webpage need those hypermedia controls in the response if they are already encoded in the API of the business logic? for ex: The business logic tells the presentation layer, "if X field is true, disable transfers button".


The presentation layer needs to interpret the business layer information in your example. This couples the two together if, for example, the form of the business data changes.

This is in contrast to hypermedia where the client (a browser) simply sees the new hypermedia and renders it. In this case, the client is decoupled from the particulars of the business logic.

This is due to the uniform interface of REST. See https://htmx.org/essays/hateoas/


It's an API for a specific hypermedia client, i.e, a web browser. It's not obvious that it's the best hypermedia API for a different client, for example, a JavaScript application that happens to be running in a web browser.


well, if you are building a hypermedia client, good luck to you

https://www.oreilly.com/library/view/restful-web-clients/978...

it's not a trivial task though


Well . . . any time someone offers me two options . . . I know there's at least one lie involved. How many dichotomies are not false dichotomies, after all?

> The central feature that distinguishes the REST architectural style from other network-based styles is its emphasis on a uniform interface between components.

Well, no. What distinguishes REST is that it uses HTTP verbs on resources.

And then it talks about Fielding mostly-not-implemented stuff and . . . no. Just no.

Before we've even gotten to the end of the para about coupling, it's clearly a bunch of crap.

Which is too bad, because personally I think REST is garbage, but CRUD garbage meant to replace SOAP "what is that thing?" garbage.


do you have a paper that I can cite when discussing REST, since you understand the idea better than the man who defined the term REST?


I could be misunderstanding of course, but IMO this article is basically misleading or incorrect. If everything else is the same, a JSON API is strictly more flexible than the corresponding HTML API because JSON is a strictly more flexible serialization format than HTML. You have a database on one side of a network and clients on the other side. If you bind your data to a specific client on the database side of the network, your system is less flexible.

N.B. I am not making any claims about whether any particular HTTP API should serve JSON or HTML, I’m just saying that JSON is a strictly more flexible option.


Yes and no.

What I'm reading [1] is that they're advocating for side-by-side decoupling.

In other words, humans and computers need different interfaces. Rather than try and build one, which is bad for both [2] build 2, one for the humans (however you prefer) and one for the computers (REST API).

Perhaps I'm reading it this way because I'm prejudiced, (I've written our systems this way) and it seems to be working well.

[1] this whole article is both well formatted, and seems to make a good post, but equally seems to feel like you're not sure what it's trying to say. Perhaps it makes more sense in context,rather than as a stand-alone article.

[2] the computer-client team has different needs to the human-client team. One wants a very stable API, the other wants regular changes. One is content driven, the other process driven. Ultimately they want different things making two interfaces makes both teams happier with less conflict between them that you have to umpire.


True, the JSON API itself is more flexible, in the sense of being multi-purpose and more easily consumable by different clients. But how easy is it to change each API? I think the article uses "flexibility" as our flexibility to change the API, not as the flexibility of the data.


I think this makes sense and my initial comment may have been incorrect. I guess the idea is something like:

- a general purpose API is insufficient for any particular application, each application will require a special-purpose API alongside the general API

- assuming that you are going to have an application-specific API, it makes sense that the application-specific API should serve data specifically-tailored to that application (in this case hypermedia)

That said, it seems like the most interesting discussion would be about the tradeoffs between the two-JSON-API solution that they reference and the one-JSON-API + one-HTML-API solution that they suggest.


"Well, let’s say that we wish to remove the ability to transfer money from our bank to other banks as well as the ability to close accounts.[...] You can see that in this response, links for those two actions have been removed from the HTML."

This makes me think about server functionality that is not exposed in the web app but it's still there, with potential security implications. An instance of the more generic case of client functionality not being in sync with server functionality of which client-only form validation is another scenario.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: