Hacker News new | past | comments | ask | show | jobs | submit login
HTML Attributes vs. DOM Properties (jakearchibald.com)
397 points by thunderbong 7 months ago | hide | past | favorite | 152 comments



Good article. There is also a nice middle ground with the data-attributes:

```

<div id="myDiv" data-payload="something"></div>

```

These data-attributes are then automatically available to JavaScript as a read+write property:

```

document.getElementById('myDiv').dataset.payload == "something"

```

However, one has to be mindful that HTML attributes use -kebab-case- while JavaScript uses camelCase. Thus

```

<div id="example-div" data-my-cool-data-attribute="fun"></div>

```

becomes

```

document.getElementById('example-div').dataset.myCoolDataAttribute == "fun"

```


I really dislike automatic conversion between cases. There does not seem to be any established convention of how to handle uppercase acronyms across all different ecosystems that use camelCase, so

``` document.getElementById('example-div').dataset.myID == "1" ```

becomes:

``` <div id="example-div" data-my-i-d="1"></div> ```


Personal pet peeve but IMO ‘ID’ is a bad example because it is an abbreviation, not an acronym. I generally recommend ‘Id’ but I’ll admit, it’s like a tabs/spaces debate.

‘UI’ may be a better example :)


Whoah, I never thought about that. You're totally right, "ID" should really be "Id."


Or…Identity Document.


Although, I admit "ID" has kinda become a word in its own right, in my usage. An attribute ID is not, in my mind, its "identity" (and certainly not any kind of document), but a way of referring to it…and hopefully locally unique.


Interesting. In my mind an id is an identity. I think of ids as ‘what distinguishes and object uniquely from other objects’.


just treat acronyms as a word, e.g. `myId` instead of `myID`. That's been the standard practice in JavaScript as long as I've been using it at least.


That is the standard practice in Java. But JavaScript has XMLHttpRequest, innerHTML, etc

There's not really a good standard at all, but I think the _emerging_ practice is captured well here

> When using acronyms, use Pascal case or camel case for acronyms more than two characters long. For example, use HtmlButton or htmlButton . However, you should capitalize acronyms that consist of only two characters, such as System.IO instead of System.Io . Do not use abbreviations in identifiers or parameter names.

https://learn.microsoft.com/en-us/previous-versions/dotnet/n...


It is for id, but not for acronyms. Eg `innerHTML`. https://w3ctag.github.io/design-principles/#casing-rules


And that isn't even getting to XMLHttpRequest.


Haha, yeah

- Isn't limited to XML

- Isn't limited to HTTP

- The returned object is also the response

Beautifully named.


And the first letter in the XML acronym shouldn't even be X.

Obiously the JS API should be called ExtensibleMarkupLanguageHyperTextTransportProtocolRequest


just as well-thought-through as other parts of the web ecosystem, starting at the scripting language designed in a mere week.


I'm having to touch Spring occasionally nowadays and wow it is a hodgepodge. It really is the backend equivalent of jQuery.


Can you provide some example why it’s a hodgepodge?


React for example has a solid mathematical foundation and that's why it provides really good, clean abstractions. Spring is just a bunch of annotations and if you haven't done it for half of your life it really isn't clear why I have to write an empty! class that has 6 annotations on it, just to get the plumbing right for configuration. Smells like what I'm doing there should just be a function instead but maybe that's also partly Java being Java.

Yes it's very mature etc but at the end of the day it's just a collection of stuff that evolved over time. It's a framework written by developers, not mathematicians; a BeanPropertyRowMapper is likely something that happened on accident, react's algebraic effects are not.

Maybe it's not a fair comparison because react is super slim and spring (boot) is huge but when using it you have to work your way through endless baeldung tutorials to figure out which annotations to put onto your methods with no clear design. I also found asp.net easier to work with.

Reason it reminds me of jQuery is because it's everywhere. A coworker of mine just wrote some websocket code that should be really great and reusable but it uses spring annotations extensively so you can never use his code without spring. But it doesn't really matter because if he wants to reuse the code in another java project, spring will likely be there. And that's a level of hodgepodge that jQuery achieved back in the day too.

Guess what I want to say is that not all of web is "badly thought out" like the above poster insinuated.


You should read Java EE standards, and Spring documentation, not Baeldung. There is design there. Finding the name of something is difficult in both, but of course React’s search space is way smaller due that Spring is larger by magnitudes, so it will be obviously easier.


FYI, on HN, you can get code blocks by indenting with two or more spaces:

  like this


Thanks. This is really good to know! Unfortunately I can no longer edit my original post, but I will use your tip next time.


It isn't hard to correctly convert from camel case, you just tree any sequence of one letter words as a single word. However, without a standard for camel case acronym calculation, when you convert back you won't necessarily get an exact match.

I presume, the ability to get the exact camelCase capitalization you want is why the html spec does the conversion this way: https://html.spec.whatwg.org/multipage/dom.html#dom-dataset-...


I avoid all of the kebab/camel conversions by using the methods:

Element.hasAttribute('data-thing')

Element.getAttribute('data-thing')

Element.setAttribute('data-thing', '...')

Much more clear what I'm doing without the awkward translating when I inevitably have to inspect/debug in the DOM.


MDN article Using data attributes describing HTML syntax, JavaScript access and CSS access, and a caveat about accessibility: https://developer.mozilla.org/en-US/docs/Learn/HTML/Howto/Us...

Also https://developer.mozilla.org/en-US/docs/Web/HTML/Global_att..., which says:

> The data-* global attributes form a class of attributes called custom data attributes, that allow proprietary information to be exchanged between the HTML and its DOM representation by scripts.


(Belated reply re URL parsing bug)

Per https://news.ycombinator.com/formatdoc, you can put your URL in angle brackets to override HN's URL parser:

<https://developer.mozilla.org/en-US/docs/Web/HTML/Global_att...>

The problem with binding asterisks to URLs is that sometimes the asterisk denotes the opening or (more likely) closing of italicized text, like this:

The URL for Hacker News is https://news.ycombinator.com/

This happens relatively often - here are a few recent examples:

https://news.ycombinator.com/item?id=40391982

https://news.ycombinator.com/item?id=40324105

https://news.ycombinator.com/item?id=40317677

Arguably we could deal with this by binding the asterisk to the URL unless it's closing an italicized bit, but this wouldn't resolve the ambiguity completely—for example it wouldn't let you put <https://developer.mozilla.org/en-US/docs/Web/HTML/Global_att...> inside italics—so it's probably better not to chase the corner cases too hard. The angle bracket notation works, but only if people know about it of course!


These are also accessible to a degree from css


Cool! What is the syntax for that?


Attributes can be used in selectors like this `div[data-my-data="stuff"]`.


CSS is very powerful here because you can take advantage of the other attribute selectors to find partial matches:

div[data-my-data^="my-prefix"] (selects by prefix)

div[data-my-data$="my-suffix"] (selects by suffix)

div[data-my-data*="my-substring"] (selects by substring)

Quite a few more of these:

https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_s...


attr() can also be used to some degree https://developer.mozilla.org/en-US/docs/Web/CSS/attr


The attr() function is actually much more powerful in the spec (perhaps one of my favorite part of unimplemented CSS). According to the spec you can supply the type of the function and use it as value (not just content as of now):

    <div data-background="lime">Red (but could be lime)</div>

    div {
      background: red;
    }

    div[data-background] {
      background: attr(data-background color, red);
    }
So according to the spec, you should be able to control the sizes, color, and even animation timings with (non-style) attributes.

In reality though, whenever I think I need this advanced attr() function, I usually just solve the issue using Custom Properties on the `style` attribute:

   <div style="--background: lime">


It’s an interesting and another way to communicate across JavaScript contexts.

By contexts here I’m not specifically talking about different Iframes — then it would be beholden to the same origin restrictions and probably have to use postmessage - but I mean in the context of a browser extension, where you create a parallel JavaScript execution context that has access to the same DOM

These are also called isolated worlds in the language of the dev tools protocol


What does `data--my-cool-attribute` supposed to become?


The letter after - is uppercased, and the - is removed. So, `div.dataset.MyCoolAttribute`.


Won't that clash in this case, with the version that has double hypens?


I'm talking about your example with double hyphens. It's the initial double hyphen that causes the first letter to be uppercase.


> I'm talking about your example with double hyphens. It's the initial double hyphen that causes the first letter to be uppercase.

I was asking about the initial example, and made a mistake in the question I asked.

To clarify, according to the standards, what do the following transformations result in?

    data-my--cool-data
    data-my-cool-data


el.dataset['my-CoolData'] and el.dataset.myCoolData respectively


And all of the things in this article represent complexity and behaviours that don't add anything to anyone's lives.

This behaviour is all downside and no upside.

The spec should have simply defined a dom object and a javascript object to be one and the same, with attributes and properties being the same thing.


Most modern UI frameworks wouldn’t work at all if this limitation “simply” existed. Being able to store non-string values on DOM nodes is an upside.

Acting this flippant isn’t helpful.


To my non-webdev ears this sounds like "having only pegs and square holes isn't a problem because otherwise how would you try to fit one in the other?"

Could it be that this whole paradigm (which by the way has been devised for a different and much simpler use case, and abused forever since) isn't sustainable anymore?


I think the article missed an important “why” here that is confusing people here: JavaScript objects and the DOM are fundamentally different, and JavaScript is providing an API via the DOM for reading and writing HTML, which is essentially an XML document. We shouldn’t expect the objects in JavaScript to perfectly translate to their HTML counterparts or vice versa.

If anything, it’s odd that they added these conveniences like reflection that make it more magical than it should be.

(Edited for clarity)


> We shouldn’t expect the objects in JavaScript to perfectly translate to their DOM counterparts or vice versa.

Since JavaScript was invented specifically to manipulate the DOM, I'm not sure that that follows.


That's definitely one of the main original use cases, but it would have been a bad decision to base the entire language around the limitations of HTML, and I doubt JavaScript today would be the most popular programming language today if they had. You can read the original ECMAScript 1 standard here, which is titled: "A general purpose, cross-platform programming language"[1].

https://ecma-international.org/wp-content/uploads/ECMA-262_1...


I think a lot of people dislike that JavaScript is a general purpose programming language, rather than being a scripting engine for browsers.


A lot of people dislike any tool that actually has wide spread usage. No one complains about tools that have been abandoned.


"There are only two kinds of languages: the ones people complain about and the ones nobody uses." - Bjarne Stroustrop


I wonder why the web technologies are not fully integrated like the Smalltalk and Self graphical environments... Is it impossible or did it just end up that way and it's too late to change it now?


The DOM API is not designed as a JavaScript API (at least originally). It was designed to be language independent and to be more natural for a language like Java rather than JavaScript.

The reason for NodeList instead of Array is the same

For example the Event interface definition[0] contains parts like

  const unsigned short NONE = 0;
[0] https://dom.spec.whatwg.org/#interface-event


That's used as a constant for eventPhase https://dom.spec.whatwg.org/#ref-for-dom-event-none%E2%91%A2

In modern DOM APIs, it would be a string enum.


I did not mean to disparage the DOM API, I intended to show that the DOM API was not (initially) designed as a Javascript API and was instead expressed in a language independent (but Java inspired) IDL.

I would not surprise me if modern addition or revisions to the spec were more JS inspired


Mostly the latter. The web wasn’t designed to become an application platform.


What do you mean by not sustainable? What has been more sustainable than web dev and web tech lol? Also, being flexible isn't trying to fit a square peg in a hole, it's the opposite actually.


I agree, except with the “simply” part. This is not simple at all, unless you also demand that every property can only ever be a string. Otherwise, you need to think of conversions (eg from and to numbers), and what about properties that can be set to objects? Or properies that don’t correspond to attributes at all?

I agree that any sort of convention/rule here that’d be 100% fixed would’ve been better, but it wouldn’t be simple.


think the other way round... allow DOM attribute values to be of any javascript type...

Sure, they maybe can't be serialized, but that isn't an issue.


That doesn't make any sense. DOM attributes are serialized as a fundamental part of their expression.


So is JavaScript. Granted, it would be more straightforward if it were a Lisp.


In what way is that not an issue? What happens when you use .outerHTML to get a string representation of the element? Not to mention that, unavoidably, the entire page HTML arrives to the browser as a string.


Same is true of real javascript objects vs JSON. For example, you can't put Date() in JSON. Nor objects with a prototype.


I do want to point out for those who don't know, on objects with a prototype (such as a class) you can control how JSON.stringify works by adding a special method `toJSON`[0] which will control how that object is serialized.

[0]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


Right… and that’s an issue people have to work around every day. Non string values in HTML world similarly be an issue everyone would have to work around.


This means you couldn't non-string values on DOM interfaces, and that includes methods, so most of the DOM's functionality goes out the window.


Good point, but HTML already has a workaround by using strings:

<div onClick="func()">...</div>

Now, I'm not a JS developer, so my opinion shouldn't count for much, but that looks like a much less usable standard to me.


Yeah, that works, but it requires func() to be on the global scope. It doesn't scale as a solution.


Does that matter now that we've got (possibly declarative) shadow roots as the units of DOM encapsulation?


Yes, even more so. You end up with encapsulated DOM without encapsulated script.


Making attributes and properties symmetric to one another would be difficult for the existing DOM API: methods like `addEventListener` are (prototype) properties on individual DOM elements, but would be nonsensical string attributes. Making your idea concrete:

    // our document is like:
    // <foo bar="asdf">
    let foo = document.querySelector('foo'); 
    console.log('bar', foo.bar); // cool, prints asdf
    foo.bar = 'nice'; // <foo> element is now <foo bar="nice">
    foo.addEventListener('some-event', e => console.log(e)); // wait, what?
So DOM element methods would instead be free functions like:

    HTMLElement.addEventListener(foo, 'some-event', e => console.log(e));
It's a matter of taste, but I for one appreciate having object method calls in JavaScript.


I don’t know why they would be different? It seems cool, but the DOM elements have a very specific interface, where ‘setAttribute’ is required to modify state. If you just hang some property off a JavaScript object - that you created to send messages to the DOM using that interface - then why would you expect it to get routed there? Again, that would be a much better API/experience, some challenges but cleaner.

Maybe I’ve just been writing backend code for too long. I see an interface like this and I figure there’s some weird shit going on under the hood, and some kind developer from the past gave me the best and safest wrapper she could.


It's all "state"...


> all of the things in this article represent complexity and behaviours that don't add anything to anyone's lives.

A cynic might point out it added a lot of makework to the lives of its creators. And that this is why we can't have nice things like proper specs instead of 'living' ones.


What helps me make sense of it is that attributes in the narrowest sense are DOM nodes (Attr nodes). They can have properties (such as `name`, `value`, `ownerElement`, `prefix`), and collectively (as a NamedNodeMap, not an array) they can be accessed as the value of the `attributes` property of the element that owns them.

So in a sense the name of getAttribute(name) is misleading: it's probably better renamed getAttributeValue(name), because it doesn't really return an attribute (in the sense of an Attr node), but the value of the `value` property of the attribute node (owned by the element on which it is called) whose name is `name`.

See:

https://developer.mozilla.org/en-US/docs/Web/API/Element/get...

https://developer.mozilla.org/en-US/docs/Web/API/Element/get...

https://developer.mozilla.org/en-US/docs/Web/API/Attr


That's all true, but is there anything practical you can do with an attribute node, and el.attributes, that you can't do with setAttribute/getAttribute/getAttributeNames/hasAttribute?

You can't even move attribute nodes between elements. You can't event reorder attribute nodes within an element.


Not that I know of! It's just a way to help myself keep the distinction straight mentally. If at the back of my head I always think of attributes as DOM nodes in their own right, I'm much less tempted to infer from a DOM element having a particular attribute to it possessing a particular property, and vice versa.


Work with attributes whose name doesn't match the XML Name production.

That can be a real concern in some cases (it came up recently in the context of "how do I move all the attributes on node A to node B, given that node A's attributes can be anything that can be created by the HTML parser?" c.f. https://software.hixie.ch/utilities/js/live-dom-viewer/?save... )


Hah, good insight! I didn't realise you could move attributes either. I tried, but seems (unlike nodes) you need to remove them before adding them.


Look at the DOM Attr interface [1].

You can list attributes, get their name, namespace, value.

[1] https://developer.mozilla.org/en-US/docs/Web/API/Attr


All of which you can do with the other methods.


Sure


why do you care what order the attributes are in that you feel not being able to reorder them is a negative?


I don't. I was just trying to think of anything an iterable might offer over the other methods.



> You'll see this syntax on my blog because it's what Prettier does, and I really like Prettier.


One would think and hope that they would fix this mishandling when MDN, WHATWG and the W3C all complain online about Prettier.


I understand the argument against updating the <dialog open> property but it's useful to be able to target it with CSS


That's why I said it should also have a pseudo class. Similar to :checked on inputs.


That’s my case across all others too, so much eassier to debut by checking the dom in an instant rather go through the debugger process


Perhaps the most important difference is persistence, and isn’t covered by the article.

Attributes are persistent for the duration that the same root object of the given DOM tree remains available for access, such as not leaving the page.

Object properties only remain available until the given object is garbage collected.


The object exists as long as the DOM node (the thing with the attributes) exists, so the lifecycle is the same.

The DOM node could be cloned, or go through serialisation and deserialisation, in which case it's now a completely different DOM node (although with the same attributes), so it won't have the non-default properties of the other node.


No. That is incorrect and makes assumptions not present.

The DOM is a language agnostic tree model in memory. The DOM, or any node therein, is accessed via the DOM’s API. Accessing a single node in JavaScript generates a node object. That node object is an artifact in JavaScript language representing the DOM node at the moment of access.

A DOM node is a living mutable thing, but the JavaScript object representing that node is not. It’s a static object in JavaScript language. That is also why a node list is not an array.

The DOM is not an artifact of JavaScript. I can understand how this is confusing if you have never operated without a framework, but otherwise it’s really straightforward.

This is one of the reasons I abandoned my JavaScript career. Nobody knows what the DOM is, which is their only compile target, and yet everyone wants to an expert.


What I said in my previous comment is observably true. Try making a demo where it isn't.

> A DOM node is a living mutable thing, but the JavaScript object representing that node is not.

The JavaScript object is mutable. The first example in the article shows this.

> That is also why a node list is not an array.

Modern APIs on the web return platform arrays (eg JavaScript arrays). https://webidl.spec.whatwg.org/#js-sequence - here's where the WebIDL spec specifies how to convert a sequence to a JavaScript array.

I'm fully aware of NodeList. There's a reason the spec calls them "old-style" https://dom.spec.whatwg.org/#old-style-collections

> I can understand how this is confusing if you have never operated without a framework, but otherwise it’s really straightforward

Sighhhhhh. I've been a web developer for over 20 years, and spent a decade on the Chrome team working on web platform features. Most of my career has been on the low-level parts of the platform.

Could it be possible that people are disagreeing with you, not because they're stupid, but because you're in the wrong? Please try to be open minded. Try creating some demos that test your opinions.


The JavaScript object will be garbage collected as seen fit by the corresponding JavaScript run time irrespective of whether the associated node remains attached to its document.

That is challenging to observe when using event listeners, as opposed to assigning events directly to handlers, because listeners interfere with garbage collection. Furthermore, this almost impossible to observe when abstracted by large frameworks.

As for a demo I am currently out of town without a computer, but you can use this project to perform experimental qualifiers: https://github.com/prettydiff/share-file-systems

That project forms an OS like GUI in the browser without listeners and eliminates most non-locally scoped DOM node references in the event handlers. This allows for localized event handlers with localized DOM node references that are always ready for garbage collection. This also allows state restoration of greater than 10,000 DOM nodes without a large increase in load time and without increased memory consumption. So instead of normal page load with state restoration of about 80ms blowing up to 10,000 nodes could take up to 300+ms. If you want to achieve extreme performance, in any language, you must absolutely understand your compile target. The compile target of the browser is the DOM.

You could also try it on my website which has far less functionality but allows rapid experimentation of the state management. http://prettydiff.com


> The JavaScript object will be garbage collected as seen fit by the corresponding JavaScript run time irrespective of whether the associated node remains attached to its document.

Nope. In practice, the DOM node has a backpointer to the JS object that it's wrapped by which keeps it alive. This is known a "DOM wrapper" and exists in every browser engine. Here's Chrome's internal documentation about this feature, showing the property survive after a GC cycle: https://chromium.googlesource.com/chromium/src/+/master/thir...

The term "expando properties" was invented at Mozilla (or maybe even Netscape?) for exactly this reason; they needed a name for extra properties added to a JS DOM wrapper that would be kept alive as long as the DOM is. It's been part of Gecko for close to 20 years at this point.

You are wrong.


I am not sure you understood the article you linked to as its completely outside this conversation. Here is the key bit:

As a result, we have multiple DOM wrapper storages in one isolate. The mapping of the main world is written in ScriptWrappable. If ScriptWrappable::main_world_wrapper_ has a non-empty value, it is a DOM wrapper of the C++ DOM object of the main world. The mapping of other worlds are written in DOMDataStore.

It’s not that an instance object of a DOM node in JavaScript must back port to that node in the DOM tree, but that nodes in the C++DOM architecture have wrappers to other abstractions and such wrappers always reflect the same abstractions in context to their means of access.

That is why to update the DOM an API is provided to JavaScript. Otherwise assigning values to object properties would be sufficient and faster, but nowhere in any version of the DOM specification is this specified or allowed. The DOM is intended to have a memory model separate from the memory model of a JS instance such that access to a given document may occur by unrelated technologies without interference and independent from a given JS instance.

For example SVG can be accessed in JavaScript using the DOM API which returns a JS object reflecting that DOM node. SVG has its own animation scheme unrelated to JavaScript conventions. Since SVG is an XML library all definitions are stored as child nodes as either attributes or child elements. Assigning new animation definitions as JavaScript properties does not back port to given SVG instance in the document, and yet a separate unrelated runtime can modify that animation by updating the SVGs child nodes. If there is a sufficient DOM wrapper abstraction for this case from the lower level DOM memory manager the changes to the SVG animation by that separate unrelated application should be updated in the JavaScript object instance.

So, you are assuming the article says something it doesn’t.


Weird comment. There's no substantial difference in behavior here (aside from the existing distinction between attributes and properties).


The DOM has nothing to do with JavaScript. I explained this in further detail to a peer comment.


You did not describe any substantial difference in behavior in the sibling comments. You claim that the DOM nodes and the js nodes representing the DOM nodes have different lifetimes. What difference if behavior can I as a web developer experience because of this?


The DOM is not JavaScript. It doesn't make the original comment any less weird. (The reply with further detail that you referred to is even weirder.)


How does that difference manifest?


Fascinating article. I had always perceived DOM elements in JavaScript as proxies. They let you do things to that DOM element, but it’s still just a javascript object that you can assign to and do whatever, in addition to using the functions for interacting with the actual element.


Nice to see this well explained with examples, it can for sure benefit many people, especially that there are several details that are sometimes overlooked.


DOM properties also known as IDL attributes


Yeah, I deliberately avoided that terminology because few people outside of spec authors call them IDL attributes, and calling both things attributes is very confusing.


Heh, I guess I’m a stickler for the textbook. Always have been hahaha :)


Nice article, makes me wonder what a better representation would look like. It would be interesting if Html could have non string attributes that reference some local values from a scoped script tag or something.

I found the take on attributes for default configuration to be an odd take. Think about "class" attribute for example. If it only ever showed the default, toggling classes would look really odd in the dom inspector / stringified html, either not showing an active class or showing an inactive one.


I didn't say it was for default configuration, just configuration. The class attribute works within this system. I, as the owner of the document, configure the class attribute. I can change its configuration over time, but only I do that.

An alternative system would be where the browser adds and removes classes of its own, in response to things like hover/focus state. Thankfully it doesn't, and pseudo-classes are used instead.


My bad, that makes sense.


It doesn't help that Google too is confused - and confusing.

Search for 'html properties' gives results mostly about attributes.


The gist of it seems to be that attributes sometimes behave like properties.

Setting 3 attributes on an html element:

    <input id=good x=morning value=hn>
Then logging 3 properties with the same names:

    <script>
        e = document.querySelector('input');
        console.log(e.id); 
        console.log(e.x);
        console.log(e.value);
    </script>
Will output

    good
    undefined
    hn
I'm not sure why. The article says it's because "Element has an id getter & setter that 'reflects' the id attribute". Which sounds logical. But MDN does not mention a getter function for "value" for example:

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/in...

It even calls "value" a property AND an attribute: "...access the respective HTMLInputElement object's value property. The value attribute is always..."

And HTMLInputElement lists "value" under "properties":

https://developer.mozilla.org/en-US/docs/Web/API/HTMLInputEl...

So maybe it's not as easy. Maybe <element some=thing> sometimes sets a property and sometimes sets an attribute.


There are two different objects here. In HTML, you have an <input> element with a "value" attribute [0]. An <input> element also has an internal value, which is initialized using the "value" attribute.

In JavaScript, you have an HTMLInputElement with a "value" property [1]. Getting the "value" property of an HTMLInputElement reads from its <input> element's internal value, and setting the "value" property of an HTMLInputElement writes to its <input> element's internal value. (The "value" attribute remains unchanged, since it is only used for initialization.) The DOM object is just modeling the actual element.

In general, the property on the DOM object will not exist unless it is specifically documented to, such as "id" and "value".

[0] https://developer.mozilla.org/en-US/docs/Web/HTML/Element/in...

[1] https://developer.mozilla.org/en-US/docs/Web/API/HTMLInputEl...


It's a matter of getters and setters being defined for well-known attributes on the DOM element indeed. This is why it doesn't work of 'x', the 'x' HTML attribute doesn't have a getter and a setter defined on the 'x' property of the DOM element (You can try defining custom getters and setters with defineProperty that call getAttribute and setAttribute if you want to experiment - I haven't experimented but I expect this to work and I think that's how predefined properties work).

For 'value', MDN says:

> it can be altered or retrieved at any time using JavaScript to access the respective HTMLInputElement object's value property

While it doesn't say it is implemented using a setter and a getter (I suspect because it doesn't need to go into such details that are confusing if you don't know this mechanism), I think this is what it implies. I believe this can be found in the spec of the Javascript implementation of DOM.

So it is as "easy" as it looks (note my phrasing totally allows you to find it looks difficult): for well-known HTML attributes, there are usually getters and setters for the corresponding DOM property. Usually, it's the same string, sometimes there's a conversion, for instance with booleans (for the open and the hidden attributes).


So is "id" stored as a property or as an attribute?


An attribute. From https://jakearchibald.com/2024/attributes-vs-properties/#ref...

> When a property reflects an attribute, the attribute is the source of the data. When you set the property, it's updating the attribute. When you read from the property, it's reading the attribute.

value is different. From https://jakearchibald.com/2024/attributes-vs-properties/#val...

> the value property does not reflect the value attribute. Instead, the defaultValue property reflects the value attribute.

Follow the link for the full explanation.


It might depends on the implementation, and I've haven't dug into actual implementations but you pretty much need an attribute mapping (from names to values) for many things and that's how I would store it.

There might be additional optimizations or storage features for certain attributes. For example for id, since you mention this attribute specifically, you pretty much want document.getElementById() to be fast, you probably don't want it to traverse the whole DOM tree each time is called, so there's likely an additional mapping from ids to DOM elements stored somewhere per document, that is to update each time the id attribute is changed (though the spec [1] does appear to say anything about the speed characteristics of getElementById; in practice I personally certainly assume it's quasi instantaneous when using it).

For other attributes, more generally, you need to store the attribute value, not the property value (which you can cache, though), because the property value can be coerced. For instance, hidden="hidden" or hidden="HIDDEN" both lead to .hidden == true. But you need the exact value for getAttribute or for HTML serialization.

[1] https://dom.spec.whatwg.org/#ref-for-dom-nonelementparentnod...


Yes.


value is a property and an attribute. Your x is only an attribute.


Why would you ever want to write data on an element as a property instead of an attribute? Yikes.


When you don’t want it coerced into a string?


For primitive types, serde isn't hard. It could even be part of a simple spec. Numbers, Dates, String[], Number[], Date[]. Edit: Or it could just be JSON.

For complex types it is. But why and when do you need to store complex types in the DOM? Isn't that always a bad idea?

I'm a seasoned web dev, but haven't been working on complex frontend apps, because I believe complexity and frontend web don't go together. So I may very well miss some use-cases.


For custom elements you may want to store data or methods that modify the element’s behavior.


Methods as in callbacks? So, code? Do you then store a closure or such? Or a pointer to one (i.e. the function/name)? And how or what data would modify behaviour?

As said: I'm unfamiliar with these concepts as I actively try to avoid them :)


When you want to pass an object to a web component without going through JSON or some other stringification procedure.


How do you read it in the web component afterwards? Passing it via property sounds quite useful.


Web components are implemented using custom elements which are just plain JS classes that extend the HTMLElement class, so you would access the property the same way you would access a property in a normal JS class.


Thanks understood, that makes it quite simple and intuitive thinking about it.


The author of the thread is arguing against properties.


Yes, I know. I wasn't making a prescriptive comment about whether or not developers should use properties, I was just making a descriptive comment about how properties are accessed in a web component to answer the specific question politelemon asked.


Event listeners? Image data?


element.srcObject is very useful for <video> elements when the srcObject is a direct stream feed from the camera. I don’t even know how I would go about passing that as a string attribute.

https://developer.mozilla.org/en-US/docs/Web/API/HTMLMediaEl...


Your sentence can be understood in two ways, I'll answer the question "Why use domElement.attrname instead of the corresponding domElement.setAttribute('attrname') and domElement.getAttribute('attrname')?".

It can look cleaner (matter of taste). It looks like a native JS assignment, and it's shorter.

For open and hidden, it's way more intuitive and convenient to set and get booleans than testing the existence of the corresponding HTML attribute (Still not a fan of this thing, years after having learned this).

(but maybe you meant "Why use domElement.randomprop instead of something like domElement.dataset.randomprop"?)


I believe it's quite useful for framework developers for not required to build an abstract layer upon the already-exist-and-well-designed DOM layer to manage the state transitions or event handlings or something similar, you can't do too much if you are constrained to use only string values to encode/decode the inner state/data on the DOM.


When there is no need to appear in the HTML


Great article. Maybe a hot take but I think it's a good thing that dialog and details modify the DOM. It allows for UI state to be serialized and stored in a native way. I think input.value should work the same way too (although it's probably too late for that). Consider how some browsers will remember what you input into forms when you hit the back button: I think representing this as a cached version of the modified DOM makes a lot more sense than whatever magic browsers currently use to remember it. Many CSS pseudoclasses like :checked feel like a pre-attribute-selector bandaid.


> Consider how some browsers will remember what you input into forms when you hit the back button: I think representing this as a cached version of the modified DOM makes a lot more sense than whatever magic browsers currently use to remember it.

That would not work either:

> A control's value is its internal state. As such, it might not match the user's current input.

> For instance, if a user enters the word "three" into a numeric field that expects digits, the user's input would be the string "three" but the control's value would remain unchanged. Or, if a user enters the email address " awesome@example.com" (with leading whitespace) into an email field, the user's input would be the string " awesome@example.com" but the browser's UI for email fields might translate that into a value of "awesome@example.com" (without the leading whitespace).

From the standard: https://html.spec.whatwg.org/multipage/form-control-infrastr...


Yeah, I get your point. On the other hand, I prefer the serialisation of the light DOM to reflect only changes I've made as the author.

The serialisation isn't expected to represent page state. Imagine how that would work with <canvas>, or even things like event listeners.


Tend to agree ... with the added reflection that to a stylesheet, the DOM is the only worldview. Attributes - and the currently limited set of pseudo classes - afford its agency.

If pseudos were extended to include all property state, DOM serialised or not, then we'd be onto something.


Interesting article. I had built the right intuitions empirically (except for frameworks, which I don't practice much) but it's nice to read it.

> The funny thing is, React popularised using className instead of class in what looks like an attribute. But, even though you're using the property name rather than the attribute name, React will set the class attribute under the hood.

Yeah, I never understood why they did this. className in their XML-like syntax is ugly and it doesn't seem like they needed to do this. They could have gone with class. They are parsing this with their custom JSX parser, and these things are not JS identifiers. IIRC you need to use {accolades} to put JS expressions (objects) there.


There's a Java script object behind each DOM node basically, that has extra stuff in it.


so always use getProperty instead of getAttribute to avoid surprises?


Nope, just el.propertyName. el.getProperty isn't a thing.


This is a good read for JS developers.

TL;DR: they’re not the same thing. Only sometimes they match (e.g. `id`) but often they don’t.

Something fun: JSX does not use attributes, even if they look like it. That’s why they have unusual naming/formatting (`htmlFor` and `className`)


Reason #231 why web components are something of a nightmare.


This has nothing to do with web components. This is the behavior of built in HTML elements:

https://developer.mozilla.org/en-US/docs/Glossary/IDL#conten...

If anything, web components make this behavior easier to control. They represent the ground truth rather than providing a leaky abstraction. The article demonstrates this in several places.


Well it has something to do: this complexity comes up front-and-center when writing Web Components (or rather, custom elements), where you often would want to pass around JS values seamlessly like in e.g. React props.

Since custom elements are still in the regular DOM you have to deal with this soon enough that writing a custom element is likely the time you'd learn about this.

And then you have to write many lines to deal with the impedance mismatch while you wouldn't in any component framework, including ensuring that both are synced in whatever form of serialization you come up with -- which ends up being an even greater mess usually... not to mention you also pay the de/serialization costs if you want to keep a real sync between them.

As you can see I have written more `attributeChangedCallback`s and property proxies than I'd like (which is zero).


Thank you for writing this, it was frustrating reading their reply stating "this has nothing to do with web components" as if I was just making things up for fun.


well explained, good read.


[flagged]


I think the average frontend developer doesn't care anymore because they only have to care about relevant things, and this ain't it.


It might be more useful to ask it the other way around.

When and where would the average js developer learn this?


Average js developer thinks that react invented className and htmlFor


The DOM is bad and knowing browser quirks probably isn’t the thing I’d gatekeep a junior engineer about


LOL, I feel like the author mix up the two things even in the title. There are HTML attributes, there is the DOM API (which for example exists also in other languages, but not always refers to HTML but mostly XML instead) and there are JavaScript Object properties. And because the DOM api uses JavaScript objects you can access properties. But only attributes are serialized / deserialised. And some frameworks blur this, so you get and set both as a „convenience“.


I know they're two different things. I wouldn't bother writing an article comparing two things that are the same, even though it wouldn't take long.


Apparently I worded my first sentence wrong and you didn't bother to read longer than that. Apologies.

I just want to reiterate: DOM objects do have properties, because they are JavaScript objects. You make it sound like they have properties because this is something the DOM API set. And unfortunately this is only the case for some special HTML attributes, not for all of them.


I think you missed a couple of bits in the article:

> the above only works because Element has an id getter & setter that 'reflects' the id attribute … It didn't work in the example at the start of the article, because foo isn't a spec-defined attribute, so there isn't a spec-defined foo property that reflects it.

This is where the distinction is made between merely setting a property on a JavaScript object, and cases where you're actually calling an HTML-spec'd getter that has side effects.

The whole "reflection" section of the article is dedicated to how these HTML-spec'd getters and setters change the behaviour from the basic JavaScript property operations shown at the start of the article, and how it differs from property to property.

"LOL", I suppose.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: