

Ambiguous ampersands - riledhel
http://mathiasbynens.be/notes/ambiguous-ampersands

======
evmar
The reason it's confusing is that the spec is attempting to document the
confused logic that browsers implement (almost certainly for backwards
compatibility with bugs in older browsers). As an HTML author you shouldn't
waste the brainspace on learning these rules, but rather just encode
ampersands correctly.

~~~
makmanalp
It's not even backwards compatibility though. Incorporating common mistakes
into the spec is just plain silly (as in the &amp without the semicolon).

I don't understand why HTML maintains its relaxed attitude. Back in the day,
it was understandable since humans wrote HTML pages one by one but nowadays
it's all written once and then generated through templates. Thus, we can
afford to have a strict standard that just breaks when it sees nonconforming
markup. If you don't want your customers to see the "page broken" sign, fix
your damn website! Losing money and customers is a very good incentive to
conform to a standard.

The reason I support stricter markup is because it makes parsing and rendering
way simpler. As it is, browsers have "quirks mode" to guess at what the author
really ment when they wrote their markup, and no two browser guesses the same.
Then there are specialized "html parsers" like beautifulsoup that deal with
the bad markup. Away with all that!

~~~
barrkel
You go ahead and make a browser that breaks like that, and I'll go ahead and
spread the word to everyone I know how crap it is, and generally do my best to
wipe it from the surface of the earth - on the principle that it thinks the
page encoding is more important than delivering value to the user.

~~~
barrkel
I wish the people who downvoted me would tell me what software they work on,
so I can avoid it...

The kind of personality dysfunction required to deprioritize the authority and
autonomy of the user in favour of machine concerns is enough that I don't want
them involved in software I use. Such user-hostile malice is common in open
source (due to the intrinsic incentives involved); IMO it's one of the primary
reasons Linux fails on the desktop. Everybody is always eager to please the
machine, forever creating new "consistent" abstractions, but not so happy to
compromise even if such compromise lets the user win.

It's a serious problem. These dysfunctional people are far too numerous.

~~~
cjfont
I think the downvotes may have been less about the point you're making and
more about its delivery. While I generally agree with you, the rationale
behind making stricter markup is to ultimately push web developers to write
cleaner code, which makes parsing easier, and the page load faster - so in the
end it does come back to what the user wants. In reality not all web
developers are so careful.

~~~
barrkel
The medium is the message; if I didn't deliver it in the way I did, much of
the vehemence of what I said would be lost.

C'est la vie, though, I don't really mind.

------
mike-cardwell
Stupid handling of ampersands is exactly why I prevent viewtext.org from
displaying articles from <https://grepular.com/>

Viewtext.org rewrites all links to a redirector on their own site, but doesn't
decode the entities first. So if I have this link on my site:

    
    
      <a href="http://example.com/?1=2&amp;x=y">link</a>
    

Clicking that takes you directly to:

    
    
      http://example.com/?1=2&x=y
    

However, clicking the munged viewtext version of the link, sends you to:

    
    
      http://example.com/?1=2&amp;x=y
    

That is a _different_ URL! Can everyone _please_ learn how to encode/decode
html entities and uris properly.

