The Django people have always been quite up-front about their goal of providing a full-stack framework with all basic components included and tested rather than an all-singing, all-dancing, pluggable and composable architecture. I can't say what they've come up with is to my personal tastes but it's a perfectly reasonable design choice. "What real world problems would such a change solve" is a similarly reasonable and pragmatic question. "Tough. Django produces XHTML" makes them sound like bigger jerks than they probably are!
It's mainly about Django's Admin and form generation - in most other places Django is markup agnostic.
"Switching back to HTML4 is driven by the same kind of purity-beats-practicality, fashion-conscious silliness that made us all switch to XHTML in the first place."
"But to worry about being able to instantly switch to the doctype-du-jour -- or rather last years' doctype-du-jour -- as well as having
HTML validity - that's not being a perfectionist, it's called OCD, and I'm drawing the line there."
Whereas having a doctype that matches the tags you actually use, (xhtml or html4) will work fine without the http:// prefix. As most people don't use that prefix when typing your url, this can have a drastic effect on your page views.
I've not heard about it before and Google's not giving me anything.
It should be trivial to setup a test case if you want. Setup a virtual host with an empty content-type html header, and a html5 doctype and open it in IE6 without putting http:// before the url. You should see it try to download the page.
Browsers sniff the doctype to change rendering mode, but that's after they've already decided that it's HTML that they're going to display.
IE6 & 7 do content type sniffing, where they ignore what the server tells them it is and try and figure it out themselves, by looking at various bits of info including the start of the file.
This could lead to HTML being downloaded if it happened to look like a RAR file to IE, but generally it has the opposite effect as in something being sent as plain text that happens to include tags being rendered as HTML.
If anything, shortening the doctype should make something look more like HTML since it works without any doctype at all and the shorter the doctype the more room for HTML tags.
On the other hand content-sniffing is always going to give you unexpected results. Is there anything about the particular file you had that makes it atypical or likely to look like a binary file if you only consider the start of it?
It's not like any browser is going to drop the html4 doctype so until there is widespread support for html5 elements (footer, article etc) there is no real benefit other than pr.
I don't spend all day writing HTML, so it's not at the forefront of my mind which tags are self-closing and which ones aren't. Is it "<br>" or "<br/>"? And really, why would I care? And ever more really, how much of that behaviour can I be bothered putting into automatically-produced output, rather than just simply putting the "/" everywhere?
If I cheat and serve up XHTML-like tags under a text/html mimetype, then nothing and nobody cares (as far as I can tell) except for the W3 validator.
If you write XHTML, and don't follow these two non-XML rules (and indeed the other 14), then it'll probably break when you give it to an HTML parser or browser:
C.2 Empty Elements: Include a space before the trailing / and > of empty elements, e.g. <br />, <hr /> and <img src="karen.jpg" alt="Karen" />. Also, use the minimized tag syntax for empty elements, e.g. <br />, as the alternative syntax <br></br> allowed by XML gives uncertain results in many existing user agents.
C.3. Element Minimization and Empty Element Content: Given an empty instance of an element whose content model is not EMPTY (for example, an empty title or paragraph) do not use the minimized form (e.g. use <p> </p> and not <p />).
I guess I don't really understand this; best practice in HTML is to close tags and quote attributes just as in XHTML. The only difference is XHTML requires it.
As for self-closing tags, the only one I use, at all, is <img>, and I'm perfectly OK with having to handle that (largely because I don't; there are already libraries which will generate my HTML for me).
(Mislav Marohnić's discussion of his standardista plugin and a link to it on Github.)
<input type="text" name="email" />
<input type="text" name="email">
* keep making tags lowercase
* keep quoting attribute values
* keep closing tags after I open them (and properly nesting tags)
* keep using the closing slash for standalone tags
* keep calling it text/html (mime type)
* keep using UTF-8 encoded Unicode by default
* keep not using xml namespaces
* keep forgetting to use the DOCTYPE declaration
* and start carefully using some of the neat new tags in html5
easy to remember and triggers standards mode
"A number of problems resulting from the use of the text/html MIME type in conjunction with XHTML content are discussed. It is suggested that XHTML delivered as text/html is broken and XHTML delivered as text/xml is risky, so authors intending their work for public consumption should stick to HTML 4.01, and authors who wish to use XHTML should deliver their markup as application/xhtml+xml."
Note the date (Sep 2002) and the fact that the author is now in charge of the HTML5 spec.
Some people argue that forcing web developers to follow rigid XML rules, plus an appendix full of hints on how to mangle your XML so that actual browsers wouldn't choke on it (http://www.w3.org/TR/xhtml1/guidelines.html) somehow got people in the right mindset to do semantic, accessible HTML. The prevailing view is that this was mostly cargo cult behaviour and neither those following or recommending this course of action really understood the supposed benefits or drawbacks.
if we were outputting HTML4, it might cause
∗genuine∗ problems - e.g. if you need XHTML in order to mix in MathML or SVG
.. to do which you must serve as application/xhtml+xml.
That said, I agree with his points. And personally I wish the browsers would relax the restrictions on when they will agree to render SVG. I see no good reason why they refuse to do svg in text/html mode, considering the exceptions they make for seemingly everything else.
This probably breaks major assumptions inside the browser engines. Rendering most image types is a one-and-done type deal, just setting up a few calls to libjpeg or libpng, but SVG actually has to be XML-parsed and rendered element by element on page load. To do it in text/html, you'd need to hook together the html parser with the XML parser with the image generator. That's a lot of work to break the rules!