Hacker News new | comments | show | ask | jobs | submit login
"Tough. Django produces XHTML." (groups.google.com)
56 points by andybak on Feb 2, 2010 | hide | past | web | favorite | 37 comments

The fundamental problem here is the need to output in multiple markup languages. [...] However, Django simply does not work at the level of abstraction that would allow multiple output languages.

The Django people have always been quite up-front about their goal of providing a full-stack framework with all basic components included and tested rather than an all-singing, all-dancing, pluggable and composable architecture. I can't say what they've come up with is to my personal tastes but it's a perfectly reasonable design choice. "What real world problems would such a change solve" is a similarly reasonable and pragmatic question. "Tough. Django produces XHTML" makes them sound like bigger jerks than they probably are!

I think that one of Django's nicest benefits is that even though it isn't all singing/all dancing, you can make it do just about any song or any dance with some extra code. I'm not saying that don't have valid reasons to do otherwise, but I think it would be more inline with the "django philosophy" if you could make the HTML generation extensible.

It is and there are fixes out there for the HTML/XHTML thing but this is mostly an argument about defaults and built-ins.

Some context:

It's mainly about Django's Admin and form generation - in most other places Django is markup agnostic.

Some choice quotes:

"Switching back to HTML4 is driven by the same kind of purity-beats-practicality, fashion-conscious silliness that made us all switch to XHTML in the first place."

"But to worry about being able to instantly switch to the doctype-du-jour -- or rather last years' doctype-du-jour -- as well as having HTML validity - that's not being a perfectionist, it's called OCD, and I'm drawing the line there."

XHTML is last year's doctype-du-jour, too.

He does go on to mention that 'HTML5 will save us' - at least from quibbling over <br> vs <br />...

Unfortunately he thinks that he can't switch to HTML5 today and get all the benefits he requires. Do it, there's no reason not to.



actually, there is a reason. because in IE6 it breaks depending on context. if you have an incorrect content header and don't prefix http:// on your url, then having an html 5 doctype makes your page try to "download" instead of display.

Whereas having a doctype that matches the tags you actually use, (xhtml or html4) will work fine without the http:// prefix. As most people don't use that prefix when typing your url, this can have a drastic effect on your page views.

Have you got a link describing this issue?

I've not heard about it before and Google's not giving me anything.

afraid not, as I recently came across it and have fixed it by adjusting the content headers. it's a quick fix, but can be detrimental as a lot of web servers will present the incorrect content headers and developers will be completely unaware since browsers will obey doctypes over content headers (I think). In this particular case IE6 will obey the content header since it doesn't understand the html5 doctype and will try to "download" the content instead.

It should be trivial to setup a test case if you want. Setup a virtual host with an empty content-type html header, and a html5 doctype and open it in IE6 without putting http:// before the url. You should see it try to download the page.

Not that I'm doubting you, but it doesn't really make sense to me.

Browsers sniff the doctype to change rendering mode, but that's after they've already decided that it's HTML that they're going to display.

IE6 & 7 do content type sniffing, where they ignore what the server tells them it is and try and figure it out themselves, by looking at various bits of info including the start of the file.

This could lead to HTML being downloaded if it happened to look like a RAR file to IE, but generally it has the opposite effect as in something being sent as plain text that happens to include tags being rendered as HTML.

If anything, shortening the doctype should make something look more like HTML since it works without any doctype at all and the shorter the doctype the more room for HTML tags.

On the other hand content-sniffing is always going to give you unexpected results. Is there anything about the particular file you had that makes it atypical or likely to look like a binary file if you only consider the start of it?

There's nothing preventing Joe Djangodev from switching the doctypes right now.

Not much, but why would you switch to html5? There is limited support for actual html5 elements, and if your not using them, then your not using html5, just html4 loose.

It's not like any browser is going to drop the html4 doctype so until there is widespread support for html5 elements (footer, article etc) there is no real benefit other than pr.

Because then I can mix <br> and <br /> without some smart alec telling me my document doesn't validate.

Serious question: what is wrong with XHTML? I've been using the XHTML 1.1 Strict DTD for the last 3 years straight, always creating fully validating code. I found XHTML infinitely easier to appease than HTML4. Perhaps not as important, but I've also found that I can create nearly identical XHTML in different Web frameworks (depending on the task), which is important to some of my clients who don't have a lot of resources to upgrade the plethora of legacy systems they operate. I don't mean that as a brag "I'm so great", to the contrary, I've just found XHTML to be much more consistent and far easier to construct via robot than HTML4, meaning I don't have to be "so great" to use it.

Yes but - it's still true (for me at least), that XHTML is far easier to construct automatically. Mostly due to the tag-closing behaviour.

I don't spend all day writing HTML, so it's not at the forefront of my mind which tags are self-closing and which ones aren't. Is it "<br>" or "<br/>"? And really, why would I care? And ever more really, how much of that behaviour can I be bothered putting into automatically-produced output, rather than just simply putting the "/" everywhere?

If I cheat and serve up XHTML-like tags under a text/html mimetype, then nothing and nobody cares (as far as I can tell) except for the W3 validator.

You can't just put the / everywhere, and you still have to differentiate between non-empty and empty elements (e.g. div and br).

If you write XHTML, and don't follow these two non-XML rules (and indeed the other 14), then it'll probably break when you give it to an HTML parser or browser:


C.2 Empty Elements: Include a space before the trailing / and > of empty elements, e.g. <br />, <hr /> and <img src="karen.jpg" alt="Karen" />. Also, use the minimized tag syntax for empty elements, e.g. <br />, as the alternative syntax <br></br> allowed by XML gives uncertain results in many existing user agents.


C.3. Element Minimization and Empty Element Content: Given an empty instance of an element whose content model is not EMPTY (for example, an empty title or paragraph) do not use the minimized form (e.g. use <p> </p> and not <p />).

Mostly due to the tag-closing behaviour.

I guess I don't really understand this; best practice in HTML is to close tags and quote attributes just as in XHTML. The only difference is XHTML requires it.

As for self-closing tags, the only one I use, at all, is <img>, and I'm perfectly OK with having to handle that (largely because I don't; there are already libraries which will generate my HTML for me).

If you're using a serializer to write your document, it doesn't make any difference whether you serialize to XML or HTML. If you worry about where to put a '/', you should most likely be using a serializer instead of whatever you are doing.

If Django's XHTML-style form tags bother you (they bother me) you can always use my django-html template library, which fixes the problem in a kind-of-but-not-too-hacky way.


And if anyone happens along and wants something similar for Rails: http://mislav.uniqpath.com/rails/cargo-culting-xhtml-conside...

(Mislav Marohnić's discussion of his standardista plugin and a link to it on Github.)

Why do they bother you? Its literally a matter of

    <input type="text" name="email" />

    <input type="text" name="email">
Why do you prefer the latter? (Genuinely curious, not being rude)

I'm a perfectionist.

My plan has always been:

* keep making tags lowercase

* keep quoting attribute values

* keep closing tags after I open them (and properly nesting tags)

* keep using the closing slash for standalone tags

* keep calling it text/html (mime type)

* keep using UTF-8 encoded Unicode by default

* keep not using xml namespaces

* keep forgetting to use the DOCTYPE declaration

* and start carefully using some of the neat new tags in html5

Forgetting to use a DOCTYPE is a bad idea, since it throws most browsers in to "quirks" mode meaning they won't render things according to the published standards. That makes debugging problems a whole lot harder since you first have to work out if the bug relates to quirks mode v.s. standards mode, and quirks mode problems are far less widely documented.

Given my html as described above, what would you recommend to use for the DOCTYPE line?

<!DOCTYPE html>

easy to remember and triggers standards mode

Looks redundant, but easy. Thanks.

I missed the memo. Can anyone point me to an article explaining exactly why XHTML has suddenly become the whipping boy du jour? I'm well aware that HTML5 is the next hot thing and all, but I don't understand why anyone would favor HTML4?


"A number of problems resulting from the use of the text/html MIME type in conjunction with XHTML content are discussed. It is suggested that XHTML delivered as text/html is broken and XHTML delivered as text/xml is risky, so authors intending their work for public consumption should stick to HTML 4.01, and authors who wish to use XHTML should deliver their markup as application/xhtml+xml."

Note the date (Sep 2002) and the fact that the author is now in charge of the HTML5 spec.

Some people argue that forcing web developers to follow rigid XML rules, plus an appendix full of hints on how to mangle your XML so that actual browsers wouldn't choke on it (http://www.w3.org/TR/xhtml1/guidelines.html) somehow got people in the right mindset to do semantic, accessible HTML. The prevailing view is that this was mostly cargo cult behaviour and neither those following or recommending this course of action really understood the supposed benefits or drawbacks.

I wrote up some thoughts a while back:


You only have problems if you try to switch to application/xhtml+xml, which is easy to solve: don't do that

if we were outputting HTML4, it might cause ∗genuine∗ problems - e.g. if you need XHTML in order to mix in MathML or SVG

.. to do which you must serve as application/xhtml+xml.

That said, I agree with his points. And personally I wish the browsers would relax the restrictions on when they will agree to render SVG. I see no good reason why they refuse to do svg in text/html mode, considering the exceptions they make for seemingly everything else.

"I see no good reason why they refuse to do svg in text/html mode"

This probably breaks major assumptions inside the browser engines. Rendering most image types is a one-and-done type deal, just setting up a few calls to libjpeg or libpng, but SVG actually has to be XML-parsed and rendered element by element on page load. To do it in text/html, you'd need to hook together the html parser with the XML parser with the image generator. That's a lot of work to break the rules!

Well, good point, but I can't imagine it would be that hard, and they'd have to do it anyway for HTML5, which does indeed support svg inside text/html.


Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact