

HTML 5 Parsing - twampss
http://ejohn.org/blog/html-5-parsing/

======
makecheck
I always felt that the <!DOCTYPE html> simplification, while nice, is still a
little weird because it immediately precedes an <html> anyway. I recall a blog
years ago suggesting that HTML 5 should simply have said <html version=5>, and
I think it would have been nice to see something that sane. :)

~~~
mbrubeck
I'm sure the spec authors would love to make that simplification too, but one
of their goals is compatibility with deployed browsers. <!DOCTYPE html> works
to trigger standards mode in all currently-popular browsers.

~~~
ars
But shouldn't they _also_ add a version=5 in the <html> tag?

What are they going to do when we get html 6?

Is there any way to suggest it, or the spec basically done?

~~~
rimantas
No, omitting version is intentional (and version number did not mean much in
the past). HTML5 spec defines parsing algorithm what should be used for all
html versions and will never require browsers to do anything different for
different versions of html. If the future browser will be capable to parse
HTML8 it will be able to parse HTML5 just fine.

~~~
randallsquared
I think the problem is that a 2009 browser will not be able to notice that the
HTML8 it's seeing is potentially beyond what it can do. With versions, a check
for versions above 5 could produce a message that says that some parts might
not display properly.

~~~
blasdel
Why, just because some Zeldman-humping git noticed they could put a bigger
number there?

The browser could display an infobar if it _actually encounters something it
doesn't understand_

------
jdagostino
Does anyone know the reason for maintaining the java and c++ version?

Why didn't they just fork the c++ conversion of the java source and be done
with it?

~~~
ars
Be done with it? Quite the opposite, that means you constantly need to
maintain two version.

By using an automated conversion, you only need to maintain one of them.

