Hacker News new | past | comments | ask | show | jobs | submit login

I don't think you fully appreciated my comment. I've looked at now 100+ documents from that list. Not a single one has had actual content related to the web standard.

I was finally able to find one, by looking elsewhere: https://www.w3.org/TR/css-grid-1/. You include 8 copies of the css-grid-1 standard in your count. So of the small fraction of documents that are actually web standards, you're miscounting by an order of magnitude. In other words, I expect that the actual count here is off by 2 orders of magnitude and that the real size of the "relevant" web standard is 1-2 million words, and the rest is just bad measurement.

> Out of curiosity, is it your intention to also look for flaws in my approach to word-counting the non-web specs I compared against?

No, I think pointing out a 2-3 order of magnitude mistake in your methodology speaks for itself.

> They felt this necessary to document, so I included it. The same is true of other specs I compared against, such as POSIX.

The posix spec includes examples and docs yes. But so do the actual web specs (see again the css grid spec doc). What the posix spec doesn't include is a parallel version of the docs meant entirely for posix users, that is wholly irrelevant to people who are building a posix shell. Again, you're including an analysis of which PDF readers to test the accessibility of the PDF you're writing in an analysis of web standards.


For an even more egregious example, https://www.w3.org/TR/2013/CR-xpath-datamodel-30-20130108/ is one of eighty versions of the xpath datamodel spec that you count, and xpath isn't even an officially supported browser thing.

I think I was extremely generous with my margins and went to lengths to be selective with my inclusion criteria, I didn't even catalogue everything under those criteria, and I omitted huge swaths of web standards on the basis that (1) it was more forgiving to W3C and (2) they would be difficult to compare on the same terms. At most you've given a credible suggestion that there might be an order of magnitude off, but even if there were, it changes the conclusions very little. I explained all of that and more in my methodology document, and I stand by it. If you want to take the pains to come up with an objective measure yourself and provide a similar level of justification, I'm prepared to defer to your results, but not when all you have is anecdotes from vaugely scanning through my dataset looking for problems to cherry pick.

No, I've given credible reasons for two orders of magnitude:

1. The majority of the documents you are including are not reasonably considered web standards

2. Of those that are, you are counting each one 5-50 times.

That's two orders of magnitude.

All your analysis has proven is that it's (ironically) difficult to machine-parse the w3 data, and that you did so in a way to justify your preconceptions.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact