I was finally able to find one, by looking elsewhere: https://www.w3.org/TR/css-grid-1/. You include 8 copies of the css-grid-1 standard in your count. So of the small fraction of documents that are actually web standards, you're miscounting by an order of magnitude. In other words, I expect that the actual count here is off by 2 orders of magnitude and that the real size of the "relevant" web standard is 1-2 million words, and the rest is just bad measurement.
> Out of curiosity, is it your intention to also look for flaws in my approach to word-counting the non-web specs I compared against?
No, I think pointing out a 2-3 order of magnitude mistake in your methodology speaks for itself.
> They felt this necessary to document, so I included it. The same is true of other specs I compared against, such as POSIX.
The posix spec includes examples and docs yes. But so do the actual web specs (see again the css grid spec doc). What the posix spec doesn't include is a parallel version of the docs meant entirely for posix users, that is wholly irrelevant to people who are building a posix shell. Again, you're including an analysis of which PDF readers to test the accessibility of the PDF you're writing in an analysis of web standards.
For an even more egregious example, https://www.w3.org/TR/2013/CR-xpath-datamodel-30-20130108/ is one of eighty versions of the xpath datamodel spec that you count, and xpath isn't even an officially supported browser thing.
1. The majority of the documents you are including are not reasonably considered web standards
2. Of those that are, you are counting each one 5-50 times.
That's two orders of magnitude.
All your analysis has proven is that it's (ironically) difficult to machine-parse the w3 data, and that you did so in a way to justify your preconceptions.