
The Robustness Principle Reconsidered (2011) - throw0101a
https://queue.acm.org/detail.cfm?id=1999945
======
mcguire
" _This also clarifies what to do when passing on a packet—implementations
should not clear field X, even though that is the most "conservative" thing to
do, because that would break the case of a version 1 implementation forwarding
a packet between two version 2 implementations. In this case the Robustness
Principle must include a corollary: implementations should silently ignore and
pass on anything that they don't understand. In other words, there are two
definitions of "conservative" that are in direct conflict._"

Clearing the field is not 'the most "conservative" thing to do'.
"Conservative" in the context of network protocols means not fooling with
things you don't understand and don't need to fool with. Yet, this is a very
common misunderstanding, by both the specifier and the implementer. Likewise,

" _Now let 's suppose that our mythical standard has another field Y that is
intended for future use—that is, in a protocol extension. There are many ways
to describe such fields, but common examples are to label them "reserved" or
"must be zero." The former doesn't say what value a compliant implementation
should use to initialize reserved fields, whereas the latter does, but it is
usually assumed that zero is a good initializer. Applying the Robustness
Principle makes it easy to see that when version 3 of the protocol is released
using field Y there will be no problem, since all older implementations will
be sending zero in that field._"

"Must be zero" should probably be stricken from the protocol lexicon. If a
field is unused, an implementation should not be looking at it at all.
Instead, add wording such that all fields are zeroed before individual values
are assigned.

" _The problem occurs in how far to go. It 's probably reasonable not to
verify that the "must be zero" fields that you don't have to interpret are
actually zero—that is, you can treat them as undefined. As a real-world
example, the SMTP specification says that implementations must allow lines of
up to 998 characters, but many implementations allow arbitrary lengths;
accepting longer lines is probably OK (and in fact longer lines often occur
because a lot of software transmits paragraphs as single lines)._"

But this is exactly the issue that causes attacks on the robustness principle
to fall down. A server passing along a message with lines longer than 998
characters to an implementation that doesn't handle them is a buffer overflow
waiting to happen. It's only considered a reasonable exception because it
hasn't caused problems, not because it cannot. On the other hand,

" _Many Web browsers have generally been willing to accept improper HTML
(notably, many browsers accept pages that lack closing tags). This can lead to
rendering ambiguities (just where does that closing tag belong, anyhow?), but
is so common that the improper form has become a de facto standard—which makes
building any nontrivial Web page a nightmare. This has been referred to as
"specification rot."._"

Would it have been possible for the HTTP/HTML environment to take off in the
1990s if browsers _had not_ been very liberal in what they accepted? All of
those Geocities pages were probably misformatted. HTTP servers, by definition,
don't care what they're serving, so any errors would only appear in the
browser, and probably not the browser of the page's author.

(By the way, back when HTML was an SGML standard, SGML supported and
encouraged omitting unnecessary closing tags. HTML took advantage of this
flexibility.)

