Hacker News new | past | comments | ask | show | jobs | submit login

A number of errors in this article makes me wary:

1. The "request" line in HTTP is not a header - it is the request, which can have associated headers. The headers are all “about” the request. The request itself is not a header, and does not follow the header syntax. (The historical reason for this is that the request line was defined in HTTP 0.9, which did not have headers.)

2. ISO-8859-1 is not “a crappy Windows character set”. It is an international standard specifically different from what Microsoft was using at the time (code page 437 was standard for MS-DOS in the US). Later, Windows switched to code page 1252, which is a copy of ISO-8859-1 except some extra glyphs in the bytes the ISO standard defined as control characters.

Thanks for the clarification about the request line, I'll edit the article to point that out!

I mostly referred to it as a "crappy Windows character set" because A) it has a limited set of characters, mostly Western European, and B) it's pretty much only used by Windows these days. While the term "crappy Windows character set" is not perhaps entirely accurate, it is a short, tongue in cheek summary of ISO-8859-1.

Unicode also has a limited set of characters, mostly those that the unicode consortium has agreed on including in the standard.

That's splitting hairs - UTF-8 allows for over a million code points, enough to cover pretty much every written language, and then some (including swathes of emoji characters). ISO-8859-1 has 256 code points, barely enough to cover Europe and America.

> Thanks for the clarification about the request line, I'll edit the article to point that out!

(Apparently you weren’t thankful enough to upvote. EDIT: never mind, I must have been mistaken.)

A more accurate description of ISO-8859-1 would be “a crappy 8-bit character set mostly only still relevant for Windows which uses its own embraced and extended version, CP1252.”

I'm afraid you're mistaken, I dutifully upvoted you right after I commented.

I've changed the wording to be slightly less ambiguous. Thanks again :)

I saw your comment and still saw only 1 point on my post; I guess I must have received a downvote too during that time. Oh well, sorry for being huffy.

For compatibility reasons browsers don't use ISO-8859-1, they interpret it as Windows 1252 instead (that de-facto requirement has been codified in the HTML standard now <http://encoding.spec.whatwg.org/>).

To quibble further the request line typically wont have a "host" section. Its almost always a uri path/stem and the 1.1 client sends an additional Host header. The request line must also have the protocol and version, HTTP/1.0.

To quibble further still: the request line may have the protocol and version if the client is HTTP/1.0 or newer. HTTP/1.0 servers must "recognize the format of the Request-Line for HTTP/0.9 and HTTP/1.0 requests" (RFC 1945).

Although no one will give a fuck if you don't handle HTTP/0.9.

Indeed. The claim "Deflate sucks compared to Gzip" jumped out at me. A more thorough discussion here would be helpful, something along the lines of "While deflate would be the superior choice (though narrowly), it has historically been poorly implemented in servers and user-agents and should therefore be avoided for compatibility".

It jumped out at me as well... because I'm under the impression that there are little differences between the two and they both use the same compression algorithm.

Gzip format uses the deflate algorithm and adds header and footer. Only advantage is over raw deflate is that it includes CRC, uncompressed size, and optionally original file name. None of which are necessary for HTTP. I guess there is an advantage that already gzipped files can be served for Accept-Encoding.

The difference between the two is that Gzip uses CRC32 while Deflate uses Adler32, which is slightly more performant. The problem, though, is that many browsers and servers (incorrectly) send or expect deflate without the headers, so "deflate" interoperability is a trainwreck.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact