A more charitable explanation: the title length limit of 80 characters makes it difficult to keep all of the important information in the HN title.
No doubt that news publications absolutely do sometimes intentionally bury the lede and leave important information out of the original title, but I don’t think HN users are all optimizing just to hit the frontpage.
What kind of overhead does gzip have? I'd be interested to know how many characters you could fit into 80 compressed characters. Some mapping of (2 byte?) unicode characters to 3ish lowercase letters could be effective. Is there any standard way like that?
This is an interesting question. I haven't researched the overheads.
Separately, I feel like if you account for grammatical rules, it is possible to eliminate certain "filler" words in a lossy fashion but add them back later based on grammatical rules. For example, you don't need to say "in mice", you could just say "mice", the meaning is obvious, and "in" could be fixed in post-processing at the client end.
You could also quite possibly eliminate all vowels and still reconstruct everything accurately.
> For example, you don't need to say "in mice", you could just say "mice", the meaning is obvious, and "in" could be fixed in post-processing at the client end.
> You could also quite possibly eliminate all vowels and still reconstruct everything accurately.
My guess is that 10 minutes after that is rolled out, someone will have found a collision that decompresses to some kind of dirty joke
No doubt that news publications absolutely do sometimes intentionally bury the lede and leave important information out of the original title, but I don’t think HN users are all optimizing just to hit the frontpage.