This story comes up from time to time and I hate it. Knuth was asked to illustrate literate programming style, and he did so. The request was not "show us how to find the most frequently occurring words in a file." McIlroy's response would be fine for the latter and utterly useless for the former (and actual) request.
Was curious how fast the shell was versus something more purpose written.
This short bit of (golfed/terse) Perl seems to be 2-3x faster for the inputs I tested. Though it's also 2x more "code", despite golfing and skipping best practice strict/warnings. And both are wasting some time sorting the whole list versus a partial heap sort or similar.
#!/usr/bin/perl
$l=shift(@ARGV)||1;
while (<>) {
for (split) {
$c{lc($_)}++;
}
}
for (sort { $c{$b} <=> $c{$a} } (keys %c)) {
print "$c{$_} $_\n";
last if ++$i == $l;
}
That would depend upon the shell that you used. GNU Bash in particular is not known for speed, although this is just a question of pipes.
I remember that cdrecord has explict advice on raising pipe buffer size to avoid an underun (that will render a CD unusable). Perhaps pipe tuning here might also impact performance.
I don't know if dash will increase performance in this particular case.
$ rpm -qi dash | tail -3
DASH is a POSIX-compliant implementation of /bin/sh that aims to be as small as
possible. It does this without sacrificing speed where possible. In fact, it is
significantly faster than bash (the GNU Bourne-Again SHell) for most tasks.
Looking at the shell script, I think it will always be somewhat slow as it's sorting twice. One pass to sort the words so that "uniq -c" can work correctly, then the reverse numeric sort.
I don't think the shell used matters, as most of the cpu time is in the sorting.
(I pondered 'do { local $/; <ARGV> }' to do a full slurp as well, but I don't think it helps with clarity and I'm not actually golfing here even though the resulting code happens to be shorter)
Also, if you have List::UtilsBy handy (which I basically always do),
rev_nsort_by { $c{$_} } keys %c
could be a nicer way to express the sort, although probably as fast because of the block expr versus anon-sub-via-& prototype part.
(I have not executed any of this code, I'm just musing here, so no warranty express or implied ;)
Every time I read in an article (or more typically in a headline) how some guy X "eviscerated" / "destroyed" some other guy Y and their argument, 9 times out of 10 it just means some guy X had a different opinion.
Very occasionally their opinion may be backed by an actual counterargument, but this is quite rare, and even then the counterarguments tend to be of varying quality.
Rarely is there an actual "evisceration" at play (a term I would have otherwise reserved for someone pointing out basic logical flaws in an argument, which cause it to collapse as completely invalid).
I've seen this kind of language used as clickbait more than anything else so many times by now, that by this point whenever I hear "X destroys Y", my knee jerk reaction is that, despite the usually present controversy surrounding the kind of opinions that attract this kind of reaponses, for Y to merit such a vacuous attack in the first place means they're probably right.
> Every time I read in an article (or more typically in a headline) how some guy X "eviscerated" / "destroyed" some other guy Y and their argument, 9 times out of 10 it just means some guy X had a different opinion.
Neither was said.
> And then he calmly and clearly eviscerated the very foundation of Knuth’s program.
I've heard this story before, but I can't shake the impression that a lot of it is just luck that the problem is something that lends itself very well to a shell pipeline.
One thing that I'd love to hear more people talk about is how to apply shell-style composition to settings where it's not so obvious, or how to create new programs that lend themselves well to this programming style.
These days it's not much more gendered than the word guys which is now frequently used to refer to mixed sex groups. And anyway Doug McIlroy is male so what exactly is your complaint?
The piece would be better without the closing section entirely. McIlroy demonstrated a different, not competing and not in any sense "courageous" or "daring", approach to program design.
The unfortunate closing statement is the single reason I didn't include this in our shell programming onboarding documentation as a motivating example of how shell fluency can help solve ad hoc questions.
• Hillel Wayne, "Donald Knuth Was Framed": https://buttondown.email/hillelwayne/archive/donald-knuth-wa...
• me, various comments, including https://news.ycombinator.com/item?id=22407313 in the comments on that post, linking to older/other comments.