Hacker News new | comments | ask | show | jobs | submit login
The Yoda of Silicon Valley (nytimes.com)
976 points by sohkamyung 29 days ago | hide | past | web | favorite | 331 comments



My favorite Knuth quote: "Email is a wonderful thing for people whose role in life is to be on top of things. But not for me; my role is to be on the bottom of things. What I do takes long hours of studying and uninterruptible concentration. I try to learn certain areas of computer science exhaustively; then I try to digest that knowledge into a form that is accessible to people who don't have time for such study."

source: https://www-cs-faculty.stanford.edu/~knuth/email.html


I like how it's a nice articulation of a preference that doesn't denigrate people who have the opposite preference.


There are actually two of those in that sentence. The one about emails, and the one about doing the long understanding study.


I feel completely the opposite.

Not that email or any of the million other apps don't produce constant interruptions - but in terms of what I want my role to be and what I find pleasure in.

A world of high stimulation and constant social interaction? A million things shouting at me and I have to sort through them and grab the most important one and hop from thing to thing fast? That's what I like.

Focusing on one topic for hours, days, even weeks to really understanding every facet of it? I think it'd kill me. I know people theoretically need these big blocks of uninterrupted time to get work done, but I'd much rather have a bunch of meetings where we can figure out the path forward.

I have mad respect for him... but I never could have been an academic.

I like to take in all those digested kernels of knowledge, I like to digest as many of them as I can; I know I lose the details, but integrating them all into something bigger has always been my dream; it's just how I live my life.


It's only after you've developed a deep understanding that any efforts at integration would be successful. If you just like orchestrating, that's fine but I've met too many managers/product managers who think up lofty ideas on how to integrate various pieces without a deeper understanding of the underlying technology.


Yup, been around a lot of them, they are the ones who love to sling words around and are a prime definition of tech bros always looking to debate not on the basis of firm knowledge but as a cover-up for their inadequacies.


One of the signs that you're actually good at operating at this level is you spend more time listening than talking. You need people who actually know more than you - if you are always arguing and trying to prove you actually do know best, you're definitely not looking for that.


I have seen the other side of this where the developers trying to implement business requirements in code are making bad assumptions and end up in confrontations with middle management because their technically "pure" vision for how something should work doesn't solve the business goal.


Yeah. Seen this happen too. That usually points to a lack of (good) senior leadership in the dev team (senior developers not product/managers).


I like both.

The problem I find with surface level knowledge is you make a lot of mistakes. When I have a surface level knowledge of a piece of tech and make a decision based upon that, then talk to someone with deep knowledge it's pretty common for them to say "why didn't you just do XYZ, it's easier/more robust".

Ultimately you can produce OK or even sometimes good solutions to problems but not great ones, unless you have direct access to people that know more than you.

This becomes even more pronounced when different pieces of tech can be combined with eachother.


I simply do make a lot of mistakes in most domains of my life.

I'm ADHD (diagnosed, etc), and medication did help, but didn't solve it. For much of my life, I fought against this natural tendency. But for most part I essentially kept telling myself "pay more attention"! It didn't work.

Compensation is important. Detecting the mistakes I make, and avoiding situations where the cost of them is very high. One part of that is accepting things I can never do, like be an academic :).

In the engineering world there's lots of means of compensating. One you mentioned; knowing people who know more than you. Another one would be automated testing; at almost every workplace I've had, I have put a lot of effort into expanding and increasing it, and generally detecting errors wherever they might occur. I'm always asking "what if there's a typo somewhere, will we know, will that go to production?" Might I create a PR I think I've tested but I haven't? The odds I've done this are higher than my peers, so I need to protect myself from that.

If I'm at a workplace that values accuracy above else - that views being able to focus on the code and get the best (not "good enough") solution to every problem as the main trait that I'm evaluated on, I will not succeed. I'm never going to be the guru of anything. This kind of sucks, because a lot of places view being really good at those things as necessary before you can move to another area (IE, management) where you don't use those skills.

I'm a "good not great" or "move fast and break things" kind of person. My usual working style is to produce prototypes very fast; then whoever I'm making it for can see that I didn't quite read the design thoroughly and miss something before it's a huge disaster.

I need the feedback of fast release cycles. I can detect errors fast, take action fast, fix them fast. But "take your time and make sure it's perfect before you throw the switch"? That's just not my personality. I'm high volume (when not artificially constrained) but error prone.

I'm smart enough to know it, smart enough to insist that my code gets reviewed, smart enough to say we really have to test for these cases, things like that. But it's absolutely something I have to compensate a lot for.


I don’t have ADHD but I can relate to feeling that deep knowledge is inaccessible to me. My issue has always been around some kind of internal optimization function that is always asking “how will this help me”?

A solution that’s worked well for me is to only study deeply when I have a concrete need.

For example, when I felt my work suffering because I didn’t deeply understand z-index, it was easy to sell my brain on reading the complete z-index spec.

One of my hobbies is music, and I’m just starting to feel the limitations imposed by not understanding music theory deeply. That means it’s a good time for me to study music theory.

Some people feel compelled to learn things. I feel compelled to do things, and if I can’t do something as well as I want to for lack of knowledge, then I know I’m ready to learn things.


Motivation is a super important factor for me. The gap between what I can accomplish when I care about somethin versus when I feel it is an obligation or requirement is huge. Thankfully the latter, combined with being smart enough to cover mistakes, has been good enough to keep me from failing things or getting fired, but it's a definite challenge.

Confuses the hell out of employers, too. "You did a great job on X, you really improved things, but when we put you on Y, you totally halfassed it" - because X was my idea, and Y was forced on me. I respect the (social) contract that you are my employer and I should do what I'm told; I know at least mentally I should try to excel on any task given me, but it's just so hard to invest in those kind of things.

Anyone can get demotivated; it's just that the gap for me is bigger than people are used to. I'm not blaming my employers here, to be clear - thinking that way does me no good, anyway(+). While the vast majority do a terrible job of recognizing the motivations of employees - and anyone can get demotivated - I'm going to be way worse at this than the average employee and it's up to me to compensate.

Another strategy: I learned early on ANY social context is motivating for me. So I try to schedule regular checkins, even if the org doesn't mandate them. Even a few minutes of a manager explaining the next project will yield much better work than getting a ticket thrown over the wall. Constantly engaging with others and feeling like they are personally depending on me is one of the most motivating things out there - so I have to keep reminding my bosses that YES, I do want to be in all those meetings.

I will say on the subject of learning - I mostly agree with you. I learn all the time in a huge variety of fields, and when I need to learn new tech to accomplish a task I relish it. But if I try to make myself learn something because it would push my career ahead? Force myself to devote some number of hours to a project in a new language, something like that? It's like pulling teeth. I can't get into it.

(+: In general, if the world discriminates against you, is difficult for you, etc - whether or not it's truly worse for you or a perception you have, it doesn't matter to you - what matters is how you react to it. Sitting at home crying how about how it's not fair will never benefit you.)


Thanks, as someone else with ADHD this is almost entirely spot-on with regards to myself, although I'm perhaps overly anal about the code I work on as a defence mechanism - which sometimes leads to focusing on the minutiae and missing other issues :)

> I need the feedback of fast release cycles. I can detect errors fast, take action fast, fix them fast. But "take your time and make sure it's perfect before you throw the switch"? That's just not my personality. I'm high volume (when not artificially constrained) but error prone.

So true, and it's made past jobs very hard work, most especially before I was diagnosed at age 31 and had the benefit of medication and being able to formulate better coping strategies :)


Me too.. in fact I just started imposing on myself last week a MWF: on top of things, TTh(plus weekend if available): on bottom of things. Hence here I am on HN on Monday!


Out of curiosity, what is your profession?


Software Engineer who dreams of being in some kind of management some day.


My thing is. Those meetings are great, but they are pointless until we know what we’re talking about...


Thats an amazing quote.

Reminds me of a "hermit-like wizard in his tower in solitary uninterrupted study of their craft."


I once used two weeks of vacation to learn category theory - that was a really fun and rich time. It really relaxed me, but it also brought me more knowledge than one year of shallowly engaging with the topic.


I've heard many times on HN people bring up category theory and how beneficial it has been to learn it, so I've recently ordered a textbook on it.

Can I ask how has it benefited your thinking?

I'm curious to know not only how it benefits an understanding of software development, programming languages and functional programming, but also has it helped you understand the world at large in new and novel ways?


If category theory is your first exposure to proofs and logic it can benefit you a lot, although number theory might be a more fun and more concrete way to get the same result. If you know a lot about more than one field of math, as a great big generalization over everything category theory can also help.

I should warn you that there is a social bias towards saying that you found a lot of insight in studying something, and a bias against saying that studying it was a waste of time, irrespective of whether or not the thing is actually helpful. Aristotle could stare at a rock and end up with some insight, so anyone who stares at a rock and admits it hasn't helped them has to publicly admit they are not as smart as Aristotle. They say this is how geology got started. ;)


Aristotle might have been able to stare at a rock and say something clever, but most of his ideas about science were wrong, starting from the very definition of science, which he believed should be based on unprovable first principles (like mathematical axioms).

He rejected atomic theory, believed the Earth was the center of the universe and the four fundamental elements were air, fire, water, and earth, and thought heavy objects fell faster and moving objects would tend to come to a stop if nothing interacted with them.

These were observations that Aristotle and his audience accepted as "obvious". It took more careful and patient observers to discern the less obvious truth.


It's an iterative process. Anthro-centricism is powerful in terms of what it makes "obvious". The universe spins about the earth. That atoms are made up of objects. That we are the pinnacle of local evolution. That reality is really real.


It's this vogueish thing among a certain set of programmers to go around saying they learned category theory. I studied category theory as a math graduate student and frankly, if your goal is simply to become a better programmer, there are much more direct and productive ways to achieve that goal.


I can certainly appreciate ways that there are more direct and productive ways to become a better programmer.

My goal is to become a better thinker in general. Specifically with regards to seeing how one type of thing relates to another type of thing.

I'm actually more interested in better understanding and reasoning about the ecology of software development than becoming a better programmer per se but only because on some level I think that's actually necessary in order to achieve the writing of better software.

To put that into English... we don't write software in a vacuum. We write it in a team, in a company, in a world full of other programmers and there is a rich set of relationships among all of those things that is exerting pressure on the final shape the software takes.

Category Theory seems like fun and useful tangent that might help me understand the world better and be mildly useful in an abstract way.


Not qualified to speak on this myself, but I asked a similar question to a friend and he said something like:

Part of good software development is knowing where and when to create abstractions. Category theory has a collection of interesting and very general abstractions - if you can spot these patterns in your software development then they will be good candidates for abstraction/generalisation


> I'm curious to know not only how it benefits an understanding of software development, programming languages and functional programming

It did and does that too (not much, since I am still a learner): It helps me to write APIs that behave well; usually, that means being an API which outward facing parts compose well. (The opposite would be an API where you configure it's internal state until it does what you want - the more complex the API, the harder it gets to mantain the suiting mental model). To be honest, it will help me getting the theoretic motivation behind many FP-concepts like monads, but I am not yet deep enough, I would not say I reaped a practical benefit in my haskell-writing. Reading (and implementing) "Category for programmers", which is a great blog, should take me there tho.

> but also has it helped you understand the world at large in new and novel ways?

On a local scale, it was fascinating to see how you get logic and algebras by putting a little bit of additional order onto some abstract concept. These formal logic rules I leard? Turns out they are not only funny ways to play with symbols, but also funny ways to play with arbitrary finite sets. It was also nice to see how in category theory, you can "fix" nice properties by creating a new structures where these nice properties are objects. It is all pretty handweavy, since I am mostly going by my intuition unless I am pressed to prove something formally. This thread might help you: https://www.reddit.com/r/haskell/comments/16i322/whats_the_i.... Globally, I am a christian, so I see everything and everyone in relation to God anyways. Category Theory might at best helped me to identify God as the intitial and terminal object in the Category `Human` ;)

To give a short answer: I think CT is another tool in the box to gain a more connected view of the world, the main ingredients being age and observation of processes.


Now imagine having a year of this kind of intensive study of category theory (or similar), that is, graduate-level classes in math.


I envy those. I pity not having done my undergrad in maths! I have to say tho that we the examples learned in math department category theory are often not that important or interesting to programmers, and the same is true for the abstractions being studied.


I did actually do graduate studies in math. Early on I went to Paris for 2 semesters where I tried to follow a course on category theory, in preparation to algebraic geometry. Even though I did lots of abstract algebra before that, the intensity was too much and I could not keep pace.

Interestingly, the Chinese students in that class did fairly well, even though they could not follow the actual lectures (given in French) and were studying exclusively from the lecture notes (given in English).


Or the scholar/monk archetype.


Stanford CS has at least a couple of people who seem to be really decent humans as well as great computer scientists. Besides Don Knuth, John Ousterhout is another person who comes to mind. Both of them have a humility combined with humor that I find quite admirable.


Really? So everyone else in CS isn't even a "decent human being" by default for you?


They were simply commenting that there were a couple people they especially admired, who they think are really decent (i.e., better than average at being decent). It is neither a necessary nor a natural reading to think that means they believe everyone else is below average.


My favorite Knuth story, attributed to Alan Kay (if you're around, would love confirmation):

When I was at Stanford with the AI project [in the late 1960s] one of the things we used to do every Thanksgiving is have a computer programming contest with people on research projects in the Bay area. The prize I think was a turkey.

[John] McCarthy used to make up the problems. The one year that Knuth entered this, he won both the fastest time getting the program running and he also won the fastest execution of the algorithm. He did it on the worst system with remote batch called the Wilbur system. And he basically beat the shit out of everyone.

And they asked him, "How could you possibly do this?" And he answered, "When I learned to program, you were lucky if you got five minutes with the machine a day. If you wanted to get the program going, it just had to be written right. So people just learned to program like it was carving stone. You sort of have to sidle up to it. That's how I learned to program."

[0] http://www.softpanorama.org/People/Knuth/index.shtml


I’ve posted this comment on HN before, but this is my best Knuth story:

In the 70's I had a co-worker, perhaps the best programmer in the department, that had gone to school with Knuth. He told me that one day while in college Knuth was using one of the available key-punch machines to punch his program on cards. My friend was ready to punch his program so he stood nearby to wait for Knuth to finish. Knuth, working on a big program, offered to Keypunch my friends program before finishing his own because my friend's program was shorter and Knuth could keypunch quite fast.

While watching over Knuth's shoulder, my friend noticed Knuth speeding up and slowing down at irregular intervals. Later he asked him about that and Knuth replied that he was fixing the bugs in my friend's Fortran as he punched it out.


> He did it on the worst system with remote batch called the Wilbur system.

I think you mean WYLBUR.

I had the "opportunity" to work with WYLBUR once in 1993, and I remember it to this day. "the worst system with remote batch" dramatically understates how bad it was. Hearing this raises Knuth even higher in my estimation.

https://en.wikipedia.org/wiki/ORVYL_and_WYLBUR


> He did it on the worst system with remote batch called the Wilbur system.

How funny, because Wylbur was a large improvement on the default at the time, MVS's TSO (Time-Sharing Option). Wylbur was miles easier and faster than TSO.

So, not actually 'the worst system with remote batch' because that trophy would go to TSO.


There's another interesting story about Knuth that's worth repeating here.

http://www.leancrew.com/all-this/2011/12/more-shell-less-egg...

The tl;dr is that Knuth wrote an elaborate implementation of a program to solve a particular problem, and Doug McIlroy replaced it entirely with a six step shell pipeline. (Knuth's program was written using his literate programming tools, could be typeset in TeX, and involved some precise work with data structures and algorithms.)

I love this story as an example both of Knuth's genius and perspective, but also as a way to show what his level of dedication can achieve. It's an amazing intellectual accomplishment.

I also love this story as a demonstration what those of us without that skill and dedication can achieve using the advancements built on the work of Knuth and others.


That was quite unfair criticism, and even Doug McIlroy knew it (as he admitted later). The background is this:

- Bentley, the author of the column, invited Knuth to demonstrate literate programming using a program of his choice.

- Knuth insisted that to be fair, Bentley ought to specify the program to be written; else someone might object that Knuth chose a program that would be good for literate programming.

- Bentley chose (what we'd now call) the term frequency problem (list the top k most frequent words in a text file), and accordingly Knuth wrote a system program for solving just this one particular task. (Did everything from opening the input file to formatting the output, etc.)

- Doug McIlroy was asked to “review” this program, the way works of literature are reviewed. He happens to be the inventor of Unix pipes. Towards the end of his review, along with many other points (e.g. Knuth didn't include diagrams in cases where most of us would appreciate them, something I still struggle with when reading TeX and other programs), he used the opportunity to demonstrate his own invention, a shell pipeline using now-standard Unix tools (tr, sort, uniq, sed).

There are a few things wrong with this criticism:

- The main thing is that DEK wrote the program he was asked to write, so pointing out that he shouldn't have written that program is a criticism of the one who chose the program (Bentley mentioned this when printing McIlroy's review).

- At the time, Unix wasn't even widely available outside Bell Labs and a few places; it definitely wasn't available to Knuth or most of the column's readers.

- Knuth's program, fine-tuned for the task, is more efficient than the shell pipeline.

- Even if you use a shell pipeline, someone has to write the “standard” programs that go into it (the "tr", "sort", "uniq" and "sed" above), and literate programming can be used there. In fact, Knuth did exactly that a few years later, rewriting Unix's “wc” (IIRC) and compared the resulting program with that of Sun Unix's wc, and his LP version had, among things, better error handling. (He's explained it by saying that in conventional programming, if you have a small function and 90% of it is error-checking, it looks like the function is “about” error-checking, so there's a psychological resistance to doing too much of that, while with LP you move the error-handling to a separate section entirely about error-checking and then you tend to do a better job. BTW, TeX's error handling is phenomenally good IMO; the opposite of the situation with LaTeX.)

All that said, there is some valid criticism that Knuth prefers to write monolithic programs, but that works for him. He seems not to consider it a problem that to change something you have to understand more-or-less the entire program; he seems to prefer doing that anyway (he reads other people's code a lot: in Peter Seibel's Coders at Work, he was the only person interviewed who read others' programs regularly).


re: At the time, Unix wasn't even widely available outside Bell Labs and a few places; it definitely wasn't available to Knuth or most of the column's readers.

At that time, 1986, Unix was widely available - there were more than 100k Unix installations around the world by 1984. AT&T Unix Sys V and UCB 4.3bsd were available. Knuth was friendly with McIlroy, who was head of the Bell Labs computer science research group that begat Unix. Sun Microsystems was formed in 1982 - their boxes ran Unix, and Sun was a startup spun off from Stanford.


Hmm interesting; I remember checking this for the time when TeX was written (1977, because of questions about building on top of troff) -- what I remember finding is that Unix wasn't widely available at colleges then. Perhaps things changed in the next 9 years. As far as I know, Knuth's access to computers at the time was still through Stanford's AI Lab (that's why the first version of TeX in 1977-1978 was written in SAIL; see also the comment for 14 Mar 1978 in http://texdoc.net/texmf-dist/doc/generic/knuth/errata/errorl...). Do you know if Unix was installed on Stanford lab computers by 1986? What was the distribution of these 100k Unix installations (academia/industry)?


>Sun Microsystems was formed in 1982 - their boxes ran Unix, and Sun was a startup spun off from Stanford.

Right. Even their name was derived from Stanford University Network:

https://en.wikipedia.org/wiki/Sun_Microsystems#History


OK, Unix was probably available to Knuth, but the task given to Knuth was not to promote any already written programs! Had he done so it would be claimed that he missed to do what was requested of him to do.

Even today, if you would get the exactly same task, with the goal to make the most efficient solution when you have to care about the limitations of hardware available to you and to produce the self contained program (e.g. because your algorithm should run with hundreds of billions of words of input) you'd still at the end probably produce something closer to what Knuth did then what McIlroy did.

Which doesn't mean that it's not brilliant. But it's also not obvious, i.e. not something a "normal user" would "know":

- even if you knew that "tr command translates the characters to characters" did you know that you could (and must) write

   tr -cs A-Za-z'
   '
to perform the first operation from 6? What the -c does? What the -s does? That you could and even had to form the command line to contain the newline? I bet a lot of Unix users of today would still not know that one.

- did you know what the fifth line was supposed to do "sort -rn" Would you know that you're to sort "numerically" (-n) and that it would "work"?

- "sed ${1}q" how many people even today would know that one?

And after all that, the first of two sorts needs to sort the file that is as big as the original input! If you have hundreds of gigabytes of input, you'd have to have at least that much more just to sort it. McIlroy's approach is a good one for one-off program or not too big input processing, and if you knew that you can use these commands as he used them. But it's still not "a program" in the same sense a Knuth's program is.

Knuth's algorithm would, unsurprisingly, handle the huge inputs orders of magnitude more efficiently. And that is what McIlroy was aware of and intentionally hand-waved it in his "critique." Read the original text:

https://www.cs.tufts.edu/~nr/cs257/archive/don-knuth/pearls-...

But the major point is still: Knuth's task was not "use the existing programs" (or libraries) but "write a program" that does what's said to be done: The fair comparison would then include the source of all the sort, uniq, tr etc. programs which McIlroy used.

And once that is being done, McIlroy's code would still be both less readable, less efficient and worse overall.

Which on the other side also doesn't mean that for some purposes "worse" isn't "better": https://yosefk.com/blog/what-worse-is-better-vs-the-right-th... But for some purposes, on "better" works and "worse" simply doesn't, e.g. when the scale of the problem is big enough. And Knuth teaches us how to solve such, harder problems. And presents the complete solution v.s. doing tricks (just call that library/executable which I'm going to avoid to explain you how it is implemented and what its limitations really are).

And giving the misunderstanding in the difference between showing how something is implemented (most efficiently) and that "just use pre-written tool X" approach, I understand even more why Knuth uses assembly in his "The Art of Computer Programming" books.


> But it's also not obvious, i.e. not something a "normal user" would "know":

> - even if you knew that "tr command translates... "sed ${1}q" how many people even today would know that one?

Are you suggesting it's ever been more likely for people to understand how to manage a trie structure in Pascal than use Unix command line tools? Or look flags up in the manpages?

Personally speaking, I'm comfortable doing both, but can't imagine many scenarios where I'd rather have ten pages of a custom datastructure than six lines of shell. (And they all involve either high volumes of data or situations where I can't easily get to the process-level tools.)

> The fair comparison would then include the source of all the sort, uniq, tr etc. programs which McIlroy used.

If you're including the code that provides the surface abstractions, where do you draw that line? If the code for sort, uniq, etc. is fair game, why not the code for the shell that provides the pipelining, the underlying OS, the file system, the firmware on the disk? After all, who's to say that the programs in the pipeline don't just run one after another with temporary files written to disk, rather than in parallel? (Which I've seen happen in reality.)

The same is true for the other side, of course. The 'fair comparison' could easily require Knuth's solution to include the source for weave/tangle, TeX/Metafont/CMR, the OS, etc.

> And once that is being done, McIlroy's code would still be both less readable, less efficient and worse overall.

What definition of 'worse' are you using?

* I expect sort/uniq/tr/sed to be more well tested and understood than a bespoke program.

* If there are issues with the program, it'll be easier to find skills/time to maintain a shell pipeline than custom trie-balancing code written in Pascal. (Sitting aside a prose description of the same.)

* The shell pipeline runs on a machine that can be found in a retail store today, rather than requiring an elaborate download/build process.

* It's possible that the custom solution runs faster, but not obvious without testing. (None of which is worthwhile outside of demonstrated need.)

Point being: it's very easy to find a definition of 'worse' that applies more to the custom solution than to the pipeline.


> The shell pipeline runs on a machine that can be found in a retail store today, rather than requiring an elaborate download/build process.

That argument points to the fact that your “view” of the whole topic changes the assumed definition of the problem that was given to Knuth to solve. Read once again the original text: he was supposed to illustrate how the “Literate programming” could be ised while writing a program which solves a given program. It was definitely not “write an example of calling the existing pre-written programs”.

And, of course, it was all in 1986, definitely not “to target the machine which can be found in the retail store in 2018.”

McIlroy already behaved as the goal had been different than it was.


> McIlroy already behaved as the goal had been different than it was.

How would you feel about McIlroy's solution if it was semantically exactly the same, but written in a literate approach? (Essentially a version of 'weave/tangle', but for shell scripts.)


How would you feel if somebody would present the “six steps clicks in Excel and SQL Server” that eventually produce the same result? The starting goal was simply not “show how to use and combine external programs.” Even if you need some kind of skill to combine them. It’s exactly the same kind of missed fullfilment of the given starting task.


The reason I asked about a literate programming version of the shell script is that it speaks directly to Knuth's original stated goal: "I’ll try to prove the merits of literate programming by finding the best possible solution to whatever problem you pose"

In the context of that requirement, the it's the use of literate programming that's more of a concern than the specific implementation. (Which is why I asked about a literate version of the shell pipeline.)

Earlier in the thread, you also mention this concern around data volumes:

> If you have hundreds of gigabytes of input, you'd have to have at least that much more just to sort it. McIlroy's approach is a good one for one-off program or not too big input processing,

There, your concern is not justified by the stated requirements of the problem: "I did impose an efficiency constraint: a user should be able to find the 100 most frequent words in a twenty-page technical paper"

I do think McIlroy failed to solve the problem of demonstrating the value of literate programming, but I'm not sympathetic to arguments that he should've used more complex algorithms or relied on less library code. This is particularly the case when the additional complexity is only relevant in cases that don't fall into the given requirements.

(A literate program that uses SQL server or Excel might be an interesting read....)


> The reason I asked about a literate programming version of the shell script is that it speaks directly to Knuth's original stated goal: "I’ll try to prove the merits of literate programming by finding the best possible solution to whatever problem you pose" In the context of that requirement, the it's the use of literate programming that's more of a concern than the specific implementation.

And McIlroy's "solution" is provably not the "best possible solution" if you are interested in the algorithms, algorithmic complexity, the resources used, you know, all the topics studied by people doing computer science. All these topics are still relevant today.

That is the domain that was of interest to both Bentley and Knuth, and McIlroy "sabotaged" the whole by presenting effectively only a list of calls to the stand-alone programs which he hasn't developed himself, which he avoided to present, and even without looking at them, just by analyzing what are the best possible implementations of these, every student of computer science can prove that McIlroy's solution is worse.

If you carefully read the original text (and if you understand the topics of computer science), you can recognize that McIlroy was aware of the algorithmic superiority of Knuth's solution.

Knuth definitely and provably won.


While I have been watching this subthread with increasing dread, I feel I should point out it was not a competition or contest to be "won" -- Knuth wrote an interesting program, and McIlroy wrote an interesting review of it.


> it was not a competition or contest to be "won"

Sure, it wasn't a competition. The fact remains: McIlroy criticized Knuth's presentation of complete algorithms which effectively solved specified problem, by presenting just a sequence of calls which provably have to call the implementations that must implement provably worse algorithms. In the column in ACM whose topic were... algorithms, edited by Jon Bentley.

So if you consider that two sides presented their arguments regarding the algorithms related to the specific problem, we can still say that Knuth "won" that "dispute."


> presenting just a sequence of calls ... that must implement provably worse algorithms.

You've never really established why this matters given that the goal of the challenge was to present the value of literate programming.

The goal wasn't an optimal algorithm or minimal resource consumption - the goal was to demonstrate the value of literate programming on a small data processing problem.

This is a very different problem than writing an optimal or highly scalable algorithm.


> The goal wasn't an optimal algorithm or minimal resource consumption - the goal was to demonstrate the value of literate programming on a small data processing problem.

No. You obviously still haven't read the column and the article that explained what the goal actually was. The actual goal was to demonstrate Knuth's WEB system, which was explicitly made only for Pascal in 1986. That was what Bentley asked Knuth to do (quoting from the article "Programming pearls: literate programming", Communications of the ACM, Volume 29 Issue 5, May 1986 Pages 384-369):

"for the first time, somebody was proud enough of a substantial piece of code to publish it for public viewing, in a way that is inviting to read. I was so fascinated that I wrote Knuth a letter, asking whether he had any spare programs handy that I might publish as a “Programming Pearl.” But that was too easy for Knuth. He responded, “Why should you let me choose the program? My claim is that programming is an artistic endeavor and that the WEB system gives me the best way to write beautiful programs. Therefore I should be able to meet a stiffer test: I should be able to write a superliterate program that will be noticeably better than an ordinary one, whatever the topic. So how about this: You tell me what sort of program you want me to write, Crin d I’ll try to prove the merits of literate programming by finding the best possible solution to whatever problem you pose’--at least the best by current standards.”"

So, in the context of demonstrating Knuth's WEB on the substantial piece of code, the modification of the request was only that Knuth wasn't allowed to use the program he already wrote, but that he had to write a new one! (So the starting goal was in effectively all the premises exactly the opposite of what McIlroy then showed!)

So the goal was to write and present a wholly new program in Knuth's WEB, which, under the standards of evaluation of the quality of the solution as widely accepted by computer scientists, would be "the best solution." Which is exactly about the optimality of the algorithms, resources use etc.

If you still don't fully appreciate the context of the Knuth's program, do search for all the other columns and computer science books written by both Bentley and Knuth -- the topics of both were never "how to use existing programs" but how to develop/use the best algorithms.


Thanks for the response.

> You obviously still haven't read the column and the article that explained what the goal actually was. ... > do search for all the other columns and computer science books written by both Bentley and Knuth

Since this is public, I'll conclude by noting here that I had indeed read both of the articles, and a bunch of other text by both Bentley and Knuth besides. (In fact, the Programming Pearls books are particularly high on my list of recommendations...)


Then what I can conclude is that you claim that you've read the articles but that you still ignored their content, in case you've read them before writing here, otherwise you would not instead present here the false interpretation of what the Knuth's specific task in these specific articles explicitly was.


Well put.

I think it's possible and desirable to simultaneously respect both approaches for their respective merits. Maybe ironically for this profession, the choice isn't either/or.


Absolutely agree - Unix was ubiquitous in academia by 1986. I myself learned C on a VAX running BSD 4.2 in 1986.


I never saw that as a criticism.

Just an interesting counterpoint.


To his great credit, Knuth included McIlroy's critique, in full and without response, when he published Literate Programming. Knuth also conceded in an earlier chapter that literate programming was probably a crazy idea and that promoting it was perhaps an act of madness! It's delightful that we live in a world where there's room both for industrial-strength Fabergé eggs and hack-it-together pipelines. :)


McIlroy's critique is both astute on it's own yet perhaps not relevant in the context of what Knuth had been asked to do, namely present an example of literate programming. That's not to say the McIlroy's script can't have a literate form, only that it wouldn't serve as a good exemplar.

It all serves as a good reminder of how Knuth's work should be used & viewed: It shouldn't be taken as insight into how to manage and develop for software projects. It's more useful as teaching tools for how to think about solving problems. In my mind it's the difference between "pure" science and practical engineering, though I'm not sure that's a perfect analogy.

To try another analogy, it's like Knuth was asked to make a custom set of clothes. And McIlroy's critique seems like saying "Not everyone can afford custom made clothes, and the GAP produces perfectly serviceable clothes that are far more practical in most situations" It's correct, sure, but rather besides the point in the context of the original request.


> perhaps not relevant in the context of what Knuth had been asked to do, namely present an example of literate programming.

Agreed... if that was the goal, then McIlroy's example was terrible (in that it didn't use literate programming at all).


That's a good one. It's one of the classic examples of the Unix philosophy and how pipelines can be powerful, although of course not applicable in all cases, and they have their issues too - which often becomes an argument on HN (about text streams vs. object pipes a la PowerShell, etc.).

I had seen it a while ago and blogged about it here, with solutions (approx.) in Python and shell:

The Bentley-Knuth problem and solutions:

https://jugad2.blogspot.com/2012/07/the-bentley-knuth-proble...

Also, previous HN thread about it here:

Revisiting Knuth and McIlroy's word count programs (2011) (franklinchen.com)

https://news.ycombinator.com/item?id=15264997


There is also the book "Exercises in Programming Style" that uses the term frequency problem to illustrate 33 different programming paradigms (including a shell solution).

https://www.amazon.com/Exercises-Programming-Style-Cristina-...


Cool. Will check that out.


I love that story, although I think the takeaway needs to be slightly updated. It's quite possible that a sorted word count was easiest to do with bash and unix utilities in 1992. Now, it's easier and more comprehensible in your favorite scripting language (which I'm assuming isn't bash).

The real lesson is to program using the most powerful tools at your disposal.


That's a good lesson, but we shouldn't stop there. Knuth is a computer scientist, but also an artist: writer, musician, typographer, and creative programmer. His insights into programming-as-literature have infused an often-soulless industry with something soulful. He began with the popular notion that programming should be a creative, playful act, and tried to elevate mere playfulness into the realm of fine art. An incredible detour in an amazing career, and an enriching lesson of a different kind.


And a theologian, he has has written works on divinity, etc..


> It's quite possible that a sorted word count was easiest to do with bash and unix utilities in 1992. Now, it's easier and more comprehensible in your favorite scripting language

It'd be interesting to see how true this is in reality. I'm not at all convinced that most scripting languages can do this quite as concisely as bash and the Unix userland. (But, being honest, this problem seems very well aligned to that tooling's strenghts.)


It is easy to do in your favorite scripting language. But is unlikely to be as short. Comprehensibility is in the eye of the beholder.

Here is a Perl solution for comparison.

    use strict;
    use warnings;

    my $limit = shift @ARGV;

    my %count;
    for my $line (<>) {
        while ($line =~ /(\w+)/g) {
            $count{lc $1}++;
        }
    }

    @words = sort {$count{$b} <=> $count{$a} or $a cmp $b} keys %count;
    $limit = @words if @words < $limit;
    print "$_\t$count{$_}\n" for @words[0..($limit-1)];


Just for reference: a Perl 6 version:

    sub MAIN($limit = Inf) {
        my %bag is Bag = words.map(&lc);
        say "{.key}\t{.value}" for %bag.sort( {
          $^b.value cmp $^a.value || $^a.key cmp $^b.key
        } )[^$limit]
    }
And for what it's worth, the `words` function is lazy, so it won't read all of the words into memory first.


I would have written the sort differently

    %bag.sort({ -.value, .key })


Python is relatively short, batteries included and all that:

  import re
  import sys
  import collections
  
  words = re.findall('[a-z]+', sys.stdin.read().lower())
  c = collections.Counter(words)
  for word, count in c.most_common(int(sys.argv[1])):
      print(f'{count:>7} {word}')
Three times the character count of the Unix pipe version, but IMHO a lot more readable (and generalisable).


You don't want to read all input into memory first.


Fair enough.

  import re
  import sys
  import collections
  
  c = collections.Counter()
  for line in sys.stdin:
      words = re.findall('[a-z]+', line.lower())
      c.update(words)
  for word, count in c.most_common(int(sys.argv[1])):
      print(f'{count:>7} {word}')


Well, what is easy or comprehensible isn't necessarily concise. Bash is great for code golf but I know few people who prefer to code in it.


And for completeness, a Ruby implementation:

  puts STDIN.read
    .split(/[^A-Za-z]+/)
    .group_by(&:downcase).transform_values(&:size)
    .sort_by(&:last).reverse
    .take(ARGV[0].to_i)
    .map{ |l| l.join("\t") }.join("\n")
Wordier than the bash script, but still pretty short and fairly trivial. Anyone could make this in five minutes.


There is no bash (or any scripting) involved, the complete command from the article is:

  tr -cs A-Za-z '\n' |
  tr A-Z a-z |
  sort |
  uniq -c |
  sort -rn |
  sed ${1}q
The real benefit is that each is a single command that is easy to test in isolation and it's multi-process. That's not possible in most scripting languages.


> There is no bash (or any scripting) involved

What provides the pipe functionality?


McIlroy literally calls it a script in his review (notice the ${1}).

Nothing prevents you from unit testing in scripting languages. Multi-process, sure, but most people aren't looking for that.


> McIlroy literally calls it a script in his review (notice the ${1}).

Ok, there's a single shell substitution, if it was fixed would you still call it a script? Technically the result of that is itself a sed script "3q", but if you count either of those then there isn't a lot of wiggle room between script and command, the arguments to tr are by far the most complex "script" involved.

> Nothing prevents you from unit testing in scripting languages.

That is a world away from what I'm talking about. Each line of that command can be executed on the CLI in isolation, you'd be replicating a lot more in nearly any scripting language, except maybe perl and awk.

> Multi-process, sure, but most people aren't looking for that.

Neither am I generally, still quite nice when you get it for free though.


You're really splitting hairs. You can execute a Python command in a REPL. There's little material difference between scripts and commands for our purposes. And scripting languages provide facilities to test functions in isolation.


> He has fashioned a sort of industrial-strength Fabergé egg—intricate, wonderfully worked, refined beyond all ordinary desires, a museum piece from the start.

Curiously, I independently came up with this metaphor in my review of The MMIX Supplement to The Art of Computer Programming [1]:

"You can marvel at its intricacies like one marvels at a Fabergé egg."

[1] https://www.amazon.com/gp/customer-reviews/R25NHN9UP7PLJI/


Yes, I love this story too. It really demonstrates so much. And oh, the hackles it raises. :)


> And oh, the hackles it raises. :)

The main reason I like the story is that it's good fodder for discussion about priorities in software development. After all, counting words isn't exactly an interesting problem in 2018, but deciding the effective use of libraries and programmer time is still very interesting.


Probably my favourite is along similar lines: read “The Summer Of 1960 (Time Spent with don knuth)” for a great story of Knuth as “this tall college kid” (consider the fact that Knuth first encountered a computer at college), from “Stories about the B5000 And People Who Were There”: http://ed-thelen.org/comp-hist/B5000-AlgolRWaychoff.html#7 (PDF scan: https://www.computerhistory.org/collections/catalog/10272464... page 7)

And about the story you shared, Knuth has elaborated in an interview (with typical humility, he plays it down): http://www.informit.com/articles/article.aspx?p=1193856


Most people who learned to program on batch-oriented systems developed those habits. When you had to wait hours or even overnight for your compile to run, you spent a lot of time at your desk checking the code for syntax errors, and running the logic in your head.


Knuth has said that the story as told above is apocryphal. I quote from here (http://www.informit.com/articles/article.aspx?p=1193856): > Donald: The story you heard is typical of legends that are based on only a small kernel of truth. Here’s what actually happened: John McCarthy decided in 1971 to have a Memorial Day Programming Race. All of the contestants except me worked at his AI Lab up in the hills above Stanford, using the WAITS time-sharing system; I was down on the main campus, where the only computer available to me was a mainframe for which I had to punch cards and submit them for processing in batch mode. I used Wirth’s ALGOL W system (the predecessor of Pascal). My program didn’t work the first time, but fortunately I could use Ed Satterthwaite’s excellent offline debugging system for ALGOL W, so I needed only two runs. Meanwhile, the folks using WAITS couldn’t get enough machine cycles because their machine was so overloaded. (I think that the second-place finisher, using that "modern" approach, came in about an hour after I had submitted the winning entry with old-fangled methods.) It wasn’t a fair contest.

> As to your real question, the idea of immediate compilation and "unit tests" appeals to me only rarely, when I’m feeling my way in a totally unknown environment and need feedback about what works and what doesn’t. Otherwise, lots of time is wasted on activities that I simply never need to perform or even think about. Nothing needs to be "mocked up."


I once had to write a program for a class didn't have access to a compiler. I wrote it all in notepad and double checked everything more thoroughly than I ever would have done otherwise. When I got the computer lab the compiler found one or two type-o's and the program ran perfectly the first time. The experience changed how I write code. It's worth trying out at least one time.


When I was a kid, I learned Pascal from a book. We didn't even have a PC at home (it was the 80s) but my mother was a college professor and had access to the mainframe there. I wrote out a whole program to run Conway's Game of Life on a bunch of notebook paper, and one day when I was off from school I came in and typed the whole thing in at a terminal. I ran the command to compile it, and it spit out hundreds of syntax errors. Then it was time to go home.

It was not an auspicious start to my career in computer science.


That's how I learned C. I bought a copy of K&R, but didn't have access to a Unix system with a C compiler for about a month (when my new job started). I read the book and wrote out all the exercises in a notebook.

When it came time to try my code, I spent a while correcting some basic misconceptions that I had developed but that hadn't been corrected by an actual compiler.


> It was not an auspicious start to my career in computer science.

I'd argue the opposite!

Learning how to "run" code in your head, cover edge cases and invariants, is a great skill to practice. When I was at a hospital after a difficult operation, I used to write down pages and pages of C64 programs in my notebook, too :-)

Even if I never even typed all them out later, I like to think this made me a better programmer. It trains your brain in a way that REPL doesn't.


I sometimes worry that I've become REPL-dependent.


I used to sit in my grade 12 English class working out polygon fill-and-shade routines on graph paper instead of whatever nonsense Shakespeare we were going through at the time.

Worked out really well for me - I'm employed well as a programmer, and I got to come to Shakespeare basically fresh when I was old enough to actually appreciate the work.


How old is that? I'm over 50 and I still can't get into Shakespeare. Watching a performance is OK, though not something I'd choose to do on my own, but reading the plays? Can't get into it at all.


Have you tried an annotated one? Shakespeare is full of, essentially, in-jokes and memes from the 1500s. There is a ton of depth in his writing that you'll totally miss if you don't know all of that context.


Also, lots of bawdy humor that gets glossed over by staid and respectable modern productions. For example: the "bite your thumb" gesture that gets used in a few plays is pretty much the equivalent of somebody from today giving a double middle finger while sticking their tongue out.


Try watching a movie version - I really like Branagh's Hamlet and I never expected to (long though)


Try the "Sonetts" instead of the plays!


Just program.


I originally learned to program in 6502 machine code. Typically I would write code in spiral notebooks while watching the late late late show and hexkeypad it in afterwards. made it a little easier to edit things that way...


To me, this is one way to show evidence of mastery of an engineering discipline: The ability to do it once and have it come out right. If you asked a modern software engineer who's used to fast modern tools, they'd tell you this was impossible. Everyone's pretty much settled into this kind of workflow:

Write -> Compile -> Fix Compile Errors -> Compile -> Fix Compile Errors -> Compile -> Fix Compile Errors -> Compile -> Run -> Fix Runtime Bugs -> Compile -> Run -> Fix Runtime Bugs -> Compile -> Run -> Fix Runtime Bugs -> Compile -> Run -> Fix Runtime Bugs -> Compile -> Run -> Fix Runtime Bugs -> Git Push and Pray

With maybe a few "Write Unit Tests" steps intermixed. Totally different mentality. Is it really a faster or more productive way to work?


Yes. A real feedback loop is better than a mental one in your head. “Turn off your targeting computer and use the force, Luke” only works if you hone a lot of mental power into it (like Knuth), but it would have also discouraged a lot more people without that ability from programming at all.

In the past you had to be Hawkeye to program, now most of us just put on our Ironman suit and get on with it.


Nevertheless, the more complete and accurate your picture of what you are attempting to achieve, and where you are, is (at every level of abstraction), the fewer iterations it will take to get done - or the fewer bugs remaining after a given elapsed time.

It's like situational awareness: https://en.wikipedia.org/wiki/Situation_awareness


The number one thing I try and build up for my teams is a mental picture of the entire system. It is amazing, to me, how much resistance I get to the idea.


The difference is other people.

Yes, if you're writing your -own- code, learning to envision what the system must do in your head first is very, very valuable.

When you're on a team, however, you -have- to compartmentalize. You have to create abstractions, black boxes, functionalities that you -could- dive into, but which you will accept only understanding at a contract (interface) level. The skill of envisioning the system is still useful, of course, but there will be black boxes.

The problem that causes, of course, is that every abstraction is leaky. You didn't know that calling this method also fired off this event, and so you -also- triggered that event. Or whatever. Hence, bugs and iteration. You also have to deal with -bad- interfaces, a bajillion parameters, or a parameter object that isn't intuitively obvious how to initialize, and you start having to iterate.


The problem is not just, or mainly, leaky abstractions - you can't make a system out of a collection of opaque abstractions. An important part of Knuth's genius is seeing deeply and clearly how they have to interact. Leaky abstractions are just one sort of problem that arises when this doesn't happen.


If we want to go there, modern tooling allows you to forget some of the details you used to care about and focus more on other things, the tooling is rarely comprehensive and there will always be more to consider until the entire programming process is automated.


As modern tooling chips away at the accidental difficulties of software development, systems-level situational awareness becomes relatively more central to the process.


Sort of. I've seen these productivity-enhancing tools used by people who have absolutely no clue what those tools are doing on their behalf produce some nightmares that sort of work sometimes. To get geekier with the analogy, a lot of people are like spiderman (the Tom Holland spiderman) who need to learn to use the powers of their suit before they're given access to the more powerful, more dangerous stuff.


Put an untrained person in a bulldozer and there will be a lot of damage.


>In the past you had to be Hawkeye to program, now most of us just put on our Ironman suit and get on with it.

This is an amazing metaphor, thank you for sharing!


It is much more impressive that Knuth came up with and wrote TeX than that it was bug-free or written all in his head.

Computer time is no longer at a premium and bugs can be fixed.

Useful, creative, and clear solutions usually trump perfectly correct ones, except in special cases such as when life is on the line or you're writing programs to fly to the moon where getting it right the first time is of utmost importance.


"Useful, creative, and clear solutions usually trump perfectly correct ones"

Actually I've found that the most useful solutions are those that are creative and clear.

But yeah, it totally depends what one is doing. Usually if the algorithm is mission critical and the component is mission critical you can weasel out enough time to make it right (so you don't have to return it to ever again and you get a reputation as a guy who's code just works).


In some domains it is, for example you won't create fun arcade game without iterating 1000s of times on the basics (control scheme, physics, etc). Often it isn't.

I learnt programming writing small games, so quick iteration is my first instinct, and I can see myself sometimes jumping to write code too quickly.


If Knuth worked on a large code base (e.g. Windows), I'm sure he'd have the exact same workflow.


Do you really encounter compiler errors that frequently? The compiler stopped barking at me for the most part after my first year or two.


During big refactorings, I admit I sometimes find myself relying on compiler errors as a crutch to find (for example) which API layer I haven’t added the parameter to yet.

I’ll also rely heavily on my IDE’s function signature completion, to remind me whether some method I’m calling takes an int or an unsigned int, rather than have it memorized like I used to.

This might be why a lot of people (including me) hate whiteboard-coding interviews: we’ve gotten so spoiled by our tools that we can’t code without them!


> During big refactorings, I admit I sometimes find myself relying on compiler errors as a crutch to find (for example) which API layer I haven’t added the parameter to yet.

That's not a 'crutch' - it's literally what those compiler errors are for! The alternative would be for the compiler to do something non-sensical, which would error out at runtime.

And when it comes to white-board coding, you should arguably be using pseudo-code anyway - your goal is then not to come up with something that will run, but to convince your interviewer that the code is 'morally' correct and that any subsequent fixes are well within your skill level.


My biggest problem with whiteboard coding is writing text in straight lines. You never realize how much of a liability being left handed is until someone asks you to do whiteboard coding (ya, many lefties train for this, but some don’t).

Computers have been a godsend for my penmenship. On the other hand, I guess I relied on them too much as a kid.


I thought lefties learn to read and write backwards and then mirror the final result before submission.


No, we definitely do it forwards; fountain pens and cursive used to be a serious problem, as is having your hand over what you've just written, so I write "overhand".

(To see this, do a 45 degree '/' with your right-hand pen, then leave the tip in the middle and make your left hand into a mirror image in the plane of the '/')


My dad was left-handed. He'd use ball-point pens that didn't smear, and when he was taking notes he'd flip the notebook so that the binding was on the right.


Pens are the worst, they konk out so quickly! iPad pros are really useful here. I wish my hands were transparent, occlusion is also a real problem (it is hard to write a straight line of text if you can't see what has already been written).


This might be why a lot of people (including me) hate whiteboard-coding interviews: we’ve gotten so spoiled by our tools that we can’t code without them!

Isn't that like someone saying they've gotten so spoiled by training wheels, they can't ride a bike without them? It's not like I'm one to talk 100% of the time. I think I couldn't get my company's project to build, without a few weeks of reading and understanding a lot more of the build system. However, I've also coded on an 8 bit machine by flipping 8 switches and pressing a commit button for each byte.

It's a worthwhile exercise to do some coding with nothing but a text editor and a debugger once in a while. That isn't going 100% to the bare metal, but it's a level that's very worthwhile for working on basic skills. Entire programming education books have been based on this idea.


The difference is that training wheels are intended to be a temporary assistance until you've learned to operate without them, while development tools are meant to be a productivity boost. If you're working with a toolchain and aren't leaning on it, you're not working to your full potential.

Whether being able to work without the chain is also important is an independent issue. (I happen to prefer vim+scripting languages over IDEs+compiled. But I recognize it as a personal preference, and not a question of moral superiority.)


The difference is that training wheels are intended to be a temporary assistance until you've learned to operate without them, while development tools are meant to be a productivity boost.

Right, but pro cyclists don't say they couldn't do it without piece of equipment X. X is just a performance boost. Some coders say they couldn't practically do it at all without X. There would be a big difference in fitness between a commuter being unable to make a certain trip without a motor assist, and a rider who could do the same trip without the motor assist. I think most would look askance at a "pro" in the 1st category.

If you're working with a toolchain and aren't leaning on it, you're not working to your full potential.

Yes, but you need to be wary that you're using the toolchain for its intended purpose. The toolchain is supposed to be saving you typing and lookup time. It's not supposed to be substituting for your actual understanding of the code. The former is a good thing, and you should be good at using the tool for that. The latter is a bad thing, and you shouldn't be doing that. By working out with nothing but an editor sometimes, you can work out in a way that guards against that.

No football player plays actual games running through tires, but the exercise is apparently helpful.

But I recognize it as a personal preference, and not a question of moral superiority.

It's not moral superiority. It's using tools as intended and not substituting for understanding.


It isn't that simple.

A lot of what compilers catch are careless errors like mistyped variable names and missing semi-colons. This is not fundamental understanding of the code. This is a level of care that it is perfectly fine to let your toolchain handle for you.

A more interesting case is type checking. One of the reasons to use static type checking is that you are free to change the type of a variable or argument to a function, and then let your compiler tell you what needs to be fixed. Each individual fix is straightforward and easy - you understand it. But thanks to the toolchain you don't have to find them.

Choosing to work with dynamic typing and exercise vigilance is a reasonable choice. (In fact this is what I choose.) Choosing to work with static typing and let the toolchain help you in ways that it was designed to help is a reasonable choice. But choosing to work with static typing and refusing to take advantage of how it lets a toolchain help you is not a choice that makes sense to me.

That you should know how to operate without the tools is one thing - I strongly support it. But maintaining your practice in doing so is quite another.


Choosing to work with dynamic typing and exercise vigilance is a reasonable choice. (In fact this is what I choose.)

The biggest chunk of my professional work is still in Smalltalk.

But choosing to work with static typing and refusing to take advantage of how it lets a toolchain help you is not a choice that makes sense to me.

Yet despite being a Smalltalker for years, I'm still an advocate of static type annotation. It enables you to do more refactoring than not having it. It enables you to know sooner about incorrectly written code.

That you should know how to operate without the tools is one thing - I strongly support it. But maintaining your practice in doing so is quite another.

The "maintaining your practice" I'm advocating in this thread is merely: "Code without an IDE once in awhile to make sure you know how to operate without the tools." You don't have to fire drill every day. There is some benefit to doing it once in awhile, however.

I'm not advocating not using an IDE. I'm merely advocating for knowing exactly what it is the IDE is doing for you and doing to you. It's just wise practice for any professional power tool.


Yes, but you need to be wary that you're using the toolchain for its intended purpose. The toolchain is supposed to be saving you typing and lookup time. It's not supposed to be substituting for your actual understanding of the code.

The problem is, I don't think anyone completely understands the code when they write it. 90% of the time, you're writing code with a library written by someone else and you have an abstract understanding of what it does. That's the whole point of code is to abstract as much detail as possible. Its so you don't have to know what the machine code equivalent of X CPU add instruction is when coding a website for example. You just type + and the compiler/interpreter does everything for you.


The problem is, I don't think anyone completely understands the code when they write it. 90% of the time, you're writing code with a library written by someone else and you have an abstract understanding of what it does.

The trick is this: Do you actually have that good abstract working understanding, or have you only convinced yourself? This is the difference between sloppily convincing yourself you understand a word salad, or being able to coherently teach a concept. It's even a further step to be able to understand the specification of something in enough detail to be able to implement it and to see potential pitfalls. (This is the difference between real science and cargo cult science: Predictive Power.)

You just type + and the compiler/interpreter does everything for you.

There's a world of difference between just typing "+" or "/" because you've seen it and just going on token frequency/pattern matching, and really understanding the concept.

    X = Y + Z
Is often going to be quite different in control flow consequences from

    X = Y / Z
If Z happens to be zero.

If one is a careful programmer who has done substantive work, one should know there's a world of difference between a specification of a program that sounds good on the surface and a really good specification. After many years, one will have encountered many specifications which had to be re-thought one or more times to be practically implemented.

No, you shouldn't have to always rewire your program in hardware from NAND gates you make yourself from silicon in the bucket of sand under your desk. (I've actually quit a job because the manager was going overboard with that attitude.) But you should be able to peek under the next level of abstraction, and have enough working knowledge to become wary and know when you should be peeking. Either you can do this, or you do not have that level of skill/knowledge. Simple as that.

(Addendum: If you think to yourself that you can't do that, there's two common reactions to that. Either you tell yourself excuses and denigrate and don't bother, or you roll up your sleeves and learn it for yourself. You choose.)


There's a world of difference between just typing "+" or "/" because you've seen it and just going on token frequency/pattern matching, and really understanding the concept.

    X = Y + Z
Is often going to be quite different in control flow consequences from X = Y / Z If Z happens to be zero.

Obviously, but that is more of a concept in understanding mathematics than writing and understanding code. For example, do you need to know that Y / 0 returns a custom exception inherited from several parent classes under the parent Exception? Are you really thinking about all that when you code? Or is it mostly irrelevant and you just need to know that an error occurs and you need to be mindful of it (regardless of whether the error comes in the form of an exception, error code, hardware interrupt, etc.)

API's are written specifically to avoid needing to peek under the code and you should only really need to if they are poorly documented. Even then, you only seem to think only the next level of abstraction is warranted and not other levels (i.e. IL/x86 assembly or machine code). This next level of abstraction can yield important learning in code optimization because what you write in a higher level language can be implemented in multiple ways at a lower one (sometimes at a performance determent).

Either way, most coding is a black box exercise. While looking under the hood is useful and informative at times, nobody has the brain capacity nor time to absorb it all and apply it. Which is why the smart people built upon other smart people to put a model in place that can be applied without knowing what machine instructions your computer spits out after compilation, or without knowing the implementation details of how you get a list of a specific type or how it adds/removes/copies/etc. Same as science. You don't need to do an experiment every time to know the motion of the planets, measuring their positions in the sky each time to calculate the orbits and deriving the equations through calculus. You simply skip to the step of Newtonian mechanics and kinematic equations.

Imagine if everyone had to learn how an engine worked in order to operate a vehicle. Hardly anyone would be able to drive.


Obviously, but that is more of a concept in understanding mathematics than writing and understanding code. For example, do you need to know that Y / 0 returns a custom exception inherited from several parent classes under the parent Exception? Are you really thinking about all that when you code?

Indeed. In a conversation like this, the mention of divide by zero is just supposed to evoke all of that for an audience of programmers.

API's are written specifically to avoid needing to peek under the code

Any product is designed to be simply used. The difference between an end user and a pro is that the pro can sometimes go a little further and sometimes needs to because they can push the product harder.

Either way, most coding is a black box exercise.

As is most professional activity of any kind. Most of any job is kind of routine. That's why they call it "routine." What distinguishes the consummate pro is the ability to go beyond when needed.

While looking under the hood is useful and informative at times, nobody has the brain capacity nor time to absorb it all and apply it. Which is why the smart people built upon other smart people

Being a smart person means taking into account context and getting the best cost/benefit. Nobody who is "smart" would advocate knowing absolutely everything about everything, all the time. That's clearly a straw man. (Perhaps you are pressing things in a certain direction?) It's also clearly not the position I'm advocating for. Likewise, nobody who's smart would simply advocate for ignorance. Not even the smartest people are infallible. Smart people are simply prepared for when things go wrong.

Imagine if everyone had to learn how an engine worked in order to operate a vehicle. Hardly anyone would be able to drive.

Funny you should mention this, but I was about to bring up that analogy, then decided to leave it off. I guess you're indicating it should be brought up. A typical driver doesn't need to know much about their engine. However, a professional driver of one of several different types is very well served by some knowledge of engines. Such knowledge isn't needed all the time, but when it is needed, the potential costs of not knowing can be quite high. You could lose a race, lose money, or lose a life.


The problem is, what you define as a pro is completely vague. There's no adequate definition of one as its simply presumed that a pro "knows what to do in X situation in X domain" which is almost no different than "knowing what to do in every situation that resides in X domain". There's literally no difference, its that vague.

Professional drivers really don't need to know much outside of the behavior of what they experience while driving. This is why there are things like pit crews and staff that support the driver. Its so the driver has to think about driving, not what fuel to air mixture is adequate to prevent piston knock.

This is the problem, nobody adequately defines what a professional is and can only be seen when "someone knows what to do" which implies having broad, wide scoping knowledge about a topic which again, is supposed to be something that isn't required and is the point of having models in the first place.

Its not that having in depth knowledge is bad, its just that having in depth knowledge is often not required and it doesn't make you any less of a professional for not "knowing it at the right time".

And with regards to Y / 0, a programmer is mostly thinking about how to catch the exception properly for a given task. He does not care that the exception is nested X levels deep in the class hierarchy. He does not even need to know the mathematical understanding of Y / 0, just that its an error state. He does not care how the error code is generated, just that it exists. He does not care how the list adds/removes items and what the worst case runtime is, he just cares about its use. Because the point is, you're not supposed to care about implementation details. Abstraction is king and you can go a long way without knowing a lot.


The problem is, what you define as a pro is completely vague.

I think it's pretty clear from this thread. In the context of programming, it's someone who can do what a "mere user" can do, but who can push the thing to extreme limits, modify it to do something completely new, or fix it if it's somehow broken or has holes in its design.

Professional drivers really don't need to know much outside of the behavior of what they experience while driving.

In the case of racecar drivers: They need an intuitive understanding of what their engine is capable of, and how far they can push it based on what it's been doing. They need an understanding of driving physics. They need a good feeling of driving physics, and they need to understand how the wear they've put into their tires could affect performance. There's a lot to know about competition rules and regulations. They need to know enough about the physics of car aerodynamics, and how it affects their grip in different situations. Not knowing such things has literally gotten professional drivers killed. And professional encompasses more than just racecar drivers. Truck drivers working in extreme conditions have to know quite a bit, it turns out. Truck drivers who are pointedly ignorant about their engines can end up costing someone a lot of money in repairs. "Professional driver" actually encompasses a number of careers. The point is, professionals have to know a lot that goes much, much deeper than just being an "end user."

And with regards to Y / 0, a programmer is mostly thinking about how to catch the exception properly for a given task.

Not necessarily. This would vary a lot depending on the language and the particular task.

He does not care how the list adds/removes items and what the worst case runtime is, he just cares about its use.

Not necessarily. This would vary a lot depending on the language and the particular task.

Because the point is, you're not supposed to care about implementation details. Abstraction is king and you can go a long way without knowing a lot.

The king is just another fallible mortal. All abstractions leak. I'll grant that there are professionals what can mostly get away with basically just being an end user. 95+% of the time, everything will be routine, and copacetic. It's that last few percent where things can get really dicey and expensive. (Also the basis of a lot of the money for "consultants.") If someone wants to run a shop where someone isn't prepared with the know-how to deal with that, I guess it's their business. That's not what I'd consider a very a high level of "professional."


I think it's pretty clear from this thread. In the context of programming, it's someone who can do what a "mere user" can do, but who can push the thing to extreme limits, modify it to do something completely new, or fix it if it's somehow broken or has holes in its design.

Its not clear at all because you can simply move the goal post. Someone with 20 years of programming in client side apps struggles with programming a website and can't figure things out on X situation without help. Does that mean they are automatically not professional?

In the case of racecar drivers: They need an intuitive understanding of what their engine is capable of, and how far they can push it based on what it's been doing. They need an understanding of driving physics. They need a good feeling of driving physics, and they need to understand how the wear they've put into their tires could affect performance...

Again, another arbitrary definition. I bet you can go to most professional drivers and they won't understand the physics at all, they have an understanding from experience and consultation of the experts, but I would bet that 90% of drivers aren't going to pull out calculus or kinematic equations to analyze a race track. They either hire someone to do that or use a computer (most likely) to simulate the race. And even then, the simulations are inaccurate due to driver emotional state. My guess is they have, at most, a surface level understanding of driving physics.

Not necessarily. This would vary a lot depending on the language and the particular task.

So does everything, but we know that Y/0 is a programmed failure state based on the library being used. So the solution is either, catch the exception (and by catch, I mean control in a broader sense since you can condition the input), use a different library, or write your own library. I would hope, in a business setting and in most settings, you would choose to catch the exception or control for it in some way.

It's that last few percent where things can get really dicey and expensive. (Also the basis of a lot of the money for "consultants.") If someone wants to run a shop where someone isn't prepared with the know-how to deal with that, I guess it's their business. That's not what I'd consider a very a high level of "professional."

So you base your decision on the few percent of people that can do X vs. the 95% of people that can't, but can solve all other problems without knowing the details? Seems like a very irrational outlook considering the following: a) Some of these top 5% of people may not even exist b) If they do exist, they are most likely consulting c) They are being paid more than most businesses can afford d) They work for companies you probably don't work at e) They are doing research work and publishing papers you probably never read f) They probably can't solve X problem outside their expertise without assistance

But the point is that even if abstractions leak, they rarely do. And that's the whole point of engineering and technological advancement in general, is so that you don't need to know the details.


Its not clear at all because you can simply move the goal post. Someone with 20 years of programming in client side apps struggles with programming a website and can't figure things out on X situation without help. Does that mean they are automatically not professional?

People are experts in different things, but general skills can be applied. My wife notes that there's an effectiveness and mindfulness constant to be applied to "years of experience." There are people in her field who have 20 years of experience, who know less about the regulations and subtle aspects than she has learned in 2.

I bet you can go to most professional drivers and they won't understand the physics at all

Note that I wrote intuitive understanding. It would be highly inaccurate to say they "don't understand it at all." I would question the general understanding of someone who would say that.

My guess is they have, at most, a surface level understanding of driving physics.

Something that someone has practiced over many years in a competitive environment isn't just "surface." This is why educated people should have at least two areas in which they've delved deeply, so they have a firsthand knowledge of what "deeply" means for knowledge.

I would bet that 90% of drivers aren't going to pull out calculus or kinematic equations to analyze a race track.

That's a ridiculous suggestion. Projecting that position on someone is either grasping at straws to make a straw man, or some other form of bias. If a driver knows enough to intuit there might be a way he can improve his line, such that he can seek out another expert's help, then I'd say he could well be a "consummate pro." It's the curiosity, awareness, and drive to peek under the surface which is the difference.

So you base your decision on the few percent of people that can do X vs. the 95% of people that can't, but can solve all other problems without knowing the details?

A more concise way of putting it, is, "Are you smart and informed enough to know what you don't know? Is that sufficient to keep you out of trouble?" The Pareto often rears its ugly head in reality. That last few percent can really, really cost you.

If they do exist, they are most likely consulting

I was an example.

They are being paid more than most businesses can afford

There's an old saying for this: "A fool and his money are soon parted."

They work for companies you probably don't work at

Again, I was once such a consultant. Also, there are coworkers at my current job who are curious, energetic, and smart enough to have such a position, but who don't want one right at the moment.

They are doing research work and publishing papers you probably never read

Nah. Just a modest level of basic curiosity is enough to get you there.

They probably can't solve X problem outside their expertise without assistance

Which is fine, if they're smart enough to know what they don't know, so that they can gracefully navigate their situation.

But the point is that even if abstractions leak, they rarely do.

Boats leak. Could be rarely. Could be a lot. Both can be true of the same boat. It depends on how hard you're pushing that equipment. People can and do make money driving a boat no harder than a dilettante hobbyist. People can and do make money using technology at about that level too. In either case, I just hope everyone knows what they don't know, so no one gets in over their head and drowns.

And that's the whole point of engineering and technological advancement in general, is so that you don't need to know the details.

The point is to get stuff done and to save money while making money. Knowledge is power, but ignorance helps someone else's margins. "You pays your money, and you makes your choice."


Regardless of the fact that this post is full of contradictions, for example claiming years of experience requires mindfulness and effectiveness and then saying someone who has practiced over many years doesn't just have a surface level understanding. You're also straw-manning me by misrepresenting what I said about the driver having a surface level understanding of vehicle physics which you agree by saying its absurd for the driver to use equations to analyze a race track. Its pretty clear, to any physicist, that if you don't have an understanding of how to model kinematic movement, that you have an intuitive or surface level understanding. Intuition is probably the worst thing to champion as its not measurable and often unreliable. For example, it was intuitive that the sun revolved around the Earth and that large objects fell faster than smaller ones. The whole point of life is to do things you don't know how to do because otherwise you never grow. It seems like you're saying the opposite, that people should know they don't know and never approach it. The only way you get stuff done and save money is if you reel in the details and make things easier. Again, its the reason why people aren't writing their own language from scratch and instead using an existing language and framework.


It's like a lumberjack so used to cutting down trees with a chainsaw that they'd have trouble to put down a decent tree with an axe. Powerful tools do certain subtasks for you, if you expect to use them all the time (and it's a reasonable expectation in your domain), then it makes all sense that you'd forget these subtasks.

It's like manual memory allocation and proper deallocation - been there, done that, but after 10+ years of working with GC languages, I would definitely have some memory leak bugs if I suddenly had to do that again. It is a basic skill, but just as many other basic skills, it's one that you can ignore in most domains.


It's like a lumberjack so used to cutting down trees with a chainsaw that they'd have trouble to put down a decent tree with an axe.

Bad analogy. It's more like a "lumberjack" who thinks all it takes is pressing the controls on a chainsaw. There's some more, very important things to know, and not knowing them can cost time and money or even get someone badly hurt. The chainsaw has to be maintained, with possibly severe consequences if it isn't. You need to know how to get the tree to fall where it's supposed to. You have to know how to cut so the weight of the tree doesn't clamp the chain.

It's a bad analogy, because the important issue is whether someone is letting the tool substitute for understanding. I guess some rube might think their chainsaw is so powerful, they don't have to worry about how they cut down the tree. It's more like sailors who think they can just lean on GPS instead of having skills. Those are the guys who collide their ships and get people killed. It's more like pilots who weren't great student pilots, and they make critical mistakes and program the autopilot into the side of a mountain or do the wrong thing when the plane is stalling or always depend on the auto-landing system and don't really know how to do a manual landing and wreck the plane. (Those are all things that really happened.)

Just because some big fraction of time being a "professional" is just being an end user doesn't mean there isn't something more beyond that which is very important. I guess it just has to do with what level of "professional" you aspire to be.

It's like manual memory allocation and proper deallocation - been there, done that, but after 10+ years of working with GC languages, I would definitely have some memory leak bugs if I suddenly had to do that again. It is a basic skill, but just as many other basic skills, it's one that you can ignore in most domains.

If you're doing something hard enough, you still have to think about stuff like that in a GC environment. I know, because I worked for a Smalltalk vendor. You can even have memory leaks in a GC environment. There's a lot of stuff you can ignore -- most of the time -- but can come and bite you real hard if you're not prepared for it.


>During big refactorings, I admit I sometimes find myself relying on compiler errors as a crutch to find (for example) which API layer I haven’t added the parameter to yet.

Sure, I do the same. I wouldn't call it a 'crutch' though; it's just using the tools you have available to you.


> Do you really encounter compiler errors that frequently?

He skipped the "ctrl+enter" part of the workflow.


I know you probably exaggerated, but isn't causing so many feedback loops still the sign of a bad programmer?

I wouldn't trust someone to deliver quality software who produces multiple syntax and multiple runtime errors while coding, especially when modern IDE's fix typos on the fly? What about all the runtime errors he didn't test for?

Is that flow really that common?

Sure, I cause errors all the time, but my workflow includes a compile time error maybe once in 2 trys and an obviously fixable runtime error once in maybe 10 times. Sure there are these countless of edgecase errors I don't even know about, but that's not the thing catchable with this workflow and I wonder if someone with that workflow, who misses the obvious, does miss even more edgecases.

Also don't a lot of tech companies require (mostly) correct whiteboard coding exactly because of this for quite some time?


The ability to do it once and have it come out right.

One coworker of mine was the son of one of the engineers of the XC-142 tiltwing aircraft. He started a project to make a functional scale model, and this was before Arduino, so we decided to use a Gumstix Linux board. (Because of its generous number of GPIO outputs.) I wrote a bit-banging implementation of the flight surface control mixing in C. It "just worked." No errors. It just ran the 1st time. It was even flown on a simpler aircraft.

This isn't my usual way, however. Usually, I'm quite iterative. If you're going to write a program that works the 1st time, then it helps if the control flow is relatively simple, there isn't a lot of complexity that can come about with interaction with state, and it does just one thing.


There are individuals out there who possess sufficient cognitive capabilities such that they do not require much in the way of cognitive load mitigation tooling even when working on very complex and technical tasks.

These individuals are uncommon.


I had a similar experience! When I first learned programming, it was through a book I borrowed from a teacher, who would only let me access her computer once a week. So I basically wrote all the programs I wanted to write with pen and paper, and then typed them and ran them once a week. It definitely gave me a much thorough understanding of many algorithms, as I basically have to simulate execution on paper.


I got into this type of thinking/habit by working for a couple startups and programming live on prod...


Knuth with the original "git gud".


Taking the opportunity to post the video of Knuth mishearing the question "What makes a good teacher?" as "What makes a good T-shirt?" at my uni.

https://youtu.be/74BfHoE66rc?t=42m17s


What a wonderful and genuine human being! It helps explain why his books are so incredibly humane while remaining deep on a very broad topic. Thanks for sharing. ️


I wish the guy would have let him answer as to what the CompSci equivalent of that T-Shirt is.


A few years ago, while doing a bit of a family history research project (Knuth was my father's PhD advisor at Stanford), I stumbled across a rather awesome t-shirt design. May I humbly submit "Knuth is my homeboy" as my entry to the CompSci t-shirt contest. (not my own design, to be clear, I just came across it on the internet)

https://geekz.co.uk/shop/store/show/knuth-tshirt.html

And here is a shot of the recursive shirt in action: https://laughingsquid.com/jacob-appelbaum-donald-knuth-demon...


Right? Because of that interruption we never get to have Don Knuth approved compsci T-shirts. Think before you act!


Is there any way we could ask him? I don't wear Compsci T-shirts but I would wear a Don Knuth-designed one.


The word he used was "a t-shirt that would be an _analog_."


That is so cool! He really liked the question!


I lived in Sweden for a while, Swedes are very proficient in English. Yet, they almost always pronounce ch as sh. They even have TV ads that make fun of this.


Reminds me of this ad: https://www.youtube.com/watch?v=yR0lWICH3rY

"Mayday! ... We're sinking! We're sinking!"

"What are you sinking about?"


It took me the longest time before I understood that ad, since the sheep speaks fluent "American".

While on pronounciation topics, how does one pronounce "Knuth"? Is the k silent like in "know"?

Edit: According to Wikipedia it is kə-NOOTH.


That's funny. This is the best thing I've ever found on HN. Thank you.


Thanks for sharing this. Definitely made my day :D


I want that T-shirt now!


Reminds me a bit of Professor Neutron (Bob Newhart) in the Big Bang Theory!


that's my favorite gag from the Firefly tv show.

Simon: Are you Alliance?

Early: Am I a lion?

Simon: What?

Early: I don't think of myself as a lion. You might as well though, I have a mighty roar.

Simon: I said "Alliance."

Early: Oh, I thought...

Simon: No, I was...

Early: That's weird.


As mentioned in the article he is also a devoted Christian and one of the smartest I know of.

Some of his writings on the subject:

https://www-cs-faculty.stanford.edu/~knuth/things.html

https://www-cs-faculty.stanford.edu/~knuth/316.html


My favorite Knuth quote:

One of the first times I was ever asked about the title of my books was in 1966, during the last previous ACM national meeting held in Southern California. This was before any of the books were published, and I recall having lunch with a friend at the convention hotel. He knew how conceited I was, already at that time, so he asked if I was going to call my books "An Introduction to Don Knuth." I replied that, on the contrary, I was naming the books after him. His name: Art Evans. (The Art of Computer Programming, in person.)


"I was sitting in Steve's office when Lynn Takahashi, Steve's assistant, announced Knuth's arrival. Steve bounced out of his chair, bounded over to the door and extended a welcoming hand.

"It's a pleasure to meet you, Professor Knuth," Steve said. "I've read all of your books."

"You're full of shit," Knuth responded.

http://www.folklore.org/StoryView.py?project=Macintosh&story...


No way; I worked for him from 1978-1986, and never heard him say a naughty word. Also, I tagged along when he went to Apple to meet Jobs, who wanted to show him a late prototype of the Mac, and don’t recall hearing any such thing at the time (and it would have shocked me).


I've also heard that story with several different people in the place of Jobs (mid 90's it was almost always Bill Gates, for example), making me believe it to be apocryphal.


Also, it seems unlike Jobs to be reading algorithm books or to claim so - he was not interested in writing software. Jobs would involve himself in the user-facing functions of software.

Whereas I recall Gates claimed to have read them, or at least some of the volumes, and found them mind-expanding. Gates wrote a fair amount of software himself early on and later would still challenge people on very technical points.


Perhaps everyone misheard when he said "you're full of t-shirts."


Posting the reference to this for those who may have missed it: https://news.ycombinator.com/item?id=18699166


My guess is that this is a paraphrase by Tom Zito using his own speech patterns (and the general style of the Apple group). Since you were there you might remember Don's actual words, or we could always ask him.


Well the website is called "folklore", so I think your skepticism is well placed. ;)


Is there a way to dispute the story on folklore? drfuch's comment should be appended there.


The story is 100% veritably folklore. I know this since I've been hearing the story for at least 20 years starring a good half dozen different characters. The fact that is almost certainly not true (and Knuth has denied it) seems quite irrelevant.


FWIW Knuth himself doesn't think this happened nor does he sound like the kind of person (to me) who would try to cut someone down like that: https://m.youtube.com/watch?v=zJOS0sV2a24&t=26m


More likely Knuth made an earnest comment about being impressed about reading all his books.

That comment was then misinterpreted as sarcasm, which is what the folklore author remembered. Later having forgotten the words, and only remembering the mistaken interpretation's overall feel, it was paraphrased, poorly.


I like this response because it doesn't put the blame into anyone.


Never attribute to malice that which can be explained by incompetence.


Or entropy. (In this case, social entropy?)


Knuth was asked if this was true by R Munroe (xkcd) on a talk on campus at Google, and Knuth denied it.



As they say, "never let the truth spoil a good story."


..that the Don was the one who really revived Apple by inventing the iMac and the iPhone ? Yes, yes, we know all that.


It is very unlikely that Steve read all of the TAoCP that were published at the time.


When Java was new, James Gosling and I gave a talk about it to a group that included Don. When we gave the talk I was in the process of leaving Sun for a startup. A couple of years later Don mentioned to me that the talk was the first time we met and, at the time, he couldn't figure out if I was really smart or really stupid for leaving Sun right after Java shipped. In 2010 we both agreed that in the long term, it was the right choice for me, but neither he nor I could have predicted that at the time.


“I am worried that algorithms are getting too prominent in the world,” he added. “It started out that computer scientists were worried nobody was listening to us. Now I’m worried that too many people are listening.”

cheers!


Ok, so this journalist gets to spend WHOLE day with Knuth and all we get is some bits of history and quotes from couple of Google guys. Massively lost opportunity to provide window in this legends daily life, thoughts and habits...


To be fair though, Peter Norvig is a lot more than just a "Google guy". I'd happily read an article that's nothing but Norvig quotes.

I thought that the article was pretty decent, considering it is aimed at an audience of people who aren't computer scientists and who have never heard of Donald Knuth.


> At age 19, Dr. Knuth published his first technical paper, “The Potrzebie System of Weights and Measures,” in Mad magazine.

I'd say she did something positive with the opportunity.


I have a copy of the issue, so when I went to see one of his Christmas Tree Lectures a few years ago I took it with me to get signed after the talk.

Instead of just signing it, he opened to his article and corrected a typo in the Potrzebie article--I think it amounted to two numbers being transposed somewhere around the millionths place (My copy is packed away from a recent move so it's not too convenient to verify).

What was funny was he had a poem he used to remember the order of the numbers after the decimal. He couldn't quite remember it, so he ssh'd into his home machine (or University, dunno), fired up Emacs, and opened a file containing the poem. Then he corrected the typo.

And that's how Donald Knuth "signed" my copy.


+1 I was behind you in this line and can confirm this story :-) He was surprised someone brought this and IIRC asked you how you got hold of this MAD magazine issue. I remember him not wanting to sign without fixing the typo, and also ssh-ing to find the correct numbers after he wasn't sure he remembered them.


Wow, that's wild. It makes me happy someone else remembers it. I kinda figured it was just me as I think I attended the lecture alone that year--pretty sure it was 2015.

FTR, the answer to how I got it: I made an off-hand comment to my wife about the magazine and she tracked it down for the next gift-giving occasion.


My favorite book from Knuth isn't computer-related at all.

3:16 Bible Texts Illuminated

https://smile.amazon.com/3-16-Bible-Texts-Illuminated/dp/089...


The book where he discusses these is also wonderful: "Things a computer scientist rarely talks about".

https://www.amazon.com/Things-Computer-Scientist-Rarely-Lect...


Wonderful! Thanks for sharing this.


Donald Knuth shaved the ultimate yak. He wanted to write a book with mathematics in it so he implemented what for decades has been industry-standard mathematical typesetting system: TeX.

That's remarkable. I just wish he'd finish the book.


The details are even hairier.

He wrote TeX82 (modern TeX) in WEB, a literate programming language he invented that could be tangled into the Pascal machine code or weaved into literate documentation in ... TeX itself.

In fact he wrote WEB itself in WEB and by hand compiled the output of the tangle program, which he ran on the tangle source to get the same output.

He also wrote the program METAFONT to produce the typefaces, which he used to design the Computer Modern font.

And then he used that to write instruction books for TeX and METAFONT as well as new editions of The Art of Computer Science and others.

TeX is still in regular use in the academic community (ported via Web2C) over 35 years later, despite the changes in technology in that time (with hacks to deal with modern fonts, outputting PDFs instead of DVI, including images and the like).

He taught the yak to shave itself, and it's still clean shaven 35 years later.


"In regular use" is an understatement. Virtually all manuscripts published on Arxiv.org, which comprises a large portion of all contemporary physics research output, is typeset in Latex.


Knuth is just ONE section away from writing about my (current) topic of study: Constraint Programming.

His 4th volume of TAOCP is so big that its beginning for him to take decades to write sections. Backtracking search + 3-SAT solvers is really interesting (and very closely related to Constraint Programming), but I'm deeply curious to know what his take on Constraint Programming will be.

Its really odd to be "waiting" for the next chapter of TAOCP. Its a serious body of work that has spanned decades of research and writing... I really hope he finishes the next section sooner.

Maybe I'll go and buy the fascicle for 3SAT and read it while waiting.


In his recent Dancing Links talk (a couple of weeks ago), he mentioned a generalization of exact cover that also covers constraint programming. You may want to look into that (XCC) in the meantime, in Knuth's fascicle on Dancing Links (https://cs.stanford.edu/~knuth/fasc5c.ps.gz).

I know the feeling of waiting for new chapters of TAOCP; I'm waiting for Chapter 8 on recursion because his idea of it is so different from everyone else's.


What makes his idea different from everyone else's? I know he covers coroutines, being something not really available in most languages. That said, I thought it was an accepted term and I didn't think that was his invention.


Oh sorry, to be clear, by "his idea of it" I meant "the way he thinks about it [recursion]". What's different is that when most people learn programming, they pick up the idea that iterative structures (loops etc) are simple, and recursion is something more "advanced", harder to understand. Knuth thinks that recursion is the simpler thing (what comes to humans/children naturally), and the mark of a better programmer is being able to translate recursion into iterative structures.

This seems to tie in with his attitude towards programming in general: while in the zeitgeist the idea is that a good program builds better abstractions until the structure of the program reflects human thought, Knuth seems to think that a good programmer is one who starts with a human-level description of the solution and carefully refines it until it translates into a good set of instructions (in machine code). (There's no contradiction, but the point of view seems different from the usual.)

He briefly mentions something related in his Structured Programming with Go To Statements (1974) https://pic.plover.com/knuth-GOTO.pdf#page=21

> I have always felt that the transformation from recursion to iteration is one of the most fundamental concepts of computer science, and that a student should learn it at about the time he is studying data structures. This topic is the subject of Chapter 8 in my multivolume work; but it's only by accident that recursion wasn't Chapter 3, since it conceptually belongs very early in the table of contents.


I think some of this is likely to be a product of the times. The vast majority of early programming languages were not friendly to recursion. It wasn't at all uncommon for functions not to be reentrant, and to rely on self manipulation to set the return stack, if I recall.

Even today, for many "systems level" programs, you have to take extreme care around recursion, since you are likely to blow the stack limits. (That is, you can't just use recursion freely, but instead have to prove the maximum depth that you will trigger it.)

That is to say, I don't see what makes this different than how most others view recursion. If you can afford to do it, then you should do so. If you can't, then you should be ready for a lot of his other tricks. (Threaded trees being my personal favorite, at the moment. DFS with no stacks? Crazy talk! :) )


Definitely a product of his times (like everyone). Let me put it this way: to my knowledge, Knuth is the only programmer from his times, still writing, programming, and writing about programming. So you get a 1960s perspective from someone who's seen (and programmed in) 2018; that's not very usual.

(BTW I'm not sure that Knuth would wholeheartedly agree with “If you can afford recursion, then you should use recursion”. Perhaps he'd consider the program half-written. That's in some sense the crux of what I think is different.)

Let me put it another way: I thought I knew how to generate all permutations of a set of numbers. When I saw questions about it on Stack Overflow or Quora, I answered. When I opened Volume 4A, it disposed of what I knew in the first half-page, and proceeded to discuss the problem for over 50 more pages. I thought I knew about backtracking, having written backtracking programs hundreds of times (always implemented using recursion). Knuth's section on backtracking obliterated everything I knew (and with no recursion in sight). So I'm looking forward to when he gets around to writing about recursion, to see what I can (un)learn about something so fundamental.


I spent meaningful amounts of time this summer with TAOCP. The unusual ways of thinking, the diligence, and building solutions from the ground up were astounding. There are ways of thinking in there that, I think, can be ported into the modern world which would revolutionize the systems they touch.


His dlx algorithm makes heavy use of recursion. As do many of his cweb programs. In one, he comments it as "textbook" stuff.

Your point on him being one of the main programmers from his time still programming is fair.


Ah but see, his DLX program is not written using recursion! Look at his CWEB program dance.w from his website (or if you don't have cweave handy, download the program from here: https://github.com/shreevatsa/knuth-literate-programs/blob/m... and click around on the hyperlinks) — all the backtracking happens inside “main”; there are only two subroutines “cover” and “uncover”, and neither of them is recursive.

Here is what he says in the (very early) draft of fasc16a (only 2.5 pages written):

> We've already seen hundreds of examples of recursion in previous chapters. Early on, we encountered sequences of numbers that were de fined by relating each member to earlier members after getting started; these “recurrence relations” were sometimes simple formulas like […], sometimes more involved like […], and sometimes considerably more elaborate and complex. We also found that basic data structures for trees are inherently recursive, because all but the simplest trees are composed of subtrees. We discussed numerous instances where problems of sorting or searching or optimization could be solved effectively by a “divide and conquer” approach, with which a large problem is reduced to smaller problems of the same kind.

> Yet we didn't dwell on the recursive nature of those concepts. We transformed them into equivalent notions that could be dealt with directly and nonrecursively. Because “at bottom” a computer operates by means of the physical properties of electrons, which are not recursive.

> For example, in Algorithm 2.3.1T we traversed the nodes of a binary tree by using an auxiliary stack A, not by using the recursive definition of symmetric order to de fine that algorithm in terms of itself.

Don't want to quote the whole thing (though there isn't much), but from the sample I feel I'm going to learn from a new perspective (new to me, anyway).


This is near causing dissonance for me. I'm certain I had seen a version of his where he did use recursion. Rereading, you are definitely correct that he doesn't.

I know he has program in one of his books where he uses recursion to print the contents of a tree. Maybe I was just confusing that. (Don't have my books handy.)

That said, this is definitely going to make me study those programs more. Hilarious how off I was in my memory.

(The use of goto statements like he does in the main routine cracks me up, btw. Curious to know if these made a measured difference. With Knuth, I would assume so.)


It's completely bizarre that it is taught in this way. Knuth is absolutely right. Recursion is easy but the problem is computers don't actually work that way. I don't understand how anyone who has programmed for a while can think recursion is the cool or tricky thing to do.

Even worse is that recursion is taught using stupid examples like the fibonacci numbers rather than something that actually needs it like quicksort.


I agree: as much as I love the Fibonacci numbers, it's better to teach recursion with something like (say) implementing directory traversal (printing all files under a current directory, like ls -R) -- some problem both where the recursive nature of the problem itself is obvious, and where it would be more awkward to implement without recursion. I'm looking forward to seeing what problems he thinks really merit recursion.


> What makes his idea different from everyone else's

Depth and coverage.

Case in point: Knuth's coverage of Quicksort doesn't calculate the Big-O. It calculates the precise average running time. He literally counts the assembly-language statements. You might say "But arrays might be randomly sorted", and yeah, Knuth averages the runtime across all possible ways that an array can be sorted. Its a beautiful work of math.

There's almost no room for questions once Knuth finishes a subject. Its so exhaustively covered that I can't even imagine questions I have on any subject after he's done writing about it.

The few questions I do have are the "Difficulty 40+" questions that Knuth puts at the end of each section, which would serve as excellent questions for Masters Thesis or a PH.D Thesis. IE: There's no answer to those questions yet.

Anyway, while most textbooks hand-wave the idea of "Quicksort is faster than Mergesort, even both are O(n*log(n))". Knuth is pretty simple, he says: Look at these two precise numbers I've calculated (and proved). Quicksort is faster (or at least, the Assembly-language version of Quicksort he wrote on his imaginary computer is faster)

Basically, Knuth got "to the bottom of it", there's no hand-waving involved in the explanation. He just tackles the hard questions head on and solves it more precisely than everyone else.

----------

Some of the older chapters are going out of date, which is natural for a work so complete. There is so much material out there that Knuth has to make a decision on whether to revise older stuff (which he does occasionally) or to write about subjects which don't have a major unifying work yet (ie: Constraint Programming has some textbooks here and there, but most of its ideas are only in academic journals, and haven't really been unified into a singular work yet IMO. Even the textbooks on the subjects are mostly introductions + references to papers)

And sometimes, Knuth gets lucky. Knuth's Book3 "Searching and Sorting" includes a HUGE table of sorting methods applied to tape drives.

But today's RAM is now faster sequentially rather than random-access. So Knuth's calculations on Tape-Drive sorts actually applies to modern DDR4 RAM better than any other treatment on the subject. No one else calculated the cost of "internal movements" (aka: cache on modern systems, but it was RAM on older computers) + "external movements" (which was Tape Drives in Knuth TAOCP / Book 3, but is really DDR4 RAM on today's machines).

So yeah, Book 3 is out of date and focuses a huge amount of time on external sorting methods. But... it just so happens that if you know a bit about the sequential nature of modern DDR4 RAM, it is actually super-relevant to today's computers.


That doesn't sound like it is different, so much as just more complete. His analysis of permutations in order to analyze runtimes is, as you say, truly a wonder.

This doesn't feel different from everyone else's idea, though. Unless you just mean in the general complexity analysis, where most of us are basically taught to stop at the highest order terms. It is interesting, because Knuth seems credited with pushing Big-O analysis, but he never stopped at the highest term when analyzing an algorithm. Far as I can tell, he never has. He has done this to compare some algorithms, but even then he has always been empirical for actually comparing things. Treating the Big-O as a hypothesis to test, if anything.


Knuth has enormous patience for someone so productive.

My son used to refer to Knuth as "that amazing harpsichord player" and didn't care about the computer science "stuff" I tried to tell him was really important. And he had plenty of time to sit down with a kid and go through the score of Fantasia Apocalyptica and laugh together at the musical jokes in the score.

OTOH I either have to consciously decide not to stress about all the things I want to get done or else...stress and push hard. Yet the things I feel driven to accomplish are quite minor compared to Knuth's to do list!


I had the pleasure to attend his 80th birthday conference (http://knuth80.elfbrink.se) in Piteå, Sweden where he also premiered his piece Fantasia Apocalyptica for church organ.

We attended the conference as students but still got to see the conference, listen to the music and be at the birthday dinner free of charge. When I talked to Knuth I was amazed by his humility. Every presentation was about how grand he was, but he still was humble and friendly.

They say you shouldn't meet your heroes, but they're obviously not talking about Knuth!


Once met Yoda himself on the 59th street platform of the Metra next to the University of Chicago. He needed to get to the airport within 45 minutes. I told him he needed to call a cab. Now I get to tell people I once helped him optimize an algorithm. ;-P


“He’s a maximum in the community. If you had an optimization function that was in some way a combination of warmth and depth, Don would be it.”

I love that quote.


The Electronic Coach video of Knuth mentioned in the article:

https://youtu.be/dhh8Ao4yweQ

He was ready for Moneyball and sports data science decades before everyone else.


>“Here at Google, sometimes we just throw stuff together,” Dr. Norvig said, during a meeting of the Google Trips team, in Mountain View, Calif. “But other times, if you’re serving billions of users, it’s important to do that efficiently. A 10-per-cent improvement in efficiency can work out to billions of dollars, and in order to get that last level of efficiency, you have to understand what’s going on all the way down.”

I'm interested in this low level systems opimization. What type of jobs hire for this work? What areas in graduate school should one focus in? Is this type of skill set still employable?


It's not necessarily low level optimization. You get better optimization results if you're willing to change things at all levels of the software stack (including user expectations); so it's really about understanding what's going on all the way down and all the way up.

As with all things, the way to get experience and knowledge is to try things. Take a look at some software that you use that does something slow, and try to make it faster; it's easier if you have the source available. Lots of applications these days are slow to start, and that's easy to time, and probably easy to bucket things into things that don't need to be done or things that can be done later. If you interact with any long data processing loops, those are often good choices for optimization as well.

This skill set is most employable for companies that have scale; when you have ten servers, being able to turn off one of them because of efficiency gains doesn't help that much; but it's nice to turn off ten percent of your 10,000 node cluster. It also leads to nice claims on resumes "made efficiency gain in X, leading to Y% cost savings on servers", and since you're likely to spend a bunch of time on a single issue, you'll be able to have a good discussion when it comes to 'discuss some of your interesting projects' in interviews.

The only downside is you'll be super frustrated about everyone else's software being slow. You'll likely gravitate towards classic games because they don't have load times; I hope you like pixel art. :D


There's a position like that in my company, open to new hires (only bachelor's needed). The goal is to optimize algorithms and data analysis models that run on our HPC cluster. It basically boils down to changing compiler settings and seeing what runs the fastest.


Can you post a link (if there are no anonymity concerns)? Sounds like more fun than my current job.


Amongst other things, know your hardware.

For example, you could read the C++ spec forwards and backwards for a year and know your data structure big O notation just as well, but they will never tell you that inserting into the middle of a vector is faster than inserting into the middle of a linked list in a great many common cases.

Sure, ideally you'd measure, but you need to have some idea of what options to even compare with your measurements.


I may be missing something here, but your example seems to be a textbook case for big-O analysis. Writing in a vector is O(1), while linked list is O(n), isn’t it?


It's about inserting in the middle of a vector, which requires to shift every element after that one (so, O(n)), and for a linked list it can be done by allocating a new element, which, if you have the pointer to the place where you want to insert, is O(1) - and really, just a few operations.

However, in practice, the O(n) operation will be faster - since doing a hundred reads and a hundred writes to shift a hundred numbers within a single cache line will be much faster than reading a single byte to an out-of-cache memory in the linked list case; the processor can do a lot while waiting for any new data from RAM.


Writing into a vector is O(n), writing into a LL is O(1), assuming you have an iterator to the location to write to. If you don't, then it becomes O(n), and writing into a vector can be O(1) if you're writing to the end.

C++ stl, LL appending is always O(1) and writing into a vector is O(n). There are plenty of caveats to this and the documentation should be referenced.

He's right btw that insert into a vector is generally faster in the C++ stl than insert into a LL. I don't recall off the top of my head why that is, just remember it's the case.


Data locality is a big reason. Sure, it's O(N) writes, but those writes are neatly lined up in such a way that the likelihood of a cache miss is low. As opposed to a linked list, where just traversing to the middle may entail multiple round trips to main memory.


Yeah -- O(1) and O(n) are definitely simplifications and never tell the whole story.


It's not strictly related to C++ and/or STL. This is how computers work nowadays.


Well, my comments were specifically based on C++ and the STL. There are so many implementations that one has to define the discussion.


Working in any of the big cloud services (AWS, Azure, GCP) will involve this sort of work. Some services probably have more interesting scaling problems than others. (e.g. I bet AWS DynamoDB is more interesting than AWS CloudTrail).


SRE (Site Reliability Engineer) is one such track.


Hmpf, considering that the author spent a whole day with Knuth there is disappointingly little interviewing going on. Other people (e.g. Norvig) are quoted more extensively.


At Knuth's current age-related physical state, he probably didn't say much over the course of a day.


He still looks pretty healthy and mentally sharp.


True Jeff Dean story time (this was directly related to me): when Jeff spoke at Stanford, it was so crowded, Knuth had to sit on the floor (I think somebody eventually noticed and offered him a seat).


My Uncle is a researcher and has one of those famous checks from Knuth for pointing out an error in the book. Needless to say, he never intends on cashing it in, and treats it like a little trophy.


That picture with Knuth and TeX is really beautiful to me.


Funny, The NeXT Book by Bruce Webster referred to Knuth as the "Obi Wan of Computer Science".

I guess we can split the difference and agree on Jedi. :-)


My NeXT Cube was my introduction to TeX, and one for which I will be forever grateful.

I just wish that there was a machine running OPENSTEP 5.0 (as I think of Mac OS X) which I wanted to buy --- I prefer pen computers (the highwater mark of my computing experience was an NCR-3125 running PenPoint paired w/ a NeXT Cube w/ a Wacom ArtZ graphics tablet).


Ah but who is the Han Solo of comp Sci?


Linus Torvald


He's definately chewbacca.


I like these two quotes of Knuth where he lets us know how hard he worked.

--

From this small interview [1]:

"When I'm working on a research problem I generally begin by filling dozens of sheets of scratch paper with partial calculations. When I eventually get to a point where I can think about the problem while swimming, then I'm often ready to solve it."

--

From this other interview [2]:

"So I went to Case, and the Dean of Case says to us, says, it’s a all men’s school, says, “Men, look at, look to the person on your left, and the person on your right. One of you isn’t going to be here next year; one of you is going to fail.” So I get to Case, and again I’m studying all the time, working really hard on my classes, and so for that I had to be kind of a machine.

I, the calculus book that I had, in high school we — in high school, as I said, our math program wasn’t much, and I had never heard of calculus until I got to college. But the calculus book that we had was great, and in the back of the book there were supplementary problems that weren’t, you know, that weren’t assigned by the teacher. The teacher would assign, so this was a famous calculus text by a man named George Thomas, and I mention it especially because it was one of the first books published by Addison-Wesley, and I loved this calculus book so much that later I chose Addison-Wesley to be the publisher of my own book.

But Thomas’s Calculus would have the text, then would have problems, and our teacher would assign, say, the even numbered problems, or something like that. I would also do the odd numbered problems. In the back of Thomas’s book he had supplementary problems, the teacher didn’t assign the supplementary problems; I worked the supplementary problems. I was, you know, I was scared I wouldn’t learn calculus, so I worked hard on it, and it turned out that of course it took me longer to solve all these problems than the kids who were only working on what was assigned, at first. But after a year, I could do all of those problems in the same time as my classmates were doing the assigned problems, and after that I could just coast in mathematics, because I’d learned how to solve problems. So it was good that I was scared, in a way that I, you know, that made me start strong, and then I could coast afterwards, rather than always climbing and being on a lower part of the learning curve."

[1] http://authenticinquirymaths.blogspot.pt/2015/11/maths-in-sc....

[2] Transcript from here: https://github.com/kragen/knuth-interview-2006


I was also scared in the first year of college next to all these accomplished programmers and math wizards and it definitely helps!


Knuth says in the Preface to Volume 1 that "A shorter version of Volumes 1 through 5 [chapters 1-10] is planned, intended specifically to serve as a more general reference and/or text for undergraduate computer courses; its contents will be a subset of the material in these books, with the more specialized information omitted."

I realize he can't shorten what he hasn't written, but I wish he had been able to write this shorter version years ago, and then work on filling in the details in the now-ever-expanding sequence of in-depth volumes.


For some reason on this particular article I found it amusing to realize several times that a given concept they were talking about was something I'm familiar with, but it had been translated first from "nerd language" into "layman." Pretty common practice for a general audience, so I'm not sure why it stood out this time. Examples: bytecode, assembly language, frameworks, machine learning. Counterexample: Libraries, where they used the actual word, denying me the silly little pleasure of going "Oh you mean libraries."


I actually thought this article did a really good job (certainly better than most) at explaining complex CS concepts.

More

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: