(Cue in the "but why bother, my app is IO bound anyway" counterarguments. That happens, but people too often forget you can design software to avoid or minimize the time spent waiting on IO. And I don't mean just "use async" - I mean design its whole structure and UX alike to manage waits better.)
 - Or HFT, or browser engine development, few other performance-mindful areas of software.
I have encountered far too many people who interpret that to mean, "thou shall not even even consider performance until a user, PM or executive complains about it."
Here's the quote in context:
> There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.…
( https://pic.plover.com/knuth-GOTO.pdf , http://www.kohala.com/start/papers.others/knuth.dec74.html For fun, see also the thread around https://twitter.com/pervognsen/status/1252736617510350848)
And from the same paper, an explicit statement of his attitude:
> The improvement in speed from Example 2 to Example 2a [which introduces a couple of goto statements] is only about 12%, and many people would pronounce that insignificant. The conventional wisdom shared by many of today's software engineers calls for ignoring efficiency in the small; but I believe this is simply an overreaction to the abuses they see being practiced by pennywise-and-pound-foolish programmers, who can't debug or maintain their "optimized" programs. In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering. Of course I wouldn't bother making such optimizations on a one-shot job, but when it's a question of preparing quality programs, …
And of course most people don't know the full quote and they don't care about what Knuth really meant at the time.
Quick quips don't get to trump awful engineering. Just say call Knuth a boomer and point to the awful aspects of actual code. No disrespect to Knuth, just dismiss him as easily as people use him to dismiss real problems.
Except that line was written in a book (Volume 1: Art of Computer Programming) that was entirely written from the ground up in Assembly language.
Its been a while since I read the quote in context. But IIRC: it was the question about saving 1 instruction between a for-loop that counts up vs a for-loop that counts down.
"Premature optimization is the root of all evil" if you're deciding to save 1-instruction to count from "last-number to 0", taking advantage of the jnz instruction common in assembly languages, rather than "0 to last-number". (that is: for(int i=0; i<size; i++) vs for(int i=size-1; i>=0; i--). The latter is slightly more efficient).
Especially because "last-number to 0" is a bit more difficult to think about and prove correct in a number of cases.
I recall it being from his response to the debates over GOTO, and some googling seems to agree.
Not that that takes away from your overall point.
In support of your overall point, though, having just said "[w]e should forget about small efficiencies, about 97% of the time", the next paragraph opens: "Yet we should not pass up our opportunities in that critical 3%."
If I don't think about performance and other critical things before committing to a design I know that in the end I will have to rewrite everything or deliver awful software. Being lazy and perfectionist those are two things I really want to avoid.
Yet we should not pass up our opportunities in that critical 3%.
"support larger screens and refresh rates"
It is possible to get office97 (or for that matter 2003, which is one of the last non sucky versions) and run it on a modern machine. It does basically everything instantly, including starting. So I don't really think resolution is the problem.
PS, I've had multiple monitors since the late 80's too, in various forms, in the late 1990's driving multiple large CRTs at high resolution from secondary PCI graphics cards, until they started coming with multiple ports (thanks matrox!) for reasonable prices.
I'd imagine perhaps this is how product teams are assessed - is the component just-about fast enough, and does it have lots of features. So long as MS Office is the most feature-rich software package, enterprise will buy nothing else, slow or not.
Is Teams better than Zoom? No, but my last employer ditched Zoom because Teams was already included in their enterprise license package and they didn't want to pay twice for the same functionality.
The tech stack at use in the Windows Terminal project is new code bolted onto old code, and no one on the existing team knows how that old code works. No one understands what it's doing. No one knows when the things that old code needed to do was still needed.
It took someone like Casey who knew gamedev to know instinctually that all of that stuff was junk and you could rewrite it in a weekend. The Microsoft devs, if they wanted to dive into the issue, would be forced to Chesteron's Fence every single line of code. It WOULD have taken them years.
We've always recommended that programmers know the code one and possibly two layers below them. This recommendation failed here, it failed during the GTA loading times scandal. It has failed millions of times and the ramifications of that failing is causing chaos of performance issues.
I'm come to realize that much of the problems that we have gotten ourselves into is based on what I call feedback bandwidth. If you are an expert, as Casey is, you have infinite bandwidth, and you are only limited by your ability to experiment. If your ability to change is a couple seconds, you will be able to create projects that are fundamentally impossible without that feedback.
If you need to discuss something with someone else, that bandwidth drops like a stone. If you need a team of experts, all IMing each-other 1 on 1, you might as well give up. 2 week Agile sprints are much better than months to years long waterfall, but we still have so much to learn. If you only know if the sprint is a success after everyone comes together, you are doomed. The people iterating dozens of times every hour will eat your shorts.
I'm not saying that only a single developer should work on entire projects. But what I am saying is that when you have a Quarterback and Wide Receiver that are on the same page, talking at the same abstraction level, sometimes all it takes is one turn, one bit of information, to know exactly what the other is about to do. They can react together.
Simple is not easy. Matching essential complexity might very well be impossible. Communication will never be perfect. But you have to give it a shot.
Then in college, I studied ECE and worked in a physics lab where everything needed to run fast enough to read out ADCs as quickly as allowed by the datasheet.
Then I moved to defense doing FPGA and SW work (and I moonlighted in another group consulting internally on verifcation for ASICs). Again, everything was tightly controlled. On a PCI-e transfer, we were allowed 5 us of maximum overhead. The rest of the time could only be used for streaming data to and from a device. So if you needed to do computation with the data, you needed to do it in flight and every data structure had to be perfectly optimized. Weirdly, once you have data structures that are optimized for your hardware, the algorithms kind of just fall into place. After that, I worked on sensor fusion and video applications for about a year where our data rates for a single card were measured in TB/s. Needless to say, efficiency was the name of the game.
After that, I moved to HFT. And weirdly, outside of the critical tick-to-trade path or microwave stuff, this industry has a lot less care around tight latencies and has crazy low data rates compared to what I'm used to working with.
So when I go look at software and stuff is slow, I'm just suffering because I know all of this can be done faster and more efficiently (I once shaved 99.5% of the run time off of a person's code with better data packing to align to cache lines, better addressing to minimize page thrashing, and automated loop unrolling into threads all in about 1 day of work). Software developers seriously need to learn to optimize proactively... or just write less shitty code to begin with.
while that's true, in this particular case with Casey's impl, it's not an art. The one thing that drastically improved performance, was caching. Literally the simplest, most obvious thing to do when you have performance problems.
The meta-point is that corporate developers have been through the hiring machine and are supposed to know these things.
Stories like this imply that in fact they don't.
Also it's not that simple. One place I worked at (scientific computation-kind of place), we'd prototype everything in Python, and production would be carefully rewritten C++. Standards were very high for prod, and we struggled to hire "good enough", modern C++, endless debates about "ooh template meta-programming or struct or bare metal pointers" kind of stuff.
3 times out of 4, the Python prototype was faster than the subsequent C++. It was because it had to be, the prototype was ran and re-ran and tweaked many times in the course of development. C++ was written once, deployed, and churned daily, without anyone caring for its speed.
If you're doing generic symbol shuffling with a bit of math, Python is fast-ish to code and horribly slow to run. You can easily waste a lot of performance - and possibly cash - trying to use it for production.
Whether or not you'll save budget by writing your own optimised fast libs in some other lang is a different issue, and very application dependent.
“Any object around which an S.E.P. is applied will cease to be noticed, because any problems one may have understanding it (and therefore accepting its existence) become Somebody Else's Problem.”
And you’re right; all refterm really does is move the glyph cache out to the GPU rather than copying pixels from a glyph cache in main memory every frame.