
The 'premature optimization is evil' myth (2010) - jerf
http://joeduffyblog.com/2010/09/06/the-premature-optimization-is-evil-myth/
======
jasode
The _longer_ Knuth quote is _“We should forget about small efficiencies, say
about 97% of the time; premature optimization is the root of all evil”_

In the inevitable meme transfer in the telephone game[1] and shortening of
memes to smaller soundbites, the _" small efficiencies"_ part is left out.

To me, "small efficiencies" was trying to "optimize" your old C code from...

    
    
      x = x + 1;
    

to...

    
    
      x++;
    

to...

    
    
      ++x;
    

... because you read/saw that a C compiler created tiny differences of
assembly code based on post- vs pre- -increment operators which results in a
0.00001% runtime difference.

Knuth isn't talking about being ignorant or careless with choosing bubble sort
O(n^2) vs quicksort O(log(n)). Or _not_ placing an index on a lookup key of a
1 terabyte table (that's a 1-hour full table scan vs millisecond b-tree
lookup). Those are not "small efficiencies".

If one leaves out the "small efficiencies" as a conditional, regurgitating the
"premature optimization" is a cop out for not thinking.

[1][https://en.wikipedia.org/wiki/Chinese_whispers](https://en.wikipedia.org/wiki/Chinese_whispers)

~~~
hn9780470248775
> quicksort O(log(n)).

Quicksort is O(n log n) average case and O(n^2) worst-case.

~~~
jasode
Yes, I saw that error after the edit window closed so I couldn't fix the typo.
There has to be an extra "n" because qs has to touch _every_ element at least
once so a baseline complexity of O(n) is unavoidable.

Hopefully, it didn't detract from the point that Knuth was talking about
premature _micro-_ optimizations and not _design /architecture/algorithm_
optimization. Some inexperienced people are repeating "premature optimization"
to try and win internet arguments instead of using it as nuanced advice to
avoid wasting time.

~~~
cossatot
>to try and win internet arguments

is pretty much the antithesis of

>to avoid wasting time.

------
ska
Pretty verbose way of (yet again) reiterating that Knuth was essentially
correct, but that many people misunderstand or misapply what he was saying.
Joe says as much again in the conclusion.

To me that makes the "myth" part of the title more than a little click-baity,
which is unfortunate.

Knuth is right: premature optimization is a bad idea, full stop. That doesn't
mean that there aren't performance related activities you should be
undertaking at various stages of implementation, that either aren't
optimization, or aren't premature, or both.

~~~
biot

      > Knuth is right: premature optimization is a bad idea, full stop.
    

A bit of a "no true Scotsman" though, isn't it? Any optimization that is a
good idea to do now is "not truly premature", whereas everything else is
actually premature.

~~~
gutnor
But that's basically every rule or guideline in the software world.

You should do X always, unless it does not make sense.

How do you know the difference ? With enough experience or enough ignorance.
Knowing when you are in one category or another for a specific topic is the
tricky bit.

~~~
biot
I'm going to have to agree with ska's other comment[0] and say that it's
knowing the difference between good design and optimization. Being able to
design a performant system means choosing designs which are inherently fast.
Squeezing the last few percent out of bubble sort makes no sense when you
should have gone with, say, insertion sort in the first place.

Once you have the right algorithms, data structures, and system architecture
in place and working, it's going to be fast enough and you can choose to spend
time optimizing only where absolutely necessary. Even then, you should default
to getting order of magnitude better performance via a better design rather
than tweaking inefficiencies.

[0]
[https://news.ycombinator.com/item?id=11284817](https://news.ycombinator.com/item?id=11284817)

------
mwfunk
He is refuting a version of "premature optimization is the root of all evil"
that I have never heard in practice:

"Mostly this quip is used defend sloppy decision-making, or to justify the
indefinite deferral of decision-making."

I have never heard it used in this context. Sometimes I've heard it used as a
gentle way to suggest to someone that they are going off in the weeds and need
to refocus on what they should be focused on, but usually I've just heard it
used as it was originally intended by Knuth.

Optimization often involves making code less clear, more brittle, or with a
more pasta-like organization. Frequently optimization requires writing code
that if looked at out of context, doesn't make sense or might even look wrong.

When these sorts of optimizations need to be made, they should be made only as
needed (and documented). It shouldn't be done without knowing whether or not a
particular code path is even a bottleneck in the first place, and it shouldn't
be done if speeding up that particular bottleneck wouldn't make the software
better in any tangible way. That's all the phrase means.

~~~
adrusi
In a lot of circles, especially where web developers are involved, you'll get
called out for premature optimization for spending any mental energy worrying
about memory usage or bandwidth. The idea is that computers are fast, so we
can just do whatever we want, and worry about it if it becomes a problem. The
result is that it becomes a problem, then gets patched up to meet whatever
bare minimum performance standards the company has (or the deadline arrives
and it's released unoptimized) and we end up with the absurdly heavy and
resource-greedy software we see today.

~~~
Swizec
It's a cost optimization. How much engineer time does it take to shave 0.2
seconds off of an action that's got a 0.3s animated transition anyway? How
much engineering does it take to care about the memory footprint of a website
users are going to close in 5 minutes anyway?

Most of the time, the answer is "Too much, not worth it". Some of the time the
answer is "Let's do it". Knowing which situation you are in is key.

Ideally, I should write code for readability and maintainability and let the
compiler and runtime worry about optimizations.

~~~
yongjik
No opinion about the rest of the argument, but a 300ms animated transition is
looooooong. It will be noticed by basically everybody and annoy a good number
of them.

The only good reason I can think of is that you're somehow stuck with 300ms+
delay anyway, so you provide an animation so that the users don't think "WTF?
I just clicked on it and why is nothing happening?" But if you can shave off
0.2 seconds then you can probably get rid of the animation altogether!

~~~
Swizec
You'd be surprised by how many people think those 300ms animated transitions
are a good thing.

I think they're terrible.

You'd also be surprised by how many people will completely misunderstand your
UI and get confused by things popping around magically, if the transitions are
too fast or inexistent. You have to build a UI/UX that your typical user will
enjoy and be able to use, not a UI/UX that your nerdy friends are going to
love. (unless they're your target users)

~~~
mrob
The only good UI animations I've seen were in Metacity (window manager for
Gnome 2). It would move windows instantly, but also provide a transparent
trail showing the path they would have taken if they had been animated
traditionally. It let you continue working without delay if you knew what you
were doing while still helping beginners.

------
jerf
I posted this article less for the negative "countering the myth" that the
comments here seem to be responding to, and more for the positive description
of how exactly you write code in a thoughtful manner while not overdoing it
into "performance uber alles".

I tend to think of it more as not painting myself into a corner than
necessarily getting it perfect the first time. It's amazing what some thought,
maybe a day in the profiler per couple of months of dev work to catch out the
big mistakes (and as near as I can see, nobody ever gets quite good enough to
be able to never make such mistakes), and some basic double-checking (like
"are any of my queries are doing table scans?") can do for performance, long
before you pull out the "big guns".

~~~
markbnj
If it were positioned as an article about writing thoughtful code then I doubt
the comments would be as focused on the claim in the headline. Knuth's point
was that even thoughtful programmers could get caught up in pursuing
performance in areas where it ultimately didn't matter, and even more
critically, that even thoughtful programmers could be guilty of discarding a
clear, comprehensible piece of code in favor of something terser, and less
accessible due to a perceived performance benefit.

~~~
jerf
"If it were positioned as an article about writing thoughtful code then I
doubt the comments would be as focused on the claim in the headline."

Sorry, I did not mean to delegitimize those points. I understand where they
are coming from.

------
ninjakeyboard
I think the "premature optimization is evil" heuristic exists is not to avoid
doing efficient things but to avoid prioritizing optimization over design.
Yes, you want linear or logarithmic runtime complexity and NEVER quadratic,
but you won't use mutable datastructures in scala until you know that there is
a space complexity issue for instance. Then, and only then, do you optimize to
reduce memory usage as it hurts your design quality.

I think the title is a bit misleading because it's a good heuristic and you
agree with that too.

------
WalterBright
Reminds me of:

1\. novice - follows rules because he is told to

2\. master - follows rules because he understands them

3\. guru - transcends the rules because he understands that rules are over-
simplifications of reality

~~~
aidenn0
Except this is railing against a bastardized version of a rule. Leaving out
the "small efficiencies" allows the rule to be applied in contexts where it
clearly was not intended.

------
linkregister
I can't agree more with Joe Duffy's viewpoint.

In case you're interested in a graphical representation [1] of some common
latency costs, someone at UC Berkeley put together an interactive chart with
the original Numbers Every Programmer Should Know from Jeff Dean's (Google)
large scale systems presentation.

[1]
[http://www.eecs.berkeley.edu/~rcs/research/interactive_laten...](http://www.eecs.berkeley.edu/~rcs/research/interactive_latency.html)

------
danra
It's good to invest time in making decisions and coding them when it actually
ends up making a positive difference to your work. Otherwise, by definition,
it is premature optimization. (of performance, design or otherwise.)

For instance, you often figure out you don't need a piece of code only after
you've written and tested it, or after your thought process about the design
has evolved. When you delete that code, it doesn't help anyone that a couple
of hours ago you've invested five minutes in picking the "right" data
structure for the implementation. The right data structure for unstable code
is the one which lets you work with it and takes up the least of your time. As
your code becomes more stable, it could then make sense to invest time in
picking and coding a better data structure; it's less efficient to do so
prematurely.

------
kazinator
> _First and foremost, you really ought to understand what order of magnitude
> matters for each line of code you write._

And that is: amount of time that will ever be spent in it across all
deployments and execution instances, versus how long it takes to develop,
taking into the cost of that CPU time and development time.

You could spend, say, $100 of development time such that the total CPU time
saved over the entire installed base of the code over its lifetime is worth
$5.

Secondly, even if the saving is greater than $100, that means nothing if it's
not recouped! That is to say, suppose you spend $100 to optimize something,
and the entire user base saves $200 worth of CPU time over the next 25 years,
when the last installation of the program is shelved. Only, oops, the users
never paid a single penny more for the improvement. Moreover, suppose the
improvement was only marginal and in some relatively obscure function, so that
it didn't help to sell more of the program to more users. So in the end,
you're just out $100.

> _Mostly this quip is used defend sloppy decision-making, or to justify the
> indefinite deferral of decision-making._

Here is the thing. An program optimized for performance is "bad" because it's
hard to change its organization later (for instance when it needs to be
optimized). It's harder to debug, too.

We consciously avoid optimizing code in order to have the code in a state that
is easier to work with.

But we must ensure that we _actually_ achieve this. In effect, we should be
actively optimizing for good program organization, rather than just focusing
on a negative: not optimizing for performance.

Or optimizing something other than performance, and good program organization.

A really bad approach is, for example, "optimizing for the minimum amount of
time I ever have to spend learning effective use of my programming language,
libraries, and existing frameworks in my project".

You get code that isn't performance optimized, avoiding the "root of all
evil", but it's garbage in other ways.

------
rubber_duck
I think his example using LINQ vs loops is not realistic - if you're using
arrays like he is who's going to use LINQ with that ? The only reason I would
specify a concrete type like that is if I cared about performance - otherwise
you'd just specify IEnumerable/IList/IReadOnlyList or whatever and then use
LINQ because it's cleaner. Use abstract interface when you don't care about
performance at all - and IMO over 80% of the code is like that -
initialization code, edge cases, stuff that gets touched less than a 0.01% of
execution time and spending the time to optimize is simply not worth it.

He starts the article by judging laziness - after spending a lot of time on
stuff that ends up being irrelevant in retrospective I wish I was more lazy
about this stuff.

------
mchahn
> First and foremost, you really ought to understand what order of magnitude
> matters for each line of code you write.

Isn't that exactly what the phrase means? Understanding where it is important
and where it isn't? At least that is what I always thought.

------
golergka
TL;DR: be careful with the word "premature". Knuth's quote is still correct.

------
rjurney
There are things you can do to scale well, that you tend to have to learn from
longstanding error, that don't take a lot of time up front. You don't spend
much time on them, and these efforts bear fruit later. Then there are things
that do take a lot of time that are unwarranted. You have to avoid these.

Knowing the difference is key, and this is why senior engineers should be in
charge of making architectural and design choices up front, and on an ongoing
basis. Of course, most businesses can't attract such people, as scalability is
not common knowledge outside major internet cities :(

------
amai
Given that performance is not such an huge issue as it used to be I believe
that nowadays premature flexibilization is really the root of all evil:
[http://product.hubspot.com/blog/bid/7271/premature-
flexibili...](http://product.hubspot.com/blog/bid/7271/premature-
flexibilization-is-the-root-of-whatever-evil-is-left)

------
shifter
Designing for big-O performance is a good thing to do while writing code.
Optimizations beyond that are typically an anti-pattern.

~~~
pjscott
As the author emphasizes, that depends on the speed requirements of your
software. There _are_ places where nanoseconds matter, just as there are
places where tens or hundreds of milliseconds don't.

------
colordrops
A lot of confusion could be saved by reframing the discussion. Instead, talk
about whether the performance characteristics of a particular choice are
understood or not. If not, then don't optimize until it is either understood
to be a problem through measurement or some other form of discovery.

------
pklausler
Rewritten: Keeping performance in mind when considering design alternatives is
never premature.

------
j45
In the case of MVP & Prototype development and maybe even the long run:

Clever architecture will always beat clever coding.

In the early stages premature optimization can engage too much clever coding
and architecture.

There's no shortage of time spent building and optimizing a stack that largely
introduces overhead to quickly iterate and solve a problem. I guess this
perspective also keeps in mind you should likely throw away the first version
of whatever you build because it uncovers _how_ the architecture should be,
and where, if anywhere the clever coding and optimization should be.

It's not to say optimization isn't worth thinking about. It's not just worth
obsessing about at a scale perspective, and experienced developers develop
clever architecture approaches and habits that buy their designs breathing
room as they may grow.

The fundamental issue here is every piece of software is meant to break at a
certain capacity, just like hardware. As the author very eloquently mentioned,
understanding what you may come back to revisit and develop often may be one
thing, and other areas you may not end up touching again, and may be worth a
different type of design thought.

The mentors I have worked with have balanced the thought of being kind to your
future developer self in the present, and that can mean not under, or over-
engineering a solution.

Quite often the architectural design needs to be proven and verified before
building a lot around it. Spending more time on the schema and architecture to
ensure this is where I've found massive gains in baking in optimization to the
bread with little development overhead other than planning and thinking a bit
more.

Quite often if I want to dive in to build a throw away prototype, I'll stop
myself and think of a plan. When I'm hesitant to build without a plan, I often
let myself prototype lightly to aid development of a plan.

Developing for the simplest common denominator in the early stages to allow as
many people to participate in the learning and direction of the solution is
extremely critical as well. When problems reach the 10-100 million row level
there will be a lot more to figure out than just optimizing it.

Quite often technologies get caught up in optimizing technical design and
code, and not users, problems or solving them. Maybe users need to be the
focus for Technical developers, and technical understanding is something to
focus on for non-technical developers who trivialize technical matters.

------
api
"Premature optimization is the root of all evil" is like "don't ever roll your
own crypto." It's "talking down" advice intended for programmers considered
less knowledgable than the advisor.

Personally I think "talking down" advice is harmful and goes very much against
the pro-learning pro-self-education mindset of our industry. People either
ignore it, in which case it accomplishes nothing, or they obey it and it stops
people from learning or trying new things. It's also subject to a lot of
misinterpretation. The "premature optimization" quote is often misinterpreted
in practice to mean "never optimize or think about performance at all."

A better version of the premature optimization quote is:

"Don't sacrifice correctness, capability, good design, versatility, or
maintainability to optimization until you already have something that works
and you know what you need to optimize."

Another nuance on optimization is: "optimize through better algorithms before
you micro-optimize." Micro-optimization means tweaking out a for() loop or
implementing something with SSE, while picking a better algorithm means
picking something with O(N) over something with O(N^2). Picking a better
algorithm is often something you do "prematurely" during the design phase,
while micro-optimization is best left until the end.

A better version of the crypto quote is:

"Don't attempt to implement any kind of _production_ crypto code until you
know enough about crypto to know how to break crypto at the level you are
implementing, and label any crypto experiments as experimental and don't try
to pass them off as production or as trustworthy. Also make sure you are up on
the state of the art and can name e.g. the last few major attacks against a
major crypto implementation and can describe how they work."

If you can't meet those criteria then no, you should not be implementing
_production_ crypto (though you are free to play around). But that advice also
tells you what paths you need to go down if you want to learn enough to
attempt crypto and how to recognize when you might know enough to attempt
crypto. Can you explain exactly how BEAST, CRIME, POODLE, and DROWN work? Can
you tell me why crypto must be authenticated and why you should encrypt-then-
MAC instead of MAC-then-encrypt? If so, then maybe you're ready to swim in
that pool. Otherwise, learn.

~~~
aidenn0
The original premature optimization quote is not at all talking down. 3% of my
code is pretty close to what fraction benefits from microoptimizations, and it
is about "small efficiencies." It is useful advice for a novice and does not
become less true as one gains in art.

"Don't optimize" would be the talking-down version.

"Don't optimize prematurely" is naturally tautological. "It is wrong to do X
prematurely" is true regardless of X; if it isn't wrong to do X at this point
in time, then doing X now isn't premature. It's closer to the pop-culture
version of the advice, and like any tautological advice can always be wielded
against someone. It's worse than "talking down," which can reduce the mental
load on a novice, it's actually not-useful at any stage, as it provides no
advice on when optimization is premature and when it isn't.

The pithy version of Knuth's quote might be "Don't microoptimize until you can
tell the difference between the 97% of code that doesn't need it and the 3% of
code that does" which is in line with pretty much the entirety of your
comment.

------
lugus35
The important thing is to use the right algorithm for the right task.

------
Tharkun
When in doubt, use your head.

------
falsedan

      > I am personally used to writing code where 100 CPU cycles matters.
    

Not me, bucko. I'm used to writing code that, if it needs to go fast, I buy
more CPU time and run it in parallel.

~~~
kuschku
And that’s when you discover that (a) electricity isn’t unlimited, (b)
ressources aren’t unlimited, (c) money isn’t unlimited, and (d) maybe you
should just save for the sake of efficiency.

~~~
vinceguidry
a) and b) can be so cheap relative to the total cost of the application and /
or the value that application produces that they might as well be unlimited.

Essentially, if you are running into electricity / resource constraints on,
say, an e-commerce website, then unless your design choices were absolutely
hideous, then you are having a Very Good Problem.

Many programmers can spend their entire careers on building and maintaining
such apps. The infrastructure costs are outclassed by their salary by several
orders of magnitude. A decent website costs _millions_ to develop in total and
hundreds monthly to host. It can bring in several million in revenue every
year.

All this and it's still a sideshow to the main business. A site I maintain
does $3 million in business every year, whereas our retail partners do 7.

~~~
kuschku
Electricity prices aren’t globally the same, in some regions in Europe they’re
over $0.40/kWh.

And renting servers from AWS can end up being more expensive than paying
another dev and using dedicated systems.

