
Fast software is a discipline, not a purpose - frostmatthew
https://lemire.me/blog/2017/11/16/fast-software-is-a-discipline-not-a-purpose/
======
jackjeff
“Avoid multiple passes over the data when one would do.”

Totally disagree. Unless performance is an issue (like I’m not dealing with a
trivial number of elements), I would rather use functional programming
approaches to sort data into shape (think map/filter/reduce). These approaches
typically result in passing over the data multiple times, and often performing
multiple copies, but it makes for readable and less error prone code. Doing
the performant thing consists of writing a for loop (or multiple of them
unless you unroll) and automatically result with a large custom body of code
performing multiple transformations on the data. It’s just wasted mental
effort when you write it and every time when you happen to read it again.

Most collections/array I process have trivial number of elements (<100) and
you get virtually no benefit from writing optimized code.

I have also written a lot of performant code, and the lesson is always
benchmark benchmark benchmark. The results do not always fit the mental model
about what you think should be the fastest. In particuliar avoiding cache
misses is far more important than you might think.

~~~
AstralStorm
Readable and less error prone? As opposed as to proven to be correct for
example? Recursive functional code is a pain to prove to be right and not blow
up stack. Multiple passes (when interspersed with other accesses) can blow up
cache.

Even calling through a lambda is a cost compilers cannot easily optimize away
unless you help them. (Even in C++.)

You do get the benefit of optimizing all the "trivial data size" code when it
is all the functions being written in such silly way. Difference of 10% in one
function as opposed to across whole application.

Benchmarking is often even harder to do right than writing good tests. (In
fact is tied to it.) Instead of benchmarking, be a real computer scientist and
prove bounds and memory allocations.

~~~
tigershark
How is it difficult to prove if a recursive call is tail-recursive or not? If
it's tail-recursive and your language supports tail-call optimisation then you
proved that there will be no stack blow-up.

~~~
AstralStorm
For simple calls, not. For something more involved? (Dependent functions,
corecursion, partially stateful code.) Very which is why compilers fail to
optimize it in general.

------
ssijak
"When people train, they usually don’t try to actually run faster or lift
heavier weights. As a relatively healthy computer science professor, how fast
I run or how much I can lift is of no practical relevance. "

This is really strange. If you don`t try to lift heavier weights you will not
have progress. It is called progressive loading. I don`t wan`t to go into the
details of the training, but if you do not track and improve your training
(with whatever goal you have in mind, there are different kinds of
progressions) you are the same as the programmer who write bad code, and knows
it is bad, but does not care.

~~~
zimpenfish
> If you don`t try to lift heavier weights you will not have progress.

That depends on what you're aiming for - if you're aiming for tone rather than
bulk, you'd go for more reps of the same weight rather than stepping up the
weight with the same reps, wouldn't you?

(Although I guess, technically, that does count as "heavier" since you're
still moving more weight in your sets.)

~~~
tomtomau
The general consensus is there's not really such thing as 'tone' \- usually
that's just a matter of low bodyfat % which shows your muscles more
promimently.

What you're talking about somewhat though is balancing hypertrophy and
strength goals - where someone may wish to increase their reps to say 8-20
which is demonstrated with studies to increase muscle size, but is poorer for
progressing with strength (where 3-8 reps is more effective)

------
blux
Regarding the point "Don’t use floating-point operations when integers will
do." I agree that using integer arithmetic can result in cleaner, easier to
reason about code when applicable. It then follows that because of these
traits the software will faster.

But integer arithmetic is not necessarily faster than floating point
arithmetic in general; see for example:
[https://youtu.be/3K2LmnaLLF8?t=31m10s](https://youtu.be/3K2LmnaLLF8?t=31m10s).

~~~
gopalv
> But integer arithmetic is not necessarily faster than floating point
> arithmetic in general;

GPUs was the first place I feel like floats got a much better pathway than any
other data type - graphical pixel manipulations also can get away with much
higher arithmetic errors, because it's all going to get rounded down to a
pixel eventually.

On the other hand, the only place where I've really worked heavily with fixed
point was on a graphical interface toolkit (MIDP on ARM I was working with
would've had to use softfp for floats, so fixed point was hugely superior).

~~~
AstralStorm
Come on, all reasonable ARM have NEON and VFPv3 (including cortex M) about the
only place I saw something that didn't was an old internet router. Hardware
completely unsuited to maths.

Softfp does not preclude the use of floating point math at all. It means you
pay an extra cost when passing floats as arguments. (Which means you should
rather use a hardfp math library if possible.)

~~~
gopalv
> Come on, all reasonable ARM have NEON and VFPv3 (including cortex M) about
> the only place I saw something that didn't was an old internet router

ARM MIDP UI code in 2003-2004 - very interesting chip (ARM926EJ-S [1]) ... not
that I ever invented anything new to do all this, all of what I did was learn
& repeat tricks from 1980s arcade consoles.

[1] -
[https://en.wikipedia.org/wiki/Jazelle#BXJ:_Branch_to_Java](https://en.wikipedia.org/wiki/Jazelle#BXJ:_Branch_to_Java)

------
AstralStorm
There is indeed a point where being fit and clean turns into an obsession.
There is such a thing as premature optimization.

Commonly the mistakes costing performance are caused in part by bad design
though, typically arrived at by either not caring or not fixing a quick and
dirty prototype with something reasonable. Or "brute force" bug fixing in
concurrent code.

If I got paid for every unnecessary or replaceable synchronized statement in
Java, I'd be rich. Or a fat data copy made using a queue.

~~~
tluyben2
> There is indeed a point where being fit and clean turns into an obsession.
> There is such a thing as premature optimization.

There is also something as no optimization (at any point), and I see far more
often in the wild. I think more devs should run their app on a cheap Android
phone or cheap consumer entry laptop to see what is happening when it's not
running on 32 gb i7 octa cores. I recently got some entry level machine from a
friend to play around with; it has 4gb and flash drive; it is eerie how slow
the thing is; this is what is sold in walmart etc at the entry price level, so
fair to say, if you are B2C software dev, this is your audience.

For fun, I tried to do some development on it; Atom/Vscode started both
snappy, but when devving on it a bit, they became rapidly unusable (cpu 100%
always and typing appearing seconds after you typed it); emacs was ok; vim did
well. You cannot run Electron on that and yet, that's what many people
deliver. And my suspicion is that most don't optimize when using that (if you
are fast software minded, you wouldn't use it in the first place, not because
it's inherently bad, but because there are too many moving parts you cannot
influence). Please try it on these machines to know what you are doing to your
users :)

------
delta1
1\. Make it work

2\. Make it right

3\. Make it fast

 _" Programmers waste enormous amounts of time thinking about, or worrying
about, the speed of noncritical parts of their programs, and these attempts at
efficiency actually have a strong negative impact when debugging and
maintenance are considered. We should forget about small efficiencies, say
about 97% of the time: premature optimization is the root of all evil. Yet we
should not pass up our opportunities in that critical 3%."_ \- Donald Knuth

~~~
hasenj
This quote is so misunderstood and so abused all the time.

The point of the statement is to not micro-optimize this and that corner of
the program without having any data to guide you on what you should optimize
and how.

It's not a rallying call to forgo all concerns about application performance
and efficiency.

The mis-application of this quote is the root of all evil in modern software
IMO.

It's why a chat program takes 500MB of RAM and why many programs take 10
seconds to load even though there's no technical reason they cannot startup
instantly.

You should make it right and fast from the very start. You should not make
something that "works" but is full of bugs and slow as hell. Like, that makes
no sense at all. If it's full of bugs then it doesn't work. If it's slow as
hell then it doesn't work.

1\. Make it work correctly and efficiently (to a reasonable degree)

2\. Make it even faster

3\. Make it better

~~~
zimpenfish
> 1\. Make it work correctly and efficiently (to a reasonable degree)

Unfortunately, this took your team 6 months and the people that launched their
app 4 months ago (having opted for "make it work") now have 90% of your market
and you've all just been made redundant because the customers do not care
whether your app is "more correct" and "more efficient" because they just
plain damn couldn't use it.

There's a reason "worse is better" applies to software.

~~~
hasenj
There's no evidence than any of this is true.

First market is not often the winner. Facebook came very late to the social
media scene but it dominated.

Google came very late to the search engine scene, and people don't even
remember this but there was a time when there were a ton of search engines and
anyone at the time would probably think that the search engine market is
saturated and there's no space for a new product.

All evidence actually points to better products dominating the market even if
they come late.

If your competitors released their app 4 months ahead of you but it was full
of bugs and always hangs up, and then you release your product which actually
works and performs well, people will see your product as a breath of fresh
air.

Another example is the Chrome browser, which came at a time when Firefox and
IE were competing fiercely for market share, and Chrome completely dominated
them on the simple basis that it was _really fast_.

For most products it doesn't even cost 6 months to make it fast. If you have
that as a goal from the very start, there will never be a stage of "omg it's
really slow let's try to make it a bit faster". It will always just be fast.

~~~
zimpenfish
> Chrome completely dominated them on the simple basis that it was _really
> fast_.

It isn't that simple - a lot of Chrome's success came because they "made it
work" in ways that Firefox (horribly wasteful of CPU, memory, battery life)
and IE (shambles in every department) didn't. Also helped in large part by
having an enormous web monopoly pushing it and favouring it for their
properties.

But you're definitely right that it came after them.

------
twic
> But then, I dress cleanly every single day even if I stay at home. And you
> should too.

Nope.

It's good to know how to produce fast software, but also good to know when to
do so. Lemire sounds like he has some psychological issues to work out.

------
majikandy
I’ve spent the last 20 years not caring about performance. Not once has
performance ever been a real issue on any project I’ve been on. Imagine how
much time I could have wasted prematurely optimising.

~~~
speedplane
You've never written any code to run in parallel? Or used a data structure or
algorithm with O(logN) time instead of O(N) time?

~~~
majikandy
I probably have, without particularly prioritising it. Ok, perhaps I should
have said "overly caring about performance". Of course I have used built in
functions in languages that take advantage of this when searching or sorting -
but no, I can't remember a time when I had to get into the nitty gritty of low
level performance enhancement, unless it was in a coding challenge of course.

------
mbrock
For most people, "seeing inefficient code" is basically hallucinating. There's
real truth to the mantra about premature optimization. I've seen people so
many times worry about the performance of something, thinking that some code
_looks slow_ , when in actual reality it's a complete nonissue.

Software development is all about compromises, just like any kind of real
world engineering. Building a stage for a weekend festival is different from
building a bridge over a river. We're always on a budget, so if I can work
much faster and safer and lose some marginal efficiency, I'll do it—unless
marginal efficiency is what I'm competing on, which it usually isn't.

------
a_imho
_There is a reason we don’t tend to hire people who show up to work in dirty
t-shirts. It is not that we particularly care about dirty t-shirts, it is that
we want people who care about their work._

I don't quite agree the cleanliness of clothing is a good proxy of one's
craftsmanship or personality, even if t-shirts are acceptable garment at a
particular workplace.

------
agnivade
It maybe different in US. But in some Asian countries, most people care about
the money at the end of the day.

So yea, they don't care. But does it matter ? People are happy as long as they
get paid at the end of the day. They don't care about the quality of code
written.

Not saying its a good thing or bad thing. Just the way it is.

------
foreigner
I think of myself as a software craftsman. Just like a master carpenter, the
things I make should be not only functional but beautiful as well. Of course
we should "care" about our work.

However there are other metrics besides performance that are worth caring
about, e.g. readability, maintainability, etc...

~~~
mercer
Meeting deadlines, getting paid, whether anyone will ever care about the code,
again, etc.

I also try to be a 'craftsman', and I try to pick work where that's an option,
but I find that in practice, more often than not I have to care about various
'other metrics' at the detriment of craftsmanship.

Although I suppose being skilled at managing these various metrics could in
itself be considered craftsmanship. And perhaps that mindset helps a bit with
not getting utterly depressed at some of the shit I'm forced to deliver...

------
michaelg7x
Code bloat could easily be sorted out by forcing people to read the generated
assembly of their creations.

~~~
reikonomusha
This is easy and convenient in Common Lisp with the DISASSEMBLE built-in
function.

------
nnq
Blah. You get a lot of _performance "for free"_ by getting the overall
_architecture of a system right_. By starting with optimizations, you usually
hit the "limit of your brainpower" very soon, like before you know enough to
architect things right, so you _mess up the architecture_ (think like "shit,
we ended up with this location dependent time conversion logic in the backend
instead of on the clients, so now is un-cacheable, but it's too late to
refactor across multiple teams now... shit").

Just _avoid building un-optimizable systems_ (nowadays it's harder than ever
to avoid this... all the tools and services lay traps for us), and then you
can safely leave the optimizations for later...

If you can't avoid building an un-optimizable system, all your clever early
optimizations are worthless anyway. If you can, they can be postponed anyway.

Now, _what I 'd really want to read would be an article/book about avoiding
the traps that lead people to get stuck with un-optimizable systems!_ (Though
I imagine it's in general an unsolvable problem if natural evolution didn't
solve it either: at some point your body accumulated enough de-optimizations
that it basically stops working or being fixable/optimizable, so you die, and
some of your memes and genes carry on in "rewritten forks" aka "other newly
born people".)

