
Efficiency is fundamentally at odds with elegance (2013) - henning
https://yosefk.com/blog/efficiency-is-fundamentally-at-odds-with-elegance.html
======
phkahler
Symbolic representation vs floating point as a trade of elegance? The
suggestion of maintaining non-numeric representations falls flat very quickly
in a number of cases:

5th root of a polynomial. There is no closed form solution that could be
carried through other computations.

Integrals. There is no general method for symbolic integration. A physics
simulation cannot maintain closed form solutions and it would not be elegant
even if you could.

Representations of geometry - as in CAD programs. The intersection of 2 curved
surfaces is usually higher order than the surfaces in question and can balloon
very quickly.

Anything with sampled data - why even bother trying.

Sometimes numeric representations are necessary, not just some kind of trade.
Unless one thinks the non-existence of some fantasy ideal representation is a
kind of tragedy in need of fixing. And that seems to be his point at the end -
you'll get frustrated very quickly looking for an elegant solution that may
not even exist.

~~~
phaedrus
I recently started playing an Android game called Euclidea. It teaches the
basics of compass and ruler construction (geometry) and asks you to complete
various tasks such as bisecting an angle or finding a circle equally spaced
between four points, etc. The goal is to do so with a certain minimal number
of moves.

Having never taken a geometry class, but being an experienced programmer, I
was struck by the difference in the approaches I gravitate to versus the
geometer's approach the game seems to be trying to teach me.

For example, I might want to approach a point by successive approximation, but
that doesn't count because the game is looking for an exact solution - to say
nothing of the points-penalty incurred by using constructions of dozens of
moves where the game knows it can be solved by just 3 or 4.

Another example, to reach something I might start by defining a "coordinate
system", perhaps a strange triangular coordinate system defined by osculating
circles. In this case I _do_ reach the exact solution, eventually, but again
fail to get the most points for golfing the number of moves. This because I
was thinking in layers of abstraction and drawing all the possible circles
first, and then saying "oh here's the point(s) I needed" rather than just
getting from A to B.

I think it would no stretch to associate the elegant/minimal geometric
construction of this game with the symbolic approach the author of the
original post is advocating. The Euclidea game really opened my eyes to how
ingrained numeric, successive approximation is in my approach to problem
solving and how it contrasts to the symbolic, geometric approach. I don't know
if I'd change how I approach problem solving or programming, but I think it's
helpful to be aware both that you're doing it and that it's not the only
approach.

~~~
logfromblammo
Exact answers are nice, but sometimes you need to know whether two ellipses
_do or don 't_ intersect, as fast as possible, and then do the more
computationally intensive math only for the interesting cases.

For instance, if the distance between centers is greater than the sum of the
major radiuses, the ellipses do not have a real area of intersection, and
imaginary areas are not useful for the problem at hand, so you stop
calculating and move on to the next pair. If the distance between centers is
smaller than the larger minor radius, there is a real area of intersection,
and you can put it on the list to calculate it later.

In seven steps, you can determine the desired answer to the limit of floating
point precision faster than the _exact_ symbolic answer determined by actually
solving the quartic as a "single step", which is actually encapsulating quite
a lot of multiplications, and at least six distinct conditional cases.

But on the other hand, the rapid approximation is the work of one afternoon,
and can be debugged one step at a time, while the exact answer is a doctoral-
level thesis.

And if you're drawing your ellipses on the surface of an ellipsoid, like the
WGS84 geoid, the value of approximations becomes even greater, because the
exact answers will come from an even higher-order polynomial equation.

In this, the "don't care" part of the problem space is always used to make the
shape of the solution as simple as possible. So when you always care about
exact answers, discovered elegantly, you're actually discarding the means to
make the solution more elegant in way you might not have realized.

~~~
gugagore
What is the iterative method for determining the area of intersection ?

~~~
logfromblammo
I don't remember the whole thing, and I couldn't take it with me, but what I
can recall is this:

It is much more important to determine whether there is a real intersection at
all, than it is to determine how big the area of intersection is.

Calculate distance between centers.

Add both major radiuses.

If distance >= sum, skip this pair. Intersection area is entirely imaginary.

If distance < max( minor radiuses ), the pair definitely has a real
intersection. Put it on the list for later and check the next pair. Upper
bound, min( ellipse areas ); lower bound zero.

Find the maximum extents of each ellipse, when projected onto the segment
between centers. If the sum is greater than the distance, there is a real
intersection. Put it on the list and go on to the next pair.

Then there was a bit of a race against time through increasingly esoteric
tests that relied heavily on the calculations from previous steps. The goal
was to produce a list of ellipses, sorted by decreasing real area of
intersection greater than zero with a single reference ellipse. So if there
was only one on the list, you didn't even need to calculate area. That was the
answer, and you're done.

At some point, the reference ellipse was transformed into a circle at the
origin, with the center of the comparison ellipse on the x axis. Some tests
only used one quadrant of an ellipse at a time. Each test refined the bounds
on the fraction of the reference ellipse that could be in the intersection, so
again, if those limits were enough to sort the list, the area calculation was
skipped.

If there were still enough ambiguous tests, then finally you calculate the
points of intersection. For zero or one, the area is exactly the area of the
smaller ellipse. For four, first approximation is the area of the
quadrilateral, as minimum. For three, the area of the triangle, as minimum.
Two was the worst case, because then you needed to pick out a third point for
the triangle, which was probably one of the four where a major axis
intersected an ellipse. If you needed a maximum area, it was the smallest
circle enclosing the points. Each test either increased the lower bound, or
decreased the upper bound, until there was no more overlap in the sorted list.

After all that, if the list was still not clearly ordered, you finally
calculated and added chord areas to get exact results. You couldn't really get
here with real-world test data, so now you had to invent test data
specifically to make it this far, to be sure that if any real-world data set
ever does execute this branch, it ends up ordered correctly. But I don't think
we ever got funding to bulletproof it. Customer was fine with the possibility
that a tiny fraction of inputs might end up ordered incorrectly, so long as
the wrong answer came out quickly.

Clearly, if all the areas had to be calculated exactly, it would have been
more efficient to just do that from the start. But that wasn't the problem.

------
mattnewport
It's not my experience that efficiency is "fundamentally" at odds with
elegance. Elegance can be a hard concept to pin down and has a somewhat
subjective component but I often find situations where code can be made both
more efficient and more elegant. There are certainly cases where I can't see a
way to make the code more efficient without sacrificing elegance however. They
don't seem to be fundamentally at odds but neither are they always aligned.

I disagree with several of the specific examples given in the article but to
pick one, the author says "C++ rather obviously does pay with programmer
efficiency for runtime efficiency, without an option to opt out of the deal.
Every allocation can leak, every reference can be dangling, every buffer can
overflow, etc. etc. etc.". Most of the ways that modern C++ addresses these
problems inherited from C such as unique_ptr, containers and ranges are more
elegant to my taste, more convenient to use and generally no less efficient.

I find garbage collection very inelegant and the approach to memory management
taken by Rust and even C++ in most cases more elegant but I'm not clear if
that's something the author was intending to include in his above statement.
This seems somewhat subjective however. I know there are arguments some people
might make for garbage collection being elegant and even efficient in
comparison to manual memory management.

Where elegance and efficiency seem to have some natural alignment is in the
realm of simplicity and doing only what is necessary. Code can often be made
both more efficient and more elegant in my experience by distilling out the
essence of the problem and removing the unnecessary.

~~~
ben509
The underlying need is to traverse and update data structures, so language
designers went with universally mutable objects with arbitrary references.
That makes the representation of an individual object quite simple, and it
does centralize the complexity of memory management in garbage collection, so
it's a good engineering decision. But real GC requires extensive tuning
(different "generations", stop-the-world pauses, etc) and that's simply not an
elegant solution.

My approach was to revisit that underlying need, and I settled on a data model
that simply prohibits cyclic data structures, so memory management doesn't
have to be more complex than reference counting. Now, I'm still going to have
to implement zippers and paths and such to handle traversal, but I think that
will still be far cleaner. I have some notes here if you're curious:
[https://tenet-lang.org/types.html](https://tenet-lang.org/types.html)

~~~
naasking
> My approach was to revisit that underlying need, and I settled on a data
> model that simply prohibits cyclic data structures, so memory management
> doesn't have to be more complex than reference counting.

Not sure I'd call reference counting elegant or efficient either. It has a lot
of problems, like spurious updates which kill performance.

If you've already ruled out cycles, then you've ruled out a large class of
expressions in your programming language. Perhaps you should revisit region-
based memory management to see if it's sufficient, as that will give you
optimal space and time use.

------
rbranson
I don’t think the author really addresses the examples presented.

The example of the FAST decision tree is based on a code generator, an
abstraction which is presumably more elegant than the generated code which is
littered with goto statements.

And what about std::sort? Say what you want about C++ in general, but it is
definitely superior to a rudimentary qsort in C in terms of elegance and at
least equivalent in terms of runtime efficiency.

There are plenty of examples in either direction. It would seem that this is
yet another case of “it depends.”

~~~
diegoperini
Noob question: What does std::sort do (aside from sorting oc) ? Which feature
of it were you highlighting?

~~~
mattnewport
Compared to C style qsort what std::sort gives you are type safety,
convenience (types that already define a suitable operator< will use it
automatically without you having to explicitly create a comparison function),
correctness/convenience (you don't have to manually specify the number of
elements to sort, it's deduced from the range) and efficiency (the comparison
can be inlined by the compiler much more easily than in a C style qsort which
typically makes a meaningful performance difference for sorting).

~~~
jstimpfle
TBH the efficiency thing is irrelevant. I once compared std::sort and qsort on
large integer arrays (which is where you can expect the greatest speedup) and
the speedup was less than 2x on my machine. Furthermore the cases where
sorting could ever become a noticeable bottleneck are pretty rare.

Meanwhile each std::sort instantiation costs a few hundred bytes of machine
code and increases the project's compilation times.

When performance ever matters, you tune your sort algorithm and implementation
to the data - you use a bucket sort for example. Much larger speedups than 2x
to be had this way. std::sort cannot do that.

I like the type safety of std::sort vs qsort, though, even though I don't find
pursuing "type safety" a good idea in general.

~~~
mattnewport
I don't consider a 2x speedup "irrelevant". It's also not the case that large
integer arrays are where I would expect the greatest speedup. The greatest
speedup would be on arrays that fit in L1 cache where function call overhead
is going to be the most significant factor rather than cache misses.

Compile times and code bloat with templates are genuine issues but they are
being improved both with improved compilers and linkers and with C++ standard
changes (modules should be a big help for compile times).

Type safety is in my opinion a good example of elegance and efficiency being
generally aligned. Typically improving type safety makes code more elegant,
catches bugs and give opportunities for better efficiency in my experience.

~~~
jstimpfle
Again, it was an artificial benchmark. I sorted millions of integers. 2x is
not measured in a real-life program. It's the absolute upper limit what you
can ever expect. Most programs don't spend a noticeable amount time in sorting
at all. Program performance usually is dominated by other things, like I/O.
Now what is 50% of "not a noticeable amount of time"? Right, it's irrelevant.

But 2x is a hard number, so people tend to think it's important and forget
about all the disadvantages which are hard to measure but have far more
ramifications.

And for the rare cases where a sort really matters, you should use an
implementation that is tuned to the data. That will bring you larger speed-
ups. I could have said "10x" to sound impressive, and it wouldn't be wrong in
most cases, but it's simply not possible to make a blanket statement. It
depends. It could be more than 10x.

Even better, you can often construct the data so it falls out sorted. Speed-
up: INFx. Machine code and compile time: Zero.

It's the same story for types in general. They lead to wrong and bloated
design and boilerplate code when you overdo it. Which prevents the important
optimizations that could simplify program structure enormously, and kill many
more bugs this way.

> The greatest speedup would be on arrays that fit in L1 cache where function
> call overhead is going to be the most significant factor rather than cache
> misses.

The smaller you make the data, the less time it takes to sort it with whatever
implementation, the less important the implementation is.

Furthermore, shouldn't we expect most sorting applications to be pretty cache-
friendly? E.g. Quicksort scans the array sequentially, log n times.

~~~
Jtsummers
> Most programs don't spend a noticeable amount time in sorting at all.

I think a more important point is that most programs that depend on sorting
data will use a data structure which can sort itself rather than trying to
sort lists/arrays of such large numbers (especially when exceeding a certain
threshold of data records). Consider if you're getting millions of records and
need them in some order (potentially different from how they arrive), it makes
more sense (especially with the IO delays) to spend the extra time on
inserting into a sorted structure while waiting for the next piece of data to
arrive. Interleaving the two activities, rather than waiting for all IO to
finish and then sorting a massive list (forcing them to be serialized). The
concurrent version can benefit from the inherent parallel nature of dealing
with IO-bound and CPU-bound activities.

~~~
jstimpfle
I'm not positive that most sorts are "online" with I/O that reads the data to
be sorted. It's not what I wanted to imply anyway.

The kind of I/O I meant is just generally time consuming I/O (large amounts of
data, too many small requests, too much synchronization) that most
applications are doing. For other, non-I/O intensive applications, most of
them likely have bottlenecks other than sorting.

In any case I wouldn't want to tie my data to an incremental sorting structure
(like a search tree instead of a flat array) before I've ensured that this
online aspect was critical. If it is not critical choosing a more complicated
incremental structure is probably premature optimization which constrains the
project structure, and thus is likely to negatively affect performance and
maintainability in the end.

------
edjroot
I recently watched a documentary where Dijkstra says the opposite:

[https://youtu.be/RCCigccBzIU?t=1125](https://youtu.be/RCCigccBzIU?t=1125)

"One of the things I discovered as early as the 1960s is that mathematical
elegance is not a question of aesthetics, of taste or fashion, but something
you can translate into a technical notion.

"The Concise Oxford Dictionary gives as one of the meanings of 'elegant'
'ingeniously simple and effective'.

"In practice a program is manageable if you make a truly elegant program,
firstly because it is shorter than most alternatives and consists of discrete
parts, each of which you can replace by an alternative implementation without
influencing the rest of the program, but also, curiously, the most elegant
programs are often the most efficient."

------
bwoj
I've seen this tradeoff firsthand with dsp video algorithms. The naive code
just implements the algorithm straight away. The performant version has to
ensure that the inner loop all fits in cache while running. It also does
tricks like prefetching data into cache so the code doesn't stall on a data
load. These sort of tricks really impact the readability of the code.

------
yarg
I wouldn't say elegance here - there is something fundamentally elegant about
the utilisation of 8-bit integers in scenarios where they can be used.

It is on the other hand, fundamentally at odds with accuracy - which is
already well and widely known (at very least implicitly) and is the reason we
so readily accept approximations in computing.

------
cosmolev
Those are apples and oranges. Efficiency can be measured, elegance is
subjective.

------
jonahrd
I've been writing firmware to run a DC motor controller where this
efficiency/elegance tradeoff is very apparent. I can write "slower" code in a
very elegant way, such as reading the throttle input. But when it comes to
implementing the high-frequency control loop with all its ADC readings and
filtering, I have to trade much of my elegance for raw efficiency.

------
js8
There is a lot of truth to that, although I also see a third trade-off, which
is robustness to change (not so much in the code but rather the inputs).

I think you can have in some cases efficiency and elegance, but then the
solution will be either too abstract or specialized that it won't really be
practical, and a small change in problem conditions will cause it not to work.
(An example from the article - a matrix solver that only does a well-
conditioned systems so the floating point errors never accumulate.)

Similarly, you can have robustness to change and elegance without efficiency
(mathematically beautiful code), and robustness to change and efficiency
without elegance (lot of copy pasting the code to deal with special cases).

In any case, I think we are generally trying to present elegant code to humans
(using more abstract programming languages) and then use compilers to convert
the elegance into efficiency. But it requires that somebody takes the pains to
actually deal with all the possible cases (and write a working compiler).

------
blablabla123
I find this discussion tiring, also I have actually seen people vividly
defending both positions: that the most efficient code is also very elegant
and the opposite, as here.

It depends so much on the situation, skill/experience of the person/team
building the stuff and what focus one has.

Sometimes an elegant solution opens completely new perspectives on performance
optimizations.

------
yters
Symbolic manipulation is undecidable, hence programmers prefer numeric
approaches. But, symbolic approaches are clearly superior if the programmer
does them a priori. I can convert a completely intractable bruteforce
combinatorics problem into a very tractable algorithm with some human symbolic
preprocessing (i.e. math).

------
dmichulke
> Efficiency is fundamentally at odds with elegance

Strangely though their absence coincides in roughly 80% of all programmes, so
"being at odds with" is meant as

(Efficiency -> NOT Elegance) AND (Elegance -> NOT Efficiency)

but not as

Elegance XOR Efficiency

~~~
ptero
What you are saying that there are a lot of programs that are neither elegant
nor efficient. Few would dispute this, but I think this case is still covered
by the original formulation.

For example I do not see "being drunk is fundamentally at odds with performing
surgery" being invalidated by a bunch of folks writing programs in offices and
fitting in neither bin. Caveat: I am not a native English speaker.

------
sebringj
Here's an example that made sense to me

elegant (readable / compact)

function isPalindrome(str) { return str == str.split('').reverse().join(''); }

efficient (~50x faster than above)

function isPalindrome(str) { var len = Math.floor(str.length / 2); for (var i
= 0; i < len; i++) if (str[i] !== str[str.length - i - 1]) return false;
return true; }

I would probably say elegance and efficiency are not aligned rather than at
odds IMO because they may overlap or may not, they don't have the same goals.

~~~
kazinator
Does just str.reverse() not work, by the way.

~~~
sebringj
I didn't see that on MDN but would be cool. I get 'abc'.reverse() exception
that it is not a function.

------
gameswithgo
I would love to see the author revisit this today. What would they think of
Rust, or the current state of C#, for instance?

Would F# or Nim be a much faster Python with less ugly C FFI?

------
szemet
Or it maybe just
[https://en.wikipedia.org/wiki/Berkson%27s_paradox](https://en.wikipedia.org/wiki/Berkson%27s_paradox)

------
hardlianotion
I must say, I disagree with the very first paragraph of this article. You need
numerical recipes for any real-word problem whether you have full precision or
not.

------
m4r35n357
Bad first example. Automatic differentiation in arbitrary precision is as
elegant as symbolic, and as efficient as "floating point".

------
socketnaut
This just seems like survivorship bias. An algorithm / tool / technology tends
not to remain in use if something that is both more elegant and more efficient
is available. Therefore, if you survey the well known options you'll see a
negative relationship between elegance and efficiency. It doesn't imply a
general relationship nor does it preclude the discovery of an approach that is
both more elegant and more efficient than anything available today.

------
someguy1234567
Simplicity is usually efficient and elegant.

