Hacker News new | past | comments | ask | show | jobs | submit login
Efficiency is fundamentally at odds with elegance (2013) (yosefk.com)
90 points by henning on Dec 18, 2018 | hide | past | favorite | 43 comments



Symbolic representation vs floating point as a trade of elegance? The suggestion of maintaining non-numeric representations falls flat very quickly in a number of cases:

5th root of a polynomial. There is no closed form solution that could be carried through other computations.

Integrals. There is no general method for symbolic integration. A physics simulation cannot maintain closed form solutions and it would not be elegant even if you could.

Representations of geometry - as in CAD programs. The intersection of 2 curved surfaces is usually higher order than the surfaces in question and can balloon very quickly.

Anything with sampled data - why even bother trying.

Sometimes numeric representations are necessary, not just some kind of trade. Unless one thinks the non-existence of some fantasy ideal representation is a kind of tragedy in need of fixing. And that seems to be his point at the end - you'll get frustrated very quickly looking for an elegant solution that may not even exist.


I recently started playing an Android game called Euclidea. It teaches the basics of compass and ruler construction (geometry) and asks you to complete various tasks such as bisecting an angle or finding a circle equally spaced between four points, etc. The goal is to do so with a certain minimal number of moves.

Having never taken a geometry class, but being an experienced programmer, I was struck by the difference in the approaches I gravitate to versus the geometer's approach the game seems to be trying to teach me.

For example, I might want to approach a point by successive approximation, but that doesn't count because the game is looking for an exact solution - to say nothing of the points-penalty incurred by using constructions of dozens of moves where the game knows it can be solved by just 3 or 4.

Another example, to reach something I might start by defining a "coordinate system", perhaps a strange triangular coordinate system defined by osculating circles. In this case I do reach the exact solution, eventually, but again fail to get the most points for golfing the number of moves. This because I was thinking in layers of abstraction and drawing all the possible circles first, and then saying "oh here's the point(s) I needed" rather than just getting from A to B.

I think it would no stretch to associate the elegant/minimal geometric construction of this game with the symbolic approach the author of the original post is advocating. The Euclidea game really opened my eyes to how ingrained numeric, successive approximation is in my approach to problem solving and how it contrasts to the symbolic, geometric approach. I don't know if I'd change how I approach problem solving or programming, but I think it's helpful to be aware both that you're doing it and that it's not the only approach.


Your comment reminds me of frustrations I ran into when going the opposite direction in college. I majored in mathematics (focusing on pure math) where the goal is typically to produce a solution which is as elegant and readable as possible. In the couple of programming classes I took for my minor, I tended to follow this same approach. If I could write a sexy three line recursive statement that solved the problem in O(n^2) rather than some convoluted nest of loops and tests that did it in O(nlogn) I wanted to go with the first one! It was really hard to make the mental shift from trying to produce something elegant and concise to something fast and memory efficient.


It's funny, I've been diving into a lot of math I haven't used in a while lately (via the Coursera "Robotics Specialization"). Since I don't particularly trust myself to do huge amounts of symbolic manipulation/integration/etc on paper, I've picked up Maxima as a CAS to try to keep track of it all. More than once, I've gotten some set of equations into a form that I'm ready to use to do numerical iteration/gradient descent/whatever, and Maxima just kind of... pops out a closed form solution for the problem I'm trying to solve.

My basic approach to straddle the symbolic/numeric divide is to keep things symbolic until the last minute and then substitute in the boundary conditions/constants/whatever at the end and see what pops out. If Maxima pops out a closed form solution, I'll analyze that geometrically to make sure it makes sense and isn't just a quirk of the specific boundary conditions I've chosen, and if it doesn't really have a closed form solution, I'll use the simplified equations and run them through a solver of whatever kind (ODE, gradient descent, etc).


Since you're into geometric constructions... I've been learning CAD with SolveSpace, which is GPL and has a constraint solver. This is very interesting code to dig into. When you apply constraints to a drawing (midpoint, point on line, parallel, perpendicular, etc..) It creates an internal algebraic expression for that constraint. You end up with a system of equations (internally only) that are constantly enforced as you manipulate a sketch. It's a hybrid approach where the constraints are represented as equations but the solutions are all computed numerically. I think you'd appreciate it.


Exact answers are nice, but sometimes you need to know whether two ellipses do or don't intersect, as fast as possible, and then do the more computationally intensive math only for the interesting cases.

For instance, if the distance between centers is greater than the sum of the major radiuses, the ellipses do not have a real area of intersection, and imaginary areas are not useful for the problem at hand, so you stop calculating and move on to the next pair. If the distance between centers is smaller than the larger minor radius, there is a real area of intersection, and you can put it on the list to calculate it later.

In seven steps, you can determine the desired answer to the limit of floating point precision faster than the exact symbolic answer determined by actually solving the quartic as a "single step", which is actually encapsulating quite a lot of multiplications, and at least six distinct conditional cases.

But on the other hand, the rapid approximation is the work of one afternoon, and can be debugged one step at a time, while the exact answer is a doctoral-level thesis.

And if you're drawing your ellipses on the surface of an ellipsoid, like the WGS84 geoid, the value of approximations becomes even greater, because the exact answers will come from an even higher-order polynomial equation.

In this, the "don't care" part of the problem space is always used to make the shape of the solution as simple as possible. So when you always care about exact answers, discovered elegantly, you're actually discarding the means to make the solution more elegant in way you might not have realized.


What is the iterative method for determining the area of intersection ?


I don't remember the whole thing, and I couldn't take it with me, but what I can recall is this:

It is much more important to determine whether there is a real intersection at all, than it is to determine how big the area of intersection is.

Calculate distance between centers.

Add both major radiuses.

If distance >= sum, skip this pair. Intersection area is entirely imaginary.

If distance < max( minor radiuses ), the pair definitely has a real intersection. Put it on the list for later and check the next pair. Upper bound, min( ellipse areas ); lower bound zero.

Find the maximum extents of each ellipse, when projected onto the segment between centers. If the sum is greater than the distance, there is a real intersection. Put it on the list and go on to the next pair.

Then there was a bit of a race against time through increasingly esoteric tests that relied heavily on the calculations from previous steps. The goal was to produce a list of ellipses, sorted by decreasing real area of intersection greater than zero with a single reference ellipse. So if there was only one on the list, you didn't even need to calculate area. That was the answer, and you're done.

At some point, the reference ellipse was transformed into a circle at the origin, with the center of the comparison ellipse on the x axis. Some tests only used one quadrant of an ellipse at a time. Each test refined the bounds on the fraction of the reference ellipse that could be in the intersection, so again, if those limits were enough to sort the list, the area calculation was skipped.

If there were still enough ambiguous tests, then finally you calculate the points of intersection. For zero or one, the area is exactly the area of the smaller ellipse. For four, first approximation is the area of the quadrilateral, as minimum. For three, the area of the triangle, as minimum. Two was the worst case, because then you needed to pick out a third point for the triangle, which was probably one of the four where a major axis intersected an ellipse. If you needed a maximum area, it was the smallest circle enclosing the points. Each test either increased the lower bound, or decreased the upper bound, until there was no more overlap in the sorted list.

After all that, if the list was still not clearly ordered, you finally calculated and added chord areas to get exact results. You couldn't really get here with real-world test data, so now you had to invent test data specifically to make it this far, to be sure that if any real-world data set ever does execute this branch, it ends up ordered correctly. But I don't think we ever got funding to bulletproof it. Customer was fine with the possibility that a tiny fraction of inputs might end up ordered incorrectly, so long as the wrong answer came out quickly.

Clearly, if all the areas had to be calculated exactly, it would have been more efficient to just do that from the start. But that wasn't the problem.


There is an elegant universal notation for integers, called "variable-width integers" or "varints" or "bigints" or similar. It is reviled amongst many, and doesn't see much use.

There is an elegant universal notation for rationals, called "quote notation" or "finite continued fractions" or "p-adic rationals". It is largely unknown and sees next-to-no use.

There is an elegant universal notation for computable reals, called "computability". We use this all the time, but almost never for computing reals.

Floating-point numbers are fussy and require careful management in order to avoid buggy algorithms. In contrast, there exists an algorithm (although it's an open problem to express it in a quick way!) for taking the fifth root of any generalized continued fraction or any Turing-computable real, in a way that allows the result to be used in further computation. Somebody should tackle the Gaussian distributions next!


Nitpick: There is no closed form solution for the 5th root of a polynomial in terms of only additions, subtractions, multiplications, divisions and root extractions. This is what Galois showed. However, it can be expressed in closed form using trigonometric functions [1]. There is a summary here of the many closed form solutions that exist [2].

[1] https://math.stackexchange.com/questions/1537069/the-trigono...

[2] https://math.stackexchange.com/questions/1555743/how-do-you-...


There's room for both, though. While some functions can't be integrated, others can. You can save yourself a lot of effort and get a more correct answer by taking a symbolic definite integral and then computing a result, compared with numerical integration.


It's not my experience that efficiency is "fundamentally" at odds with elegance. Elegance can be a hard concept to pin down and has a somewhat subjective component but I often find situations where code can be made both more efficient and more elegant. There are certainly cases where I can't see a way to make the code more efficient without sacrificing elegance however. They don't seem to be fundamentally at odds but neither are they always aligned.

I disagree with several of the specific examples given in the article but to pick one, the author says "C++ rather obviously does pay with programmer efficiency for runtime efficiency, without an option to opt out of the deal. Every allocation can leak, every reference can be dangling, every buffer can overflow, etc. etc. etc.". Most of the ways that modern C++ addresses these problems inherited from C such as unique_ptr, containers and ranges are more elegant to my taste, more convenient to use and generally no less efficient.

I find garbage collection very inelegant and the approach to memory management taken by Rust and even C++ in most cases more elegant but I'm not clear if that's something the author was intending to include in his above statement. This seems somewhat subjective however. I know there are arguments some people might make for garbage collection being elegant and even efficient in comparison to manual memory management.

Where elegance and efficiency seem to have some natural alignment is in the realm of simplicity and doing only what is necessary. Code can often be made both more efficient and more elegant in my experience by distilling out the essence of the problem and removing the unnecessary.


The underlying need is to traverse and update data structures, so language designers went with universally mutable objects with arbitrary references. That makes the representation of an individual object quite simple, and it does centralize the complexity of memory management in garbage collection, so it's a good engineering decision. But real GC requires extensive tuning (different "generations", stop-the-world pauses, etc) and that's simply not an elegant solution.

My approach was to revisit that underlying need, and I settled on a data model that simply prohibits cyclic data structures, so memory management doesn't have to be more complex than reference counting. Now, I'm still going to have to implement zippers and paths and such to handle traversal, but I think that will still be far cleaner. I have some notes here if you're curious: https://tenet-lang.org/types.html


> My approach was to revisit that underlying need, and I settled on a data model that simply prohibits cyclic data structures, so memory management doesn't have to be more complex than reference counting.

Not sure I'd call reference counting elegant or efficient either. It has a lot of problems, like spurious updates which kill performance.

If you've already ruled out cycles, then you've ruled out a large class of expressions in your programming language. Perhaps you should revisit region-based memory management to see if it's sufficient, as that will give you optimal space and time use.


I don’t think the author really addresses the examples presented.

The example of the FAST decision tree is based on a code generator, an abstraction which is presumably more elegant than the generated code which is littered with goto statements.

And what about std::sort? Say what you want about C++ in general, but it is definitely superior to a rudimentary qsort in C in terms of elegance and at least equivalent in terms of runtime efficiency.

There are plenty of examples in either direction. It would seem that this is yet another case of “it depends.”


Noob question: What does std::sort do (aside from sorting oc) ? Which feature of it were you highlighting?


Compared to C style qsort what std::sort gives you are type safety, convenience (types that already define a suitable operator< will use it automatically without you having to explicitly create a comparison function), correctness/convenience (you don't have to manually specify the number of elements to sort, it's deduced from the range) and efficiency (the comparison can be inlined by the compiler much more easily than in a C style qsort which typically makes a meaningful performance difference for sorting).


TBH the efficiency thing is irrelevant. I once compared std::sort and qsort on large integer arrays (which is where you can expect the greatest speedup) and the speedup was less than 2x on my machine. Furthermore the cases where sorting could ever become a noticeable bottleneck are pretty rare.

Meanwhile each std::sort instantiation costs a few hundred bytes of machine code and increases the project's compilation times.

When performance ever matters, you tune your sort algorithm and implementation to the data - you use a bucket sort for example. Much larger speedups than 2x to be had this way. std::sort cannot do that.

I like the type safety of std::sort vs qsort, though, even though I don't find pursuing "type safety" a good idea in general.


I don't consider a 2x speedup "irrelevant". It's also not the case that large integer arrays are where I would expect the greatest speedup. The greatest speedup would be on arrays that fit in L1 cache where function call overhead is going to be the most significant factor rather than cache misses.

Compile times and code bloat with templates are genuine issues but they are being improved both with improved compilers and linkers and with C++ standard changes (modules should be a big help for compile times).

Type safety is in my opinion a good example of elegance and efficiency being generally aligned. Typically improving type safety makes code more elegant, catches bugs and give opportunities for better efficiency in my experience.


Again, it was an artificial benchmark. I sorted millions of integers. 2x is not measured in a real-life program. It's the absolute upper limit what you can ever expect. Most programs don't spend a noticeable amount time in sorting at all. Program performance usually is dominated by other things, like I/O. Now what is 50% of "not a noticeable amount of time"? Right, it's irrelevant.

But 2x is a hard number, so people tend to think it's important and forget about all the disadvantages which are hard to measure but have far more ramifications.

And for the rare cases where a sort really matters, you should use an implementation that is tuned to the data. That will bring you larger speed-ups. I could have said "10x" to sound impressive, and it wouldn't be wrong in most cases, but it's simply not possible to make a blanket statement. It depends. It could be more than 10x.

Even better, you can often construct the data so it falls out sorted. Speed-up: INFx. Machine code and compile time: Zero.

It's the same story for types in general. They lead to wrong and bloated design and boilerplate code when you overdo it. Which prevents the important optimizations that could simplify program structure enormously, and kill many more bugs this way.

> The greatest speedup would be on arrays that fit in L1 cache where function call overhead is going to be the most significant factor rather than cache misses.

The smaller you make the data, the less time it takes to sort it with whatever implementation, the less important the implementation is.

Furthermore, shouldn't we expect most sorting applications to be pretty cache-friendly? E.g. Quicksort scans the array sequentially, log n times.


> Most programs don't spend a noticeable amount time in sorting at all.

I think a more important point is that most programs that depend on sorting data will use a data structure which can sort itself rather than trying to sort lists/arrays of such large numbers (especially when exceeding a certain threshold of data records). Consider if you're getting millions of records and need them in some order (potentially different from how they arrive), it makes more sense (especially with the IO delays) to spend the extra time on inserting into a sorted structure while waiting for the next piece of data to arrive. Interleaving the two activities, rather than waiting for all IO to finish and then sorting a massive list (forcing them to be serialized). The concurrent version can benefit from the inherent parallel nature of dealing with IO-bound and CPU-bound activities.


I'm not positive that most sorts are "online" with I/O that reads the data to be sorted. It's not what I wanted to imply anyway.

The kind of I/O I meant is just generally time consuming I/O (large amounts of data, too many small requests, too much synchronization) that most applications are doing. For other, non-I/O intensive applications, most of them likely have bottlenecks other than sorting.

In any case I wouldn't want to tie my data to an incremental sorting structure (like a search tree instead of a flat array) before I've ensured that this online aspect was critical. If it is not critical choosing a more complicated incremental structure is probably premature optimization which constrains the project structure, and thus is likely to negatively affect performance and maintainability in the end.


> It's the same story for types in general. They lead to wrong and bloated design and boilerplate code when you overdo it.

I wouldn't recommend overdoing anything, by definition. My experience however is that when done correctly type safety usually makes code both more elegant and more efficient. Like anything in programming it is often done poorly however and people sometimes seem to confuse types with OOP concepts in C++. A lot of people seem to worry about certain programming practices because they fear other people abusing them. Those discussions are of little interest to me. I'm interested in how I can best write good code, not worrying about what other people are doing.


std::sort and other templates make you pay in terms of header complexity (no circular includes for example) and compilation time.

For the most part it’s a good trade off, particularly std::sort and some of the other <algorithms>. But I’ve yet to work on a long lived low latency/high performance C++ project where some wizard coworker has not tanked the code base for purely theoretical gains that turn out to not meaningfully change the assembly or change it by an instruction or two.

Godbolt is a revelation when these situations arise, but good luck even then convincing someone who just spent 4 days writing a dispatch or whatever that it was all a waste.

I guess what I’m saying is that when it comes to template-oriented performance programming, std::sort is the exception, not the rule.

Or maybe the good templates are small and modular so their footprint is 1/10th that of a bad template such that we just don’t notice them as much.


I recently watched a documentary where Dijkstra says the opposite:

https://youtu.be/RCCigccBzIU?t=1125

"One of the things I discovered as early as the 1960s is that mathematical elegance is not a question of aesthetics, of taste or fashion, but something you can translate into a technical notion.

"The Concise Oxford Dictionary gives as one of the meanings of 'elegant' 'ingeniously simple and effective'.

"In practice a program is manageable if you make a truly elegant program, firstly because it is shorter than most alternatives and consists of discrete parts, each of which you can replace by an alternative implementation without influencing the rest of the program, but also, curiously, the most elegant programs are often the most efficient."


I've seen this tradeoff firsthand with dsp video algorithms. The naive code just implements the algorithm straight away. The performant version has to ensure that the inner loop all fits in cache while running. It also does tricks like prefetching data into cache so the code doesn't stall on a data load. These sort of tricks really impact the readability of the code.


I wouldn't say elegance here - there is something fundamentally elegant about the utilisation of 8-bit integers in scenarios where they can be used.

It is on the other hand, fundamentally at odds with accuracy - which is already well and widely known (at very least implicitly) and is the reason we so readily accept approximations in computing.


Those are apples and oranges. Efficiency can be measured, elegance is subjective.


I've been writing firmware to run a DC motor controller where this efficiency/elegance tradeoff is very apparent. I can write "slower" code in a very elegant way, such as reading the throttle input. But when it comes to implementing the high-frequency control loop with all its ADC readings and filtering, I have to trade much of my elegance for raw efficiency.


There is a lot of truth to that, although I also see a third trade-off, which is robustness to change (not so much in the code but rather the inputs).

I think you can have in some cases efficiency and elegance, but then the solution will be either too abstract or specialized that it won't really be practical, and a small change in problem conditions will cause it not to work. (An example from the article - a matrix solver that only does a well-conditioned systems so the floating point errors never accumulate.)

Similarly, you can have robustness to change and elegance without efficiency (mathematically beautiful code), and robustness to change and efficiency without elegance (lot of copy pasting the code to deal with special cases).

In any case, I think we are generally trying to present elegant code to humans (using more abstract programming languages) and then use compilers to convert the elegance into efficiency. But it requires that somebody takes the pains to actually deal with all the possible cases (and write a working compiler).


I find this discussion tiring, also I have actually seen people vividly defending both positions: that the most efficient code is also very elegant and the opposite, as here.

It depends so much on the situation, skill/experience of the person/team building the stuff and what focus one has.

Sometimes an elegant solution opens completely new perspectives on performance optimizations.


Symbolic manipulation is undecidable, hence programmers prefer numeric approaches. But, symbolic approaches are clearly superior if the programmer does them a priori. I can convert a completely intractable bruteforce combinatorics problem into a very tractable algorithm with some human symbolic preprocessing (i.e. math).


> Efficiency is fundamentally at odds with elegance

Strangely though their absence coincides in roughly 80% of all programmes, so "being at odds with" is meant as

(Efficiency -> NOT Elegance) AND (Elegance -> NOT Efficiency)

but not as

Elegance XOR Efficiency


What you are saying that there are a lot of programs that are neither elegant nor efficient. Few would dispute this, but I think this case is still covered by the original formulation.

For example I do not see "being drunk is fundamentally at odds with performing surgery" being invalidated by a bunch of folks writing programs in offices and fitting in neither bin. Caveat: I am not a native English speaker.


Here's an example that made sense to me

elegant (readable / compact)

function isPalindrome(str) { return str == str.split('').reverse().join(''); }

efficient (~50x faster than above)

function isPalindrome(str) { var len = Math.floor(str.length / 2); for (var i = 0; i < len; i++) if (str[i] !== str[str.length - i - 1]) return false; return true; }

I would probably say elegance and efficiency are not aligned rather than at odds IMO because they may overlap or may not, they don't have the same goals.


Does just str.reverse() not work, by the way.


I didn't see that on MDN but would be cool. I get 'abc'.reverse() exception that it is not a function.


I would love to see the author revisit this today. What would they think of Rust, or the current state of C#, for instance?

Would F# or Nim be a much faster Python with less ugly C FFI?



I must say, I disagree with the very first paragraph of this article. You need numerical recipes for any real-word problem whether you have full precision or not.


Bad first example. Automatic differentiation in arbitrary precision is as elegant as symbolic, and as efficient as "floating point".


This just seems like survivorship bias. An algorithm / tool / technology tends not to remain in use if something that is both more elegant and more efficient is available. Therefore, if you survey the well known options you'll see a negative relationship between elegance and efficiency. It doesn't imply a general relationship nor does it preclude the discovery of an approach that is both more elegant and more efficient than anything available today.


Simplicity is usually efficient and elegant.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: