I don't understand all the hate about the post, one of the referenced articles i...

gsg · on May 22, 2013

That's a much better, much more careful article. For example:

> Just in case my Mea Culpa and Introduction did not cover it. Let me be clear: This article is not disqualifying the linked-list for other types of usages, such as when it contains types that have pointers to other types, large and expensive-to-copy types (and yes, POD can also be too large).

The OP displays no such nuanced understanding.

betterunix · on May 22, 2013

I do not think our theoretical models diverge from reality; the theoretical models are based on how the performance relates to the size of the input, and remain valid even if you have a very fast computer. What modern architectures have done is to change the constant factors a bit; that just raises the threshold for problem sizes, but it does not eliminate it.

Sure, there are cases where asymptotic analysis can be dismissed. Matrix multiplication comes to mind: the fastest algorithms have impractically large constant factors. Usually, though, asymptotic analysis does matter, because it is often the case that input sizes will grow unexpectedly.

stiff · on May 22, 2013

This has nothing to do with simply having a "fast computer", it would be all fine if we just used 486s clocked at 10Ghz, but we don't, the changes in the architecture of CPUs do indeed make the theoretical model more diverged from reality, because it presupposes a lot of things, like that every instruction takes the same time to execute, also implying that instruction execution times are independent, that one instruction gets executed at a time, etc. If those assumptions do not hold you cannot assume the factor is really constant, and even if it were so, you cannot dismiss this difference indefinitely on this ground, if on datasets of the size commonly encountered in practice the algorithms with a larger asymptotic complexity start outperforming the ones with a smaller one. For matrix multiplication this was known for years, for linked lists it is a relatively new development that came with larger and faster CPU caches etc.

betterunix · on May 22, 2013

Except that the theoretical model is not affected by any of the things you mentioned. The existence of caches, branch prediction, instruction reordering, parallel execution, etc. does not chance as problem sizes change. If your inputs become large enough, you will reach the upper limit of your CPU's ability to speed up execution with those features. In the limit, the asymptotics still matter, and experience has shown that the limit is not at all far-fetched in most cases.

What makes matrix multiplication an exceptional case is that the constant factor on the best known algorithm is so large that we do not know how to build a computer with enough memory to store a problem large enough to overcome that constant. That is not the case with this analysis of linked lists; all one can say is that the data sets chosen in that particular article (possibly representative of the most common data sets) are not big enough. One can certainly store a large enough data set to overcome the advantages arrays have, and so the only real question is, "Is it possible that the inputs will be so large?"

Maybe the answer is truly, "No, that is unlikely." I am skeptical, though, as there are not many cases where such statements can be made. Even software written for embedded systems that target specific products is likely to be repurposed for new systems with different inputs. Even researchers, who write software for the particular datasets sitting on their hard drives, often re-use their code in future work. There are "border" cases, like integer multiplication, but typically libraries will just select the best algorithm for a given input size (e.g. for multiplication, you'll probably only see FFT methods applied above a particular threshold). Perhaps linked lists are now a "border" case, but all that would mean is that we need to use abstract "sequence" operations that dynamically choose particular implementations to use as their sizes change.

LesZedCB · on May 22, 2013

I don't think he means big-O notation as a theoretical model... Processors are not the same as they were when they were single core machines. The entirety of computing and thinking about problem solving has become different than it was before 2005 because of multicore processor architecture. Projects like parallela make this obvious. Programmers in general don't understand how much this makes things that weren't a big deal 10 years ago a huge deal now. And stuff with threading is not even the tip of it. Threading is trivial compared to real parallelism. Our model of computation is different than it is now, and will continue to diverge until everybody starts thinking with parallel as a first priority. That's the most convincing argument the OP makes about why linked lists are bad. In parallel computing, they are definitely hard/bad.

stiff · on May 22, 2013

http://en.wikipedia.org/wiki/Asymptotically_optimal_algorith...

Sometimes vague or implicit assumptions can make it unclear whether an algorithm is asymptotically optimal. For example, a lower bound theorem might assume a particular abstract machine model, as in the case of comparison sorts, or a particular organization of memory. By violating these assumptions, a new algorithm could potentially asymptotically outperform the lower bound and the "asymptotically optimal" algorithms.

See also:

http://en.wikipedia.org/wiki/Abstract_machine

http://en.wikipedia.org/wiki/Random-access_stored-program_ma...

http://en.wikipedia.org/wiki/Cache-oblivious_model

etc.

As far as I know all asymptotic analysis has to be done using some abstract machine model.

betterunix · on May 22, 2013

Sure, but how do modern architectures not fit into the RASP machine model? You can view the cache contents as being part of the state (rather than the memory); you can similarly view instructions as being part of sequences, so that a single instruction can mean different things depending on the state from which it is fetched. Other modern features can be similarly approached (with the exception of CPUs whose behavior depends on environmental factors like temperature, but that is an edge case).

Really, if you doubt that the RASP model is appropriate for modern architectures, you can test it (a typical exercise in an algorithms course) -- see if, as the input size grows, the timing follows the asymptotic analysis. That is basically what the article you linked to does, and the results are not all that surprising -- where things are linear time in theory, they are linear time in practice; where they are quadratic time in theory, they are quadratic time in practice. It is worth pointing out that in all but the last example, the list and vector operations had the same complexity (because of the linear search), so it was really a comparison between constant factors.

yyqux · on May 22, 2013

Asymptotically, sure, you're right. Constant factors are often important in practice, and simple cost models (e.g. ones that don't model cache locality) will no longer give you a decent estimate of constant-factor differences in performance between algorithms.

I think the issue here is that, in the past, with shallower cache hierarchies, models that assumed a constant cost per memory access would maybe be off by smallish factor (I don't know, maybe 50%).

However, now memory access is frequently the limiting factor for an algorithm, and there can easily be an order of magnitude in variation between the average memory access latency for different algorithms (i.e. cache-smart versus cache-dumb).