Hacker News new | past | comments | ask | show | jobs | submit login
How can List be faster than native arrays? (2018) (vindum.io)
47 points by lioeters 73 days ago | hide | past | web | favorite | 27 comments

It seems notable that while map is mentioned there is no performance measurement shown for map. To me The proof of the pudding would be in map performance.

In my spare time I have been building various versions of the nbody benchmark to try out different fundamental data methods, arrays, objects, immutable or not etc. Every alternative version is, of course, slower than the original mutating object version. In principle however, many forms should be optimisable by a 'sufficiently smart JIT' to be at least as fast.

It would be nice at least for a JavaScript engine to be able to do in-place maps on arrays defined from literals or single creation points like map/filter etc. , and have only one copy.

For instance.

       let x = [1,5,9];
       let q = x.map(a=>a+1)
The JIT would have to determine that

. x held the only reference to the array.

. x.map is the last reference to x

. the array in x was not modified between creation and last reference.

. the used version of a=>a+1 returns the same type as it is passed

To place the result of the map into the same memory as the original array and assign that to q.

Of course in this ultra simple example an optimizer could just figure out that q is [2,6,10] and never used anyway so throw it away. but in real world code you can frequently have arrays like this that are only ever used as a source for a map, especially if you do q.map(something).map(somethingelse).map(anotherthing).

I found the benchmark mentioned in the article - here's the part comparing map:


Apparently, List is ~3x faster than native array.map, with the difference increasing with the number of elements.

I see that random access and iterator are two operations where native array is fastest.

Functional language compilers are sometimes capable to eliminate intermediate lists, and possibly reuse the original list storage. IIRC the optimization is called deforestation.

"faster than native arrays" as in: native arrays crippled by the requirement of immutability.

Appending to a "native array" is amortized constant time. Secondly almost all operations on a "native array" will be faster because there is no overhead in indexing, there is cache locality etc etc. Requiring immutability will almost always reduce performance (yes, I'm aware there a couple operations like concatenation that can potentially run in O(lg n) rather than O(n) when using a tree representation for arrays).

It all comes down to how you're using them. Good tries can perform dramatically better than std::vector-style vectors, given the same development effort, on many kinds of system.

That really strains credulity. What do you mean be dramatically better? Under what read and write access pattern? Do you have specific examples where your claim holds?

If your workload would normally involve regularly copying the vector, no matter what, then it is an easy win. Other factors which can make it worthwhile are access patterns where you have many similar or related vectors which you access at random (especially if you can make some effort to deduplicate those vectors).

> There are a few operations where List is slower than native arrays. One of these is random accessing.

What about sequential access? Surely array gets a benefit from data locality, where CPU caches pull in surrounding values automatically?

It will depend on the allocation method. For example, you could use a pooled allocator that allocates a contiguous array of list items in a block, then use those in order as required. That would help with cached locality. IIRC, the Borland C++ STL implementation did something like this.

The deque collection was this efficient. Allocated blocks based on pagesize (4k). A truly beautiful data structure and one of the few good reasons to use stl.

Only the libc implementation of std::deque allocates 4KiB blocks and lives upto it's performance and utility. The MSVC implementation is terrible (8 bytes) and even the GCC one isn't that efficient (512 bytes).

Even though boost has a customizable deque block size, it's performance is still way behind Clang's libc.

Wait why wouldn't std::deque allocate blocks as some multiple of the sizeof the thing it holds?

> Wait why wouldn't std::deque allocate blocks as some multiple of the sizeof the thing it holds?

It does in the semicontiguous 4k blocks. And at least one msvc was that smart as i used it.

What is so magical about 4k?

If you laid down all the list nodes in a contiguous array in the correct order, sequential iteration would still be slower than an array scan.

As far as I know this is an RRB vector.

Here's the original paper: https://infoscience.epfl.ch/record/213452/files/rrbvector.pd...

A rust implementation: https://docs.rs/im/13.0.0/im/

C++ implementation: https://github.com/arximboldi/immer

The rust implemenation does a cool trick where the data is only copied when it is shared. If you are the sole owner of the data structure it will simply mutate it in place. So you don't lose any performance. But once you share it, it becomes an immutable functional data structure. See https://docs.rs/im/13.0.0/im/#in-place-mutation

It can't. Try benchmark sequential access on array to see sub-nanosecond per item. That's fast.

From the article:

> Even though List is very fast across the board no data-structure can be the fastest at everything.

This feels like a case in which everything is sufficiently well-implemented that the No Free Lunch Theorem [1] starts to play a role.

[1] https://en.wikipedia.org/wiki/No_free_lunch_theorem

That's pretty disingenuous. The native arrays aren't designed for immutability at all. Of course an immutable data structure would be faster than copying the entire array every single operation.

Clickbait title. Fastest doing what?

Yeah. Array has well known poor performance on insertion. It's a strawman.

Arrays are faster than linked list on insertions too, stop spreading erroneous beliefs.

Why are you comparing the performance of a linked list instead of a relaxed radix balanced tree as the article talks about?

Last time I checked it was slower than linked list for an LRU I wrote.


I'd recommend people read this discussion of linked list vs array performance: https://rust-unofficial.github.io/too-many-lists/#an-obligat...

TLDR; linked lists' theoretical performance advantages are often negated by years of hardware optimization for dealing with arrays.

The original article is about lists implemented as tree-like data structures, not linked lists (which are much slower for most operations).

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact