
How can List be faster than native arrays? (2018) - lioeters
http://vindum.io/blog/how-can-list-be-faster-than-native-arrays/
======
Lerc
It seems notable that while map is mentioned there is no performance
measurement shown for map. To me The proof of the pudding would be in map
performance.

In my spare time I have been building various versions of the nbody benchmark
to try out different fundamental data methods, arrays, objects, immutable or
not etc. Every alternative version is, of course, slower than the original
mutating object version. In principle however, many forms should be
optimisable by a 'sufficiently smart JIT' to be at least as fast.

It would be nice at least for a JavaScript engine to be able to do in-place
maps on arrays defined from literals or single creation points like map/filter
etc. , and have only one copy.

For instance.

    
    
        {
           let x = [1,5,9];
           let q = x.map(a=>a+1)
        }
    

The JIT would have to determine that

. x held the only reference to the array.

. x.map is the last reference to x

. the array in x was not modified between creation and last reference.

. the used version of a=>a+1 returns the same type as it is passed

To place the result of the map into the same memory as the original array and
assign that to q.

Of course in this ultra simple example an optimizer could just figure out that
q is [2,6,10] and never used anyway so throw it away. but in real world code
you can frequently have arrays like this that are only ever used as a source
for a map, especially if you do
q.map(something).map(somethingelse).map(anotherthing).

~~~
lioeters
I found the benchmark mentioned in the article - here's the part comparing
map:

[https://funkia.github.io/list/benchmarks#map](https://funkia.github.io/list/benchmarks#map)

Apparently, List is ~3x faster than native array.map, with the difference
increasing with the number of elements.

I see that random access and iterator are two operations where native array is
fastest.

------
stabbles
"faster than native arrays" as in: native arrays crippled by the requirement
of immutability.

Appending to a "native array" is amortized constant time. Secondly almost all
operations on a "native array" will be faster because there is no overhead in
indexing, there is cache locality etc etc. Requiring immutability will almost
always reduce performance (yes, I'm aware there a couple operations like
concatenation that can potentially run in O(lg n) rather than O(n) when using
a tree representation for arrays).

~~~
microcolonel
It all comes down to how you're using them. Good tries can perform
dramatically better than std::vector-style vectors, given the same development
effort, on many kinds of system.

~~~
gjfytfh4256
That really strains credulity. What do you mean be dramatically better? Under
what read and write access pattern? Do you have specific examples where your
claim holds?

~~~
microcolonel
If your workload would normally involve regularly copying the vector, no
matter what, then it is an easy win. Other factors which can make it
worthwhile are access patterns where you have many similar or related vectors
which you access at random (especially if you can make some effort to
deduplicate those vectors).

------
Myrmornis
> There are a few operations where List is slower than native arrays. One of
> these is random accessing.

What about sequential access? Surely array gets a benefit from data locality,
where CPU caches pull in surrounding values automatically?

~~~
msclrhd
It will depend on the allocation method. For example, you could use a pooled
allocator that allocates a contiguous array of list items in a block, then use
those in order as required. That would help with cached locality. IIRC, the
Borland C++ STL implementation did something like this.

~~~
tfigment
The deque collection was this efficient. Allocated blocks based on pagesize
(4k). A truly beautiful data structure and one of the few good reasons to use
stl.

~~~
rwbt
Only the libc implementation of std::deque allocates 4KiB blocks and lives
upto it's performance and utility. The MSVC implementation is terrible (8
bytes) and even the GCC one isn't that efficient (512 bytes).

Even though boost has a customizable deque block size, it's performance is
still way behind Clang's libc.

~~~
Something1234
Wait why wouldn't std::deque allocate blocks as some multiple of the sizeof
the thing it holds?

~~~
tfigment
> Wait why wouldn't std::deque allocate blocks as some multiple of the sizeof
> the thing it holds?

It does in the semicontiguous 4k blocks. And at least one msvc was that smart
as i used it.

~~~
mycall
What is so magical about 4k?

------
AlexanderDhoore
As far as I know this is an RRB vector.

Here's the original paper:
[https://infoscience.epfl.ch/record/213452/files/rrbvector.pd...](https://infoscience.epfl.ch/record/213452/files/rrbvector.pdf)

A rust implementation:
[https://docs.rs/im/13.0.0/im/](https://docs.rs/im/13.0.0/im/)

C++ implementation:
[https://github.com/arximboldi/immer](https://github.com/arximboldi/immer)

The rust implemenation does a cool trick where the data is only copied when it
is shared. If you are the sole owner of the data structure it will simply
mutate it in place. So you don't lose any performance. But once you share it,
it becomes an immutable functional data structure. See
[https://docs.rs/im/13.0.0/im/#in-place-
mutation](https://docs.rs/im/13.0.0/im/#in-place-mutation)

------
unrealhoang
It can't. Try benchmark sequential access on array to see sub-nanosecond per
item. That's fast.

~~~
gjstein
From the article:

> Even though List is very fast across the board no data-structure can be the
> fastest at everything.

This feels like a case in which everything is sufficiently well-implemented
that the No Free Lunch Theorem [1] starts to play a role.

[1]
[https://en.wikipedia.org/wiki/No_free_lunch_theorem](https://en.wikipedia.org/wiki/No_free_lunch_theorem)

------
kccqzy
That's pretty disingenuous. The native arrays aren't designed for immutability
at all. Of course an immutable data structure would be faster than copying the
entire array every single operation.

------
keymone
Clickbait title. Fastest _doing what_?

~~~
ww520
Yeah. Array has well known poor performance on insertion. It's a strawman.

~~~
The_rationalist
Arrays are faster than linked list on insertions too, stop spreading erroneous
beliefs.

~~~
fgonzag
Why are you comparing the performance of a linked list instead of a relaxed
radix balanced tree as the article talks about?

------
danite
I'd recommend people read this discussion of linked list vs array performance:
[https://rust-unofficial.github.io/too-many-lists/#an-
obligat...](https://rust-unofficial.github.io/too-many-lists/#an-obligatory-
public-service-announcement)

TLDR; linked lists' theoretical performance advantages are often negated by
years of hardware optimization for dealing with arrays.

~~~
tomp
The original article is about lists implemented as tree-like data structures,
not linked lists (which are much slower for most operations).

