
When You Should Use Lists in Haskell (Mostly Not) - luu
http://www.imn.htwk-leipzig.de/~waldmann/etc/untutorial/list-or-not-list/
======
gamegoblin
Something the article does not touch on is the huge cache performance penalty
you pay with linked lists. With a linked list, all of the nodes are scattered
around your heap. So whenever you iterate a linked list, you are accessing
random memory locations. Random access is not really cacheable.

Compared to a contiguous memory structure such as a C-style array (and its
object oriented derivatives: vector, ArrayList, etc), iterating a linked list
is frequently 3-10x slower.

This is because the CPU will load contiguous chunks of memory into cache. On
x86, the cache block size is 64 bytes. So for an array of 4 byte ints, you are
loading 16 ints into cache at a time. With a linked list, you will likely get
a cache miss on most nodes and have to access RAM. Access time for L1 cache is
0.5 nanoseconds, access time for RAM is ~100 nanoseconds.

~~~
rbehrends
> Compared to a contiguous memory structure such as a C-style array (and its
> object oriented derivatives: vector, ArrayList, etc), iterating a linked
> list is frequently 3-10x slower.

Careful. I've actually done some recent benchmarking (in OCaml, not Haskell),
and the results don't necessarily bear that out, especially if you need
dynamic arrays rather than static arrays.

A particular use case is the creation of dynamically sized temporary data. A
bump allocator will put list nodes in a contiguous section of memory, but does
not require resizing. A dynamic array may be resized several times, requiring
additional copying and allocations (and possibly expensive major heap
allocations).

As a result, the relative performance difference not only may not be as
dramatic, but can actually swing in the favor of linked lists.

Note that I'm not saying that performance degradation due to poor memory
locality of linked lists won't happen, just that there are plenty of use cases
where this is not a problem.

~~~
wcummings
You can allocate more space than you might need to avoid resizing. It's not
uncommon for Java programmers to pass a size when initializing empty array
lists, knowing they will be filled.

~~~
rbehrends
I am well aware of that, but you often don't have that information.

------
prestonbriggs
Grrr. None of this (article or most of the comments) has much to do with
Haskell. It applies to all languages, right? You implement an abstraction
using an appropriate data structure, accounting for the frequency of various
operation you plan to perform. Often a list will be the wrong data structure -
no big surprise.

But if you'd like to implement a stack, perhaps with a lot of elements, then a
singly linked list is a reasonable choice. But you need to think about the
operations you need. If you're just pushing, popping, looking at the top
element, checking for empty, then great!

If you need to check the length, then you'll need to keep a counter, otherwise
you'll pay O(n) time. If you need to make a copy for read-only purposes,
great, constant time. If you need to make a copy for destructive use, then
you'll have to pay O(n) time. This is all Mom-n-apple pie.

~~~
xelxebar
You're absolutely right, but I think the article was mostly griping about
Haskell list syntax and how it can lure less experienced programmers into
using lists where they're inappropriate.

------
nilved
The first-class status of linked lists in Haskell really bothers me. In Idris
there's no special syntax for lists; lists of `a` are denoted `List a` and
`[]` is simply syntactic sugar for the `null` function of that module. This
means `List` and other structures like vectors are on the same level.

~~~
teh
I do agree but I also remeber that Haskell is 27 years old. Newer Haskell-
inspired languages like PureScript don't have a built-in list any more.

There's a lot of old stuff in Haskell, e.g. String which is a list of char. We
have a number of new preludes (base, foundation, protolude, ..) that improve
the situation a lot, so I'm not sure we really need a "python 3" moment.

We could definitely be more aggressive in pointing out that you need to use a
new prelude though.

~~~
nilkn
Any recommendations on which new prelude to use? I'm fairly competent with
Haskell but have never looked into these and am feeling decision paralysis --
too many choices.

~~~
nilved
I've tried a bunch of alternative prelude and my experience is that it makes
it very hard to integrate with code that uses the standard prelude. Foundation
seems to have the highest chances of success right now, ClassyPrelude seems to
be the most well-used, and Protolude seems to be more like a framework to
build your own prelude.

~~~
massysett
I've never understood the hard-to-integrate argument; I can still do `import
qualified Prelude as P`.

~~~
nilved
Right, but you need to do explicit conversions between `Foundation.String` and
`Prelude.String` at your app's boundaries (for example.)

------
kevinclancy
I think the author has made a false assumption about how lists are typically
used in Haskell. Being algebraic datatypes, lists in Haskell are built on a
foundation of universal algebra. The operations that you perform on them are
inherently inductive: process the head and recurse on the tail. The most
common tasks in functional programming, such as program analysis, fit into
this paradigm nicely. So including "(Mostly Not)" in the title is misguided;
the designers of Haskell and other functional languages knew what they were
doing when they decided to make lists prominent in accessible in their
languages.

~~~
wrs
His point is Foldable is the universal thing -- applying an inductive
operation. List, on the other hand, is a particular choice of Foldable data
structure which is seldom the appropriate choice. The language has now
separated those two concepts, but the examples and tutorials still present
them as tied together.

~~~
theoh
Essentially, the desirable and real elegance of recursion is the thing that
got fused with the slightly simplistic notion of a linked list in the minds of
the language designers, then?

~~~
chongli
Haskell lists are not linked lists though, they're streams. In a typical
strict language any list you create goes straight on the heap. Haskell, being
a lazy language, is only going to evaluate what actually needs to be
evaluated. If you compose a whole bunch of functions on lists Haskell will
fuse them, avoiding the allocation of all those intermediate lists.

Haskell does also have a Stream data type which is similar but features a
different set of tradeoffs.

~~~
theoh
They have a nested structure, is all I meant by "linked list". I know Haskell
can deal with infinite streams (e.g. by generating the values with
corecursion) but the basic "singly-linked" character remains. They aren't
doubly-linked, easy to traverse in both directions: they are recursion-
centric, as the article stated.

The notorious/ingenious idea of zippers exists to facilitate sane navigation
and (effectively) mutation of data structures. It deals with precisely the
issue of the pointers in a functional data structure pointing in inconvenient
directions...

------
sulam
My favorite data structure has a list and a map under the hood. Lists are used
precisely for the ordering, and you constantly index into the list, but only
with direct pointers. I dunno how you'd implement it in Haskell, but lists are
actually a really good building block -- although you often can replace them
with arrays if you're clever enough.

------
HugoDaniel
Insertion and removal from the start is __very __fast, one of the fastest data
structures for insertion in Haskell. While keeping lazyness and persistency.
Making them appropriate for a bunch of common use cases in functional
programming /recursion/logging/timetraveling/etc...

The author somehow forgot to mention that.

------
harpocrates
Haskell lists should be thought of like (pure immutable) iterators. They have
the same cache-misses and possibly-infinite properties.

One could argue that the mistake Haskell made was to call them "lists" instead
of "streams", and to make it so easy to make list literals.

------
mixedCase
>NET::ERR_CERT_WEAK_SIGNATURE_ALGORITHM

Site needs to upgrade its certs.

~~~
paulddraper
Huh? What's being loaded over HTTPS?

~~~
kasbah
mixedCase is likely using HTTPS everywhere which redirects this page to the
https version.

[https://www.eff.org/https-everywhere](https://www.eff.org/https-everywhere)

~~~
mixedCase
This is correct. On HTTP-only sites I'm not redirected but this site does seem
to have HTTPS, just not properly configured for modern browsers.

~~~
paulddraper
They've also got telnet and email servers running too, if you want to check
those out.

