
Understanding Clojure's Persistent Vectors, Pt. 3 - JBiserkov
http://hypirion.com/musings/understanding-persistent-vector-pt-3
======
ScottBurson
At the beginning of part 1, the author says that this data structure provides
" _practically_ O(1) runtime" (emphasis added) for the important operations of
lookups, insertion, updating, etc. At the end it is explained how it's
"O(log32 n)", which of course is actually O(log n).

And then here in part 3 where he keeps saying " _effectively_ constant time*
(emphasis in original this time), what he means is O(log n) time.

The Clojure people really need to stop doing this. Big-O notation is intended
to describe _how the time required for an operation grows_ as the number of
items being handled gets large. It is precisely _not_ intended to communicate
the absolute time the operation takes in small cases. They are confusing
people when they try to communicate both pieces of information at the same
time. What they should say is something like "lookup takes O(log n) time, and
for seqs of size up to N items, it takes M times as long as ArrayList.get".

And then, of course, to describe O(log n) time as "effectively constant" is
just lying.

(Second and fourth paragraphs added in edit.)

~~~
TheLoneWolfling
I have to disagree with you here.

Let me put it this way: There are theoretical limits to the upper bound of
problem size. The number of bits storable in the (observable) universe is
only, what, about 10^120? [1]

log32(10^120) is only around 80. So if you have a constant-time algorithm that
runs around 80x slower than a logrithmic algorithm for small cases, the
constant-time algorithm will _never_ be faster. Period. And I've seem
constant-time algorithms that take >80x the speed of theoretically
asymptotically slower algorithms. I mean: a single cache miss can cause that
level of slowdown.

Or, to put it another way, log32(n) <= 80 for all achievable n. So,
realistically, you might as well consider it effectively O(80), i.e. O(1). You
have something similar, albeit more extreme, for inverse Ackermann time.

[1]
[http://physics.aps.org/story/v9/st27](http://physics.aps.org/story/v9/st27)

~~~
ScottBurson
That argument applies to any O(log n) algorithm. The base of the logarithm
doesn't matter.

~~~
TheLoneWolfling
Well, yes and no.

Yes, in that the argument applies.

No, in that the size is larger. It's more reasonable to say one algorithm is
80x faster than another of the same complexity than to say one algorithm is
530x faster than another of the same complexity.

------
tolmasky
I'm curious if anyone who has read (or is familiar with the data structures
described in) Purely Functional Data Structures[0] can weigh in on how this
persistent vector compares to say, the real time deque which is described to
have _worst case_ runtime of O(1) for cons/head/tail/snoc/last/init (along
with plenty of other structures which have similar amortized time guarantees,
etc.) Do these structures not have practical performance characteristics, or
is there another reason a O(log) solution was chosen? Like the other poster
mentioned here, its a bit upsetting having read something that rigorously
analyzes worst time behavior to have an article hand wave away log32 as
"effectively constant time". Especially because amortization is even trickier
to consider with persistent data structures since you have to deal with the
fact that you may have many operations taking place on your worst case node
(several pushes on your edge case base vector, etc). Perhaps these tries
actually account for that very elegantly, but I certainly don't know enough to
intuit it from what I've seen here.

0\. [http://www.amazon.com/Purely-Functional-Structures-Chris-
Oka...](http://www.amazon.com/Purely-Functional-Structures-Chris-
Okasaki/dp/0521663504)

Edit: PDF version:
[http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf](http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf)

~~~
Kutta
The key point here is that Clojure-style vectors have logarithmic random
reads/writes, while Okasaki's aforementioned deques do the same in linear
time. Clojure uses the persistent vector by default, so it's essential that
people can port their usual imperative vector algorithms and expect to get
acceptable performance. Hence the focus on random access.

As to amortization, operations on Clojure-style vectors have pretty much
_zero_ variance (modulo the size of the vector). There is no resizing, no
reordering, no rotation or whatever, it's just the exact same algorithm every
time.

------
lisper
I always thought "persistent" meant "stored in non-volatile storage" but this
author seems to be using it as a synonym for "immutable" or "functional" which
seems really weird to me. Is this a Clojure-ism?

~~~
augustl
In the context of data structures, persistent has a very specific meaning. So
it's not really a Clojure-ism.

[https://en.wikipedia.org/wiki/Persistent_data_structure](https://en.wikipedia.org/wiki/Persistent_data_structure)

It's not a synonym for immutable or functional. A persistent data structure is
one that shares structure with other data structures.

~~~
ajanuary
> A persistent data structure is one that shares structure with other data
> structures.

No, a persistent data structure is one where, when you apply an operation that
would ordinarily modify it, you instead get back a new data structure with the
operation applied, and the old data structure appears the same.

I could make an array persistent by copying the entire array each time. No
structural sharing at all.

Of course, you want structural sharing because it makes things a heck of a lot
more efficient, but it's not a part of the definition of a persistent data
structure.

~~~
lisper
Then I'm still confused. Is "persistent data structure" a synonym for "purely
functional data structure" in the sense of Oakasaki96?

[http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf](http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf)

~~~
ajanuary
Persistence is defined in terms of the interface to an abstract data type.
When I do a write operation on a data structure, I get a new version back, and
all read operations on the old version still return the same values.

    
    
        version1 = make_record(x = 10)
        v1_x = get_x(version1)                 ; 10
        version2 = update(version1, x = 20)
        v2_x = get_x(version2)                 ; 20
        get_x(version1) == v1_x                ; true
    

The update operation is free to mutate the original data structure however it
likes, so long as from the outside the read operations look the same.

This freedom to mutate is often used to make efficient persistent data
structures by re-using the same memory but tagging different parts of it with
a version.

    
    
        version1 = make_record(x = 10)
    
                   +--------+--------+      +-----+   +--------+---------+--------+
        version1 ->| tag: 1 | data: -+----->| x: -+-->| tag: 1 | val: 10 | next: -+---|
                   +--------+--------+      +-----+   +--------+---------+--------+
    
    
        version2 = update(version1, x = 20)
            
                   +--------+--------+      +-----+   +--------+---------+--------+   +--------+---------+--------+
        version1 ->| tag: 1 | data: -+--+-->| x: -+-->| tag: 1 | val: 10 | next: -+-->| tag: 2 | val: 20 | next: -+---|
                   +--------+--------+  |   +-----+   +--------+---------+--------+   +--------+---------+--------+
                   +--------+--------+  |
        version2 ->| tag: 2 | data: -+--+
                   +--------+--------+
    

When version2 is created, it really just appends to the same data structure as
version1, but tags the cell as a new version. When you do get_x(version1) it
follows down the list, finds the cell with tag 1, and returns it's value. When
you do get_x(version2) it follows down the list until the cell with tag 2 and
returns it's value.

From the outside, get_x(version1) still returns the same value before and
after the update, but the data structure it refers to has been mutated
underneath it.

There are also ways to use structural sharing in persistent data structures
where the you don't do any mutation of the old versions. For examples of this,
see the linked article.

Functional data structures forbid mutation. This means they must be
persistent, because update operations have to return new data structures, and
the old data structures aren't allowed to be mutated so by definition all read
operations on them still return the same values.

Functional data structures end up forming a subset of fully persistent data
structures. They're a useful category for a lot of the same reasons people are
touting pure functional languages are useful. For instance, you never have to
worry about locks when you have concurrent access. With the mutating example
above, if one thread does update(version1, x = 20) and another does
update(version1, x = 30) I need to coordinate which one gets tag 2 and which
one gets tag 3.

Persistent data structures are the more general concept, functional data
structures are a specialized subset.

