
Turn O(n^2) reverse into O(n) - ddinh
https://github.com/nominolo/HTTP/commit/b9bd0a08fa09c6403f91422e3b23f08d339612eb
======
drostie
Explanation:

bufOps is a dictionary which holds a bunch of functions accessed with the
getters on it. For the sake of this comment, we can concretize and use
(buf_empty bufOps) as [] and (buf_append bufOps) as ++.

This code then essentially performs:

    
    
        foldr (flip (++)) [] xs
    

Which, if you look up the definition of foldr, is:

    
    
        ((([] ++ xN) ++ ... ) ++ x2) ++ x1
    

And a definition of ++ is of course:

    
    
        [] ++ ys = ys
        (x:xs) ++ ys = x : (xs ++ ys)
    

This means that for lists of this sort a ++ b runs in time O(length a),
because it has to descend down the leftmost list to find the empty list --
only once it finds [] can it "work its way backwards" to append elements from
a onto b.

If each of the x1, x2, ... xN has m elements, then we do 0 + m + 2m + ... + N
m = m * N * (N + 1) / 2 operations. Each ++ will do about N operations and
we'll do about N of them; it's O(N^2).

The new algorithm, `concat (reverse xs)`, works because `xs` is just a list
which can be reversed by traversing down it in O(N) time, then those can be
merged together in O(N * m) time.

------
zvrba
So the original line of code has apparently been present since the very first
import by Sigbjørn Finne in
[https://github.com/nominolo/HTTP/blob/c4765e822eb92196fec955...](https://github.com/nominolo/HTTP/blob/c4765e822eb92196fec95560f8f640018d18737f/Network/HTTP/Base.hs)
(check line 443).

If a Haskell expert (e.g., he authored hdirect -- an IDL compiler and
interface with Win32 COM -- sadly defunct now) makes this kind of mistake, how
are mere mortals supposed to reason about algorithmic efficiency?

~~~
velis_vel
I don't think this was an 'oh my god we're trying to track down why cabal
update is so slow and we just can't find it' issue like you make it out to be.

edit:
[http://www.reddit.com/r/haskell/comments/1sh67u/the_reason_w...](http://www.reddit.com/r/haskell/comments/1sh67u/the_reason_why_cabal_update_takes_so_long/cdxnr5l)
This comment by the patch author indicates that actually tracking it down was
a fairly straightforward profiling job.

~~~
zvrba
I didn't try to imply that it was difficult to find the performance problem.
It's an observation on something else: Haskell is often mentioned for its
equational reasoning. I find it interesting that correct equational reasoning
in this case lead to a correct program, but one with absymal performance.

It'd be interesting to think how one could encode performance characteristics
into equations.

~~~
Peaker
I think having simple and direct denotational semantics necessarily means
having relatively complex, indirect operational semantics (e.g: Haskell).
Having a simple and direct operational semantics necessarily means having
complex and indirect denotational semantics (e.g: C).

C makes performance easy and correctness hard. Haskell makes correctness easy
and performance hard.

The trick is, in most programs, you only need performance for a tiny subset of
the program. You need correctness throughout the whole program.

~~~
zvrba
Why do you think that there must be a tradeoff between simplicity of
denotational and operational semantics?

~~~
Peaker
Because operational semantics (of contemporary computers, at least) are very
different from a simple denotational semantics.

We have to bridge that gap:

1) Either use a compiler that hides away the operational details

2) Or use a language that directly maps to the operational semantics, but then
is necessarily far away from the denotational ones

------
jberryman
Here is the context

[http://www.reddit.com/r/haskell/comments/1sh67u/the_reason_w...](http://www.reddit.com/r/haskell/comments/1sh67u/the_reason_why_cabal_update_takes_so_long/)

------
djulius
I guess nobody ever got upvoted for using StringBuffer/StringBuilder instead
of String concatenation in Java.

~~~
m0shen
In JDK 7 '+' concatenation actually uses StringBuilder under the hood, so
probably nobody cares.

I can't tell if you're being facetious.

~~~
djulius
Indeed I was.

It's a good safety barrier against hype things.

------
thinkpad20
I'd definitely like to see an explanation of what's going on, both at a high
level and the specifics of the code. As a Haskell beginner-intermediate, I
don't really know what most of those functions are doing (much less the
context of what that function's purpose is), but I feel I could probably
understand an explanation if it were given.

~~~
MBCook
I think I know enough Haskell to explain.

Old version:

    
    
        foldr (flip (buf_append bufOps)) (buf_empty bufOps) strs
    

foldr is a combination map/reduce in one. It takes a value (the accumulator),
runs a function on the rightmost value in a list called str (that's why it's
'foldr'), and combines the two together into a new accumulator value.

Let's say we have a list of numbers, the addition operator, and an accumulator
of 0.

    
    
        foldr op acc list
        foldr + 0 [1, 2, 3, 4]
        a = 0, list = [1, 2, 3, 4]
        a = 0 + 4, list = [1, 2, 3]
        a = 4 + 3, list = [1, 2]
        a = 7 + 1, list = [1]
        a = 8, list = []
        Result is 8
    

One thing to note is that this walks the list 'backward', and in Haskell it's
a singly linked list. That means that each time it has to _walk the full
list_. I believe this is why it's O(n^2).

It looks like the accumulator starts with (buf_empty bufOps), which from
context I assume is an empty buffer structure. The operator is (flip
(but_append bufOps)), which looks like it adds values onto the bufOps
structure when called. Each time it goes through this process, it appends one
element.

Because they're using foldr and the list is walked 'backward', this has the
natural side effect that the order of the items in strs is reversed as it's
added to bufOps.

New version:

    
    
        buf_concat bufOps $ reverse strs
    

The $ is a way of controlling precedence in Haskell, you can replace it with
parenthesis. So we get:

    
    
        buf_concat bufOps ( reverse strs )
    

So reverse just reverses the order of a list, getting things in the order the
code wants them. Then it calls buf_concat with the current structure and the
items it wants added. So instead of taking one existing list and adding 6 new
elements at the end one at a time, it takes one existing list and adds a list
of 6 elements to the end once.

It seems like a relatively simple change. I wonder if using foldl or foldl' to
avoid having to rewalk the list every time would have performance similar to
the new line.

I find the new line simpler and cleaner either way, but was the choice of a
_right_ -fold the cause of the performance problem?

~~~
quchen
> One thing to note is that this walks the list 'backward'

No, foldr walks through the list forward, and exactly once. Your example is
evaluated as

    
    
        foldr (+) 0 [1, 2, 3, 4]
        = 1 + foldr (+) 0 [2,3,4]
        = 1 + (2 + foldr (+) 0 [3,4])
        = 1 + (2 + (3 + foldr (+) 0 [4]))
        = 1 + (2 + (3 + (4 + foldr (+) 0 [])))
        = 1 + (2 + (3 + (4 + 0)))

~~~
e12e
Is there an (easy) way to see what/how haskell evaluates such code (ie: step
through)? Something like explain from sql or disassemble from lisp?

~~~
jerf
GHCI has a debugger. You can step into an expression and watch as Haskell
evaluates it, in full lazy order.

Even if you understand lazy evaluation in theory, I recommend taking some non-
trivial (but not too large) expression and :step'ing through the whole
evaluation sometime, just to train your intuition.

------
tibbon
My Haskell skills still are growing. Can someone explain?

~~~
seliopou
The original line was something like this:

    
    
      acc = "";
      for (var i = 0; i < strings.length; i++) {
          acc += strings[i]
      }
    

The new version is something like this:

    
    
      strings.join("")
    

The second version can pre-allocate a string and copy characters into that
string under the hood. The first version has to allocate a new string and
recopy the characters on every iteration.

~~~
dmak
How is the original (the for loop that you mentioned) O(n^2)? Isn't that O(n)?
Isn't acc += strings[i] a O(1) operation?

~~~
MBCook
But strings.length is O(n), and you have to run it n times.

~~~
comex
No, it is not.

~~~
jefffoster
With the naive representation of a string as a list of characters [Char], it
is. The only way to get to the end of the list is to recursively take the tail
of the string (see the source at
[http://hackage.haskell.org/package/base-4.6.0.1/docs/src/GHC...](http://hackage.haskell.org/package/base-4.6.0.1/docs/src/GHC-
List.html#length)).

There's alternative representations with different trade offs, such as
Data.Text

~~~
comex
Heh... but I was referring to the example seliopou gave, which appears to be
in JavaScript, and strings is [String] rather than [Char]. I think the analogy
might cause more confusion than clarification.

------
seliopou
I for one prefer the more "algebraic" implementation. Why worry about
performance when you get theorems for free?!

Monads.

~~~
oakwhiz
Why not upgrade the compiler to detect the algebraic implementation and
optimize it to the buffer implementation?

~~~
tikhonj
It actually does this in a bunch of cases; see Haskell's stream fusion[1] and
rewrite rules in general[2] for examples.

However, there aren't any rewrite rules for list-based code, at least with the
standard library. Rewrite rules are generally used for libraries designed
explicitly with performance in mind like Vector and Bytestring; the standard
String type and the Prelude don't fall into this category.

[1]: [http://stackoverflow.com/questions/578063/what-is-
haskells-s...](http://stackoverflow.com/questions/578063/what-is-haskells-
stream-fusion)

[2]:
[http://www.haskell.org/ghc/docs/7.0.1/html/users_guide/rewri...](http://www.haskell.org/ghc/docs/7.0.1/html/users_guide/rewrite-
rules.html)

~~~
lmm
Obvious follow-up question: why does cabal not use more performance-oriented
libraries? Does it have to avoid depending on anything because it's the
dependency manager?

------
anonymouscowar1
Do people try and optimize Haskell programs? This is a part of Haskell that
terrifies me.

~~~
seliopou
Look. I love Haskell. I usually formulate solutions to programming problems in
my head in Haskell before I write code in whatever language I have to deal
with. Constructing programs using combinators? Great. Assuming referential
transparency everywhere? Great. It's a great language.

But there comes a point in time where you have to project that stuff all away
and get down to the brass tacks of what's happening in your system, leaving
behind the Platonic heaven you've constructed for yourself. As the patch
proves without a doubt, this is very possible to do in Haskell. But sadly, the
general ethos of the Haskell community does not bend in that direction. I am a
lover of the language, but at the same time a vocal critic of the community in
that respect.

~~~
tikhonj
Given the amount of effort and mindshare that libraries like Vector,
Bytestring and Repa get in the Haskell community, I really don't see where you
got your impression. Not everyone cares about performance, sure, but _plenty_
of people do leading to some heavily optimized libraries.

In this day and age, having everyone worry about performance is patently
absurd. Most programs don't need to be heavily optimized, so prioritizing
programmer efficiency and correctness just makes sense. And Haskell _does_
have the libraries, language features and people to optimize the parts that
need optimization!

------
thomasahle
Doesn't hlint catch stuff like this?

~~~
gwern
How would hlint catch this? The offending line is

    
    
        foldr (flip (buf_append bufOps)) (buf_empty bufOps) strs
    

hlint doesn't know what 'buf_append bufOps' is, or 'buf_empty bufOps' or
'strs'.

If you saw 'foldr (flip (u v)) (x y) z', would you immediately go 'ah, n^2!
better throw a warning'? If not, why do you expect hlint to warn?

------
taybin
Yep, this can also bite you in Erlang.

