
Python Patterns: An Optimization Anecdote - llambda
http://www.python.org/doc/essays/list2str.html
======
Xion
Bear in mind this is extremely outdated article and some of the advice should
be taken with very big grain of salt, as things have changed since 2002. The
most significant improvement of Python (in this context) is the introduction
of generators, which seem like an almost obvious choice for a problem such as
this. The solutions that immediately came to my mind, for example, were this:

    
    
        import itertools
        def f2_1(list):
            return str(itertools.imap(chr, list))
    

and this:

    
    
        def f2_2(list):
            return str(chr(c) for c in list)
    

They are both about twice as fast (in 2.7.2 CPython) as the suggested array
solution (f2_1 performs slightly better for smaller lists), and their
readability is rather significantly better.

~~~
todddeluca
To avoid confusing newbies, str() does not concatenate a list (or generator,
etc.) of strings into a single string. You probably meant to call ''.join()
instead.

    
    
      >>> import itertools
      >>> def f2_1(list):
      ...         return str(itertools.imap(chr, list))
      ... 
      >>> f2_1(range(10))
      '<itertools.imap object at 0x10048e850>'
      >>> def f2_2(list):
      ...         return str(chr(c) for c in list)
      ... 
      >>> f2_2(range(10))
      '<generator object <genexpr> at 0x10046b780>'

------
epidemian
I'm a total newbie at Python, but wouldn't

    
    
      ''.join(map(chr, list))
    

avoid the quadratic time of doing "string += something" (and also be a
prettier solution IMO)?

(In Python 2.x doing ''.join(chr(n) for n in list) might be a better solution
to avoid the intermediate list created by map)

~~~
sgk284
The article is from 2002, so it's entire line of reasoning is quite outdated.

------
casenelson
Just a note that there was good discussion about this before [1].

Remember that this essay is quite old (2002) and Python has changed
significantly. [2]

[1] <http://news.ycombinator.com/item?id=1859417>

[2]
[http://keithdevens.com/weblog/archive/2002/Apr/10/AnOptimiza...](http://keithdevens.com/weblog/archive/2002/Apr/10/AnOptimizationAnecdote)

~~~
machrider
Thanks for the links. The most interesting thing in the previous HN thread was
learning that this article is a) really old (probably outdated), and b)
written by Guido!

------
njharman
It's troubling to read something about Python optimization/performance without
mention that this is atrocious in a loop

string = string + s

That is such a code smell I cringed every time I saw it.

Should append to list and join at the end (once). Which, I bet (I, or you,
don't know anything until we actually profile), dumping the += is significant
reason why last two examples where _much_ faster.

~~~
chubot
I still follow that rule, perhaps somewhat anachronistically. This has been
long optimized in CPython so it's not quadratic:

$ python -V

Python 2.5.2

$ python -m timeit -s 'v=["a"]' 'v.append("a")'

1000000 loops, best of 3: 0.199 usec per loop

$ python -m timeit -s 's="a"' 's+="a"'

10000000 loops, best of 3: 0.127 usec per loop

The string append is actually FASTER now. Tried with strings of length 5 and
it is still faster.

------
todddeluca
Here are performance numbers for a variety of approaches using a list of
length 256. You can see that having a local reference to the global chr
function (as discussed in the article) can give a slight speed boost. Using
join() in conjunction with map() performs fairly well.

    
    
      python -m timeit -s 'l = range(256);' '"".join([chr(i) for i in l])'
      # 10000 loops, best of 3: 80.5 usec per loop
      python -m timeit -s 'l = range(256); lchr = chr' '"".join([lchr(i) for i in l])'
      # 10000 loops, best of 3: 75 usec per loop
      python -m timeit -s 'import itertools; l = range(256); lchr = chr' '"".join(itertools.imap(chr, l))'
      # 10000 loops, best of 3: 52.8 usec per loop
      python -m timeit -s 'l = range(256); lchr = chr' '"".join(map(chr, l))'
      # 10000 loops, best of 3: 48.9 usec per loop
      python -m timeit -s 'l = range(256); lchr = chr' '"".join(lchr(i) for i in l)'
      # 10000 loops, best of 3: 87.2 usec per loop
      python -m timeit -s 'l = range(256); lchr = chr' '
      s = ""
      for i in l:
        s += lchr(i)
      '
      # 10000 loops, best of 3: 64.2 usec per loop

------
beaumartinez
This[1] was my bible when I first wondered how best to build strings in
Python.

[1] <http://www.skymind.com/~ocrow/python_string/>

------
kitanata
", ".join([0,1,2,3,4,5,6,7,8,9]) works well. Why not that?

