
Brilliant or insane code? - StavrosK
http://www.stavros.io/posts/brilliant-or-insane-code/
======
hartror
This is in the zip documentation as the way of solving this problem. Sort of
surprised the author didn't look up the documentation before writing what is
otherwise a very good post.

    
    
        The left-to-right evaluation order of the iterables is guaranteed. This makes
        possible an idiom for clustering a data series into n-length groups using
        zip(*[iter(s)]*n).
    

[http://docs.python.org/2/library/functions.html#zip](http://docs.python.org/2/library/functions.html#zip)

~~~
impressive
Recently I was downvoted 2-3 times on S.O. for an answer that was claimed to
be non-idiomatic. So, I cleaned it up, but it really irked me. What I had
written was totally fine. It shouldn't have hurt anyone's eyes. It was direct.
It was in-your-face. It was not magic.

Reading this post brought back that feeling.

If people don't understand a completely valid and terse way of coding
something, sometimes instead of bothering to understand it, they will bash it.
Sometimes, this is a totally valid way to vent frustration, and then they
learn something new, and all is good. But sometimes, it just gets left as
"this is wrong" and then someone else thinks it is wrong, and so on. _That_ is
wrong, and tech leads or architects that enforce such crap will bug the living
shit out of good developers and lose them.

I appreciate clarity. But, terse one-liners can be just as clear if not
clearer than code that unnecessarily adds more methods/functions/names/local
vars and claims to be "more testable", etc.

You shouldn't have to sacrifice the ability to be terse and clear at the same
time. Testing is no excuse for code bloat. You can likely write a test that
executes the behavior without having to atomize it. Assess the amount of
production and test code you are writing. How much more code are you actually
having to write in order to test, both in the tests themselves and in the code
which you are having to test?

It happened here with Python and it happens in many languages. It even happens
with laws and regulations in government. If someone gets the same thing done
just as ethically but without the bureaucracy, just appreciate it as another
perhaps better way of doing something. Don't bash it publically because you
don't understand it.

~~~
chipsy
I think the mentality can be extended to most code-reading. Don't get it by
skimming? Must be crap code. I only got away from that when I changed my
litmus test towards whether I could write on top of the codebase successfully,
not how it looked. Today my only real point of judgment about the look of code
is whether it's written in a style that increases average error rate.

w/r to Python in particular, it has a history of ending up with idioms that
are "tricky" and not particularly more or less terse than other techniques,
but are able to exploit the standard library functions to get a faster-running
result.

This is, of course, at odds with the motto of "there should be only one
(obvious) way to do it," so every experienced Python programmer has to
internalize a small dictionary of idiomatic one-liners for these exceptional
cases. (Fortunately, it's not that big. I can only think of three or four off
the top of my head.)

~~~
specialist
_style that increases average error rate_

Errors per LOC is allegedly constant, across all languages.

I also consider the cost of change when designing things.

I once created an HL7 wrapper that was a marvelous thing of beauty. Fluent
API, clever use of the type system. But no one could maintain it, including
me. It had too much magic. So I scrapped it, went with a dumber
implementation.

~~~
aidenn0
> Errors per LOC is allegedly constant, across all languages.

If this is true, then a more verbose style will have a higher error rate.
(more LOC to do the same task -> more errors)

~~~
adventureloop
> Errors per LOC is allegedly constant, across all languages.

This implies that more expressive languages(more expressions per line) are
less prone to errors.

You conclusion would be correct if he had stated that error per expression was
constant between languages.

~~~
chii
Don't confuse correlation with causation. It may not be that more lines is the
cause, but that the method of thinking that some languages require you to
think in is error prone, and it just so happens that such a language also
needs more LoC.

------
scotty79

        i = iter(array)
        return zip(i, i, i)
    
    

There you go. All but neceessary magic gone with just one line more.

~~~
jasallen
It is clearer, but uses a fixed number of arguments, so may be unsuitable for
some applications.

~~~
aidos
The current function has the same restriction. In their case it's in the form
of " * 3 ".

~~~
garethadams
That `3` can be trivially replaced with a variable in a way that `(i, i, i)`
cannot. But still, the refactoring of the iterator into a separate variable is
the key step which reinforces that the same object is being passed 3 times

------
Jach
I'm glad Clojure has top-level support for this operation... it's quite
flexible too, and the presence of partition-all makes it explicit what you
should expect if the sequence doesn't evenly partition.

    
    
        user=> (partition 3 [1 2 3 4 5 6])
        ((1 2 3) (4 5 6))
        user=> (partition 3 [1 2 3 4 5 6 7])
        ((1 2 3) (4 5 6))
        user=> (partition-all 3 [1 2 3 4 5 6 7])
        ((1 2 3) (4 5 6) (7))
        user=> (partition 3 3 (repeat 0) [1 2 3 4 5 6 7])
        ((1 2 3) (4 5 6) (7 0 0))

~~~
phpnode
FAO: iamgopal - your account has been dead for over a year for no discernible
reason, only people with showdead on can see your posts.

------
d0mine
From Itertools Recipes [6]:

    
    
      def grouper(iterable, n, fillvalue=None):
          "Collect data into fixed-length chunks or blocks"
          # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
          args = [iter(iterable)] * n
          return zip_longest(*args, fillvalue=fillvalue)
    

\- What is the most “pythonic” way to iterate over a list in chunks? [1]

\- Idiomatic way to take groups of n items from a list in Python? [2]

\- Python “Every Other Element” Idiom [3]

\- Iterate an iterator by chunks (of n) in Python? [4]

\- How do you split a list into evenly sized chunks in Python? [5]

[1]: [http://stackoverflow.com/questions/434287/what-is-the-
most-p...](http://stackoverflow.com/questions/434287/what-is-the-most-
pythonic-way-to-iterate-over-a-list-in-chunks)

[2]: [http://stackoverflow.com/questions/2461484/idiomatic-way-
to-...](http://stackoverflow.com/questions/2461484/idiomatic-way-to-take-
groups-of-n-items-from-a-list-in-python)

[3]: [http://stackoverflow.com/questions/2631189/python-every-
othe...](http://stackoverflow.com/questions/2631189/python-every-other-
element-idiom)

[4]: [http://stackoverflow.com/questions/8991506/iterate-an-
iterat...](http://stackoverflow.com/questions/8991506/iterate-an-iterator-by-
chunks-of-n-in-python)

[5]: [http://stackoverflow.com/questions/312443/how-do-you-
split-a...](http://stackoverflow.com/questions/312443/how-do-you-split-a-list-
into-evenly-sized-chunks-in-python)

[6]: [http://docs.python.org/3/library/itertools.html#itertools-
re...](http://docs.python.org/3/library/itertools.html#itertools-recipes)

~~~
iooi
Figured I would add my own [7]. I know the example prompt might seem a bit
specific, but it can come in really handy sometimes (for example, iterating
through get parameters). Always a fan of idiomatic and one-liner python.

I should mention that I ended up using the fourth version (seemingly the
slowest) but it is actually the fastest depending on your input -- as the
length of the elements gets larger, the fourth method tends to vastly
outperform the others.

[7] [http://stackoverflow.com/questions/16685545/elegantly-
iterat...](http://stackoverflow.com/questions/16685545/elegantly-iterating-
over-string-replacement)

------
bjourne
I think the code is pretty ok, except for the stupid name, docstring and that
it is a method and not a free function.

    
    
        def chunks(seq, n):
            "groups the elements of the seq into a list of n-sized chunks."
            return zip(*[iter(seq)]*n)

~~~
inglesp
The original docstring is a complete red herring... yours is much better, but
I'd go one further and include an example.

------
mjburgess
It does not rely on an implementation detail, that is how iterators work. He's
just supplied the same iterator to a function which consumes iterators...
that's exactly the expected behaviour.

~~~
tiziano88
The fact that zip evaluates its arguments in order is an implementation
detail. It could evaluate them in reverse order, in which case this code would
not behave as expected.

~~~
StavrosK
That's what I meant, thanks for clarifying. It does seem that the zip() docs
make a guarantee of left-to-right evaluation, though, so my apprehension
proved unfounded.

------
raphael_kimmig
You can also do

    
    
        zip(arr[::3], arr[1::3], arr[2::3])
    

which is nearly as fast but doesn't work with iterators. If you want to use
iterators you could also do

    
    
        zip(islice(arr, 0, None, 3), islice(arr, 1, None, 3), islice(arr, 2, None, 3))
    

which is a tad slower.

------
s_q_b
It wouldn't have occurred to me to do this a different way. Isn't this a very
basic use of zip()?

------
chilldream
The part that annoys me is that the docstring mentions "dictionaries" when I
see no `dict`s.

~~~
StavrosK
Yeah, looks like a remnant that was never updated.

------
signed0
It's not brilliant.

This accomplishes the same thing without being hard to understand:

    
    
        from itertools import islice
    
        iterator = iter(array)
        try:
            while True:
               yield list(islice(iterator, 3))
        except StopIteration:
            pass

~~~
StavrosK
Sure, there are many ways to do it, but I think the author was going for speed
here.

~~~
signed0
I'd be surprised if my way is slower. Any time you unpack into a function such
as zip() python has to create an intermediary list to store all the results
before calling the function.

~~~
kroger
You could always use izip:

[http://docs.python.org/2/library/itertools.html#itertools.iz...](http://docs.python.org/2/library/itertools.html#itertools.izip)

~~~
signed0
I was a bit wrong about that. For some reason I imagined that there would be
quite a few args being passed to zip, when in fact there are just the three
iterators. It does create a temporary list, but it's so small it's negligible.
Using izip wouldn't really change anything.

------
auvrw
like dict(zip( or dict(getmembers(asdf)).keys(), it's idiomatic code. it
wouldn't have occurred to me the first time i had to write such a function,
but now that i've taken a few moments to read the article i find it clearer
than the list comprehension version (because the constant only appears once)
and nicer than the numpy version in that it doesn't require an extra
dependency.

may save a few keystrokes some rainy day. good post.

------
joshcorbin
Even more interesting to me:

In [3]: ar = [1, 2, 3, 2, 4, 6, 3, 5 ,7, 3, 5, 8]

In [4]: %timeit zip( _[iter(ar)]_ 3) 100000 loops, best of 3: 2.02 us per loop

In [5]: %timeit zip(ar[0::3], ar[1::3], ar[2::3]) 1000000 loops, best of 3:
1.37 us per loop

In [6]: %timeit zip( _(iter(ar),)_ 3) 1000000 loops, best of 3: 1.34 us per
loop

From which I conclude: \- zipping slices is even more efficient, and arguably
easier to grok \- but you get about the same runtime by multiplying a
singleton tuple rather than a list

However if you want to generalize the chunk size, multiplication seems to win
out over slicing (with tuples still being more efficient than lists):

In [7]: chunk1 = lambda n, it: zip( _[iter(it)]_ n)

In [8]: chunk2 = lambda n, it: zip( _(iter(it),)_ n)

In [9]: chunk3 = lambda n, seq: zip(*(seq[i::n] for i in xrange(n)))

In [10]: %timeit chunk1(3, ar) 100000 loops, best of 3: 2.32 us per loop

In [11]: %timeit chunk2(3, ar) 1000000 loops, best of 3: 1.83 us per loop

In [12]: %timeit chunk3(3, ar) 100000 loops, best of 3: 3.55 us per loop

------
wcarss
Maybe I'm misreading it, but to me the far more heinous crime than using
difficult code (setting aside that it's in the docs) is that the docstring is
a total lie.

    
    
        """Parses an array of xyz points and returns a array of point dictionaries."""
    
        'Only, it doesn’t really. It takes an iterable of points...and returns an iterable of 3-tuples of groupped points'
    

The wrongness of it would cause me to double-take, because even if I were
familiar with this usage, it isn't what the comment suggests is happening. A
docstring more like:

    
    
        """Parses an iterable of values [x,y,z,x,y,z...] and returns an iterable of 3-tuples: [(x,y,z),(x,y,z)...]"""
    

Would be a lot more clear by simple virtue of truth, even if it didn't explain
the code step by step.

------
davidp
I made a few more interesting (to me) measurements. As always, you have to
measure your performance with your actual input data to see what's "best".

Test 1: Boring, small array of integers

    
    
        In [28]: arr = range(0, 300)
    
        In [29]: %timeit [(arr[3*x], arr[3*x+1], arr[3*x+2]) for x in range(len(arr)/3)]
        10000 loops, best of 3: 27.2 us per loop
    
        In [30]: %timeit numpy.reshape(arr, (-1, 3))
        10000 loops, best of 3: 45.2 us per loop
    
        In [31]: %timeit zip(*([iter(arr)]*3))
        100000 loops, best of 3: 6.25 us per loop
    

This roughly matches the article's timing ratios, so far so good.

Test 2: Use numpy's random number generation to get a small array of floats

    
    
        In [32]: arr = numpy.random.ranf(300)
    
        In [33]: %timeit [(arr[3*x], arr[3*x+1], arr[3*x+2]) for x in range(len(arr)/3)]
        10000 loops, best of 3: 54 us per loop
    
        In [34]: %timeit numpy.reshape(arr, (-1, 3))
        1000000 loops, best of 3: 1.06 us per loop
    
        In [35]: %timeit zip(*([iter(arr)]*3))
        10000 loops, best of 3: 39.7 us per loop
    

numpy is two orders of magnitude faster here; it's evidently using a highly
optimized internal codepath for random sequence generation, which I'd guess is
a common thing to do in numeric analysis. I assume it's using a generator, so
there's no actual array being created, blowing up the CPU cache lines etc.

Test 3: Verify that analysis by interfering with numpy

    
    
        In [36]: arr = [x for x in numpy.random.ranf(300)]
    
        In [37]: %timeit [(arr[3*x], arr[3*x+1], arr[3*x+2]) for x in range(len(arr)/3)]
        10000 loops, best of 3: 26.2 us per loop
    
        In [38]: %timeit numpy.reshape(arr, (-1, 3))
        10000 loops, best of 3: 48.5 us per loop
    
        In [39]: %timeit zip(*([iter(arr)]*3))
        100000 loops, best of 3: 6.55 us per loop
    

Yep.

Test 4: Larger data set, no interference

    
    
        In [40]: arr = numpy.random.ranf(3000000)
    
        In [41]: %timeit [(arr[3*x], arr[3*x+1], arr[3*x+2]) for x in range(len(arr)/3)]
        1 loops, best of 3: 624 ms per loop
    
        In [42]: %timeit numpy.reshape(arr, (-1, 3))
        1000000 loops, best of 3: 1.06 us per loop
    
        In [43]: %timeit zip(*([iter(arr)]*3))
        1 loops, best of 3: 335 ms per loop
    

The numpy time doesn't change _at all_ from test 2 despite the larger size,
but the others suffer. Again, I suspect numpy is being intelligent here; my
guess is that it doesn't actually apply the function and generate the real
output, it just wraps the random generator in another one.

Test 5: Larger data set, interfering with numpy

    
    
        In [44]: arr = [x for x in numpy.random.ranf(3000000)]
    
        In [45]: %timeit [(arr[3*x], arr[3*x+1], arr[3*x+2]) for x in range(len(arr)/3)]
        1 loops, best of 3: 321 ms per loop
    
        In [46]: %timeit numpy.reshape(arr, (-1, 3))
        1 loops, best of 3: 354 ms per loop
    
        In [47]: %timeit zip(*([iter(arr)]*3))
        10 loops, best of 3: 83.6 ms per loop
    

There we go; we're back to roughly the original timing ratios.

So, surprise! You always have to measure. Measure, measure measure. My bias is
to write code first for legibility and modifiability, and then optimize hot
spots if needed (and add comments, please, when you do so).

Without doing deeper analysis I'd say one moral of the Python story is, this
shows the potential power of generators. But in real-world data sets this
isn't always ideal -- is it faster to load up the whole data set in memory and
blast through it, or load it from disk on demand with a generator? In _really_
high performance scenarios, is it faster to preprocess the data to fit into
the CPU's cache lines? You can't tell without measuring, and you have to
measure in the environment you're deploying to, since the answer may be
different on a machine with 1GB RAM vs. one with 128GB RAM, or 32KB L1 cache
vs. 8KB.

~~~
loarake
The numpy example becomes fast when you use numpy arrays. Try %timeit
numpy.array(arr); numpy.reshape(arr, (-1, 3));

and then just %timeit numpy.array(arr), you'll see that the reshape takes no
time at all. Type conversion from python list to numpy array is what kills the
performance.

~~~
pudquick
A point of clarification here - numpy's reshape operation stays fast as long
as the array is a numpy array.

Which is exactly what the parent comment was all about - the author figured
that the reason numpy was significantly faster was because it was accessing /
working with the data in a different fashion.

So, in order to test that theory, he converted the numpy.array into a normal
python array _before_ he proceeded to do any timed operations with zip vs.
numpy.reshape, etc.

This is a more realistic playing field if you're considering data that was
created outside of the numpy environment. At _some_ point, if you're going to
work with numpy.reshape, it will need to be type converted / "imported" into
numpy data types.

For the purposes of this test, it's much more "fair" to include both the time
numpy spent on splitting the array as well as that conversion time. The
reshape process in numpy had essentially O(1) time with native data types
indicating that it had done some behind the scenes work that allowed for such
speed. The parent example is much more realistic in capturing the time of the
behind the scenes work by forcing each method to start from the same _exact_
same data objects.

~~~
loarake
My reply was in response to the statement "numpy is two orders of magnitude
faster here; it's evidently using a highly optimized internal codepath for
random sequence generation", which is false, it's not because of highly
optimized internal codepaths for random sequence generation, it's because the
code produced a numpy array (or didn't have to do type conversion). But I
agree, when using numpy to produce a timing comparison, it would be fair to
start with a numpy array, or to show the time involved in the creation of the
array.

~~~
davidp
Thanks for your comments. I hope it didn't sound like I was negatively
comparing numpy's array/sequence operations to anything. I know very little
about numpy, and I assume that "real" numpy solutions don't look anything like
what's being discussed here. I only included those measurements since the
article's author did.

To clarify my points a bit, the optimizations I alluded to (in "highly
optimized internal codepath") were meant to include things like using a
generator, i.e. at no point is there an actual array of input random numbers.
The fact that in numpy the 300-element "array" and the 3,000,000-element
"array" had identical timings suggests exactly that; I disagree that it's an
issue of internal representation, unless the concept of a numpy array subsumes
the concept of a generator, in which case I think we're all saying the same
thing.

That kind of optimization is only possible in this case because by the
definition of randomness nobody could know what the values were until they
were enumerated, so it's 100% transparent to use a generator. That's not how
real-world data works, hence my forced-native-array measurement and pudquick's
reply.

------
hknozcan
We all love short and fast. But this is definitely an interesting approach.
I'd love to see similar approaches to problems if you guys can point out to
some.

~~~
StavrosK
I would also, I love this sort of thing. How about:

    
    
        >>> some_boolean = False
        >>> ["Thing 1", "Thing 2"][some_boolean]
        "Thing 1"

~~~
icebraining
Yeah, but unlike the hack in the article, the more readable version of that
code,

    
    
      "Thing 1" if some_boolean else "Thing 2"
    

is also almost twice as fast(775 vs 1340 ns, on my machine).

~~~
StavrosK
I was more posting it for the "cute hack" value, rather than speed. Speed was
significant in the post because the author needed their library to be fast.

------
lmm
Insane, because it relies on the zip implementation detail. If you cared about
a measly factor of 4 in performance you wouldn't be using python anyway.

~~~
techdragon
Any developer can care about performance. And should never be mocked for
achieving above 200% performance compared to an alternative implementation...
Let alone 400% performance compared to the alternatives.

~~~
lmm
If that 400% performance improvement was achieved by sacrificing code
readability, on a project that has deliberately chosen to prioritize
readability over performance, then yes they should be mocked. Or at least told
not to do it.

------
frugalmail
My vote: Code is unreadable crap written to be discarded later. Spend some
time typing it out. You're writing for the x # of people (including the
author) that will have to read it countless times in the future. I would be
better if they didn't have to think about the "cleverness" when they come
across it. Unless of course, you're participating in an obfuscated code
contest.

------
sivanmz
The pairwise recipe function in itertools demonstrates just that (using tee
instead of multiplying the iterator):

[http://docs.python.org/2/library/itertools.html](http://docs.python.org/2/library/itertools.html)

    
    
      def pairwise(iterable):
        "s -> (s0,s1), (s1,s2), (s2, s3), ..."
        a, b = tee(iterable)
        next(b, None)
        return izip(a, b)

------
Luyt
I wouldn't call it insane, neither brilliant. I use this function to split a
sequence into pairs (or triplets, or fourths, etc):

    
    
      def paired(t, size=2, default=None):
        it = iter(t)
        return itertools.izip_longest(*[it]*size, fillvalue=default)
    

I use it in a formatter which outputs alphabetized data in columns, where the
order should run down the columns instead rowwise.

------
rzimmerman
If it's actually faster, the speed actually matters, and you wrap it with a
well commented explanation and descriptive name, I'd consider it reasonable.
Otherwise, just write out what you're doing. It's definitely clever, but fewer
lines of code is not an optimization.

------
rouzh
My question is, at what point does it become incumbent on the language to
provide a more meaningful idiom for this kind of expression? I think the point
at which constructs like this leak into your official documentation is a
pretty good line to start thinking about it. :)

------
runn1ng
>"there should be one — and preferably only one — obvious way to do it"

Oh. Well. OK.

------
BasDirks
An aside: the animated expansion of the code blocks make the site feel twitchy

~~~
StavrosK
How do you mean?

------
rational_indian
Brilliant and insane?

~~~
StavrosK
Hah, yep, I hadn't considered it can be both.

