
How (Not?) to Use Python’s List Comprehensions - luu
http://moderndescartes.com/essays/python_list_comprehensions
======
eutectic
I think this looks a lot nicer with Haskell-style do notation:

    
    
        import Control.Monad (guard)
        import Data.Ratio ((%))
    
        results :: [[Int]]
        results = 
          do th <- u 
             o  <- u
             r  <- u
             guard $ 2*th^2 + o^2 + r^2 == 1000
             s  <- u
             z  <- u
             guard $ th^2 * z^2 * (s - z) + s^2 == 7225
             g  <- u
             d  <- u
             xi <- u
             guard $ (g - 1)^2 + d^2 + xi^2 - xi == 600
             (et, i) <- [(xi - 7, xi - 11), (xi - 11, xi - 7), 
                         (xi + 7, xi + 11), (xi + 11, xi + 7)]
             guard $ 0 < et && et <= 26 && 0 < i && i <= 26
             (a, mu) <- [(4, 1), (2, 2), (1, 4)]
             pi <- u
             guard $ a * (a + pi) == 4 * g
             b  <- u
             n  <- u
             guard $ b^3 + z^3 + n^3 + o^9 == 1997
             k  <- [3, 4]
             e  <- u
             guard $ (a % k)^2 + (b % n)^2 + (d % xi)^2 + 
                     (e % pi)^2 + (et % mu)^2 + (i % s)^2 == 6
             return [a, b, g, d, e, z, et, th, i, k, th, mu, n, xi, o, pi, r, s]
    
          where u = [1..26] 
    
        main :: IO ()
        main = print results

~~~
pmontra
Thanks for that code. I don't know anything about Haskell so I attempt a UX
test:

results is an array of Int, but do the double [[ ]] brackets mean something
special? I hope they do, otherwise why planting doubts in people's mind.

<\- is assignment or pattern matching, probably the latter.

the $ after guard is... what? Maybe a noop like Python's :

Maybe

    
    
       guard $ expression
       something
    

is equivalent to

    
    
       if expression
         something
    

If this is the case, it's much less readable because people are left guessing.
The lack of indentation doesn't help either.

u is probably the variable that's iterated over in where u = [1..26]

Not very readable to put that after every use of u. This is easier to read

    
    
        where u = [1..26] do
    

which is many languages is

    
    
        for u in [1..26] do
    

I can imagine that where is more general and is used in other contexts.

Don't take these as criticisms of Haskell, only as a UX test. Python's list
comprehensions have never been very readable to me. This Haskell loop is
somewhat better.

I'm sure there are good reason for designing Haskell's syntax like that.
Programming languages belong to lineages and tend to perpetrate the syntax of
their parents. It makes them easier to adopt to developers well versed in
their parent languages but that could also make them difficult to grasp for
developers of other language families. Maybe luring imperative developers into
the functional world is not one of the goals of Haskell and Haskell has
achieved its goals as it is.

Still I believe that the first functional language that looks like an
imperative language will move many developers from OOP to functional. Elixir
attempted that, consciously or not, but it's not enough. The barrier there are
the OTP patterns, which are not easy to use properly. There are many different
choices for writing a process (what in OOP would be an object with its own
CPU) and its hard to chose the proper one. GenServer, Agent, Task? Nobody is
asking these kind of questions in Ruby or Python
[https://groups.google.com/forum/#!topic/elixir-lang-
talk/DCT...](https://groups.google.com/forum/#!topic/elixir-lang-
talk/DCTXyGV791w)

There is also the code to add them to the supervision tree (the cool systemd-
inside-the-language thing of Erlang and Elixir). I can't help thinking that
Elixir has to be streamlined.

~~~
gh02t
> <\- is assignment or pattern matching, probably the latter.

Neither really, though I guess it's kind of like assignment, but it's usually
read "draw from". Do-notation is sugar for working with monadic values, which
are kind of like containers that hold a value. It's a bit hard to explain, but
basically it means to pull the enclosed value out of the container on the
right hand side and reference it according to the name on the left hand side.

> the $ after guard is... what?

$ is function application. It basically means to evaluate the right hand side
before passing the result to the left hand side. It's used to avoid
parentheses, `f $ g h` is like `f (g h)` (Haskell doesn't use parenthesis for
function evaluation).

Haskell syntax is lovely when you're used to it, but it's also subtle. IMO it
can be easy for people who aren't familiar to get the general idea of what's
going on but the specifics are often not what you think.

------
sametmax
There are very few reasons to nest comprehensions. When the need to go lazy
arise and you end up with a long code, use a generator. "yield" is much more
readable.

~~~
brilee
Author here. If I may, could I ask you to write the equivalent code as you
have suggested? I can see one way to implement with generators but it's not
particularly concise. The benefit of the list comprehension approach is that
its mechanics are laid bare and there's almost nothing to debug.

The normal benefit of generators is that they don't store the intermediate
result in memory, but that isn't particularly necessary in this case

~~~
sametmax
I actually tried, and my generator version ended up less readable than yours.
I was not expecting this, but for such an algo the way you write it gives this
impression the data flow from one end to the other, which reads better than
splitting it in several functions including nested blocks.

The only drawbacks I can see:

\- if you want to debug that with PDB, you are out of luck.

\- anybody not wanting to spend the time understanding what it does will freak
out.

Unrelated, but you can do:

def get_ranges(i): return product(*(range(1, 27) for _ in range(i)))

To replaced stuff like:

    
    
        for th in range(1, 27)
        for o in range(1, 27)
        for r in range(1, 27)
    

With:

    
    
        for th, o, r in get_ranges(3)
    

But again, no sure it makes things better. Shorter yes, but the benefit of
your version is that each step is very clear.

I still have no idea what's this code is for though :)

------
ThePhysicist
Python also supports dict and set comprehensions btw:

    
    
        { a : 2*a for a in range(1, 10)} #produces {1:2, 2:4, ...}
    
        { 2*a for a in range(1, 10)} #produces set([2, 4, ...])

~~~
xamuel
Wish more people knew about dict comprehensions, it's maddening the
contortions people go to because they don't know about it.

------
tehwalrus
Lots of people saying "nobody does this is real life". I have.

My PhD code (which was too slow and RAM-hungry, having been written in
Python...) was a monster, and was a lot less of a monster for order-2 and
order-3 nested list/dict comps.

It started out as a 200 line proof of concept (which worked perfectly on an
ideal of the system I was studying), and it evolved into a 6,000 line behemoth
(~2,000 lines for OpenGL rendering error cases) which would have been 10,000
lines without nested list and dict comps.

By the way, never use networkx on anything you expect to have a large number
of vertices or edges. It uses a lot of RAM (esp in undirected graphs where you
have backrefs for each edge.) Wrap a C++ graph lib like boost::graphs or
something, and code against that interface from the beginning.

~~~
gspetr
I have not used networkx on anything serious, but this article lead me to
believe that it is to be avoided, since the library missed obvious Python
optimizations:

[https://mathieularose.com/choosing-the-right-data-
structures...](https://mathieularose.com/choosing-the-right-data-structures/)

------
ianamartin
Well, at least the title was correct.

This is definitely not a good way to use list comprehensions.

I appreciate the OP's effort to break the language. Well done.

But no one does this in practical reality.

------
a3n
From the article:

    
    
      result = [x 
      for sublist in nested_list
          for x in sublist
              if x % 3 == 0
      ]
    

I rarely nest comprehensions, but I will occasionally have more than one
condition in the filter at the end, or some condition in the selection at the
front.

Except for the simplest comprehensions, I tend to write mine similar to above,
with indents to highlight what part of the comprehension is what.

The utility of a comprehension is not so much concise notation. Instead it's
that a comprehension is an expression (that has an associated value), rather
than the for loop which is a statement.

Similar to the trinary operator: its advantage is not merely that it replaces
an if statement visually, it's that it's a directly usable expression (that
can be composed in a (preferably short) chain of trinaries).

This highlights the four things that fizzbuzz requires:

    
    
      print '\n'.join([
          'fizzbuzz' if x%15 == 0 else
          'fizz' if x%3 == 0 else
          'buzz' if x%5 == 0 else
          str(x) for x in xrange(1, 101)])
    

Gratuitous demo filter conditions:

    
    
      print '\n'.join([
          'fizzbuzz' if x%15 == 0 else
          'fizz' if x%3 == 0 else
          'buzz' if x%5 == 0 else
          str(x) for x in xrange(1, 101)
            if x%2 == 0 or x%2 != 0])
    

If nested comprehensions make things harder to understand, I'll sometimes move
an inner comprehension out:

    
    
      subrange = {x for x in range(1, 101)}
      print '\n'.join([
          'fizzbuzz' if x%15 == 0 else
          'fizz' if x%3 == 0 else
          'buzz' if x%5 == 0 else
          str(x) for x in subrange
              if x%2 == 0 or x%2 != 0])
    

EDIT: OCD forces me to rewrite:

    
    
      subrange = {x for x in range(1, 101)}
      print '\n'.join([
          'fizzbuzz' if x%15 == 0 else
          'fizz' if x%3 == 0 else
          'buzz' if x%5 == 0 else
          str(x)
              for x in subrange
                  if x%2 == 0 or x%2 != 0])
    

FURTHER EDIT: argh!

    
    
      subrange = (x for x in range(1, 101))
    

Original was a set comprehension. That it worked "in order" was a happy
accident. Further edit is a generator.

~~~
gspetr
FYI: if you use list comprehension within a function or method that
immediately consumes the resulting list, you can usually omit it altogether:

    
    
        subrange = {x for x in range(1, 101)}
    
        print '\n'.join(
    
                        'fizzbuzz' if x % 15 == 0 else
    
                        'fizz' if x % 3 == 0 else
    
                        'buzz' if x % 5 == 0 else
    
                        str(x)
    
                        for x in subrange
    
                        if x % 2 == 0 or x % 2 != 0)
    

Notice how the '[' and ']' are missing in the code above, yet it is still both
correct python and correct fizzbuzz solution and runs faster because the code
does less work.

~~~
a3n
Huh. Never seen that.

But without the [] it appears to produce a generator expression (which is
immediately consumed) rather than a list (which is immediately consumed).

    
    
      In [9]: print ' '.join(str(x) for x in range(3))
      0 1 2
    
      In [10]: type(str(x) for x in range(3))
      Out[10]: generator
    
      In [11]: type([str(x) for x in range(3)])
      Out[11]: list
    

Same effect between the two, but different and interesting things. Thanks.

EDIT: Also:

    
    
      In [3]: %timeit ' '.join(str(x) for x in range(3))
      100000 loops, best of 3: 3.11 µs per loop
    
      In [4]: %timeit ' '.join([str(x) for x in range(3)])
      1000000 loops, best of 3: 1.47 µs per loop
    

The list version, with the [], is about twice as fast as the generator version
(without the brackets). I suspect because the generator version has to "go
back to the well" for each value, but I don't really know. And this is one
small test case, maybe it's different for larger sequences. Testing is
everything. :)

~~~
gspetr
I'll refer you to the original source where I picked up this trick:
[http://chimera.labs.oreilly.com/books/1230000000393/ch01.htm...](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#generatorargs)

"Transforming and Reducing Data at the Same Time"

I highly recommend David Beazley's excellent book, it made me a much better
Python developer, helping me produce both idiomatic and efficient solutions
over the last year. That book is programming poetry.

~~~
a3n
Thanks for that. Looks good so far.

------
Pxtl
Considering Python's goal of legibility, that syntax is hideous.

Python goes to extensive lengths to work around its mediocre lambdas. I'll
take a language with good lambda syntax and higher-order-functions like map
and filter (or select and where in C#).

~~~
RodgerTheGreat
Guido van Rossum has vocally[1] expressed a distaste for functional
programming idioms. His loss, I think.

List comprehensions have always struck me as a very ugly mechanism which is
convenient in simple instances and which composes together poorly. This kind
of special-casing is all over the place in Python's design. Consider, for
example, all the unintuitive syntactic and semantic contortions introduced to
support interval comparisons[2]. The overall result is a language which is
_apparently_ simple but in practice surprisingly complex.

[1]
[http://www.artima.com/weblogs/viewpost.jsp?thread=98196](http://www.artima.com/weblogs/viewpost.jsp?thread=98196)

[2] [https://docs.python.org/2/reference/expressions.html#not-
in](https://docs.python.org/2/reference/expressions.html#not-in)

~~~
pekk
If your ideal is Lisp, it's not surprising if you find every other language
lacking.

~~~
Pxtl
Lisp represents the ultimate sacrifice of legibility, type-safety, and
familiarity in exchange for expressiveness. But it's reasonable to say "I like
other languages than Lisp for other reasons, but I wish they had Lisp feature
X in a form that fits the philosophy of the language".

I mean, Python could offer an alternate sexpr syntax that makes it a lisp, and
that would provide nice higher-order-function support for map and filter and
foldr, but that wouldn't be a "pythonic" solution to the problem.

I want a "pythonic" answer for map and filter and reduce. List Comprehensions
are _supposed_ to provide that, but imho they're a failed attempt. They're
ugly and restrictive.

~~~
zardeh
That's because there's an argument to be made here that the blog poster here
is misusing the comprehensions syntax.

While what he does works, it can just as easily be written as

    
    
        list(itertools.chain(*[[x for x in sublist if x%3 == 0] for sublist in nested_list]))
    

Which uses a bit of magic to massage the end result, but it is much clearer
for the inner portions, and much similar to your 'lispy' style.
(`itertools.chain( _` is equivalent to a 1-level flatten)

You can do the same thing in the bigger case, and you end up nesting your
structures such that you get something like

    
    
        results = [[[[[[[[[[[[[[[(a, b, g, d, e, z, et, th, i, k, th, mu, n, xi, o, pi, r, s) 
        for th in range(1, 27) if (2*(th**2) + o**2 + r**2 == 1000) and ((th**2 * z**2 * (s - z) + s**2) == 7225)]
            for o in range(1, 27) if (b**3 + z**3 + n**3 + o**9 == 1997)]
                for r in range(1, 27)]
                    for z in range(1, 27)]
                        for n in range(1, 27) if F(a, k)**2 + F(b, n)**2 + F(d, xi)**2 + F(e, pi)**2 + F(et, mu)**2 + F(i, s)**2 == 6]
                           for b in range(1, 27)]
                               for e in range(1, 27)]
                                   for s in range(1, 27)]
                                        for (a, mu) in ((4, 1), (2, 2), (1, 4)) if a*(a+pi) == 4 * g]
                                            for pi in range(1, 27)]
                                                for g in range(1, 27) if (g-1)**2 + d**2 + xi**2 - xi == 600]
                                                    for d in range(1, 27)]
                                                        for (et, i) in ((xi-7, xi-11), (xi-11,xi-7), (xi+7, xi+11), (xi+11, xi+7)) if 0 < et <= 26 and 0 < i <= 26]
                                                            for xi in range(1, 27)]
                                                               for k in (3,4)]
    

Which admittedly isn't _better*, but you can build up over time (ie. you can
start with [f(k) for k in range(3, 4)], and then provide f(k) in a closure,
and repeat, to build this up over time.

------
lispm
Nested list comprehensions in Common Lisp:

[https://gist.github.com/lispm/7c009676fe6b86cff851bc8a1ba82d...](https://gist.github.com/lispm/7c009676fe6b86cff851bc8a1ba82d28)

    
    
        CL-USER 20 > [x (y <- '((0 1) (2 3) (4 5) (6 7) (8 9)))
                        (x <- y)
                        (zerop (mod x 3))]
        (0 3 6 9)

------
groks
You might enjoy Peter Norvig teaching you this technique which he uses to
solve the Zebra Puzzle in lesson 2 of CS212:

[https://classroom.udacity.com/courses/cs212/lessons/48632874...](https://classroom.udacity.com/courses/cs212/lessons/48632874/concepts/482769470923#)

------
gspetr
From the article:

    
    
        result = [x for sublist in nested_list for x in sublist if x % 3 == 0]
    

This explanation still confuses people initially. I suggest this one:

    
    
        result = [a for b in c for a in b if a % 3 == 0]
    

Here it's more obvious that the outer loop arguments are to be reversed in the
latter part of the nested comprehension.

However I would advise against any further nesting than this, you will not
want to debug that if anything goes wrong with one of the inputs here,
especially if you also do any conditional logic here in the comprehension.

------
IanCal
I'm a little confused by

    
    
        for A in range(1, 27):
            for B in range(1, 27):
                if A == 2 * B:
    

Why iterate over anything here?

    
    
        for A in range(1, 27):
            if A % 2 == 0:
                B = A / 2
    

Or

    
    
        for A in range(2, 27, 2):
            B = A / 2
    

Is there a full version of the naive nested loops? I'd like to see just how
readable the final version is with applying more rules like this.

------
martinflack
If I were to start it in Common Lisp, I'd probably do something like this.
Next steps might be to make normal variable declarations automatic by scanning
guard expressions (which would prevent unwittingly mentioning a variable too
early); and depending on the intended usage, maybe permit infix math via the
infix-math package.

    
    
        (defpackage solver (:use :cl :iterate) (:shadow pi))
        (in-package :solver)
    
        (defmacro defsolver (name default constraints result)
          (flet ((normal-var (var)      `(iter (for ,var ,@(rest default))))
                 (custom-var (var spec) `(iter (for ,var in ,spec)))
                 (guard (expr)          `(when ,expr))
                 (solved ()             `(return-from ,name ,result)))
            (macrolet ((yield (form) `(append ,form (list (constraint (rest cs))))))
              (labels ((constraint (cs &aux (c (first cs)))
                         (cond ((null cs)           nil)
        		           ((eq c :solved)      (solved))
        		           ((symbolp c)         (yield (normal-var c)))
        		           ((eq (first c) 'for) (yield (custom-var (second c) (third c))))
        		           (t                   (yield (guard c))))))
                `(defun ,name () ,(constraint `(,@constraints :solved)))))))
    
        (defsolver result
          (:default from 1 to 26)
          (th
           o
           r
           (= (+ (* 2 (expt th 2)) (expt o 2) (expt r 2)) 1000)
           s
           z
           (= (+ (* (expt th 2) (expt z 2) (- s z)) (expt s 2)) 7225)
           g
           d
           xi
           (= (+ (expt (1- g) 2) (expt d 2) (expt xi 2) (- xi)) 600)
           (for (et i) `((,(- xi 7) ,(- xi 11)) (,(- xi 11) ,(- xi 7))
                         (,(+ xi 7) ,(+ xi 11)) (,(+ xi 11) ,(+ xi 7))))
           (and (< 0 et) (<= et 26) (< 0 i) (<= i 26))
           (for (a mu) '((4 1) (2 2) (1 4)))
           pi
           (= (* 4 g) (* a (+ a pi)))
           b
           n
           (= (+ (expt b 3) (expt z 3) (expt n 3) (expt o 9)) 1997)
           (for k '(3 4))
           e
           (= (+ (expt (/ a k) 2) (expt (/ b n) 2) (expt (/ d xi) 2)
        	 (expt (/ e pi) 2) (expt (/ et mu) 2) (expt (/ i s) 2)) 6))
          (list a b g d e z et th i k th mu n xi o pi r s))
    
        SOLVER> (time (result))
        Evaluation took:
          0.010 seconds of real time
          0.012000 seconds of total run time (0.012000 user, 0.000000 system)
          120.00% CPU
          23,160,723 processor cycles
          851,968 bytes consed
        
        (4 9 19 12 15 3 1 20 5 4 20 1 9 12 2 15 14 5)
    

Edit: formatting, simplification

The b, n, k, e definitions are reordered better as in @eutectic's haskell
example.

~~~
kazinator
I don't see why yield has to be a macrolet rather than a labels.

~~~
martinflack
You're right, it could, if you prefer. In the same labels form with an extra
parameter cs, which the yield's all pass in; or in a labels/flet form inside
constraint.

    
    
        (defmacro defsolver (name default constraints result)
          (flet ((normal-var (var)      `(iter (for ,var ,@(rest default))))
                 (custom-var (var spec) `(iter (for ,var in ,spec)))
                 (guard (expr)          `(when ,expr))
                 (solved ()             `(return-from ,name ,result)))
            (labels ((constraint (cs &aux (c (first cs)))
                       (flet ((yield (form) (append form (list (constraint (rest cs))))))
                         (cond ((null cs)           nil)
                               ((eq c :solved)      (solved))
                               ((symbolp c)         (yield (normal-var c)))
                               ((eq (first c) 'for) (yield (custom-var (second c) (third c))))
                               (t                   (yield (guard c)))))))
              `(defun ,name () ,(constraint `(,@constraints :solved))))))

~~~
kazinator
All your functions can be in a single labels block now. It's okay for a non-
recursive function to be a labels. flet is really for those situations in
which labels cannot be used, like defining an inner foo which shadows _and
calls_ (i.e wraps) an outer foo.

------
ant6n
Instead of

    
    
        for a in range(1,27) for b in range(1,27) if a == 2*b ...
    

you could try

    
    
        for b in range(1,27) for a in [2*b] if a < 27 ...

~~~
sirclueless
This is true for the toy example. In the actual problem, the loop guards are
quadratic equations or even higher order, and his first method generalizes
better (and for small ranges is hardly slower).

------
viebel
What about making an interactive version of theaeticle with klipse?

------
gvd
More like "why not to use Python"

------
wayn3
better title: Why software engineers clearly need no math skills whatsoever -
solving a 12x18 matrix in as little as 15 loops

ever heard of gauss algorithm?

~~~
junke
Those are non-linear equations.

~~~
quantumtremor
These are all polynomial equations, so we can likely solve them by computing a
Grobner basis over the set and using the Elimination Theorem.

~~~
brianwawok
I don't know what any of those are but I know how to write 15 nested for loops
and let it run for a few minutes. Hacking code seems fine if that's what
someone knows...

~~~
tomrod
That's the failure of maths education. Plenty of names for formal things and
ideas, yet mostly inaccessible to the lay person.

~~~
brianwawok
So I know 6 programming languages, maybe 600 libraries, devops, dozens of
tools.

So I could go back to math class and learn some more formal proofs. What part
of my programming knowledge am I going to drop to learn that? Will it actually
help my career?

I have found little to no use for advanced mathematics in 10 years of
programming. I am not saying it had no use, but I have not needed it.

Most academic types majorly overestimate the usefulness of math in
programming. I can code circled around people who rock out at math.

~~~
tomrod
I don't mean to demean your programming experience or toolkit decisions in
anyway. Egg on my face--it appears that's how it was received! Let me try
again.

> I have found little to no use for advanced mathematics in 10 years of
> programming.

I reckon you use or stumble upon the algorithms often, perhaps all the time.
Be it with Boolean logic, or network design, or optimal stopping, or fuzzy
matching, or statistics in logging, or what have you. These principles show up
often. My point wasn't you can't or shouldn't be able to code circles around
people who rock out at math, but that maths education often fails to emphasize
the logical methods of figuring things out from first principles over the
already solved proofs, methods, or equations.

> Most academic types majorly overestimate the usefulness of math in
> programming.

To some degree, I agree here. If there are well developed libraries and mature
paradigms needed (RDBMS, web dev, etc.) then there is little need for
mathematical forte. Nothing wrong with that!

------
Animats
Must..write..in..functional..style..to..be..l33t....

From the article: _" Don't do this at work, adults!"_

