Python’s lambda is broken

enum · on April 9, 2009

Lambda is fine. The loops are mutating the variable i, so all the lambda share the same i. Apparently, list comprehensions also use mutation--I suppose that's fine.

scott_s · on April 9, 2009

Indeed, Python is using lexical scoping. This behavior is well known.

jrockway · on April 9, 2009

While true, that is not really the issue. The issue is which lexical scope the loop iterator is defined in. In python, it is apparently like this:

    i = 0
    increment i
    <do loop body with i>

It could have been:

    i = 0
    increment 1
      create new environment
      bind i here
      <do loop body>

In the second case, each loop body would have its own "i". In the first case, they all share the same "i".

shiro · on April 9, 2009

Same in Common Lisp.

    cl-user(1): (defvar fns (loop for i from 0 below 10 collect (lambda (n) (+ i n))))
    fns
    cl-user(2): (funcall (elt fns 3) 4)
    14

Personally I don't like this behavior (I've been bitten before in more subtle cases), but you can't blame lambda. It is consistent if one understands the same i is shared by all lambdas and loop mutates the i.

dangoldin · on April 9, 2009

Well it makes sense given the language - I just found it interesting that Haskell acts differently.

jrockway · on April 9, 2009

Haskell doesn't act differently. If you translate the Haskell to Lisp:

    (funcall (elt (mapcar (lambda (x) (lambda (y) (+ x y))) '(1 2 3)) 0) 2)
     ==> 3

Then the Lisp works the same as the Haskell.

The difference is that "loop" mutates the loop variable, whereas mapping makes a copy. In Haskell, there is no way to mutate a variable (that's not even what variables are), so that's the only way you can express the concept. In Lisp, you can either copy or mutate, and you get a different result depending on which you choose.

Closures always work this way. Otherwise, you couldn't write things like:

    (defun make-incrementers ()
        (let ((i 0))
            (cons (lambda () (+ i 1))
                  (lambda () (- i 1)))))

trjordan · on April 9, 2009

Similarly, if you use map instead, you get the right answer:

    >>> map(lambda x: lambda i: x + i, range(0,10))[3](4)
    7

gamache · on April 9, 2009

Perl 5.10.0 gives three different sets of answers.

Map works as the original author expected:

  @fs = map {sub {$_ + shift}} (0..9);
  print $fs[$_](3), " " for (0..9);
  >>> 3 4 5 6 7 8 9 10 11 12

A Perlish for loop, or rather a statement with a "for" modifier:

  push @fs, sub {$_ + shift} for (0..9);
  print $fs[$_](3), " " for (0..9);
  >>> 3 4 5 6 7 8 9 10 11 12

The same loop, unpacked into a more universal syntax:

  for (0..9) {push @fs, sub {$_ + shift}}
  print $fs[$_](3), " " for (0..9);
  >>> 3 4 5 6 7 8 9 10 11 12

Here's where things start getting awesome. Let's name our loop variable $i, rather than using the Perlish default $_.

  for $i (0..9) {push @fs, sub {$i + shift}}
  print $fs[$_](3), " " for (0..9);
  >>> 3 3 3 3 3 3 3 3 3 3

And now let's write a C-style for loop.

  for ($i=0; $i<10; $i++) {push @fs, sub {$i + shift}}
  print $fs[$_](3), " " for (0..9);
  >>> 13 13 13 13 13 13 13 13 13 13

I am not sure why the last two versions work differently; I'd halfway expect the bottom version, but the "for $i (0..9)" loop leaves me scratching my head.

jrockway · on April 9, 2009

I am not sure why the last two versions work differently; I'd halfway expect the bottom version, but the "for $i (0..9)" loop leaves me scratching my head.

$_ is dynamically scoped, so you are not capturing anything in your lambda. The reason they appear to work is because you are binding $_ dynamically to the right thing when you are printing the list.

Example:

    my @fs = map {sub {$_ + shift}} 0..9;
    $_ = 42; 
    say $fs[0]->(1); # 43

Oops.

Perl 5.10 lets you lexical-ize $_, but you didn't do this. Try again and let us know how it goes ;)

gamache · on April 9, 2009

I understand the operational difference between using $_ and $i; it's the last two examples which confuse me, because one set of closures ends up using the very last value for $i (as I would expect for an $i scoped outside the loop), and the other uses the very first (as I would not expect at all).

jrockway · on April 9, 2009

None of your examples create closures. You can only close over lexicals, but all the variables you create are global. (Your loops involving $i also affect the global $i. If you want it lexically scoped, you have to say so.)

Your first two examples work because you are adding $_ and the first arg to the function. The for loop you use to print $fs[$_] binds $_, and your anonymous function uses that value. The result is something that appears to work. If you change the loop variable to something else, though, $_ will be unbound, and you will be adding to an undefined value. ("use warnings" will whine about this.)

If you then create functions that add $i and the argument, your loop stops working, because the loop binds $_ instead of $i.

Basically, you need some "my" keywords in there before your example becomes meaningful in any way. "use strict" and "use warnings" are your friend. ("use strict" will whine about the 'for $i (...)' construct that should be 'for my $i (...)', and "use warnings" will whine about $i being undefined when you try to evaluate one of the created functions.)

One last thing, if you were experimenting in Devel::REPL (re.pl), you would have noticed this problem much sooner. strict and warnings are on by default, which would have produced errors and warnings for all of your examples. It also dumps the result of evaluation with Data::Dump::Streamer, so you can actually see what's being closed over. Examples:

    > my @fs = map { sub { $_ + $_[0] } } 1..2
    $ARRAY1 = [
                sub {
                  package Devel::REPL::Plugin::Packages::DefaultScratchpad;
                  use warnings;
                  use strict 'refs';
                  $_ + $_[0];
                },
                'V: $ARRAY1->[0]'
              ];
     $ARRAY1->[1] = $ARRAY1->[0];

DDS shows you the code, and it is even smart enough to know that the two functions you produced are identical.

If we do it "right":

    > my @fs = map { my $i = $_; sub { $i + $_[0] } } 1..2
    my ($i,$i_eclipse_1);
    $i = 1;
    $i_eclipse_1 = 2;
    $ARRAY1 = [
                sub {
                  package Devel::REPL::Plugin::Packages::DefaultScratchpad;
                  use warnings;
                  use strict 'refs';
                  $i + $_[0];
                },
                sub {
                  package Devel::REPL::Plugin::Packages::DefaultScratchpad;
                  use warnings;
                  use strict 'refs';
                  $i_eclipse_1 + $_[0];
                }
              ];

You can see that the two functions are different, and that they don't share the same $i.

berntb · on April 10, 2009

That was as good as perlmonks.org! You should almost have put a warning at the top, that a reader might want to think about the problem, before reading your answer. :-)

I'm convinced and will try Devel::REPL, instead of "traditional" one-liners.

draegtun · on April 10, 2009

And not surprisingly this topic has already been discussed on Perlmonks ;-)

Here's one variation..... http://www.perlmonks.org/?node_id=719475

DavidSJ · on April 9, 2009

I think you mean:

    (defun make-incrementers ()
        (let ((i 0))
            (cons (lambda () (incf i))
                  (lambda () (decf i)))))

jrockway · on April 10, 2009

Yes, I did.

dangoldin · on April 9, 2009

Ah nice. Thanks for the explanation.

jrockway · on April 9, 2009

The deeper issue here is "mutation is hard to reason about". Well, yeah.

miloshh · on April 9, 2009

Well, I would disagree - the problem is that there is no apparent mutation in the given example. There is hidden mutation, in that the list comprehension mutates its variable. This is probably done for efficiency, but I would claim it's a bad design decision.

dreish · on April 9, 2009

Since that's Clojure's motto, let's see how it does:

  user=> (def fs (vec (for [i (range 10)] #(+ i %))))
  #'user/fs
  user=> ((fs 3) 4)
  7

Looks good.

yummyfajitas · on April 9, 2009

Yes, Python closes over references to a mutable variable, and list comprehensions work the same way. It doesn't close over the value.

I'd make the suggestion that closures are not very "Pythonic". Use objects instead:

    class add_n_func:
        def __init__(self, n):
            self.n = n
        def __call__(self, x):
            return x+self.n

    return [add_n_func(n) for n in range(10)]

pavelludiq · on April 9, 2009

How is a 5 line class with 2 underscore methods more pythonic than a nested lambda?

yummyfajitas · on April 9, 2009

While python supports some basic functional programming, it's primarily a procedural/object oriented language. And it shows.

As the article demonstrates, using nested lambda's doesn't work the way you expect for closures. Using an object does.

edit: update. I found the presentation where guido talks about how he dislikes all the functional stuff, including lambda.

http://74.125.93.104/search?q=cache:2QztD1KncJgJ:www.python....

tptacek · on April 9, 2009

Does it make you sad that this is the same answer you'd have given if the language in question was Java?

Just rubbing it in.

ii · on April 9, 2009

I would suggest that the most "Pythonic" way would be to use functools.partial and operator.add from the standard library.

As I already wrote it would look "fs = [partial(add, i) for i in range(10)]" and it works as expected.

ii · on April 9, 2009

Workarounds:

  fs = [(lambda n, i=i: i+n) for i in range(10)]

or

  from operator import add
  from functools import partial
  fs = [partial(add, i) for i in range(10)]

pavelludiq · on April 9, 2009

heres what i did:

  >>> [i(4) for i in [(lambda n: i + n) for i in range(10)]]

and i got this:

  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "<stdin>", line 1, in <lambda>
  TypeError: unsupported operand type(s) for +: 'function' and 'int'

Apparently the outer and inter "i"s got confused. when i change the outer 'i' to 'f':

  [f(4) for f in [(lambda n: i + n) for i in range(10)]]

i get this:

  [13, 13, 13, 13, 13, 13, 13, 13, 13, 13]

With generator expressions it worked as expected:

  >>> [i(4) for i in ((lambda n: i + n) for i in range(10))]
  [4, 5, 6, 7, 8, 9, 10, 11, 12, 13]

Meanwhile:

  [i(4) for i in map(lambda i: lambda n:i+n, range(10))]

Also works as expected.

This obviously has nothing to do with the lambda in python and everything to do with the way loops work.

Python has a few quirks of this kind, this one i didn't know about. I didn't know that list comprehensions and generator expressions differ in any other way than the thing they return. I should have known better, they are very different things.

edit: looks like this also works:

  [i(4) for i in [(lambda i: lambda n:i+n)(i) for i in range(10)]]

Its all about scope I guess.

abdulhaq · on April 9, 2009

This issue (list comprehensions vs generator expressions) becomes clearer after reading PEP 289 (http://www.python.org/dev/peps/pep-0289/) which includes:

List comprehensions also "leak" their loop variable into the surrounding scope. This will also change in Python 3.0, so that the semantic definition of a list comprehension in Python 3.0 will be equivalent to list(<generator expression>). Python 2.4 and beyond should issue a deprecation warning if a list comprehension's loop variable has the same name as a variable used in the immediately surrounding scope.

and

After exploring many possibilities, a consensus emerged that binding issues were hard to understand and that users should be strongly encouraged to use generator expressions inside functions that consume their arguments immediately. For more complex applications, full generator definitions are always superior in terms of being obvious about scope, lifetime, and binding.

pavelludiq · on April 9, 2009

Thanks for the clarification. I didn't know that until i started messing with them just now for my post.

Heres a a clearer example(you could deduce this from my previous post): >>> x=((lambda n: i + n) for i in range(10)) >>> y=((lambda i: lambda n: i + n)(i) for i in range(10)) >>> [f for f in x]==[i for i in y] True

this is with generators.

with list comprehensions it becomes: >>> x=[(lambda n: i + n) for i in range(10)] >>> y=[(lambda i: lambda n: i + n)(i) for i in range(10)] >>> [f for f in x]==[i for i in y] False

alecco · on April 10, 2009

This is rubbish. From comments right there on that site, Python and many other languages "leak" the iterated result, and combined with closures this doesn't work. The right way to do it is there:

  >>> fs = [(lambda n, j=i: j + n) for i in range(10)]
  >>> fs[3](4)
  7

Anyway, Python 3.0 got rid of lambda. It complicates things and always there's a way to do it more pythonic.

Note for lisp knee-jerkers: I don't have anything against anonymous functions. In Python, 3rd party code is harder to understand if it has lambdas all over the place. At least for me.

HN: This is yet another example of front page trash.

EDIT:

Seems this teacher guy is not even recognizing his mistake. Now it's a "design error."

  While I could accept the explanation about how closures capture the location
  and not the value, this really is a design error. The loop variable ough to be
  local to the loop. But I suppose I get what I deserve for using a dynamic language.

From http://www.python.org/doc/2.5.2/ref/naming.html, it is either a module, function, class definition, script (file|command|eval), or the expression fed to input() (rare.)

  4.1 Naming and binding

  Names refer to objects. Names are introduced by name binding operations.
  Each occurrence of a name in the program text refers to the binding of that
  name established in the innermost function block containing the use.

  A block is a piece of Python program text that is executed as a unit.
  The following are blocks: a module, a function body, and a class definition. 
  A script file [...] is a code block. A script command [...] is a code
  block. The file read by the built-in function execfile() is a code block.
  The string argument passed to the built-in function eval() and to the
  exec statement is a code block. The expression read and evaluated by the
  built-in function input() is a code block.

d0mine · on April 9, 2009

Ordinary Python function have the same behaviour due to Python has lexical scopes but dynamic namespaces.

   >>> def add(n):
   ...     return i + n
   ...
   >>> fs = [add for i in range(10)]
   >>> fs[3](4)
   13

This example works due to list-comprehension leaks `i` binding.

http://stackoverflow.com/questions/139819/why-results-of-map...

senko · on April 9, 2009

The difference in number of comments here (17 at the time of writing this) and on the article itself (0) shows a nice example of (lack of) usability.

I'm guessing a number of people thought about commenting on the post, then found out they had to log in, and just gave up there (at least that was my reaction). The author missed out on some interesting feedback he might've got otherwise.

[apologies for being slightly offtopic for the discussion topic, but it's related to the article]

glomek · on April 9, 2009

Seems fine to me. If you want the captured variable to be different each time, you need to create a new environment each time. How about this?

    >>> fs = [(lambda x: (lambda n: x + n))(i) for i in range(10)]
    >>> [f(4) for f in fs]
    [4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
    >>>

vineetk · on April 9, 2009

    fs = map(lambda i: lambda n: i + n, range(10))

I actually don't think this is all that difficult to explain to newcomers. It may not be the expected behavior the first time they run into it, but it's an instructive example about python scopes and name spaces.

juanpablo · on April 9, 2009

Same thing happens with closures in JavaScript

lbolognini · on April 9, 2009

And your site doesn't feel that good either...