Hacker News new | past | comments | ask | show | jobs | submit login

It's worth explaining why mutable defaults are bad.

The problem with mutable defaults is that they are evaluated once only when the function is defined. Each time the function is called you'll be using the same mutable variable that was created during function definition.




I came here looking for an explanation for this, so thanks.

To the curious - SO has an explanation of why Python was designed like this, which I found interesting: http://stackoverflow.com/questions/1132941/least-astonishmen...


I don't buy this explanation.

Actually, this is not a design flaw, and it is not because of internals, or performance. It comes simply from the fact that functions in Python are first-class objects, and not only a piece of code.

Why in Common Lisp defaults behave the way one would expect, then? Functions are also first class, but defaults are evaluated at every call.


I don't know CL, but in python function definitions can be executed multiple times... at the top level this happens at module import, so it ends up being only once. But in nested definitions, e.g.

   def foo():
      def bar():
         ...
      return bar
   foo() == foo() # false
Two different function objects are created. If, in the above examle, bar took a pram thelist=[], each call to foo would produce a bar function with a different list instance for thelist. The default values can be read as expressions passed to the function object constructor, rather than a bit of code to be evaluated each function run.

I don't know how CL works in this regard, nor do I know which is better or worse. I think the explanation linked did a terrible job conflating first class functions with execution and runtime models. Some of the answers below it explain better tho. :)


The fact that definition can be evaluated many times is not really relevant here. What is important is how one specifies language's semantic -- for instance, Common Lisp: The Language, 2nd edition book (I don't own ANSI standard) says:

When the function represented by the lambda expression is applied to arguments, the arguments and parameters are processed in order from left to right. (...) If optional parameters are specified, then each one is processed as follows. If any unprocessed arguments remain, then the parameter variable var is bound to the next remaining arguments, just as for required parameter. If no arguments remain, however, then the initform part of the parameter specifier is evaluated, and the parameter variable is bound to the resulting value (...).

The CLTL2 specifies that the form representing the default value of optional parameter shall be evaluated every time the parameter is not provided.


> It's worth explaining why mutable defaults are bad

They can also be good. Here's an example from the Reddit discussion, showing how a mutable default can be used to very neatly and cleanly add memorization to a function:

   def fib(n, m={}):
      if n not in m:
         m[n] = 1 if n < 2 else fib(n-1) + fib(n-2)
      return m[n]


That's an unpythonic hack. It is possible to have an explicit static variable using function attributes; another way is to use a proper memoization decorator which factors this out.


Cleverness like this makes you feel warm and fuzzy inside right up to the point where someone decides to actually pass that second argument to your function.


Well in Python 3 you can warn them:

    def fib(n, m:"donotusethisparameter"={}):


Is that clean and neat, or weird and inscrutable? Will it be clear to /anyone/ reading that code what it's doing?


I'm a Python newbie, and that code was immediately clear and obvious to me when I read it.

I won't say it would be clear anyone that reads it, because we live in a world where people who claim to be programmers can't do fizz buzz.


I'm a Python veteran, and, if you do it like this, I'll shoot you.

Add a @memoize decorator and do it there, you need to always be as obvious as possible. Compare:

    @memoize
    def fibonacci(n): pass
    
    def fibonacci(n, memory=[]): pass
You don't even need documentation for the first example.


Starting from Python 3.2 there is a builtin decorator: http://docs.python.org/dev/library/functools.html#functools....


That's fantastic, thanks for the link.


I agree with your sceptisism. This is misusing a language feature, a better approach is using a decorator.


I blogged about this, and other uses for mutable default arguments (with tongue somewhat in cheek) a few weeks ago: http://inglesp.github.com/2012/03/24/mutable-default-argumen...


This also explains when mutable defaults are not a problem: when they are not mutated in the function's body. There's nothing wrong with this:

  def f(seq=[]):
    for x in seq:
      # do something


Yes but no. In two month, when the function has grown, the next coder may not notice the issue and start mutating the default in the function body. Then you have a hidden killer bug.

Pass all code under pylint scrutiny, comply to its complains or adjust its rules, do it early. That is the recommendation I wish all devs could read.


That seems like decidedly unexpected behaviour and makes default params far less useful.


It's a mistake. Default values for optional parameters don't act that way in any other language I know of, including Common Lisp, in which functions are also firstclass.


It's most certainly not a mistake; Python 3 would probably have fixed it, if it were. It is an (admittedly, strange) side effect of the way 'def' works.


But 'def' doesn't have to work that way. Consider, in CL:

  > (defvar *fn*
      (let ((x 3))
        (lambda (&optional (y (list nil x)))
          (push 7 (car y))   ; modifies the list
          y)))
  *FN*
  > (funcall *fn*)
  ((7) 3)
  > (funcall *fn*)
  ((7) 3)
From this example you can see two things. First, the binding of 'x' is closed over when the lambda expression is evaluated. And second, the expression that provides the default value of 'y' is evaluated every time the function is called.

There's no fundamental reason it couldn't have worked that way in Python. (I understand that changing the language so it worked that way now would likely break some code.)

EDIT: fixed formatting.


My lisp is a bit rusty, but it looks like what you're doing there is returning a function which gets redefined every time you reuse the function.

The equivalent Python would be something like this:

    def function():
        x = 3
        def internal(x, foo=[]):
            foo.append([7])
            foo.append(x)
            return foo
        return internal(x)
        
    print function()
    print function()
Which does what you would expect:

    [[7], 3]
    [[7], 3]


No. In my example the function is created only once, and called twice.


What's that lambda thingo in the middle then? Pretty sure that's another function, redefined every time your function is called.


Yes, the lambda creates the function. Note that defvar does not. So there is still only one function being defined here.


Ok, I see now.

You've still got that &optional argument though. I don't see a huge amount of difference from a semantic point of view between that and the Python version though (ie. if x == None: ...).


No, the lambda expression creates the function, which is returned as the value of the let block; defvar just binds the function to a name so we can use it multiple times.


This translates into

    fn = lambda y=[y]: y.push(7); return y
if you accept the ; to separate statements, as the lambda in python is syntactically only allowed to contain one statement.

(The introduction of the variable x into the example is not important for the behavior of default arguments, however, it is important for a separate issue. I've stripped it out here.)



Surely not. The implementation already has to check that the number of provided arguments is valid. The decision of whether to evaluate the default expression can be part of that.

The code that evaluates the default expression doesn't need to be in a separate function, either, so the argument that calling that function is too expensive also doesn't hold water.

I just tried a test in SBCL:

  (defun foo1 (x) x)
  (defun test1 (n) (dotimes (i n) (foo1 (cons nil nil))))
  (time (test1 100000000))
  => 4.4 sec, or 44ns / iteration
  (defun foo2 (&optional (x (cons nil nil))) x)
  (defun test2 (n) (dotimes (i n) (foo2)))
  (time (test2 100000000))
  => 4.1 sec, or 41ns / iteration
The version with the optional parameter is actually slightly faster, which completely blows a hole in the performance argument.

Look, no language is perfect -- not even Common Lisp :-) I think users are better served when design flaws in a language are acknowledged without defensiveness than when bogus justifications are offered.


You're assuming that the function-calling overhead is the same in python as in CL. I don't think that's the case, and it definitely wasn't at the start.

I don't agree that this is a design flaw. As I recall it bit me once as a beginner, and never again in over a decade of using python, and as a lisp hacker you know you don't design a language for beginners. :-)


IMO mutable default arguments should be forbidden just as mutable keys are not accepted in dictionaries. All of the examples which claim to have a use-case for mutable default values can be rewritten with more explicit (thus more pythonic) constructs.


The thing is, None is always a possible value for a parameter so it's actually more robust if functions are written to expect None.

If you say "f(x=[])" (assuming that worked without the actual side effects it has), someone could still say "x(None)" instead of "x()", causing the function to die. Since a robust program isn't able to avoid checking for None, it might as well set defaults there too.

There is another case where this is important; you might want the equivalent of "f(x=expensive_function_to_calculate_useful_default())", and you don't want that function called unless it needs to be. Only the x=None approach allows this to be deferred.


I disagree. If you want your programs to be robust like that, you now have to check for every case where someone might pass in something stupid (dict instead of int, maybe?). Much better to catch errors further up the chain and keep your low level code simple (ie. pass me something other than an iterable and I blow up).

In the expensive case, I'd just calculate it once and store it somewhere (possibly as a lookup dictionary if there are multiple inputs) and access that from within the function.


Well that's true, a function is generally written as if it's been given what it wants. I wouldn't check for other types either.

But None is a result that can happen in situations that would otherwise return exactly the expected type. If "nothingness" can be meaningful (especially in a function that accepts an empty list as a parameter, say), it's nicer if the code just deals with None itself instead of requiring checks for None in all the callers.


It's like whitespace. Everybody has this reaction at first, then they get over it.


Ahem, we have repeatedly intricate issues that are related to this kind of behavior, and I have yet to see an annoying white space bug.

Not to say I think python should be changed on this point. It shouldn't, there are code checkers that warn you on the gotcha, let's use them.


Actually, if you understand the way Python is evaluated (dig in, the core is pretty transparent), it's the only behavior that makes sense in this case. It's also documented as such[1], so it's quite expected. Default parameters are still just as useful for constants, such as:

    def f(x=0, y="foo", z=3.14159):
This, however, is a perfectly Pythonic idiom:

    def f(L=None):
        if L is None:
            L = []
[1]: http://docs.python.org/reference/compound_stmts.html#functio...


While it seems logical when you understand what's going on, from a practical point of view I can't see how this would ever be useful. The tradeoff appears to be that the functions are first class objects. I'm not sure what the benefit here is though. Does having them as first class objects allow some useful idioms? (I'm a ruby dev but I'm genuinely curious to know what this allows you to do)


There are a few use cases for default variables on effbot's site: http://effbot.org/zone/default-values.htm

Basically, sometimes you do want to reuse the mutable between function calls, and in those cases it can save a fair bit of code passing it in repeatedly.


Good coverage. I use it quite often for cache dictionary, it's much simpler API and overall code, than creatng a new class for it. Demo snippet from effbot's site:

  def calculate(a, b, c, memo={}):
    try:
      value = memo[a, b, c] # return already calculated value
    except KeyError:
      value = heavy_calculation(a, b, c)
      memo[a, b, c] = value # update the memo dictionary
      return value


This seems a bit leaky. You're exposing the caching mechanism in the method signature(yeah, ok, in practice it's unlikely to be a problem).


The other option is to create your own cache object and pass that in (and around, if it's a recursive function). Of course, in Python pretty much every cache object follows the dictionary interface anyway, so it doesn't really matter. One of the benefits of duck typing :)


Functions (methods) are first class objects also in Ruby: methods are instances of the class Method, while Procs are a lightweight alternative. Default arguments in Ruby are not mutable in any kind of function object, be it a lambda proc, a regular proc, or a method.

You can use a mutable default argument as an ersatz static variable, e.g. for memoization.


> Default arguments in Ruby are not mutable in any kind of function object

Maybe I misunderstood you, but they are perfectly mutable:

    class A
      attr_accessor :a
      def initialize
        @a = []
      end

      def b x = a
         x << 1
      end
    end
     
    obj = A.new
    # => #<A:0x007fd92b2e9850 @a=[]> 
    obj.b
    # => [1] 
    obj.b
    # => [1, 1] 
    obj.a
    # => [1, 1] 
If you mean "inline default arguments are not mutable", that's not true either. What is true is that the default argument is evaluated when the function is called, not when it is defined:

    a = 0
    # => 0 
    x = lambda {|y = (a + 1)| y }
    # => #<Proc:0x007fd92b1dae50@(irb):33 (lambda)> 
    x[]
    # => 1 
    a = 5
    # => 5 
    x[]
    # => 6


And this is expected behavior. What is unexpected is when this behavior is afforded to new list or dictionary arguments in Python, just as it would be unexpected to say

    ruby> class A; end
     => nil 
    ruby> def foo(bar = A.new); return bar; end
     => nil 
    ruby> foo
     => #<A:0x00000101985060> 
    ruby> foo
     => #<A:0x0000010197a6b0>
and get back the same object each time `foo` is called in Ruby.

I've even seen a major Python library with this bug (I'm sorry, I don't recall which off-hand). It's really surprising behavior for new Python devs.


Good catch, I hadn't thought of the case where the default argument is expressed in terms of another variable.


It's not designed to be useful, it just is. Functions are objects, yes, and the def statement brings them into being and assigns them to the given name in the current scope:

    >>> def f(x):
    ...     return x + 5
    >>> type(f)
    <type 'function'>
    >>> dis.dis(f.func_code)
      2           0 LOAD_FAST                0 (x)
                  3 LOAD_CONST               1 (5)
                  6 BINARY_ADD          
                  7 RETURN_VALUE       
    >>> g = f
    >>> g(10)
    15
    >>> g is f
    True
They're just variables in the current scope. If you're quite clever, your brain is already figuring out that has some interesting implications that some libraries use:

    >>> import socket
    >>> socket.gethostbyname('www.google.com')
    '74.125.71.103'
    >>> socket.gethostbyname = lambda i: '10.0.0.1'
    >>> socket.gethostbyname('www.google.com')
    '10.0.0.1'
I will not pass judgement on monkey patching like this, just pointing out it's doable. I know for a fact Ruby can as well.

Functions just being variables has useful properties when you're doing something like fancy switch/case type things (the readability of this is questionable, but it's cool to look at it, like a Duff's device):

    >>> i = 1
    >>> { str: func1, unicode: func2, int: func3 }.get(type(i), func1)(i)
    in func3
Also consider something like this, which is how decorators work (and they're incredibly useful), sort of like a closure:

    >>> def maker(i):
    ...   def ret(x):
    ...     return x + i
    ...   return ret

    >>> f, g = maker(10), maker(100)
    >>> f(5), g(10)
    (15, 110)
I'd be surprised if Ruby couldn't do everything I just did.


It's not designed to be useful

By useful he means mucking up your program in totally unexpected ways.

He's being nice about it being a silly decision to have it behave that way. The entire post reads more like a list of unexpected things that will bite you in the ass.


Thanking you for showing me the 'dis' module. Dis is going to be fun to play with!!


Or just embrace the tao of the tuple:

    def f(L=()):
        ...


I am but an egg, but isn't this the same but shorter?

def f(L=None): L = L or []


It might work the same depending on your use of it but it's not the same. In that instance L will become a blank list if it is equal to None, zero, or a zero length string. There are many cases where this wouldn't affect anything, but there can also be instances where that will cause you to define L as a blank list when you really wanted to keep L's value. I think it's always better to be explicit and test for the value(s) you expect.


Ah.

These are the little assumptions that keep blowing off my feet. Thanks.


It is these cases that brought the ternary operator to Python:

  def f(x=None): x if x is not None else []


I'm a little confused about this as an assignment to a variable. Is that because of the 'return' omission referred to below? If all function f did was return x, I could see how this works with a 'return' prepended.

But if this expression is a line in a larger function, and is intended to reset the value of argument x if no other value is passed in for it, does this really act as an assigment to x? Because I sort of read this expression as evaluating to some value -- the passed value for x, or a [] -- but does this assign that value to the argument x? Or must it be x = [expression]


  def f(x=None):
        x = x if x is not None else []
        return x
Now it will assign that value back to x. Otherwise it would just evaluate the expression.


Thanks for the additional clarity.


Yes, but why? Actually, it's No, because python culture aims to use one way to do thing, the least surprising one. In this case it's:

  def f(x=None): if x is None: x = []


Because that "if" is a statement whereas the ternary expression is, well, an expression. There are places expressions can be used that statements can't (eg lambdas) and that x or [] won't work (eg when x is False).

That said, people still seem to favor your form as the more Pythonic way. Personally, I think that's just because the ternary expression is relatively new.


you forgot 'return'


Yeah, I guess my typing went into "lambda mode" since it was a one liner.


I originally had "don't do that!" in my comment with your exact code, and edited it out for brevity because I've only seen a couple people do it (and they understood the ramifications, which others have told you). If you're interested in brevity, this is as terse as it gets:

    L = [] if L is None else L




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: