from functools import wraps
def defaults(**default_kwargs): # Decorator factory
def decorator(func): # Decorator function
@wraps(func)
def wrapper(*args, **kwargs): # Actual wrapper
# Iterate over all default keyword arguments
for name in default_kwargs.keys():
# And if not specified in the passed in kwargs use the default
kwargs[name] = kwargs.get(name, default_kwargs[name]())
# Call the actual function
return func(*args, **kwargs)
return wrapper
return decorator
# Note that the default argument value should be a callable to avoid the pitfall with mutable arguments
@defaults(bar=lambda: [1, 2, 3])
def foo(bar):
print(bar)
foo()
typing tools won't know what the above decorator does, without some fancy PEP-612 annotations and I'm not sure pep612 can cover that level of automation. I think intent and all is clear for everyone just having the construction done in the method body for the (IMO, should be) seldom case this is needed
Eh. I don't use typing tools for most things because (and I know this isn't a popular opinion even though I believe it's the correct one) they rarely save you from real world problems. They are mostly useful for auto-complete in your IDE, so you have to weigh the time you spend annotating your code to help your IDE auto-complete function signatures vs the time the auto-complete saves you. I think that if you use this as:
If you use it in conjunction with mypy it definitely saves you from alot of real word issues. In addition, it is also an amazing way of documenting an important element of your code. And annotations van be used by modules (like pydantic) to get even more use. Improvement to anotations have been the biggest improvement in python since 3.6 imho.
What kind of bugs have you encountered in practice that were caught by mypy but not at runtime, manual, or automated testing? I am genuinely curious because in my experience the only time I have seen this is when you either transpose the order or arguments to a function (detected the first time you run it) or if you pass None when an actual value was necessary (detected the first time you run it).
While "weird" when first encountered, once you wrap your head around why this happens, then this is a feature, not a bug.
Both in terms of enhancing one's understanding of python's memory model, and in terms of it being an incredibly useful feature, e.g. for introducing static/persistent variables to a function (without the need for global declarations)
In either case, it's pretty easy to obtain the other case, and the question is what the best default is. My impression is that "evaluate every time" is both a lot less surprising and more commonly what you want, so it would be the better default.
Right. They were saying they much prefer the current way over the proposed way.
It makes sense if you realize that the function is defined once, and for the function need to be defined, its signature needs to be defined with it. Default arguments are part of the signature, not part of the function body.
It would make sense to pass a callable without calling it, like
def myfun(person = Person): ...
but if the call occurs in the function signature, it doesn't really make sense that the call would somehow happen again and again. You're not defining a function in the signature; you're specifying a value right there.
The current PEP proposal is to enable the "evaluate at function call time" ability with new syntax, which I think makes the most sense semantically and also doesn't break existing code.
> if the call occurs in the function signature, it doesn't really make sense that the call would somehow happen again and again. You're not defining a function in the signature; you're specifying a value right there.
There are things besides functions and values! They're called "expressions" or "code blocks", and in an impure language like Python they can produce a different value each time you evaluate them. In my second example, "Person()" is neither a value nor a function -- it's an expression that would be evaluated each time the function is run. This isn't that weird; in `while condition: body`, `body` is a block that's evaluated multiple times. It would need to capture variables from the local environment though, which makes it more complicated to implement.
> The current PEP proposal is to enable the "evaluate at function call time" ability with new syntax, which I think makes the most sense semantically and also doesn't break existing code.
Oh yeah I wasn't suggesting breaking existing code. Just that if I was designing a language from scratch, it would probably avoid the Python foot-gun and have default arguments evaluated each time.
I won't bite on the feature/bug argument, but it's enough of a foot-gun that I have tracked down no less than 3 real world bugs relating to it in large python code bases. Usually when people use an empty list literal as default and shit gets real when it retains values on the next call.
When performing Python interviews I usually try to have that behavior come up in the coding question - not to quiz for trivia so much as to get a feeling for adding on battle-scar bonus points.
In practice, for any non plain-ol-data defaults I just have it default to None and handle it in logic.
Totally understand why it may be useful for introducing a persistent store to a function, but as a matter of taste I like to do that in an explicit decorator if needed.
> Both in terms of enhancing one's understanding of python's memory model...
In much the same way as having your company bricked by a buffer-overflow ransomware attack does in the C world? I'm not a big fan of elucidation by gotcha.
> [...and] for introducing static/persistent variables to a function (without the need for global declarations)
Except when you are at risk of someone calling with their own variable(s) that do not have the static-singleton semantics that the default would have.
This seems to work fine, and brings attention to "I have an object that supports mutation, and I'm actually going to mutate it," which is an unusual enough case to be worth calling out.
This seems pretty iffy for introducing static/persistent variables to a function. I mean, it can work, but it's semantically very confusing. Parameters are part of a function's interface. A global variable would be much better.
i mean it should at least be clear, because it’s not like that in most languages. And there could be 2 forms of syntax for default arguments. singletons or each-call
Sounds like a good idea. Not really a fan of the suggested syntax, though:
def fn(s, n=>len(s)):
"=>" just kind of bugs me, for some reason. (I love me some bikeshedding.)
Of the alternatives they offer, ":=" kind of looks nicest to me:
def fn(s, n:=len(s)):
Though it doesn't really line up with the semantics of the existing walrus operator. Also, as the PEP mentions, it could look confusing when mixed with type annotations.
Semantically, the "?=" alternative might be best, since it kind of makes it clear it's a default, and doesn't conflict with anything else in the language:
def fn(s, n?=len(s)):
Kind of reads like "present? else = ...".
edit: At this point I'd shift from "might be best" to "almost certainly is best". I'm now a zealous "?=" advocate.
I agree. I think python needs some solution to the `def foo(mylist=[])` which is a mistake I see devs make all the time. `foo=>func(a)` is awful. There's too much precedent that `x = f => {}` defines a lambda with x as a parameter. This flips it on its head. `@` would be challenging for the parser. `:=` also does not match the walrus semantics.
`x ?= default` is perfect IMHO. There's already prior art (Makefiles) that this means exactly "present? else = ..." with lazy evaluation of RHS expression.
It also would allow eliminating the "if x is SENTINEL" pattern, where None is a valid non-default argument.
Having mylist stick around is actually useful for making stateful functions. For example, I've often used
def foo(..., cache={}):
...
to implement cached functions, or functions which need an internal cache for whatever reason. Later versions of Python provide an lru cache decorator, but that doesn't help with the internal cache and can be inefficient when you don't want your cache to forget any results.
Yeah, the more I look at it, the more I'm a fan of "?=". Kind of seems like the obvious choice at this point.
The only potential downside is that standard, function definition-time default arguments also imply "present? else = ...". But I think of all the options available, this one makes the most sense.
Yeah, it's probably not bikeshedding when it concerns a very popular programming language's syntax. I just find myself prone to bikeshedding and pedantic naming indecisiveness in general.
Even though the PEP doesn't explicitly mention mutable default arguments, I'm pretty sure it's intended for this use case, as well. It should address this issue.
>Function parameters can have default values which are calculated during function definition and saved. This proposal introduces a new form of argument default, defined by an expression to be evaluated at function call time.
Maybe you could make a function decorator that looked for a special type of default value indicating how to make the default value each time. Similar to `attr.Factory` [1] for `attr.dataclass` annotations.
> Maybe you could make a function decorator that looked for a special type of default value
If you make the functions to supply late-bound defaults arguments to the decorator, its pretty doable; here’s what I just threw together.
from inspect import signature, Parameter
def with_deferred_defaults(*args, **kwargs):
def decorate(fun):
parameters = signature(fun).parameters
positional_parameters = [parameter for parameter in parameters.values() if parameter.kind in [Parameter.POSITIONAL_ONLY, Parameter.POSITIONAL_OR_KEYWORD]]
defaults = list(args)
kwdefaults = kwargs
def wrapper(*args, **kwargs):
args = list(args)
if len(args) < len(defaults):
args = args + [arg() for arg in defaults[len(args):]]
new_kwargs = {k: v() for k, v in kwdefaults.items()}
new_kwargs.update(kwargs)
for i, p in enumerate(positional_parameters):
if i >= len(args):
if p.name in new_kwargs:
args.extend([None]*(1+i-len(args)))
args[i] = new_kwargs[p.name]
del new_kwargs[p.name]
elif p.default is Parameter.empty:
raise TypeError('Missing argument: {}'.format(p.name))
else:
if p.name in new_kwargs:
del new_kwargs[p.name]
return fun(*args, **new_kwargs)
return wrapper
return decorate
The problem with that compared to late-bound parameters is that you have to pass it a callable for non-default uses as well. It's nice sometimes to have a deferred (callable-supplied) default but pass immediate values for non-default.
Yeah, this is the correct answer and it works with functions or objects.
This is commonly used to implement logging callbacks e.g.
from collections.abc import Callable
def log_fn(message: str = ''):
print(f'This is your log: {message}')
def run_read_file(file_to_read:str, log_handler: Callable=log_fn):
contents = open(file_to_read).readlines()
for line in contents:
log_handler(message=line)
Functions are in fact objects as well, so there might be some way to do this sort of behavior with decorators.
I would actually contend that "if x is None, x =Bar()" is actually a superior pattern in some/many cases. I used to use "x or Bar()" or "x if x is not None else Bar()" quite a lot, due to its compactness. However I learned that actually sometimes code coverage fails to ensure proper coverage of these ternary assignments. If your default argument is just because you need to initialize a mutable object, ternary works fine. However if the presence or lack of a parameter is an important part of your state space, you probably want explicit coverage.
This isn't always safe to do. It can break duck typing, because not all falsy values are interchangeable in all situations. For example, consider how numpy methods often have an optional `out` parameter indicating where to write the output. So suppose you are writing a value producing method that takes an `out: Optional[List[int]] = None` to put the produced values into and you handle the `None` case with `out = out or []`. That's a bug! When the user tries to pass in a list to put things into, and that list happens to be empty, you will instead put outputs into a new list.
Anyways, my point is that I can write `if X is None: X = bla` and be confident it will be correct in almost all situations. For `X = X or Y` there are some nasty corner cases where the distinction between `[]` and `""` and `{}` is somehow important, or the distinction between `is` and `==` is important, or different falsy values have different annotations attached to them, or or or....
All of this is true. Its also weird to me that the evaluation of a boolean expression can return return a value that is neither True nor False (yes I know the value evaluates to true, that's why the trick works).
Which is why I just do `if param is None:... ` and not worry about the few extra lines of code and move on.
The trick does have the nice property of delaying execution for evaluating what the default parameter though. The only other way to do that would involve writing lambdas which is even more code? The sibling comment suggesting a package looks like even more code to write as a user. I think any really elegant fix would have to be done in the language itself.
It's not only not a trick, it's how the language is designed with the intention of doing it like this.
To me it's like complaining that (++a) is a "trick" that lets you use secret temp values that are hard to understand. I mean.. _I guess_, but it's C so it seems like the problem is that you don't like C.
"I hate having to have a block of `if bar is None` at the top of my functions."
Why? This is like someone saying "I don't like declaring my variables in C at the top of the function".
It is a good and readable solution to a somewhat rare problem. Passing mutable objects as default function parameters is a common code smell in Python.
And no, adding a special syntax to an already syntax-heavy language to address this is not good.
For your example I would do foo(Bar()) when I needed to run it without an existing object. Or make foo a method on the Bar class and do Bar().foo()
In general I just try my best to avoid immutable, or dynamic defaults, but sometimes they're necessary and I make that same if block at the top of the function. If I catch myself copy/pasting the same setup to multiple functions, then it might be time for a class or a decorator.
Python is my go to language but there are a lot of these little hang ups you only know about once you’ve pulled your hair out. I don’t know if this is one technically though, function definitions are set at interpretation so I think the result here is pretty straightforward even if it’s a hang up for some people.
I agree that it's unfortunate that this is the default behaviour. It's just so unexpected when first encountered.
That said, it is not always desirable that a new object is created each time the function is called without the arg. One obvious example is when the default is an immutable object that's either large or expensive to construct.
The creation of /, iirc, was initially meant strictly as a way of internally enforcing constraints when interfacing with C libraries. It just so happened that someone dug it out and went "what is this?" and then core devs agreed that it should be surfaced and officially supported.
Personally, I find * useful and / fugly - because positional parameters are very opaque when reading code. They also don't give a chance to the IDE to guess what each parameter is meant for.
I'm not a big fan of them either. For teaching beginners, its often "look at the help for this function, but ignore these bits for now", which is quite jarring.
They are useful, but I think sometimes the advanced features run into the basic interfaces a bit, making it weirdly tricky to teach beginners without being confused.
Slightly off-topic, but everyone writing modern Python should be familiar with Pydantic and similar libraries that use type hints for validation and parsing:
We're using Pydantic for Robusta (https://github.com/robusta-dev/robusta) and absolutely love it. You get the best of traditional Python (rapid prototyping and no boilerplate) while still being able to scale your codebase and keep it maintainable. Robusta is the first large project I've written in Python where I'm not encountering type errors at runtime left and right.
We are using FastAPI a bit and it has been a decent experience.
One thing to note about it is tiangolo is pretty content to maintain singular control over his projects which means the project only really advances when he has the time or interest to work on that specific project.
> One thing to note about it is tiangolo is pretty content to maintain singular control over his projects which means the project only really advances when he has the time or interest to work on that specific project.
Yes, he recently responded to some criticism of this basically saying that he has no real intention to increase the bus factor or permit more community control to his projects, which is obviously his right, but is also something that people should consider when deciding whether or not to go with them.
Not everyone is using Pydantic (nor do I see the reason to) or even type hints, for that matter.
Type hints are great when combined with a good IDE such as Pycharm, but are otherwise useless for Python. Sometimes, they are even missleading. Often they don't cover all the call options.
Complicated function calls should be made with keywords anyway. Leave positional only parameters to obvious functions such as pow() or sqrt().
The speed hit should be minimal. You are using Python, where dot access is a much more bigger performance eater than keyword arg functions.
I’ve certainly avoided a nasty bug or two with Python type hints without an IDE. Not sure why you’re so hostile to the idea, the OP didn’t say everyone must use them, just that it’s helpful to be familiar with them.
I am seemingly hostile to the idea because it is turning Python into somethimg it never should have been.
Python does not have the scale to make typing worthwhile, but people are already making it a required Python skill.
I this feature is optional, then I should not be required to know it, right?
If you want static typing, please use another language that makes use of them to actually speed up the code. Let's keep the docstrings for documentation.
You aren’t required to know it but I don’t think it’s fair for you to say that typing shouldn’t be added just because you don’t want to use it. JavaScript had been made better in the form of typescript and I’d love to use a truly typed python as well.
100% this. I like Python but I'm working on a large inherited code base and type hints on parameters would have made this much easier. Stronger typing would be a good thing imo.
When I mentioned scaling I had in mind other issues with a large codebase - not actual performance at runtime.
For example, before type hints I found it difficult (even with PyCharm and other IDEs) to safely refactor code. I would end up missing specific usages of an old member name and discover it at runtime when the code raised an exception. In the best case scenario I would find it when running tests. In the worst case, in production.
Type hints make this a moot point, at least in real world usage. With type hints the IDE is smart enough to find all references when refactoring.
I've never understood what I'm supposed to do with things like this or mypy when--as is usually the case--I'm making heavy use of other libraries that don't have type annotations. The examples and tutorials I've seen always show the code importing standard-library modules and nothing else.
As a counterpoint, I cannot stand Pydantic. Type hinting in function class parameters is super useful, but kludges like Pydantic that abuse Python to turn those into runtime checks? Never again.
I am currently finishing up a moderately sized project in FastAPI, and for my next project I will strongly be recommending using Starlette alone.
Pydantic is great for parsing input either from api's or user input. You often want these runtime checks. But I would argue that the benefits lowers when you are not talking about cross-system interfaces
I would love to live in a world where articles like this included flashcards that you could instantly add to your Python Anki deck. Instead of what I do: leave the tab open forever or bookmark it (and never see it again).
As someone who suffers from the same problem, something like SuperMemo may be a close enough solution. I've never used it myself, but I know it has features such as "incremental reading" where you add pages to the database and mark pieces of it to read a little at a time.
Check out https://withorbit.com - an ongoing research project (and set of tools on github) to embed SRS into any essay/post. If more people adopted it, you might have what you want.
This is completely right. On top of this, most modern IDEs can show you implicit parameter names in the function call if you want. It's a feature you can toggle on or off. Don't force people to consume things how you want because you dont use local variables or have a modern IDE.
However, allowing them to call them positionally doesn't actually change the capability of using it like an "options" dict. It makes exactly the same API minus flexibility for the caller.
I always strongly encourage Python programmers to start flipping through Effective Python to see more like this (I believe it includes this tip as well). Great book covering parts of the language and stdlib you may not remember or may not have heard of.
Yep, using positional-only parameters makes calling the API tougher when all you've got is a `**kwargs` dictionary. But I think that for well designed APIs you'll be able to avoid the scenario of "having" to pass down arguments this way.
I have an article in the earlier stages about what I call the "keyword arguments blender" and how to avoid it, I think it's a common anti-pattern in lots of Python code (including my own, guilty!)
In a dynamic language it’s already hard enough determine what type a given variable is at runtime once the codebase reaches a certain size. When you’re passing arbitrary parameters in the form of kwargs that just adds another layer of dynamism on top that can make debugging even more difficult. It definitely has its place but it’s generally a good idea to use positional parameters if you know they will be passed.
I really don't like the idea of a function forcing me to use it a specific way. It's the same reason I don't like type checking at runtime.
I can see the benefit of forcing junior members of my team to use keyword arguments, but I would be so annoyed if a library author did it to me.
Unless there is a technical reason, your callables should just work the same way as all the others.
I can't think of any reason to use positional only arguments, but keyword only arguments would be nice if I had a variable length of positional arguments that get packed, but then wanted to have an optional argument to change some behavior. But even then it's normally better to just have the caller send an iterable.
I agree with everything in the article, but I would like to add one observation:
The article uses as an example of two arguments, a byte string and a string specifying the character encoding of the first argument. Having two arguments so strongly associated with each other as separate arguments is a slight code smell. One should contemplate unifying them into a compound type (like a class instance) which includes both arguments. It’s not always reasonable, of course, and may add more complexity than it saves. But one should keep an eye out for this.
For more complicated cases I would agree, having a class that encapsulates logic that's inter-dependent is good. However I'd caution against reaching for the class approach for scenarios that can be captured by a function call. Python programmers love functions ;)
Agreed, especially if the related values occur a lot in a program and are thought of as "thing". E.g. a Point or Line class.
I do think this moves the problem of possible argument confusion to the construction of the new type. At some point the association must be expressed in code, and if the association is between two things of the same type the type system can't flag mistakes as errors. A simple data class still helps, even if it doesn't eliminate the problem, because people can more easily remember one Point(x, y) constructor than where "x" and "y" appear in dozens of different functions.
I'd recommend dataclasses over namedtuples. In almost every case the "iterability" of a namedtuple ends up not being necessary (and can be a footgun!) what you really want is a lightweight namespace with types i.e. dataclasses!
I can't see the use of this in almost any code. All of the "bad" examples they gave just seemed perfectly fine, and if they make sense to the user of the library, there's no reason to disallow them.
There are rare cases that actually call for keyword-only and (even more rarely) positional-only arguments, but unless they are required for your usecase, please do not enforce arbitrary argument types on your users.
The "encoding" argument seems like the perfect use case for a strict type, ideally something that works like a sum type, or union types in TS. I think the difference is that in Python unions are unions of types, and not of values. I think "typing.Literal" would be what you can use here in Python.
Depending on the approach that can work, although there are cases where trying to making encoding a literal would be too restrictive (imagine a system that dynamically loads parser plugins, which isn't as uncommon as it sounds), so a literal restricted to ascii and utf-8 would prevent someone from parsing latin-1 or idk, protobuf bytes.
That's a good point. I can see it working with static typing by being more liberal in what you accept as enconding, and accepting a validation function that would take the type of the encoding (for example string) and returning a boolean. But you're right, the type itself of the encoding will have to be wider.
Thank you for sharing. I wish this were the default behavior of positional and keyword arguments rather than requiring a `/, *` in every function definition.
For example, this doesn't do what people expect:
Because this creates exactly one Bar at function definition time, not a Bar per function call.Instead what people usually want is:
I hate having to have a block of `if bar is None` at the top of my functions.I wonder if there's a library that helps make creating functions with default args easier? Or any tips people have?