Hacker News new | past | comments | ask | show | jobs | submit login
Asterisks in Python (treyhunner.com)
222 points by ingve 5 months ago | hide | past | web | favorite | 106 comments

This is one of those areas where I find Python a little contradictory. About 50% of the time it's "explicit is better than implicit" and "there should be one and only one good way" and then the other half of the time it's "here's this cool feature for doing something you could do a different way that looks like hieroglyphics and nobody understands but you should totally use it because it's awesome!

> This is one of those areas where I find Python a little contradictory.

It's not contradictory. You should never use args and kwargs unless you need them. Usually they're only needed in relatively rare situations, like A) when you're building a dictionary dynamically without knowing what the key names will be in advance B) when building a library where you want to allow people to subclass a parent class, but also reserve the right to change method signatures in the parent class without breaking the user code.

If developers are just using args and kwargs when calling one normal function from another then you should absolutely should not merge their pull requests until this gets removed.

They absolutely do solve a problem. But I can't help but wonder if it would have been possible to find a way to let people solve that problem in a more explicit manner. The specific solution the Python team chose smells to me like one that was optimized for the convenience of the language's implementors more so than its users.

This is one of the things that gets me about Python.

It makes this big noise about being a super-friendly form of executable pseudocode, but then you open any code example and the first thing you see is two asterisks and the mysterious word "kwargs" (a Swedish dessert perhaps?).

I wouldn't mind except for all the haughty pretense about Python being a language that doesn't do this sort of thing. Dear Python, get a grip, you're just another ASCII-abusing interpreted monstrosity like Perl, you just decided to make programmer life suck with whitespace instead.

Key word arguments. And I would consider the pythonic approach to be more pragmatic than anything. The language isn't haughty, it just does what it's told. That being said, you've provided a perfect example of what haughty looks like.

In the parent’s defense, people write a lot more cryptic Python than they do other languages. There are no metaclasses in Go nor abused operator overloads or half-baked DSLs. People don’t use dictionaries as structs to hang off whatever they like.

It’s kind of a penny wise and pound foolish approach. Arguably it’s not python’s fault, but that’s a poor consolation for folks who have to deal with these messes.

Is that really true? Sure you can write cryptic code using operator overloads, but the convention is to only use them in cases where they improve readability. For example numpy code would be a lot less maintainable without operator overloading IMHO.

I haven't written numerical code in Go but I am skeptical it will be as maintainable without operators. The lack of a particular feature does not in itself make code more maintainable.

I can't speak to metaclasses, since I have never had to maintain code using them. In my experience they are pretty rare.

It's been true in my experience. Operator overloads are marginally beneficial and for some reason they're prone to be abused. Put differently, `Add(x, y)` isn't any less maintainable than `x + y`, but the former is pretty much never abused, while the latter is a popular abuse target.

I've been writing Python and Go extensively for the last 10 and 5 years respectively. I really recommend giving Go a shot; you can pick it up in an afternoon and it really complements Python well since it excels at a lot of Python's weaknesses (parallelism, performance, static typing, static compilation, dependency managmenet, etc). For example, if I have to write a CLI tool, I almost always use Go since it's so much easier for coworkers to install a single binary than a Python program + dependencies + make sure you have the right version of the interpreter installed.

Oh I know Go, just haven't used it for numerical tasks.

If Add(x, y) is just as maintainable as x + y, why does Go use operators for the regular numeric types? Why have + at all?

Aside: Swedish does have the word 'kvarg', which is a product made out of sour milk. Some people eat it for breakfast but personally I can't stand it.

This is legitimately my favourite comment of the day.

Funny, in Dutch we call it kwark

At least the kwargs word is googleable. So it is relatively easy to find documentation. Unlike list comprehensions which are really hard to google until you know the name.

"for loop inside list python" seems to find a stackoverflow thread explaining them. It took me quite a while to bother trying that when I was first learning python though.

From the Zen of Python:

> (...) Although practicality beats purity.


Essentially, then, expressiveness (language complexity for the sake of flexibility), beats ease of learning or "one right way to do things".

I don't really think so. It's expanding a "language-level" iterable (list, etc) into a "syntactic-level" comma-separated list of those values. does the same but turns dictionaries into syntactic `name=value, ...` constructs. It's a great little feature with loads of handy uses!

True, Python allows a lot of crazy hacks like monkeypatching, dynamically creating classes and so on. It is there if you need it. It is not like in Java where those things are simply impossible. The Python culture on the other hand discourages obscure hacks and unreadable cleverness, and the language is generally designed to encourage the straightforward and readable solution. I don't really see a contradiction in that.

You cannot prevent people from doing stupid things anyway. What you can do is encourage good style, and making the "right way" the path of least resistance. You can write readable code in any language, even Perl, if you use enough effort and discipline, but Python tries to make readable code the easy default.

The "quote" you made up is not representative of the Python culture or the article here, although I'm sure you can find some blogger somewhere with that attitude. They seem to have mostly migrated to other more hip languages though.

> The Python culture on the other hand discourages obscure hacks and unreadable cleverness

I guess Pandas, Matplotlib, SQLAlchemy, etc, etc didn’t get the memo.

GP's quotes are paraphrased from PEP 20 and I think _are_ put of Python culture, at least as I've experienced it.


I'm afraid you've misunderstood olavk's point. 'The "quote" you made up' certainly refers to

> "here's this cool feature for doing something you could do a different way that looks like hieroglyphics and nobody understands but you should totally use it because it's awesome!"

not to

> "explicit is better than implicit" and "there should be one and only one good way"

Yeah that is what I meant, sorry for being unclear.

Doh, I think you're right!

You might be surprised at how many of those things are possible in a language like Java. Class loader and dynamic proxy tricks add a lot of power to the language.

I feel you, however I also love https://github.com/Knio/dominate - a Python library to generate HTML.

Generating HTML always felt awkward for me in many languages. From PHP where you sprinkle PHP between HTML or HTML between PHP, to frameworks in various languages where you populate variables and HTML is magically generated.

Somewhere in between there are template languages.

"Dominate" uses Python itself as the template language - this means there never is that "disconnect" where you have to validate your output, neither is it that disconnect coming from generators, nor do you have to learn and use a separate template language. Dominate can look like this:

    return hr(), p(
            href = '/start'
        style = 'text-align: center;'

If the Python is valid, so is the HTML.

For me it hit a sweet spot. Dominate uses kwargs. I discovered this and (a?)bused kwargs when I wanted to add some tags of my own.

Oooh, that's nice. Thanks for sharing this. I like how it merges your mental model for HTML with Python in what feels like a very unsurprising way.


And regarding the kwargs I had this feeling first of oh no "what is this!?"

I have been programming Python since 2000 and I never used kwargs before. I knew they were there somewhere but I never bothered with them.

I was at first a bit annoyed, but I really wanted to add my own tags and discovered it was actually quite nice if I could intercept arguments in my own tags. You don't have to use kwargs to add your own tags, but you can make the tags more powerful and composable that way.

But kwargs are really only worth it to me if there is a strong component of, "write once, use many times" to it.

You don't have to use kwargs to add your own tags to Dominate, you can just return tuples of building stone tags if you want to.

Which part do you think is inexplicit? What do you think is the equally clear different way of doing it?

How much Python experience do you have? I find that in places where it is applicable, this syntax is clear, unambiguous, easy to read, overall much nicer than alternatives.

The Zen of Python clearly states:

    There should be one-- and preferably only one --obvious way to do it.
I think that "obvious" is the key word here -- it says nothing about the non-obvious ways, nor about which ones are better.

The problem here is not with Python itself, but with people looking for a silver bullet which will solve all their problems. In Python, this is presently very obvious with async programming, and people insisting on using it for everything; however, async is useful for only a certain subset of problems one might encounter (and in a few cases it's incredibly useful) while in most situations it adds avoidable complexity. However, this approach is definitely not limited to Python, and every language has its own examples (classes vs prototypes in Javascript is one example).

My advice is: use what you're comfortable with; don't be afraid of learning new techniques, but don't feel obligated to use them at all costs; ignore the Kool Kidz™ pushing fads.

You are right, but sometimes, I don't see how you can easily achieve the same task without using the asterisk. For instance, if you want to plot a list of (x,y) coordinates which is likely to happen quite often, you can do :

What do you usually do? (Probably creating two intermediary lists with a loop?)

How would you approach decorating /any/ function without args, kwargs or something very similar?

  def decorator(f):
      def new_f(*args, **kwargs):
          return f(*args, **kwargs)
      return new_f

It wouldn't be HN if the top comment weren't critical of asterisks in Python.

I've been programming with Python for over a decade. I mostly understand, but I do try to avoid when possible for maximum clarity. Expanding function variables is fine and clear enough, but multiple levels deep in a comprehension and it can get pretty thick to try and keep it all straight. This article is nice that it covers all the patterns I've seen.

Honest question: why would anyone make a list comprehension that's multiple levels deep? Isn't the purpose of comprehensions to provide quick-and-dirty inline for loops, where a full for loop is too verbose?

Multiple for loops in list comprehensions (if that's what's meant by "multi-level") are pretty straightforward once you realise the simple rule that translates them into loops: write everything to the right of the expression in the same order with nesting. For example this:

    l = [f(x, y) for x in X for y in Y if x > y]
is the same as this:

    l = []
    for x in X:
        for y in Y:
             if x > y:
If a list comprehension doesn't fit on one line I try to highlight this to any future reader by using one line per thing that would get nested:

    l = [
        f(x, y) 
        for x in X 
        for y in Y 
        if x > y
Needless to say, there are still list comprehensions that are complex enough that they ought to be broken out into loops. But being nested isn't enough by itself.

People who come from math backgrounds often becomes infatuated with complicated list comprehensions when they realize that they can be used as set-builder notation. {(x,y) for x in range(10) for y in range(10) if y == 2*x}

I use stuff like that. But my rule of thumb is that I try and make it readable by breaking it up into multiple lines and using indentation. If I can't make it readable, then I will consider another way of writing it. This is mainly for self preservation, since I'm likely to be the person who has to read it later.

That looks exactly like my notes from math courses and I'm overjoyed that it's meaningful in Python.

Comprehensions have some other benefits too, for example, they're much less likely to be accidentally stateful. My general experience is that for complex logic, a comprehension is harder to write initially, but much less likely to have bugs when I get it right.

That said, I usually won't use deeply nested comprehensions, but sometimes it actually is the clearest way to parse something (e.g. extracting a field from nested JSON.)

For me it’s very rare and used where I would write something like concatMap in Haskell.

    citiesByState = {
        'WA': ['Seattle', 'Spokane', 'Olympia'],
        'OR': ['Portland', 'Salem'],
        'CA': ['San Francisco', 'Los Angeles', 'San Diego'],
    cityToState = {city: state
                   for state, cities in citiesByState.items()
                   for city in cities}
I would generally avoid multiple levels in a comprehension but there are some very simple two-level cases like this that I use occasionally. I feel that the code is easy to read, at least.

In Haskell, these direct equivalents aren’t bad at all:

    cityToState = Map.fromList
      [ (city, state)
      | (state, cities) <- Map.toList citiesByState
      , city <- cities

    cityToState = Map.fromList $ do
      (state, cities) <- Map.toList citiesByState
      city <- cities
      pure (city, state)
For multiple “nested loops”, I generally prefer do-notation over both list comprehensions and combinators such as concatMap, unless the structure is simple enough that the combinator version is much shorter.

And for those following who don’t know Haskell, both of these are basically sugar for concatMap.

(>>=) = flip concatMap

While supported, list comprehensions that are multiple levels deep isn't considered good style. There is an example of this in Google's python style guide: https://github.com/google/styleguide/blob/gh-pages/pyguide.m...

It used to be for performance reasons, the way python runs comprehension is faster than plain loops, so the more you could cram in a single comprehension, the faster it ran.

These days though, you can split it in generator expressions and make it run at the end inside a comprehension, and you get best of both worlds: splitting helps readability, running it all at once in a single comprehension at the end keeps it performing.

It's useful for working with permutations. E.g,

    [f(a, b) for a in some_list for b in some_list]
or even

    [f(a, b) for a in some_list for b in some_list if a is not b]

itertools has functions that make that sort of nesting unnecessary (and some would say make the code clearer), eg:

    import itertools as it
    [f(a, b) for a, b in it.product(some_list, some_list) if a is not b]

Oh, or more directly:

    [f(a, b) for a, b in it.permutations(some_list, r=2)]
And then you could also use itertools.starmap instead of using a comprehension at all, if you wanted.

I never understand why people feel the need to find shortcuts. It’s like reading smthg[i++] in C. It’s not clear and it could be clearer if written over two lines instead but yet everyone does it.

If "everyone does it", then you should treat it as clear and just part of writing the language idiomatically. Pick your battles for when even the cognoscenti favour the more verbose approach.

It doesn't help people not too familiar with the language.

It's perfectly clear and unambiguous. For some reason everyone loves it when functional languages are 'pithy' but hates it when C-like languages are equally terse.


Smthg[i++] and smthg[++i] have two different behaviors. If you tell me that this is clear I probably don't want to read your code.

They're two different things. Are you going to complain that a/b and b/a give different results?

You're comparing mathematics term which have been established since before your grandfather was born with a peculiarity of the C language that confuse more people than it helps. People willing to Obtain a small amount of LOC for less clarity deserves neither.

Unrelated to the article: the 'x' on the newsletter popup doesn't work on mobile. Makes the article very difficult to read. If the author is reading this, you should fix that!

Thanks for reporting this issue!

> print(*more_numbers, sep=', ')

This alone makes me appreciate print as a function in py3.

Now how much nicer would it be if even more things were functions, rather than special syntax like here? This can be done with an `apply`.

I think the star syntax is a lot nicer than `apply`. Python actually had an `apply` builtin, but it was deprecated in Python 2.3.

Apply, and first class / higher order functions in general are good in languages where they compose well. Python's * won't compose well.

To add a small footnote to the existing replies to this comment: A related, but slightly different, function in the `functools` module (part of the Python standard library) is the `partial` function. It is like apply() but returns a function instead of calling the function and return its return value:

    f1 = partial(f, 1, 2)
    f2 = partial(f1, 3, 4)
    f2(5, 6)  # equivalent to f(1, 2, 3, 4, 5, 6)
It also supports kwargs.

    def apply(f, a):
        return f(*a)

How wouldn't * not compose as well as apply?

It is a function - print() is being called as a function, taking a number of positional, arguments followed by named argument. I don't think you're correct about this being special syntax, except for the '*' is being used to expand a list into the positional arguments.

I believe they meant replacing the special * syntax with a use of apply so that there was no special syntax to achieve that functionality at all.

I'm not the sharpest functional programmer in the cohort, but isn't `apply` a parallelizable for-loop?

    apply(l, print)
is the same as

    for el in l:
which is a number of print statements, rather than 1 print statement with a number of arguments.

The single * way to specify keyword only arguments is a total monstrosity.

It's a logical and consistent outgrowth of the existence of *args, and having (finally) allowed parameters after that (which would be keyword-only). I'm sure you could make up other ways to do it, but not that they'd be any better.

Whats with the / in the help for sorted?

    Help on built-in function sorted in module builtins:

    sorted(iterable, /, *, key=None, reverse=False)

It looks like it's a semi-internal(?) API notation for position-only fields.

That is, you cannot say sorted(iterable=range(10)) because 'iterable' is a position-only parameter.

Some pointers. https://www.python.org/dev/peps/pep-0436/

> / Establishes that all the proceeding arguments are positional-only. For now, Argument Clinic does not support functions with both positional-only and non-positional-only arguments. ... (The semantics of / follow a syntax for positional-only parameters in Python once proposed by Guido. [5] )

https://www.python.org/dev/peps/pep-0457/ ("PEP 457 -- Syntax For Positional-Only Parameters" - draft)

> All parameters before the / are positional-only. If / is not specified in a function signature, that function does not accept any positional-only parameters.

This is only for documenting API signatures. There is no way to implement directly in Python code. Eg, that PEP goes on to say:

> This PEP does not propose we implement positional-only parameters in Python. The goal of this PEP is simply to define the syntax, so that:

> - Documentation can clearly, unambiguously, and consistently express exactly how the arguments for a function will be interpreted.

> - The syntax is reserved for future use, in case the community decides someday to add positional-only parameters to the language.

See also https://bugs.python.org/issue21314 , "Add support for partial keyword arguments in extension functions"

It's a documentation-only (currently, I remember seeing posts on python-dev to add it as a python-level construct but I don't believe that's been implemented) to document positional-only parameters, a feature currently only available to "native" functions.

Basically, at the C level you can define positional-only parameters such that sorted can not be called as `sorted(iterable=xxx)`, and this was specified using a "/" in argument clinic (https://www.python.org/dev/peps/pep-0436/#special-syntax-for...) and now appears in some docstrings. This is pretty common in builtins e.g. you can't give a name to the first argument of `max`, it doesn't have one (except informally).

Anything before the / is positional-only (the name is informational but not definitional), between / and * it's both positional and keyword, and after * (or *args) it's keyword-only.

Python has a bit of a wart in that arguments can normally be passed as keyword or positional. Python3 also added the ability to pass keyword only args (the ones after the `* `), That is

    def f(a, b):
        return a+b
can be called as `f(1, 2)` or `f(b=2, a=1)`, although I'd look at you funny if you did the second.

Certain functions implemented in C also have positional only args, so the function of signature

    weird(p_only, /, pos, kw=None, * , kw_only=None)
can be called in the following ways (1 is p_only, 2, is pos, 3 is kw, 4 is kw_only)

    weird(1, 2, 3)
    weird(1, 2, 3, kw_only=4)
    weird(1, pos=2, kw=3)
    weird(1, pos=2)
    weird(1, 2, kw_only=4)
but not

   weird(p_only=1, 2)

That is curious, it looks like it might just be a type-o in the documentation. Since it is written in C and not python, it is not automatically generated. But I don't know enough C to say for sure, maybe someone else can.


Nope. It represents positional-only arguments. You can have positional-only arguments for C function but not Python functions. Just a weird Python wart.

my favorite cheat is using locals() with string formatting something like this:

  def example_cheater(color, flavor, age):
      template = 'I am {age} years old and I like to wear {color} hats and eat {flavor} icecream'
      return template.format(**locals())

Obviously a contrived example, and it can be argued that using locals() is not very pythonic, but I think it makes the code look much nicer.

In python 3.6 and up, that becomes

    def example_cheater(color, flavor, age):
      return f'I am {age} years old and I like to wear {color} hats and eat {flavor} icecream'
Even nicer!

OMG! that is so exciting, Thank you for telling me! I have only been using 3.x as my primary language for a few months now, I can't wait to try this out. I feel like a huge dork right now, because that made my day.

For reference, this feature is Formatted String Literals ("f-strings"). https://docs.python.org/3/reference/lexical_analysis.html#fo...

Of course I didn't read your comment until several Google searches later.


    >>> pi = 3.14159; two = 2
    >>> f"{two:02d} pi is {2*pi:.2f}"
    '02 pi is 6.28'

Didn't know this. Thanks for sharing!

Does this trip up pylint?

Trips up Vim's syntax highlighter.

Hasn’t for me.

Taken straight out of Ruby, I see. Very convenient feature (and one of my favorites about Ruby), though it does contradict the do-it-one-way mentality.

According to Rosetta Code Algol 68 had string interpolation. [1]

It'd be interesting to see if there are any older languages that supported it too.


>though it does contradict the do-it-one-way mentality.

not really: use f-strings when you're passing in variables as-is (or nearly no work), and format() when you need to do work on them before stringifying them; f-strings are naturally (and obviously) more difficult to read when the variables are big, as it obscures the actual text they're being fit into, and where.

The only natural area for preference to apply is whether to do the work before the format() call, name the variables, and change it to an f-string.. or stick with a multi-line format()

I'm not sure theres any real situation where the choice isn't obvious. Maybe if you're doing something like f"list1: {sorted(a)}\nlist2{sorted(b)}", where the work is rather small, but even then f"list1:{}\nlist2:{}".format(sorted(a), sorted(b)) is just as nice, or rather unsatisfying, as the f-string.

> not really:

Yes it does. You could construct the string using a for-loop. Thank you for proving my point and then down-voting me.

...if you’re going down that route, you could claim that any feature not necessary for turing completeness conflicts with the one way to do it strategy, since you could alternatively build up whatever feature from scratch, thus offering additional solutions

But practically, there remains (close to) one obvious way to do things, with the choice boiling down to whether or not to name your outputs before string construction (and the choice of f-string vs format naturally falling out). But thats always been a choice.

On a side note: I don’t use comment voting systems, for up or down votes, on pretty much any site including HN. Even if I did, I don’t see why you’d care

> if you’re going down that route, you could claim that any feature not necessary for turing completeness conflicts with the one way to do it strategy

I suppose one could do that, philosophically if not practically. What I had in mind was the various syntax sugars used for the same thing, particularly in Ruby. While some ways of doing things in Ruby were convenient, there was a host of other "shortcuts" that added to the confusion, and it was evident that it was the design of the language itself - not an accident - that allowed these shortcuts, at least imo.

> On a side note: I don’t use comment voting systems

I had a feeling you might say that, but I'd already clicked the "reply" button. Sorry for being presumptuous.

Not that I care about the voting system either, but people tend to use it as a cowardly way of showing disapproval.

f-strings are also unusable for i18n.

My favorite unpacking snippet is rotating a matrix

>>> matrix = [(1,2,3),(4,5,6),(7,8,9)]

>>> list(zip(*matrix))

[(1, 4, 7), (2, 5, 8), (3, 6, 9)]

Not as readable as a real maths library, but pretty cool and educational.

Think I came across it in Python in a nut shell, by Alex Martelli

Edit: formatting

My frustration with locals() and globals() is the same with "from foo import *" It's really difficult to move that code around or clean up variables because their use is obscured. The worst is when your template variable is defined somewhere else.

Calling calling a Python standard library function "not pythonic" is a great example of when the term "pythonic" is beyond usefulness.

So this is like the rest and spread operator in javascript (`...`).

An interesting observation is that while python has separate operators for lists ( * ) and dictionaries ( * * )—javascript has only `...`.

> You need to be careful when using multiple times though. Functions in Python can’t have the same keyword argument specified multiple times, so the keys in each dictionary used with must be distinct or an exception will be raised.

This is not an issue in javascript, the unpacked object with repeated keys takes the value of the last one.

    {...{a: 1}, ...{a: 2}}
    // => {a: 2}
edit: formatting

I suppose both observations are reminiscent of the fact that Python has language-level support for named parameters. Given that named and sequential parameters are treated differently everywhere in Python (sequential params must be provided, keyword args are optional, amongst other things), it would be extremely surprising behaviour if the run-time type of the value being spread determined what kind of spread happened.

In JavaScript, the surrounding syntatical context solely decides if it's an object or array splat. But since Python has 2 kinds of parameters, to preserve obviousness when reading code using splats, they went with 2 operators for each parameter kind, and then separated list and dictionary splat analogously.

> it would be extremely surprising behaviour if the run-time type of the value being spread determined what kind of spread happened.

The other reason to distinguish * and double-* is because * works with any iterable, and dicts are themselves iterables (of their keys). So you can actually use * on dicts, it just does something different:

   >>> d = {'a': 1, 'b' : 2}
   >>> [*d]
   ['a', 'b']
You could arguably say that only * makes sense here since it's a list context. But then, as you say, function calls would still be ambiguous because they support both positional and named arguments. This keeps it all unambiguously consistent in all contexts.

Yes, the Python feature is much older. From what I can tell, Python inspired Ruby, Ruby inspired CoffeeScript, and CoffeeScript inspired JavaScript to adopt this feature.

It's also not an issue in Python's dictionary literal syntax:

>>> {{'a': 1}, {'a': 2}} {'a': 2}

So the multiple keyword argument problem isn't an inherent issue with so much as it's a problem with keyword arguments being specified twice (which has been a restriction since before they allowed multiple to be used).

Same with Golang:

    A := []int{1,2}
    B := append(A, A...)
    > [1, 2, 1, 2]

Golang has no equivalent to double-* since it doesn't do keyword arguments.

Neither does javascript, "keyword expansion" only occurs for object literals (which is why it's unambiguous: objects can't be created from sequences and don't implement the iteration protocol, so expansion within an object context can easily be special-cased).

Looks like you lost some characters to HN's pseudo-markdown.

Side note... Why does everything needs to be bold on that page? It made reading it very difficult.

I'm curious about what browser you're using.

Side note: I typically bold many sentences in my articles, but this is the first article in a while where I actually bolded nearly no words at all. I wonder what my other articles look like in your browser.

It's... not?

It looks like this: https://i.imgur.com/AfEnDf0.png

And this is what it would look like if was bold. I fiddled in the devtools to make this: https://i.imgur.com/zQQR1gA.png

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact