Dictionary union (PEP 584) is merged

fyp · on Feb 25, 2020

  {**d1, **d2}

is very natural if you also write javascript where their spread operator looks like:

  {...d1, ...d2}

whalesalad · on Feb 25, 2020

This is reason enough to upgrade from Python 2.7 if you are still on it. I use this convenience almost daily.

TylerE · on Feb 26, 2020

No it isn’t.

tempay · on Feb 26, 2020

One of the issues with Python 3 is that there isn’t really one killer feature, it’s countless little ones. Many believe 3.6 was the first release where they added up to enough of a benefit (though f-strings are a big help for many projects).

Regardless it no longer matters, the Python 2 ecosystem is now rotting as packages drop support. Every week I have to make one or two hot fixes somewhere to forcibly pin to an old version to fix something.

edudobay · on Feb 26, 2020

For me, proper handling and distinction of Unicode vs. binary data was a game changer. I don't know if that's related to my first language being non-English, but I remember it being really important to me and a strong reason why I made the switch years ago.

heavenlyblue · on Feb 26, 2020

I used to work for a huge SEM business working with keywords in all languages of the world. No-brainer.

martopix · on Feb 26, 2020

I would definitely mention integer division as a massive change in py3 too. In my field, the @ multiplication operator is also useful.

teddyh · on Feb 26, 2020

> I would definitely mention integer division as a massive change in py3 too.

To be fair, you can get that in Python 2 too:

  from __future__ import division

modal-soul · on Feb 26, 2020

Even more natural if you write Ruby, which uses an identical syntax.

devy · on Feb 26, 2020

So PEP 584 is a JavaScript inspired feature request?

beanpup_py · on Feb 26, 2020

No, it adds a new merge (|) and update (|=) operator to avoid using

    {**d1, **d2}

because it apparently looks ugly[1].

[1]: https://www.python.org/dev/peps/pep-0584/#d1-d2

privateSFacct · on Feb 26, 2020

And in doing so creates multiple ways to do the same thing which is SUPER SUPER annoying.

There has been a recent effort to add all these new operators that don't actually let you do anything you couldn't but now you can confuse everyone by doing it in other ways.

kzrdude · on Feb 26, 2020

Python doesn't use formal interfaces. This makes dicts implement unions in a duck typeable way - the operators.

py3001 · on Feb 26, 2020

instead of actually useful features like dict deconstruction

banannaise · on Feb 27, 2020

The bigger reasons are discoverability (as noted in the section you linked) and the ever-vague notion of Pythonic-ness.

{d1, d2} is not intuitive to a primarily-Python developer, and looks nothing like typical Python. The dict unpacking operator it uses is almost never seen outside function arguments.

kam · on Feb 25, 2020

The PEP document describing the feature: https://www.python.org/dev/peps/pep-0584/

pansa2 · on Feb 25, 2020

> > Dict union will violate the Only One Way koan from the Zen.

> There is no such koan. "Only One Way" is a calumny about Python originating long ago from the Perl community.

wwright · on Feb 26, 2020

For what it is worth, from https://www.python.org/dev/peps/pep-0020/:

> There should be one-- and preferably only one --obvious way to do it.

Personally, I’ve always thought that Python missed its Zen, both here and on “explicit is better than implicit.”

ehsankia · on Feb 26, 2020

That ship had long sailed with string formatting anyways.

orf · on Feb 26, 2020

F-strings are the one obvious way to do string formatting. There may be other ways, for legacy backwards compatibility reasons, but f-strings are the way to do string formatting.

kzrdude · on Feb 26, 2020

I think you are right. Just left a string of legacy (%-formatting) and dead end (.format()) solutions on the way there.

AlexandrB · on Feb 29, 2020

I think F-strings are bad for i18n. It sucks to use them with a database of string localizations because the variable name is now embedded in dozens of translated F-strings and basically becomes immutable.

mixmastamyk · on Feb 26, 2020

Yes, unless a template is needed later. That's why the others continue to exist.

cochne · on Feb 26, 2020

I disagree. In logging for example, ‘%s’ with the value as an argument to the logger is preferred because the formatting can be ignored if the log level is not sufficient to print.

heavenlyblue · on Feb 26, 2020

Except log formatting isn’t the same as actually executing the expression passed to the logging function.

orf · on Feb 26, 2020

Log formatting is a different problem than string formatting. With log formatting you pass the formatting arguments as function parameters, which is completely different from any other way you format strings.

AlexandrB · on Feb 29, 2020

I think you're getting downvoted because logging is basically doing:

    def log(fmt, *args):
        print(fmt % args)

Hardly a huge change.

saagarjha · on Feb 26, 2020

It sailed long before that.

eesmith · on Feb 26, 2020

Go back to 1.0 and we still had two ways to write strings ("abc" and 'abc'), and two ways to write not equal ("!=" and "<>").

The last died with 3.0.

But my point is that the Zen of Python must be seen as a post hoc description overlaid onto whatever the actual Python philosophy is. Aligned, certainly, but at times only roughly aligned.

So I don't see it as having sailed (with string formatting, or other specific even) but never having been there in the first place. More like, sailing in the same waters.

bjterry · on Feb 26, 2020

    The Tao that can be told
    is not the eternal Tao

eesmith · on Feb 27, 2020

    Can the Tao be found
    where there is no Tao?

mokus · on Feb 26, 2020

It sailed with Turing completeness! At least, the popular interpretation of the phrase that ignores the word “obvious” did.

orf · on Feb 26, 2020

It sailed when they added the for loop - everyone was happy using while loops, things worked and it was simple.

Now there are TWO ways of calling a block of code repeatedly based? How confusing for new users. Python really went downhill since then.

anon102010 · on Feb 26, 2020

I think go dropped one of these (for vs while) to keep to just one approach on loops so python providing lots of ways to do same thing is something other languages targeting entry level folks are seeking to avoid

pwdisswordfish2 · on Feb 26, 2020

Ah, so that's what the migration to Python 3 was all about!

misnome · on Feb 26, 2020

Fantastic. Especially glad they went with | over +, that’s always felt like the natural way I’ve wanted to do this. Looking forward to more set-like operators in the future!

kbd · on Feb 26, 2020

Thank goodness sanity prevailed on the operator!

We had a whole discussion on HN last time[1] about this, where I argued that dicts are logically subclasses of sets and therefore should share operators.

When I saw this headline I accepted my fate of typing the "wrong" operator from now on and liking Python just a tiny bit less for the inconsistency. So glad they reconsidered.

[1] https://news.ycombinator.com/item?id=19314646

mixmastamyk · on Feb 26, 2020

Guido stated his preference to | and the pep was changed.

kbd · on Feb 26, 2020

The PEP has this section:

> The new operators will have the same relationship to the dict.update method as the list concatenate (+) and extend (+=) operators have to list.extend. Note that this is somewhat different from the relationship that |/|= have with set.update; the authors have determined that allowing the in-place operator to accept a wider range of types (as list does) is a more useful design, and that restricting the types of the binary operator's operands (again, as list does) will help avoid silent errors caused by complicated implicit type casting on both sides.

Would someone please explain what they mean with regard to being different from set.update, and what could lead to silent errors?

seemslegit · on Feb 25, 2020

I wouldn't have thought about dict unpacking as a solution either but once suggested it seems satisfactory and I don't see how adding a new operator is more discoverable or natural than just putting this method in a more prominent place in the documentation.

Znafon · on Feb 25, 2020

Guido himself said he had forgotten about this trick and since it's syntactic sugar, it does not respect dict subclasses or other mappings.

seemslegit · on Feb 25, 2020

imo

  defaultdict(callback,{**a,**b})

is more readable than a | b without knowing that a or b are defaultdicts and having to reason about which default callback will be used

orf · on Feb 26, 2020

It’s also really expensive. This new operator works with any Mapping type without needing silly hacks.

anonymoushn · on Feb 26, 2020

Note that a | b is already a disaster if you try to use subclasses of set. In python 2.7, iirc it would return a set of a's type but without calling the constructor. In python 3 it seems to return a set (not the type of a or the type of b).

NewJazz · on Feb 26, 2020

Not sure about the set operator, but this operator is meant to handle subclasses better than the status quo.

mdrachuk · on Feb 26, 2020

Wow, one more point for python 3.

Just override ‘__and__’ in your whatever class to replace default return.

Pretty explicit in my book.

blackandblue · on Feb 26, 2020

the easy to forget justification is surprising to me. especially when most modern languages have the concept of unpacking, rest, spread or etc.

making the trick work with other mapping types and making it faster is totally understandable though.

banannaise · on Feb 27, 2020

It's pretty obvious with the context of other languages, but wildly outside the norm for Python. I rarely see dict unpacking outside function signatures.

war1025 · on Feb 25, 2020

It mentions in the PEP discussion that the {a, b} trick only works for string keys. So it isn't applicable in as many cases as the new operator.

uryga · on Feb 25, 2020

actually, only

  dict(d1, **d2)

has that problem. it works fine if you unpack into a dict literal:

  >>> d1 = {1: 'a'}
  >>> d2 = {2: 'b'}
  >>> {**d1, **d2}
  {1: 'a', 2: 'b'}

iirc the pep mostly just says that it's suboptimal because it's syntactically heavy/noisy, non-obvious and can't be overloaded in dict subclasses

---

i was curious why the two double-stars behave differently despite syntactic similarity. so i went and checked the bytecode, and it turns out they compile down to different opcodes! `{××d1, ××d2}` yields a BUILD_MAP_UNPACK, while `dict(d1, ××d2)` yields a CALL_FUNCTION_EX/CALL_FUNCTION_KW (depending on the CPython version)

dahfizz · on Feb 26, 2020

This seems pretty straightforward. When doing

`dict(d1, d2)`

You are calling the dict function, and using the normal syntax for unpacking a dictionary into kwargs. In this case, the name for kwargs must be strings.

uryga · on Feb 26, 2020

yeah, that side of the (in)equation was pretty obvious, i was mostly interested in the `{...}` one. i admit that a bytecode listing probably isn't the best exposition, i just like digging into VM stuff :)

seemslegit · on Feb 25, 2020

  dict(d1, **d2)

only works with string keys

  {**d1,**d2} or dict({**d1,**d2})

works with all key types it seems

Rotareti · on Feb 26, 2020

I wish there was a union operator for typing as well to replace `Union[str, int]` with just `[str|int]`.

ash · on March 2, 2020

PEP 604 (draft) proposes this:

  def f(list: List[int | str], param: int | None) -> float | str:
      pass

  f([1, "abc"], None)

https://www.python.org/dev/peps/pep-0604/

LyndsySimon · on Feb 26, 2020

Naively, couldn’t your just overload `or` to make that work?

ash · on Feb 26, 2020

Did you mean `str | int`?

ptx · on Feb 25, 2020

Yay! I was wishing for this feature just a few days ago. It's somewhat analogous to how sorted (since Python 2.4) frees us from having to tediously make copies of lists to sort them in place.

gnulinux · on Feb 25, 2020

You can already do

   {**d1, **d2}

today for the same effect.

NewJazz · on Feb 26, 2020

Here I was wondering what a dictionary union was, already having stumbled upon to it and used it.

Note that the method you show is slightly different for cases of dict subclasses.

The PEP notes the difference: https://www.python.org/dev/peps/pep-0584/#d1-d2.

wodenokoto · on Feb 26, 2020

I haven’t read the entire bug tracking thread, but it seems like people were mostly against it, and have been many times in the past.

What made decision makers change their mind and accept this change?

_-___________-_ · on Feb 26, 2020

Most of the bug tracking thread was just about whether `somedictsubclass() | somedictsubclass()` should be `dict()` or `somedictsubclass()`

The latter (returns `somedictsubclass`) would cause the `|` operator to rely on the `copy()` method from `dict` which would be the only case where an operator relies on a non-double-underscores method. Based on that, two core devs were against it. The core devs prevailed, and the behaviour will be the former (returns `dict`).

mrweasel · on Feb 26, 2020

It seems that just using + as the operator was reject because it's: "Too specialised to be used as the default behavior."

What does that mean? It works for lists, obviously lists don't need to worry about duplicated values, but it's kind non-intuitive that + won't work for dicts. It think many people view dicts and lists as the same general type of data structure.

heavenlyblue · on Feb 26, 2020

>> It think many people view dicts and lists as the same general type of data structure.

Is that a joke or are you from PHP world?

speedplane · on Feb 27, 2020

Python 2 Community: We are in hell, we have to stop working on everything to upgrade to Python 3, there is no straightforward way to upgrade, many of our python 2 libraries haven't been updated, and there are tons of little bugs that are hard to fix.

Python 3 Community: Look at thing cool dictionary merging thingy!

saagarjha · on Feb 26, 2020

What does this do on duplicate keys? Keep one? Take a predicate?

realslimjd · on Feb 26, 2020

It keeps the one on the right. They explain it in the PEP: https://www.python.org/dev/peps/pep-0584/

mixmastamyk · on Feb 26, 2020

AKA Last one wins.

camgunz · on Feb 26, 2020

Can't wait for Python 4k where we TOOWTDI all the old stuff.

fulafel · on Feb 26, 2020

The problem with sprinkling operator overloading all over the place in non numerical use is that you as tje reader don't get the context hints provided by method names. I think this change is bad in the overall balance.

kbd · on Feb 26, 2020

The best way to do dictionary union is already symbolic:

    {**d1, **d2}

This provides a clearer symbolic notation for dictionaries analogous to what's already available with sets. FWIW the pep discusses what this would look like as a method vs an operator:

https://www.python.org/dev/peps/pep-0584/#use-a-method

fulafel · on Feb 26, 2020

The best way for in my view is .union(), the new syntax additions are too cryptic.

kbd · on Feb 26, 2020

Not sure what you're referring to because there is no "union" method/function. There is currently no non-symbolic built-in way to combine dictionaries in an expression.

You may be interested to read PEP 584's list of examples of all the real-world code the existence of this operator makes clearer:

https://www.python.org/dev/peps/pep-0584/#examples

choward · on Feb 26, 2020

I agree. Back in the day when I used Ruby, I remember one of the arguments for Python being their belief that there should be one way to do things. Found one reference:

> There should be one-- and preferably only one --obvious way to do it.

https://www.python.org/dev/peps/pep-0020/

orf · on Feb 26, 2020

There should be one _obvious_ way to do things. Not _one_ way to do things.

The operator is better than the dict unpacking-repacking trick, and will become the obvious way to do it.

anentropic · on Feb 26, 2020

we already have `set1 | set2` for set unions

and dict keys are basically a set

fulafel · on Feb 26, 2020

I think that's cryptic too for most people, intuitive just for people who are conversant in bitwise operations.

anentropic · on Feb 26, 2020

these are standard set operations, not bitwise

fulafel · on Feb 26, 2020

The choice of the "|" operator for set union comes from bitwise operations: bitwise OR works as a union operator if you are using integers as bit vectors to represent sets of boolean attributes. And it was a common idiom back in the day when people used to program in C/assembly, using words as bit vectors was a common way to save memory.

Hence "|" as set union is intuitive for people who are familiar with this application of bit vectors.

heavenlyblue · on Feb 26, 2020

I think the OR operator comes from set theory and has nothing to do with the low-level boolean flag fieldsets.

fulafel · on Feb 26, 2020

I looked up set theory a couple of places (WP[1] and Britannica[2]) and didn't find any references of the OR operator in this context.. do you have a link?

[1] https://en.wikipedia.org/wiki/Algebra_of_sets

[2] https://www.britannica.com/science/set-theory/Operations-on-...

housecarpenter · on Feb 26, 2020

The analogue of the OR operator in set theory is the union operator. People think of them as basically the same thing because of the correspondence between a property and the set of things with that property. If A is the property of being either B or C, then the set of the things that are A is the union of the set of things that are B and the set of things that are C.

anentropic · on Feb 25, 2020

iddan · on Feb 25, 2020

FINALLY!