
PEP 584 – Add + and – operators to the built-in dict class - Ivoah
https://www.python.org/dev/peps/pep-0584/
======
kbd
> An alternative to the + operator is the pipe | operator, which is used for
> set union. This suggestion did not receive much support on Python-Ideas.

That's disappointing. It's always been on my Python wish list that dicts would
subclass sets, as dicts are essentially sets with values attached. Pretty much
everywhere you can use a set you can use a dict and it acts like the set of
its keys. For example:

    
    
        >>> s = {'a','b','c'}
        >>> d = {i: i.upper() for i in s}
        >>> list(d) == list(s)
        True
    

Dictionaries have been moving in this more ergonomic direction for a while.
Originally, to union two dictionaries you had to say:

    
    
        >>> d2 = {'d': 'D'}
        >>>
        >>> d3 = d.copy()
        >>> d3.update(d2)
        >>> d3
        {'a': 'A', 'b': 'B', 'c': 'C', 'd': 'D'}
    

Nowadays, as the PEP points out, you can just say:

    
    
        >>> {**d, **d2}
        {'a': 'A', 'b': 'B', 'c': 'C', 'd': 'D'}
    

There's no reason you shouldn't have always been able to say d | d2, same as
sets. Now I finally get my wish that dictionaries will behave more similarly
to sets and they use the wrong set of operators.

~~~
dan-robertson
The most compelling reason to not do this is that (I claim) it’s not super
obvious what to do when the keys are equal. In:

    
    
      { 'a' : 1 } | { 'a' : 2 }
    

Should the result be:

    
    
      { 'a' : 1 }
    

(prioritise the left hand side), or

    
    
      { 'a' : 2 }
    

(prioritise the right hand side), or should it raise an error? Maybe a fourth
option would be do downgrade to sets of keys and give:

    
    
      { 'a' }
    

A fifth option is to magically merge values:

    
    
      { 'a' : 3 } or { 'a' : (1,2) }
    

For the first two choices one loses commutativity which means that code then
suddenly has to have previously cared about it (or it will do the wrong
thing), even though it didn’t previously matter, and one is always potentially
losing data. The third choice is safe but could cause unforeseen problems
later if shared keys only happen rarely. The fourth choice also forgets a
bunch of information held in the dict.

In a language like Haskell, one can use traits to specify how to merge values
(Monoid) but without traits (and a way to choose which trait to use) I think
some kind of magic merge is not great.

I claim the operations one should really want with dicts are not set
operations but rather more relational ones, ie {inner,outer,left,right} joins
on the keys followed by some mapping to decide how to merge values.

~~~
dstola
In case the values match you could supply a collision callback to define what
to do, eg to add the values,

    
    
      d1 = {'a': 1}
      d2 = {'a': 2}
    
      d3 = {**d1, **d2, add_func)
    
      def add_func(a, b):
          return a+b
    

Or something along those lines

~~~
rbanffy
Why not raise a ValueError and let the programmer figure out what The Right
Thing To Do is when you add two dicts that have the same key with a different
value?

I assume the same key with the same value would be OK, but I'm not really sure
it's a good idea for it to be OK.

~~~
zimablue
You can't do value comparison without making dict item comparison a pissed in
function or making dict values immutable. If you're doing something that
really looks like a mathematical Union that will raise if there's any overlap
then it's a really confusing abuse of notation. I don't think there's a way
out.

------
zestyping
len(dict1 + dict2) does not equal len(dict1) + len(dict2) so using the +
operator is nonsense.

The operators should be |, &, and -, exactly as for sets, and the behaviour
defined with just three rules:

1\. The keys of dict1 [op] dict2 are the elements of dict1.keys() [op]
dict2.keys().

2\. The values of dict2 overwrite the values of dict1.

3\. When either operand is a set, it is treated as a dict whose values are
None.

This yields many useful operations and is simple to explain.

merge and update:

    
    
        {'a': 1, 'b': 2} | {'b': 3, 'c': 4} => {'a': 1, 'b': 3, 'c': 4}
    

pick some items:

    
    
        {'a': 1, 'b': 2} & {'b': 3, 'c': 4} => {'b': 3}
    

remove some items:

    
    
        {'a': 1, 'b': 2} - {'b': 3, 'c': 4} => {'a': 1}
    

reset values of some keys:

    
    
        {'a': 1, 'b': 2} | {'b', 'c'} => {'a': 1, 'b': None, 'c': None}
    

ensure all keys are present:

    
    
        {'b', 'c'} | {'a': 1, 'b': 2} => {'a': 1, 'b': 2, 'c': None}
    

pick some items:

    
    
        {'b', 'c'} | {'a': 1, 'b': 2} => {'b': 2}
    

remove some items:

    
    
        {'a': 1, 'b': 2} - {'b', 'c'} => {'a': 1}

~~~
petters
So much better! Overloading addition for something that behaves differently is
not good.

~~~
rat9988
Addition as defined in mathematics behaves differently depending on context.

------
jerf
Unlike some of the other commenters, I'm fine with the + specification. +
hasn't been commutative in Python for a long time.

But the - bothers me, and nobody else seems to have mentioned this. {"a": 1} -
{"a": 1} = {}, sure, but it is _way_ less obvious to me that {"a": 1} - {"a":
2} = {}, and not {"a": 1}. If you consider dictionaries as an unordered list
of tuples (key, value) where keys happen to be unique and as a result of that
you get nice O()-factors on access, that doesn't make sense. You went to
remove ("a", 2), but saw ("a", 1) and thought, "eh, close enough". But it's
not the same thing.

If you think of a dict as a set that happens to have associated values, the
specification makes more sense, but if you dig into that line of thought, that
turns out to be a rather weird way of thinking of them. Values really
shouldn't be thought of as second-class citizens of a dict. If you are going
to go this route though, {"a": 1} - {"a"} = {} (where the right-hand side is a
set) actually makes _more_ sense, without the spurious value on the right-hand
side.

I'd actually rather conceive of the - operation as a "dict minus an iterable
that will yield keys to remove". This has the advantage of recovering the
original {"a": 1} - {"a": 2} = {} semantics that probably is what people want
in practice, just via a different method. But locking the right-hand side to a
dict makes it weird.

~~~
Areading314
Its consistent with iteration over a dict:

    
    
      for k in my_dict:
          print(my_dict[k])
    

In this example it is implied that unless you specify .items(), you are only
considering keys in the iteration. This would apply to the + and - operations
too as I understand

~~~
ken
Using - to mean "here's a dict and a seq -- remove all the seq's keys from the
dict" would be useful and consistent, but they specifically prohibit that.
They require the rhs to be a dict, too, even though the values are never used.
Why?

~~~
Areading314
Good point, although you'd need to make sure your seq only has unique values.
Other than that I don't see why you should have to write

    
    
      {k: v for k, v in d.items() if k not in seq}

------
wodenokoto
> Analogously with list addition, the operator version is more restrictive,
> and requires that both arguments are dicts, while the augmented assignment
> version allows anything the update method allows, such as iterables of
> key/value pairs.
    
    
        >>> d + [('spam', 999)]
        Traceback (most recent call last):
          ...
        TypeError: can only merge dict (not "list") to dict
        >>> d += [('spam', 999)]
        >>> print(d)
        {'spam': 999, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}
    
    
    

While I get the "Because this is what lists do"-argument, I am still wondering
why there is a difference in the types allowed for `+` and `+=`?

~~~
dragonwriter
The difference is because + in Python is intended to be a reflexive,
declarative operator while += is a directional, imperative operator.

If the + has heterogeneous operands, there should be a promotion process to
the most specific type that generalizes the operand types so that the addition
works the same regardless of order, as exists for numeric types. But for
general types (and collections particularly) the concept of most specific
generalized type including two other collection types is not always sensible,
so requiring homogenous operands makes more sense.

With += there is no intended symmetry between operands, the left side is the
receiver into which the right side is added.

~~~
zestyping
I don't find this convincing at all.

> so that the addition works the same regardless of order, as exists for
> numeric types

\+ is not commutative for lists, tuples, or dicts. So the promotion process
need not be commutative either. There is no good reason why list + tuple
should be forbidden, or dict + items should be forbidden.

a [op]= b is commonly and easily explained as "a = a [op] b, where a is
mutated in place". Python should not break that explanation with mysterious
inconsistencies.

~~~
dragonwriter
> \+ is not commutative for lists, tuples, or dicts

Yeah, that's a good point. I think I was thinking on the type level rather
than the value level, but I am unsure that makes a convincing argument for the
behavior here even if it is otherwise true (which I’m not sure it is always,
even at the type level.)

------
amelius
For sets I can understand what + and - means: you can add or subtract the sets
(not add or remove an element directly). This should be like lists, e.g.

    
    
        [10,20] + [30]
    

But what + and - would mean in the case of dicts is obscure. Better to just
use full method names imho.

~~~
sametmax
All my students disagree with you. They all try addition, and all expect a
resulting dict with keys from both dicts. The fact the keys from the one side
are prioritized is something they will learn once, just like with
dict.update().

~~~
amelius
That's something a student _may_ learn. But there will be plenty of people
reading Python code that don't know how "+" works on dicts; and it is
difficult to find out what it does because you can't easily Google/grep for
the function name. Doing experiments with dicts is not something you want to
be doing while you are reading other people's code.

------
comex
The + operator looks great – I've personally experienced the papercut this
solves multiple times, where it would be most natural to have "combine two
dicts" operator:

    
    
        return {'a': 'b'} + other_dict
    

but instead I had to assign to a variable and mutate with .update(), which is
much more verbose:

    
    
        x = {'a': 'b'}
        x.update(other_dict)
        return x
    

_However_ , I was working in Python 2; Python 3 has

    
    
        {'a': 'b', **other_dict}
    

and even

    
    
        {**one_dict, **other_dict}
    

though the PEP mentions that the latter doesn't work in all circumstances.
Still, it will be nice to have a more general operator; I personally don't
really care whether it's called + or |.

On the other hand, the - operator seems... strange, in that it only considers
the keys of its right-hand argument, and ignores the values. Seems like a
footgun.

------
petters
I think overloading + so that a + b != b + a is problematic.

I know this is the case for strings and lists, but those cases are very well
established.

~~~
sametmax
And a = 1 is not equality.

Welcome to the world of programming, where we don't all try to match
mathematical conventions because many of us suck at maths and are practical.

~~~
gizmo385
There is not rule of math stating that + must always be commutative. It is
commutative for real numbers, yes, but there are plenty of use cases mentioned
elsewhere in this discussion where + is not commutative.

------
mk89
Reminds me of Scala Maps[0].

Edit: after reading more carefully,...

> Analogously with list addition, the operator version is more restrictive,
> and requires that both arguments are dicts, while the augmented assignment
> version allows anything the update method allows, such as iterables of
> key/value pairs.

But why? Consistency in API behavior is important, and as a user I don't want
to have to read that I can add lists of pairs only with assignments. I hope
the draft gets fixed.

[0]: [https://docs.scala-
lang.org/overviews/collections/maps.html](https://docs.scala-
lang.org/overviews/collections/maps.html)

~~~
guitarbill
You can always allow it later, but deprecating such a "feature" is a pain. And
subtle errors/outright abuse can happen with some of these automatic
coercions, so Python tends to be a bit more conservative than other dynamic
languages. The most (in)famous example being comparison (edit: not addition)
of an integer and `None`. Allowed in Python 2, non-intuitive IMO, and
responsible for a few bugs in its time. Disallowed in Python 3:

> TypeError: '<' not supported between instances of 'int' and 'NoneType'

~~~
Sean1708
Do you mean _comparison_ with None?

    
    
      Python 2.7.12 (default, Dec  4 2017, 14:50:18) 
      [GCC 5.4.0 20160609] on linux2
      Type "help", "copyright", "credits" or "license" for more information.
      >>> 1 + None
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'
      >>> 1 < None
      False
    

vs

    
    
      Python 3.5.2 (default, Nov 23 2017, 16:37:01)
      [GCC 5.4.0 20160609] on linux
      Type "help", "copyright", "credits" or "license" for more information.
      >>> 1 + None
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'
      >>> 1 < None
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      TypeError: unorderable types: int() < NoneType()

~~~
guitarbill
doh, you're absolutely right, cheers! edited the parent for posterity, too.

------
speedplane
Python continues to introduce more non-intuitive semantics that may be a small
boon to the the expert class of programmers, but comes at the expense of ease
of adoption for beginners. It started by making everything a generator, which
are not very easy to master, and for which there were plenty of perfectly good
substitutes (e.g., xrange, iteritems). And now you "add" sets of items (which
you can't do in math) and when the update function worked well.

Python 3 is such a sad mess.

~~~
varelaz
Yeah, I hate this. Now dict.items() become not thread safe just because of
iterators. It could crash anytime just because you modified dict in another
thread while iteration is in progress

~~~
quietbritishjim
I'm pretty sure that the situation you're describing was not thread safe in
Python 2 either.

Sure, once you're in the body of the for loop, the dictionary must have been
copied to the list so you're safe. But _while d.items() is being evaluated_ at
the start of the for loop, there is an internal iteration that could be
preempted by the other thread. The GIL doesn't save you because Python
operations aren't guaranteed to be atomic, and I doubt something that complex
would be (it would be a serious problem if iterating over a large dictionary
in one thread held up all other threads for an arbitrarily long time). Even if
it is GIL-atomic, you're risking breakage if you move to another
implementation (e.g. pypy) or if Python changes its atomiticity in future.

In general, if you want to modify an object in one thread and read it in
another thread, you should add locking to prevent this happening
simultaneously.

It is however true that the Python 2 items() method allows you to modify the
dictionary in the body of the same for loop. But this is a surprising
exception compared to iterating over a list or other container, so it makes
sense overall to demand you explicitly make a copy if that's what you want.

~~~
scbrg
What do you mean by "threadsafe" here? Could dict.items() actually break in
Python 2? I've never seen that happen.

~~~
d33
Unexpected/undefined behavior?

~~~
varelaz
There are a lot of cases when you don't need strict consistency and current
state is enough for processing. For example you want to save requests stats
from web servers. Would you stop all operations until you counting and writing
to DB to be precise? Off course not. Some current number that you have is good
enough for you. Off course you need to be aware of side effects.

------
rurban
dict.merge(d, ...) and dict.diff(d, ...) are more expressive and have a
cleaner semantic.

overloading arithmetic ops for string, list or dict ops might only look
elegant at first sight, but discrimination needs to be done at runtime,
slowing down the most important arithmetic ops, and do not help much the
casual code reader. It also cannot be used in normal python code as older
python will fail, only in special internal code.

normal method names can be provided by external modules, so they are backwards
compatible and will find more widely adoption.

~~~
sametmax
Teacher here.

All my students eventually try {} + {}.

I'll bet on it to be the most intuitive.

~~~
rplnt
It's intuitive to try, but it's not obvious what it actually does when keys
overlap.

~~~
mixmastamyk
It’s obvious what you want done, and thankfully that is what’s going to be
done.

------
s17n
It's weird to use + for a non-commutative operation, right?

~~~
Znafon
\+ is already non-commutative for list and tuples for example.

~~~
dbrgn
And for strings.

------
js2
> The implementation will be in C. (The author of this PEP would like to make
> it known that he is not able to write the implementation.)

I hope this is for a reason other than the author being unfamiliar with C.
Otherwise the author is cheating themselves, because adding functionality to
an existing code base is probably my favorite motivator for learning a new
language.

~~~
mixmastamyk
C isn’t the hard part, rather the Python api, and dev workflow.

~~~
js2
I’ve contributed code to a handful of open source projects. Learning how to do
that is also a worthwhile (and sometimes humbling) experience. I haven’t
contributed to Python but it has an entire guide for how to do so, which
already puts it above many projects:

[http://devguide.python.org/](http://devguide.python.org/)

The CPython API is as straight-forward as any I’ve seen.

So my original comment still stands. :-)

~~~
mixmastamyk
I agree it is quite worthwhile. However, it will take longer than expected for
a new person, and most of have obligations.

------
varelaz
I don't like the idea, because a + b should produce new list c without
modification of both. Which is not memory optimal and cost of it is not
obvious. Also dict is used a lot for sub classes and that could break a lot of
existing functionality with potentially no benefit for most of the developers.
I don't think that merging is very common operation for dicts and even so it
could be done with 1 or 2 update function calls, but that will be obvious in
that case, while '+' in deeps of duck typing code is not. Also absence of '+'
operation for dicts is kind of guard for type validation in case if someone
passed dict instead of integer. Which is pretty common when you parse some
JSON from client.

~~~
sametmax
It does produce a new list/dict, etc. That's what you want to avoid side
effects in many cases. Modifying is actually an exception. Handy to have, but
not common. Even numpy arrays recreate everything every time.

------
sametmax
One of the long awaited features, rejected by Guido many times, and finally
accepted. Maybe we'll get list.get(), functools.partial() as a c written
builtin, pathlib.Path() as a primitive or inline try/except, one day.

~~~
detaro
I don't think this has been accepted yet?

~~~
sametmax
No but Guido said yes
[https://bugs.python.org/issue36144#msg336848](https://bugs.python.org/issue36144#msg336848)
and even if he is not BDFL anymore, it usually means it will be done.

------
mroche

        def __add__(self, other):
            if isinstance(other, dict):
                new = type(self)()  # May be a subclass of dict.
                new.update(self)
                new.update(other)
                return new
    

Is there something I’m missing? To me it would be cleaner and more memory/time
performant to just `self.update(other)` rather than having a third list
instance at operation time. But that would really only apply if you have truly
massive dicts.

~~~
Znafon
Yes, Python supports in place operators to avoid this:

    
    
        def __iadd__(self, other):
            if isinstance(other, dict):
                self.update(other)
                return self
    

This would get called for `d += other`.

~~~
mroche
My follow up question would be why not do this for both union functions? I
don’t see why they would want to have two functions that do the exact same
thing (called differently, though) be written in two different ways.

------
projektfu
Is there a real-world demand for the dictionary difference operator or is it
just being proposed for completeness? I'm racking my brain to think of reasons
to use it that would be more expressive than simply giving a list of keys to
delete.

------
Grue3
I like this. The current kwargs syntax is _very_ confusing since it behaves
very differently from funcall kwargs syntax.

~~~
Znafon
What exactly differs?

~~~
Grue3
For example multiple values for the same keyword argument are not allowed in
funcall so you can't "update" default arguments with a dict of arguments that
need to be changed.

~~~
Znafon
Thanks, I did not know that.

------
IceDane
This is what you get when you try to implement general mathematical concepts
in a language that is horrible at expressing them. What a clusterfuck python
is going to be in a few years.

