

Python: The Dictionary Playbook - Nurdok
http://blog.amir.rachum.com/post/39501813266/python-the-dictionary-playbook

======
elbear
You mentioned it a bit, but I want to make it clear that, even if you don't
use 2.7, you can still count by doing:

    
    
        from collections import defaultdict
        counter = defaultdict(int)
    

There is a difference though, because you have to count manually, i.e:

    
    
        for i in 'supercalifragilisticexpialidocius':
            counter[i] += 1
    

Also, because defaultdict accepts any callable, you can have a dict of
counters by doing:

    
    
        counters = defaultdict(lambda: defaultdict(int))
        for word in ['apple', 'berry', 'grape']:
            for letter in word:
                counters[word][letter] += 1
    

This is not very obvious, so I don't use it a lot, but sometimes it's the most
elegant solution.

~~~
ciupicri
For Python 2.7 there's _collections.Counter_ [1]:

    
    
        import collections
        c = collections.Counter('supercalifragilisticexpialidocius')
        print c
        # Counter({'i': 7, 'a': 3, 'c': 3, 'l': 3, 's': 3, 'e': 2,
        #          'p': 2, 'r': 2, 'u': 2, 'd': 1, 'g': 1, 'f': 1,
        #          'o': 1, 't': 1, 'x': 1})
    

[1]
[http://docs.python.org/2.7/library/collections.html#collecti...](http://docs.python.org/2.7/library/collections.html#collections.Counter)

~~~
andreasvc
It's odd that with all the 'high-brow' datatype names such as dictionaries,
tuples, and sets, they decided to call this one Counter instead of the obvious
choice 'multiset' (or 'bag'); it supports set operations such as intersection,
so it's more generally useful than just collecting top ten lists &c.

------
avolcano
You know a guide is good when it makes you want to go back and refactor old
code. Great information, thanks :)

~~~
alexpopescu
Interestingly after reading the post I've realized that while I've been using
the more advanced stuff, I was missing the first 2 basic ones.

~~~
DeepDuh
Happened to me too, I always used 'for key in mydict: do_something(key,
mydict[key])' instead of 'for key, value'

I guess python leaves so many ways to do things that you tend to settle on
some style quickly even if it's not the most efficient one.

When self learning new languages I still miss an efficient way to get all the
idioms. However I think that in python it's not really that important, as long
as you get stuff done - your code is still going to be quite readable.

~~~
RyanMcGreal
Python doesn't always hew too closely to TSBOAPOOOWTDI, despite Ken Peters'
lovely entreaty. <http://www.python.org/dev/peps/pep-0020/>

------
aidos
This is a really nice guide. I've been up to my eyes in python dicts over the
last few days so already had most of these figured out.

setdefault is new to me, which is cool. Unfortunately I can only see one place
to use it in my code and it would be inefficient [0]. Best stash it away for
later use :)

[0] r = re_subs.setdefault(s, re.compile(s))

~~~
andrewcooke
iirc re caches regexps. so unless you're using many (more than the cache size
[edit: 100]), i wouldn't worry (well, and also measuring before optimising...)

~~~
aidos
I did wonder if that was the case but I couldn't find anything about it. Just
discovered re._MAXCACHE which implies it does cache. Tested and in my case the
performance is the same.

Thanks for the tip (I was posting half hoping someone would have a better
solution). That's another 3 lines of code removed :)

------
simon_weber
Good advice. I've got one to add:

    
    
      #x and y are dictionaries
      z = dict(x.items() + y.items())
    

It merges two dictionaries, giving precedence to the second (in Python 2 -
Python 3 is a bit more nasty: [http://stackoverflow.com/questions/38987/how-
can-i-merge-uni...](http://stackoverflow.com/questions/38987/how-can-i-merge-
union-two-python-dictionaries-in-a-single-expression)).

~~~
kc0bfv
Yeah, I like:

    
    
        z=dict(x)
        z.update(y)
    

It's clear and concise, and it's obvious which one gets precedence, even if
it's two lines.

~~~
masklinn
I tend to be partial towards

    
    
        z = dict(x, **y)
    

if the keys of y are compatible with unpacking.

------
pcote
I love curiosity sparking posts like this.

>>key in dct

is much better than

key in dct.keys()

Of course, that got me curios to find out if there is a magic method out there
that takes advantage of keyword "in". Turns out __contains__ does that.

Always exciting to stumble upon new stuff in my favorite language.

------
textminer
Love defaultdict. It and dict/set/list compressions are a big part of what
makes Python so fast to write in.

Great practice for 2.7 that's probably quashed in 3.0. For large dicts, no
need to create a giant set en route when iterating over keys, values, or both.
Use "for k in d.iterkeys()", "for v in d.itervals", "for k,v in d.iteritems."

While I'm at it-- if you're ever finding yourself using a huge amount of
awfully rigid objects from a single class, use __slots__ to allocate needed
variables! Python will otherwise define the object's namespace in a dict
(called __dict__) which allocates a whole kilobyte per object. Bad news if you
have several hundred thousand... Guessing this is why Guido loves namedtuples
so much for basic attributed storage.

~~~
cardamomo
In Python 3.3:

Iterate over keys: "for k in d.keys()" ...over values: "for v in d.values()"
...over both: "for k, v in d.items()"

~~~
andreasvc
"for k in d" is even shorter. I wonder what they point of the .keys() method
is actually, perhaps it's just as redundant as .has_key(). Most times you can
iterate over the dictionary itself; when you need to explicitly pass an
iterator or list, iter(d) or list(d) is shorter than d.keys().

~~~
andrevoget
If you iterate over .keys(), you are allowed to use del key[value].

~~~
andreasvc
Only in Python 2, which returns something to the effect of list(dict); so my
point remains, they keys method doesn't add anything which you don't get (more
explicitly) by coercing to a list when needed.

------
Goopplesoft
One my favorites, safe deep searching: by returning a dict you can run another
.get

var = {'a' : 'b' , 'c' : {'d' : 'f'}}

print var.get('c', {}).get('d') print var.get('DNE', {}).get('d')

------
hartror
I hate to be that guy but what about dictionary comprehensions?

~~~
andreasvc
Or dictionary views: they stay current as the dictionary is updated, and allow
set operations.

~~~
te
I don't get it. Why would I use a view instead of the dictionary itself?

~~~
e12e
[http://stackoverflow.com/questions/340850/python-3-0-dict-
me...](http://stackoverflow.com/questions/340850/python-3-0-dict-methods-
return-views-why)

Views are lighter than a full copy of a list, yet behaves like a list (eg:
supports `key in view`).

[edit] Also, this seems to be the relevant PEP:

<http://www.python.org/dev/peps/pep-3106/>

------
solox3
"not key in dct", or "not (key in dct)", is never slower than "key not in dct"
because when key is found in dct, the expression is immediately true.

~~~
pjscott
Those actually generate the same bytecode in Python 2.7. Probably other
versions, as well.

------
taejo
In item 3, the "boilerplate" and the "awesome way" are not equivalent. The
boilerplate does the equivalent of setdefault, which is mentioned later

    
    
        dct[key] = dct.setdefault(key, 0) + 1

~~~
isbadawi
Since you're assigning to dct[key] anyway, it doesn't really matter.

------
stillinbeta
In most circumstances for (key, val) in data won't work, as the default
dictionary iterable only contains the keys. You want for key, val in
data.items():

~~~
pcote
I use dictionary comprehensions personally but I have mixed feelings about the
syntax. It looks too much like set comprehensions on first glance. Compare the
following to see what I mean.

myset = {x for x in "This is my stuff".split()}

mydict = {x:len(x) for x in "This is my stuff".split()}

~~~
andreasvc
Well I see what you mean but then again a comma looks very much like a period
...

The advantage of this syntax is that : unambiguously introduces a key: value
pair, whereas (key, value) could also occur in a list comprehension (e.g., by
accident).

------
danielwozniak
Nice! Some good information here. Good for beginners, good for intermediates,
some expert will probably say it's good for them too.

------
goronbjorn
Thanks for writing this; it's very useful.

Are you planning on doing similar posts about other parts of Python in the
future?

------
mixedbit
Can logical operator in Python be used to conditionally initialize an item in
a dict? For example, can:

    
    
        group = dct.setdefault(key, []) 
        group.append(value)
    

be replaced with some equivalent of this Ruby snippet:

    
    
       (dict[key] ||= []) << value
    
    ?

~~~
masklinn
No.

But of course if ``group`` is not used afterwards it can be inlined to

    
    
        dct.setdefault(key, []).append(value)

------
simgidacav
Thanks, I didn't know about "x not in D"! On the same note:

[http://dacavtricks.wordpress.com/2011/05/23/python-
default-v...](http://dacavtricks.wordpress.com/2011/05/23/python-default-
values-in-a-dictionary/)

------
bismark
I also find the combination of lambda and defaultdict quite useful:

    
    
      d = defaultdict(lambda: False)
    

or

    
    
      d = defaultdict(lambda: {'foo':set(), 'bar':False})
      d['baz']['foo'].add(1)

~~~
ralph
For the first I do defaultdict(bool) ;-)

------
kriro
Pretty cool, bookmarked. I always enjoy different forms of presenting the
info, playbook was a nice touch :)

------
manojarcom
my Bitdefender is blocking the site; says it's insecure.

------
eagle9
nice! very useful, thanks for putting this together.

------
nu2ycombinator
This is ... wait for it.. Legendary. :)

------
blt
Rocking it out! Awesome! Getting down! Make it happen!

