Python 201: namedtuple

dalke · on March 15, 2016

"I have seen some programmers use it like a struct."

I think namedtuple is entirely too tempting to use for that purpose, given that I think it's the wrong solution. I don't want my struct to be iterable and indexable. I think len(my_struct) should fail.

The namedtuple is a great solution for when you have a tuple API (like the old output from os.stat() or time.gmtime()) and want to transition to an attribute API.

I would prefer a "namedstruct" for a tuple-less version of the same API, to help avoid the attractive nuisance of namedtuple repurposed as a struct creator.

"The other day I was looking for a way to convert a Python dictionary into an object and I came across some code that did something like this:"

   >>> from collections import namedtuple
   >>> Parts = {'id_num':'1234', 'desc':'Ford Engine',
                'cost':1200.00, 'amount':10}
   >>> parts = namedtuple('Parts', Parts.keys())(**Parts)

(As a minor tweak, that last line can use 'namedtuple('Parts', Parts)(Parts)'.)

I'll point out that the namedtuple-less alternative is a class like this:

   class PartsClass(object):
     def __init__(self, d): self.__dict__.update(d)

   parts = PartsClass(Parts)

There are some important differences: 0) my version can't be done inline, 1) my PartsClass doesn't have the useful repr() of a namedtuple(), 2) my PartsClass attributes can be modified while a namedtuple's attributes are read-only, 3) the inline nametuple is dreadfully slow - 600x slower than making a class!

    >>> timeit.timeit(number=1000, stmt=lambda: namedtuple('Parts', Parts)(**Parts))
    1.4882779121398926
    >>> timeit.timeit(number=1000, stmt=lambda: PartsClass(Parts))
    0.0022149085998535156

4) hoisting the Parts namedtuple creation results in about 10% faster code than creating a class instance:

    >>> timeit.timeit(stmt=lambda: PartsClass(Parts))
    1.1018929481506348
    >>> PartsTuple =  namedtuple('Parts', Parts)
    >>> timeit.timeit(stmt=lambda: PartsTuple(**Parts))
    1.0174322128295898

5) the tuple API requires the * * notation, while the class notation can take the dictionary directly, and 6) object instance attribute lookup is faster than a namedtuple property lookup:

    >>> parts_instance = PartsClass(Parts)
    >>> timeit.timeit(stmt=lambda: parts_instance.desc)
    0.19052886962890625
    >>> parts_tuple = PartsTuple(**Parts)
    >>> timeit.timeit(stmt=lambda: parts_tuple.desc)
    0.3053739070892334

If there are even two attribute lookups per object then it's faster to use a class instance (1.10+2×0.19 = 1.48) than a namedtuple (1.01+2×0.305=1.62)