
Python dicts are now ordered - signa11
https://softwaremaniacs.org/blog/2020/02/05/dicts-ordered/en/
======
BiteCode_dev
I live in the Python bubble, so I haven't realized until this post that so few
people knew about that.

This post is massively popular despite talking about a feature we had since
Python 3.6, in 2016, that was posted on HN at the time and that is featured in
most popular tutorials.

A good reminder that most of the world doesn't revolve about my favorite
language. And that information is not that fast to spread.

~~~
rplnt
> about a feature we had since Python 3.6, in 2016

No, you had that feature in one implementation. Now it's in the language
specification. That's vastly different because only now you can rely on it
without fearing it can go away with the next release.

~~~
BiteCode_dev
First, cPython is 99% of Python deployments. So much that if a script works on
it but not on another implementation, people often consider the later broken.

As this post proves that most people don't even know about this feature, you
can be pretty sure the vast majority of people don't know about pypy,
micropython, etc.

Secondly, even if you want to nit pick, Python 3.7 made it official more than
one year ago. We are currently in the 3.9 alpha.

In fact, 3.7 didn't touch the implementation, just merely declared "yep, good
idea, let's keep it that way".

~~~
dr_zoidberg
There's a great talk by David Beazley[0], which I don't remember its title
right now, where he bashes all previous python versions except the latest at
the time[1], which was 3.6. And he says "since I'm not a core developer, I can
tell you this: rely on dicts being ordered so much that eventually they'll
make it official". Guess he "won" in the end ;)

[0] Well, he usually gives great talks anyway

[1] Alright, he literally started many talks doing that

~~~
crashbunny
Raymond Hettinger does or did a few talks on dictionaries too. He covers the
history of dictionaries in python and how they evolved. It was mentioned in
the article's comment section.

[https://www.youtube.com/watch?v=p33CVV29OG8](https://www.youtube.com/watch?v=p33CVV29OG8)

------
skrebbel
This is awesome, because the ordered map is the best data structure out there
for easy & predictable programming, possibly only barring the array.

There's so many cases where it's a benefit for map entries to retain order
(and none where it's a problem). PHP really got this one right (and
immediately messed it up by mixing ordered maps with arrays into a big soup,
but hey, PHP). And so did, JS, sorta-kinda-by-accident.¹

When I went from PHP to Ruby back in the day, Hashes not being ordered
definitely was my biggest gotcha. Everything about Ruby (except the
deployment) was nicer, but the hashes... I've spent serious amounts of extra
thinking just to make code work that would've worked out of the box in PHP.

Yay for ordered maps! Every language should have them, they're the best!
There's just so much ergonomics packed in such a simple API.

1) JS objects are ordered maps, _iff_ the keys are strings that cannot be
parsed as an integer (really) (yeah it's nuts) (but still better than
unordered maps!!)

~~~
tene
Could you explain some examples of what this is useful for? What sort of
algorithms or operations do you have in mind where you both want insertion
order, and also key-based lookup in the same data structure?

I've made heavy use of all kinds of maps, and of queues and channels and
arrays, but I don't recall ever noticing a situation where I wanted the
properties of both mixed into the same data structure.

I'd love to learn more about useful tools to add to my toolbox!

~~~
mehrdadn
For "algorithms", I mean, some really do care about insertion order. Like idk,
if you have a priority queue, then you generally want FIFO ordering when the
priority is the same? It like a pretty obvious desire in most cases... imagine
a thread scheduler or a packet scheduler or what have you.

But generally it's about determinism and avoiding loss of information, not
just whether a particular algorithm needs it. For example, you'd want
serialize(obj) and serialize(deserialize(serialize(obj))) to produce the same
output, otherwise you e.g. might not be able to cache stuf. But for a data
structure like a hashtable, it's pretty tough (not logically impossible, but
rather pointlessly difficult) to make that happen without preserving insertion
order.

As another example, it's incredibly handy for a user to see items in the order
in which they were inserted. Like say you're parsing command-line arguments,
and the command is ./foo --x=y --w. If the user sees {x: y, w: None} then that
tells them w was passed after x. That can be extremely useful for debugging;
e.g. maybe you expected the caller to specify w earlier, in a different
context and for an entirely different reason. Seeing that it came afterward
immediately tells you something is wrong. But when such information is lost
it's harder to debug code.

~~~
dTal
If --w can be specified in two places for "an entirely different reason", then
an unordered data structure is simply inappropriate, full stop. That's not a
question of debugging, that's a question of correctness.

------
jkbbwr
Am I the only one that thinks this is a stupid decision? This will silently
break code that starts to rely on this behaviour that gets executed on
Python3.5 and lower. I would consider changing how a builtin works to be a
major breaking change. It would have been fine if this was a change between 2
and 3 but on a minor version? Thats insane.

~~~
setr
I'm not clear on how this break existing code.

Code that assumed it was arbitrary, would expect to handle any arbitrary
order, including a happens-to-be sorted order.

Code that assumed it was random, like actually inserted by random(), was
already broken, because that simply isn't the case.

Code that assumed the order would stay constant was relying on implementation-
specific behavior, and could potentially break on any version update; as with
any reliance on implementation-specific behavior, you'd break if the
dictionary code ever got touched -- even if it were for a bugfix.

Code that ordered the dictionary keys before iterating are now slightly
innefficient due to extra work of sorting a sorted list.

~~~
biddlesby
It doesn’t break existing code. Code written for Python 3.7 might break on
older versions of Python

~~~
setr
You're right, I read the gp too quickly.

But in the case of downgrading, I'm fairly sure there's a number of other
breaking changes that can't trivially downgrade minor versions. Like f-strings
were only introduced in python3.6 as I recall. Async keyword only exists as of
3.4 as well I think?

~~~
ses1984
Introducing things is different than changing things.

~~~
setr
Sure, but you can't safely take everything from a higher version to a lower
version in any case; if insertion order became gauranteed due to a bugfix, and
wasn't backported, you'd be in the same boat.

The only way to consistently code cross-version is to start with the lowest
you plan to support (assuming the higher versions are actually backwards-
compatible).

Does any language gaurantee that code is both backwards and forwards
compatible?

~~~
peteradio
Issue seems to be silent incorrect behavior, what happens if you attempt to
run python code containing f-strings using an older python version. Does it
raise an exception? That's good! What happens now if you write code for 3.7
which takes advantage of the new ordering and someone grabs it from your repo
and runs it using 3.2, it would happily give incorrect results and noone is
the wiser.

~~~
visarga
If you expect this situation you can assert the language version.

~~~
drdaeman
But the whole point is that some developer won’t expect that someone would run
their code on an older Python, isn’t it?

------
_aleph2c_
This is an amazing contribution to the language. A mixture of speed and
convenience, probably made by volunteers. As for people criticizing a change
to what use to be a non-deterministic ordering of a dict iteration; I don't
know what to say to them, other than, are you serious? There are people out
there who are working for us, they work for free and they did some heavy
lifting to give us this. They might read what you wrote and think, "Why
bother? Maybe I should spend my weekends playing with my kids instead."

~~~
alayne
Other languages don't generally have special order guarantees about standard
maps. This seems very idiosyncratic.

~~~
_whiteCaps_
Golang map iteration is returned in random order specifically to make sure
that people don't rely on the order.

I think this feature says a lot about the philosophy of Python vs Go.

~~~
MereInterest
Sounds like a fast and idiomatic way to shuffle a deck of cards is then to
convert to a map and back.

~~~
thedirt0115
Convert a deck of cards ([]Card?) to a map (map[Card]bool?) and back just to
shuffle? That's unlikely to be faster or more idiomatic than a straightforward
implementation of the Fisher-Yates shuffle[1]. Try writing the code to do it
both ways and compare.

[1]:
[https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle)

~~~
kragen
I think "idiomatic" would be using rand.Perm, which implements the
Fisher%E2%80%93Yates shuffle. But aside from whether it's idiomatic,
converting to a map isn't random enough.

~~~
kragen
With Golang 1.11, I get orderings like this. S is spades, H is hearts, D is
diamonds, and C is clubs, because HN eats the Unicode characters I was
actually using.

    
    
         S9  SJ  SK  HQ  D9  S4  S5 S10  HJ  S8  H6  DA  D2
         D4  DQ  C6  C8  CJ  H3  H9  DK  C3  CQ  SA  S2  S3
         S6  SQ  H4 H10  D8 D10  C4  CK  S7  H7  D5  D6  D7
         C7  H2  HK  D3  DJ  C2  C5  C9  HA  H5  H8  CA C10
    

On average you would expect about one card in a shuffled deck to be
(cyclically) followed in the deck by its succeeding card, and on about one out
of 50 shuffles, that card would be followed by its successor. Here we have S4
S5, DA D2, SA S2 S3, and D5 D6 D7.

This is very strong evidence of bias.

I'm interested to hear if my Golang can be made more idiomatic, or if there
are bugs in it:
[http://canonical.org/~kragen/sw/dev3/mapshuffle.go](http://canonical.org/~kragen/sw/dev3/mapshuffle.go)

In particular it seems like there ought to be a less verbose way to express
the equivalent of the Python list({card: True for card in deck}) in Golang.

------
paulhodge
Great decision IMO. I remember maintaining a bunch of Python code that we
supported on both OSX and Windows. Out of all the platform-specific bugs we
had (where it worked on one OS and not the other), one of the most common
causes was code that relied on a certain key order. And we knew that relying
on key order was bad, we're supposed to use things like OrderedDict, blah blah
blah. It was still a really easy mistake to make, just like it's really easy
to have memory safety bugs in C.

At some point, if a human error is common enough, it makes pragmatic sense to
change the design so the error is impossible.

~~~
rob74
The Go designers went the other way (as they often do):

> When iterating over a map with a range loop, the iteration order is not
> specified and is not guaranteed to be the same from one iteration to the
> next.

Actually it is not only "not guaranteed to be the same", the runtime actively
makes sure that the iteration order is actually different so you don't even
start to rely on it...

~~~
tele_ski
That seems like it would be a performance hit to actively make it different?
That seems like a very weird and strange design decision if true

~~~
jerf
It's a very small performance hit because it doesn't do the work to be
_uniformly_ random, just "not always the same". If you think about how you
have to iterate through a hash table or similar data structure anyhow, it's
either O(1) or O(log n) paid once per "range" on the data structure, which is
dwarfed by the actual act of ranging on the data structure.

Go's philosophy is also definitely willing to pay that price to avoid a large
class of known bugs that has hit all kinds of code bases. It is not about
being the fastest language. As compiled languages go, it's solidly middle-
tier, and not likely to go up much from there. (Among the C-style compiled
languages, it's low-tier on performance, at around half the speed of C in
general. However there's enough compiled languages like Haskell that are still
generally relatively slow so that Go is mid-tier for compiled languages over
all.)

------
zackmorris
I'm glad to see other languages finally catching up to PHP. I'm joking (kind
of) but after a lot of years of doing this, I've begun de-prioritizing pure
abstractions and favoring the way that humans tend to do things on their own.

Technically this is along the lines of the worse-is-better philosophy. The
single biggest cost in software development is friction. Performance, size,
etc are all less important, because they become less important with each
passing year as computers grow more powerful. And along those lines, I think
that silly concepts like assigning letter names to drives in Windows, or
having a paltry few registers in x86, made those architectures more accessible
to the masses than the more generalized Unix and Mac platforms. Not that I
completely agree with that, just, it's all I can come up with (other than cost
and familiarity) for why people are so attached to worse-is-better paradigms.

Preserving insertion order has some small cost associated with it, but that's
going to be dwarfed by the cost of bugs introduced when junior devs are
surprised by unordered dictionaries. And the pedantic people can just use an
unordered dictionary manually since they are already converting pure
abstractions to whatever imperfect implementations are provided by languages
anyway.

~~~
upofadown
>Performance, size, etc are all less important, because they become less
important with each passing year as computers grow more powerful.

That isn't as true as it once was. We are running up against fundamental
limits and people are even starting to talk about the environmental impact of
computing.

~~~
masklinn
It should be noted that the ordering of python dict occurred as a side-effect
of a new implementation with better memory profile (and as good or better
performances).

------
geophile
"Ordered dict" is ambiguous, and from the title I thought that key order was
meant. Reading the article, I see that it's actually insertion order. Which
makes much more sense. Key order would have been a much more significant
change.

~~~
_verandaguy
Not to mention there's often no meaningful way to order by key value since
keys can be any hashable values -- so things like this break this idea:

    
    
        _dict = {}
        _dict.update({MyObject: 3})
        _dict.update({'MyObject': 3})
    

There's no ordering over _most_ hashable values since they span multiple
types, so insertion ordering is the only sane way to do it.

~~~
masklinn
A sorted map would be tree-based and require its keys to be orderable but not
hashable. For instance Rust's HashMap has its key bound on Hash + Eq, while
BTreeMap's is bound on Ord.

They're different data structures, with different use cases and different
requirements.

> There's no ordering over _most_ hashable values since they span multiple
> types, so insertion ordering is the only sane way to do it.

Python 2 actually had total ordering of all values. The result was usually
stupid but it was there.

~~~
_verandaguy
>The result was usually stupid but it was there

This is what I meant by "sane" in my comment. It's _way_ more meaningful to
the user if data's insertion-ordered.

------
throwawaylolx
Where "now" dates back to 2018?
[https://docs.python.org/3/whatsnew/3.7.html](https://docs.python.org/3/whatsnew/3.7.html)

~~~
throw18374
Came to say the same.. insertion order of dict has been here for years for
90+% of Python users (CPython implementation) and a part of the language spec
also for almost as many years.

I guess it’s good to spread awareness to HN readers who apparently were
unaware, but the headline is very misleading.

~~~
raziel2p
90% seems optimistic. I don't think that many Python projects are running the
most recent version in production. From personal experience, upgrading to 3.7
was especially a pain because of packages that used "async" for variable/kwarg
names.

~~~
throw18374
That’s a very good point, in that case even 50% would probably be optimistic.

We can probably attribute this article’s novelty to the slow adoption of
Python 3. In that way it’s probably a good thing that it’s such a popular
topic, even if it is old news.

------
rcfox
I still see dict ordering as an implementation detail; not a technical one but
a descriptive one. If you want to rely on insertion order, use
collections.OrderedDict. It communicates your intention far better, and there
should be no overhead.

~~~
xapata
OrderedDict is less efficient. It's best to replace usage with basic dict
where possible to improve efficiency. There is one obscure feature of
OrderedDict that isn't in the basic, so it can't always be swapped.

~~~
rcfox
Really? I would have figured they'd just do something like:

    
    
        class OrderedDict(dict):
            pass
    

(Plus a little bit of API shimming.)

~~~
zurtex
No, OrderedDict has it's own C implementation which was created just before it
was decided that dict would preserve order across iteration.

Further there is a big difference, regular dict preserves order across
iteration but OrderedDict treats order up to equality.

I.e. this returns True:

    
    
        {1: 1, 2: 2} == {2: 2, 1: 1}
    

Where as this returns False:

    
    
        OrderedDict({1: 1, 2: 2}) == OrderedDict({2: 2, 1: 1})
    

To make that difference speedy it needs to be done on the C level.

~~~
masklinn
Also ordereddicts provide methods to move items to the start or end, and
remove items specifically at the start or end, not so for regular dicts.

~~~
xapata
dict has popitem for removing at the end. That used to be arbitrary, but now
it (de facto) means last-inserted.

------
daenz
Cool, but, this just encourages people to rely on hashmaps being ordered by
insertion. The nature of a hashmap is not one where you rely on the ordering.
This is going to trip up newbies who then move to other languages.

Some comments on the post don't seem to understand the changes, for example,
the C++ comment saying that std::map is ordered, so this isn't a strange
change. The difference is that std::map is a red-black tree implementation,
which has a traversal order, not a hashmap with insertion order. So there is
already confusion.

------
japhyr
Brandon Rhodes gave a great talk at PyCon 2017 about the evolution of
dictionaries in Python.

I really appreciate the people who dive deep into these implementation
details, and continue to optimize the languages we all use.

[https://www.youtube.com/watch?v=66P5FMkWoVU](https://www.youtube.com/watch?v=66P5FMkWoVU)

~~~
kstrauser
I usually feel like a reasonably bright person, and then I go to a talk like
this and suddenly remember how terribly much I don't (and will never) know.
There are some very sharp people at play here.

------
taeric
This is a solid argument for why you should have more datatypes available to
you. If your program relies on an insertion order, say so. If anyone ever
moves to another implementation, they will thank you for it.

~~~
duckerude
Ordered dictionaries already existed as a separate type. Much of the gain was
in the use of dicts for core language features.

In particular, Python 3.6 guarantees that the order in which class attributes
are defined is preserved, and that functions which take variadic keyword
arguments receive them in the order in which they were passed. It permits
other implementations to use types other than dict to accomplish that.

Dataclasses make use of ordering of dictionaries in a subtle way. If you
write:

    
    
      class Foo:
          bar: int
          baz: str
    

Then Foo.__annotations__ is a dictionary {'bar': int, 'baz': str}. The
@dataclass decorator transforms that into standard boilerplate, and it needs
to know the order to do that.

~~~
taeric
And this is all the more reason using a type would make sense. Curious why
things use ordered? Look at all the places that declare they need ordered. :)

~~~
duckerude
You don't know in advance when ordering might be useful, though.

Class __annotations__ was new in 3.6. For its original use there's no need for
ordering. They were ordered, because dicts were ordered, but only as an
implementation detail.

Dataclasses were added to 3.7, after 3.6's features made a nice syntax
possible. If __annotations__ hadn't happened to be ordered already then I
would guess it wouldn't have been made ordered just for dataclasses -
dataclasses just wouldn't have existed.

Making everything ordered in one go opens up possibilities you haven't even
thought of yet.

~~~
taeric
That argument cuts both ways. You don't know when the benefits of random will
be there...

As soon as you do something that cares about order, state how it is derived.
Sometimes, insertion order is right. Sometimes, not.

Don't get me wrong alists are nice. And order def matters to those. But it is
part of their definition. And reinsertion changes the order in obvious ways.
Not even clear what it does to just "dict".

Now, I will concede this is overblown. Life will easily go on.

------
ebg13
For some definition of "now". Python 3.7 was released in June 27, 2018. So
"now" is "for the past year and a half". Python 3.6, which implemented the
change originally, was released in December 23, 2016. By that measure "now"
actually means "for the past 3 years".

~~~
toyg
The whole Python 3 affair made me realise that a lot of people upgrade their
tools like once every 5-10 years. Which is shocking, in an industry as fast as
this. I’m not saying you should churn your whole build setup every 6 months (I
think release speed for Python is a bit excessive at the moment, for example),
but complaining in 2020 for a feature that was released almost two years ago,
was in development for a year or so before that, _and was widely
publicised_... I mean, c’mon.

------
strenholme
This is a refreshing update for those of us that like having code which will
always act in the same way across multiple invocations.

Now, if only Lua could follow the same path with their “tables” (“tables” is
what Lua programmers call their form of Python’s “dictionaries” and Perl’s
“hashes”).

I just spent eight hours earlier this week debugging Lua code which would run
differently on different invocations of the same code.

The standard way to iterate in a table with Lua is like this:

    
    
      for key, value in pairs(foo) do
    

One problem: The order we get elements from the table “foo” is undefined, and
_it can change between different invocations of the same Lua code, even if the
elements were put in the table in the same order_. In order to fix things so
that we can iterate a table in a consistent manner, this is my fix (public
domain [1], if those who want to copy and paste it):

    
    
      function sorted_table_keys(t)
        local a = {}
        local b = 1
        for k,_ in pairs(t) do -- pairs() use OK; will sort 
          a[b] = k
          b = b + 1
        end
        table.sort(a, function(y,z) return tostring(y) < tostring(z) end)
        return a
      end
    

Then we iterate the table like this:

    
    
      for _, key in ipairs(sorted_table_keys(foo)) do
        local value = foo[key]
    

(In Lua, two dashes indicates a comment.)

Note that this code will not always sort in the same order all tables. If we
have a table with the keys 1 (as a number) and "1" (as a string), iteration
order is still undefined.

The code is open source, and is a procedural (“random”) map generator for Doom
written mainly by Andrew Apted which I have added some features and fixed some
bugs with. It’s here:
[https://github.com/samboy/ObHack](https://github.com/samboy/ObHack) and the
issue is here:
[https://github.com/samboy/ObHack/issues/4](https://github.com/samboy/ObHack/issues/4)

[1] The project I added this code to is GPL, but this function, which I wrote
entirely by myself, is one I am donating to the public domain.

~~~
wruza
You can write an iterator yourself, no need to leave ?pairs idiom:

    
    
      function sortpairs(t)
        local keys = { }
        for key in pairs(t) do
          table.insert(keys, key)
        end
        table.sort(keys, function (a, b)
          return tostring(a) < tostring(b)
        end)
        local i = 1
        return function (t)
          local key = keys[i]
          i = i + 1
          if key ~= nil then
            return key, t[key], i-1
          end
        end, t
      end
    
      t = {a=10, c=20, d=30, b=40, 50}
      for k,v,i in sortpairs(t) do
        print(k, v, i)
      end

~~~
strenholme
What Lua is lacking here (and why the above iterator function needs 17 lines)
is the ability to have “for” go through a list ( _without_ converting the list
in to values returned by an iterator function), which would let us quickly and
easily sort lists that “for” can use. Something like:

    
    
      d = {"foo": 2, "bar": 1, "zoo": 4}
      for k in sorted(d.keys()):
        print k
    

(I’m not advocating Python here, since Perl has a similar way of using “for”
to go through lists which can _also_ be easily sorted)

However, with Lua, “for” only accepts a numeric range, or an iterator
function, so customizing “for” requires understanding function closures:
Understanding how a function, when called multiple times, stores variables
altered in previous invocations of the function, and understanding how to give
those variables initial values (usually in the “function factory” function
which creates the function we use).

In other words, “for”, in most modern high-level languages, can be one of:

1\. for variable in [something that specifies a numeric range]

2\. for variable in [iterator function]

3\. for variable in [list]

But Lua only has “something that specifies a numeric range” and “iterator
function”; it can not natively go through a list.

~~~
wruza
You can convert a list first and then feed it to a simple iterator. I don’t
fully understand what your exact real-code issues can be, but hope this
snippet may help:

    
    
      function vs(t)
        local i = 0
        return function (t)
          i = i + 1
          return t[i]
        end, t
      end
    
      function sorted(t, cmp)
        table.sort(t, cmp or function (a, b)
          return tostring(a) < tostring(b)
        end)
        return t
      end
    
      function keys(t)
        local keys = { }
        for key in pairs(t) do
          table.insert(keys, key)
        end
        return keys
      end
    
      t = {a=10, c=20, d=30, b=40, 50}
      for k in vs(sorted(keys(t))) do
        print(k, t[k])
      end
    

I.e. if “natively” means strictly “for in t” that generates values, then no,
Lua can’t do that. But if “for in vs(t)” is okay, then that vs() is the
solution.

~~~
strenholme
That looks good, and I think putting these in a prominent place of the Lua
documentation (along with a notice that the code is public domain) would help
us who are used to the AWK/Perl/Python/PHP way of having “for” natively
traverse a list without needing a complicated list-to-iterator function that
uses function closure (i.e. the iterator function remembers the value “i” --
I’m writing this for the lurkers because code like this can be difficult to
follow).

One honest question: Is there any reason why the function factory (i.e. a
function which returns a function) which converts a list (Actually, table with
ascending integer indexes) in to an iterator Lua can use with “for” returns
both the element and the entire table here? Here is the code I am asking
about:

    
    
        return function (t)
          i = i + 1
          return t[i]
        end, t
    

I’m curious why we’re returning both the table element for the iterator and
the entire table.

------
awkward
1) I am personally annoyed as someone who has had to unwind knotty dict soup
python codebases.

2) This will make life easier for thousands of programmers and prevent a
massive number of extremely difficult bugs from hurting users.

------
svnpenn
It seems people dont realize that you can have your cake and eat it too. You
can have both orderered and unordered maps in the same language:

[https://yaml.org/type/map](https://yaml.org/type/map)

[https://yaml.org/type/omap](https://yaml.org/type/omap)

No reason to argue about which one to make "dict". In fact it would be better
to have both because youre taking a performance hit (a significant one) by
ordering the entries.

~~~
lvh
The new dict does not come with a performance hit, let alone a significant
one. It’s much more memory efficient and generally slightly faster.

~~~
svnpenn
pretty strong evidence to the contrary

[https://apps.dtic.mil/dtic/tr/fulltext/u2/a627127.pdf](https://apps.dtic.mil/dtic/tr/fulltext/u2/a627127.pdf)

do you have any references to back your claim that its actually faster?

~~~
joshuamorton
[https://morepypy.blogspot.com/2015/01/faster-more-memory-
eff...](https://morepypy.blogspot.com/2015/01/faster-more-memory-efficient-
and-more.html)

The new python dict implementation is benchmarked as faster in both
microbenchmarks and in practice for almost all real-world workloads.

Note that this isn't a sorted map, like C++'s "Ordered Map", but an Ordered
map. C++ get's the name wrong. Items aren't ordered by key comparison, but
ordered by insertion time.

~~~
svnpenn
Good link, thanks

------
oefrha
Not sure why this is posted and upvoted to front page now. After all this is a
major bullet point in py37 What’s New, and even py38 has been out for a while.

Anyway, I’ll keep using collections.OrderedDict (except for personal scripts)
until py35 EOL.

~~~
gerikson
Probably because it was posted to lobste.rs and got good traction there:

[https://lobste.rs/s/htcz5f](https://lobste.rs/s/htcz5f)

------
jmilloy
I think this change is great, but this really only becomes news again once all
major LTS are shipping with at least Python 3.7, right? No one can really use
it in code they plan to distribute at the moment. Maybe I am underestimating
the amount of Python code that is meant for internal or personal use only.

~~~
sandgiant
We've been running FROM python:3.7 in production for months. Works just fine.
;)

------
raymondh
This 37 minute Pycon video covers the essential details of how Python
dictionaries are implemented:
[https://www.youtube.com/watch?v=npw4s1QTmPg&t=3s](https://www.youtube.com/watch?v=npw4s1QTmPg&t=3s)

------
ummonk
Wait, if it is held in a separate dense array, is removal of a key from a
dictionary O(N)?

~~~
Rapzid
You can use memcpy to remove items from the middle of a memory segment. It's
probably not a single memory segment either; I imagine it works more like a
Golang slice.

~~~
ummonk
Memcpy is certainly O(N). But yeah, if they implement it with multiple slices
(e.g. a B tree) it could be fast.

~~~
Rapzid
Hrm, I'm pretty sure I meant memmove:
[https://golang.org/src/runtime/slice.go](https://golang.org/src/runtime/slice.go)

EDIT: I could have sworn they were shifting entire pages around without using
a cycle per byte but I can't find any reference to that now haha.

------
jillesvangurp
Java has many types of Map, Set, and List implementations. And that's just in
the core library. Here are some Map implementations that come with Java:
TreeMap, a HashMap, A LinkedHashMap, ConcurrentHashMap, EnumMap. There are
many more and they each have their own uses, features, and performance/memory
characteristics.

I always wondered why languages like ruby, javascript, and python don't
include a bit more choice on this front. It seems like sets are still somewhat
of a novelty for javascript developers and people just wing it with half-assed
solutions involving stupid O(N) contains operations.

------
divbzero
The historical background for those interested…

The ordered dict for Python was originally an implementation detail, part of a
more compact dict representation that was first proposed for Python in 2012
[1], implemented for PyPy in 2015 [2], and merged into CPython 3.6 in 2016 [3]
[4].

[1]: [https://mail.python.org/pipermail/python-
dev/2012-December/1...](https://mail.python.org/pipermail/python-
dev/2012-December/123028.html)

[2]: [https://morepypy.blogspot.com/2015/01/faster-more-memory-
eff...](https://morepypy.blogspot.com/2015/01/faster-more-memory-efficient-
and-more.html)

[3]: [https://mail.python.org/pipermail/python-
dev/2016-September/...](https://mail.python.org/pipermail/python-
dev/2016-September/146327.html)

[4]:
[https://news.ycombinator.com/item?id=12460936](https://news.ycombinator.com/item?id=12460936)

The decision to add it officially to the language spec wasn’t made until
Python 3.7 in 2017. [5]

[5]: [https://mail.python.org/pipermail/python-
dev/2017-December/1...](https://mail.python.org/pipermail/python-
dev/2017-December/151283.html) "Guido says ‘Make it so.’"

------
jnwatson
It sure is annoying seeing people complain about a well-received feature
implemented 3 years ago that has already transitioned smoothly.

------
diN0bot
question: does the following expression still evaluate to True with ordered
dicts?

    
    
        {'a':1, 'b':2}  ==  {'b':2, 'a':1}

~~~
ebg13
Yes. Dict comparison doesn't rely on iteration order.

~~~
diN0bot
thanks :-)

i mean, i would hope not if it's an implementation detail and not a change in
the abstraction itself... feels a bit like it is breaking the abstraction,
though.

~~~
ebg13
The definition of a dict hasn't changed. It's still defined as a set of
key:value pairs. It just happens to also have a particular guarantee now that
iterating over the keys will return them in insertion order.

------
yason
Funny, I've never expected or needed dict to be ordered. I've known about
collections.OrderedDict but never really found much use for it. A dictionary
is a hashtable and the order generally isn't important for typical uses of a
hashtable. So, what is it that people are putting in dicts where the order of
insertion actually matters?

------
lunias
Pretty important to note that it's "insertion order". I immediately thought,
ordered by what comparator function?

------
jxramos
A great talk behind the redesign which achieved this can be found here:

BayPIGgies at LinkedIn June 2017: "Techniques for Design Reviews" by Raymond
Hettinger
[https://www.youtube.com/watch?v=cNqJDRsefg8](https://www.youtube.com/watch?v=cNqJDRsefg8)

There's a lengthy lead in, but its worth listening to.

------
faraggi
I've always thought this was a bit annoying in Python, good to know a (very)
small itch is now scratched.

~~~
Recursing
It's been this way since python3.6 in 2016, so you really only have to worry
about very old versions

------
c-smile
Same thing as in JavaScript : [https://www.stefanjudis.com/today-i-
learned/property-order-i...](https://www.stefanjudis.com/today-i-
learned/property-order-is-predictable-in-javascript-objects-since-es2015/)

------
Sami_Lehtinen
Unfortunately ordered dict doesn't provide index access. In many cases, it's
still required to maintain parallel list + dict approach, even if dictionary
is ordered. List for index access and dict for fast look ups and containing
the data.

------
stevefan1999
But how does ordered map(self-binary search trees) handle hash-only key? I
mean you can compare lexicographical order with character strings or numbers,
but not a class instance right?

------
leksak
It'd be nice if using collections.OrderedDict could elicit a warning. Is there
a tool like flake8/autoflake for flagging things that used to be Pythonic but
no longer are?

------
6gvONxR4sf7o
I'd love to know more about where an ordered dict comes in handy. Anyone have
use cases? Otherwise, guaranteeing this behavior just seems ‾\\_(ツ)_/‾

If this is useful, I wonder if an ordered set is useful.

~~~
masklinn
> If this is useful, I wonder if an ordered set is useful.

An ordered set is a unique list. It's useful. However it's not present in
Python, dicts and sets have separate implementations and sets were _not_ moved
over to the ordered implementation (because the ordering was initially a side-
effect of a change in implementation which was not considered useful or
advantageous for sets).

~~~
6gvONxR4sf7o
An ordered multiset is a list, not an ordered set.

------
deathanatos
How does one have insertion order and maintain O(1) deletions?

------
TimMurnaghan
From TFA "... it brings Python on par with PHP"

Faint praise indeed

------
dfox
I had firsthand experience with this otherwise useful feature masking an
obvious bug. In one codebase there was function along the lines of (for
calling into MSSQL with it's "peculiar" stored procedures):

    
    
        def call_stored_procedure(name, **args):
            cursor.call_proc(name, *args.values())
    

Of course this is obviously wrong and on Python 3.6 this will break horribly
on first use, while in Python 3.7 this worked as long as the keyword arguments
were in the correct order.

~~~
doubleunplussed
On 3.6 it will work, it's just relying on an implementation detail (dicts are
ordered in 3.6, it's just that it is considered an implementation detail, not
a guarantee)

------
dekhn
A bit of a side topic, but when Judge Alsup (himself a descendent of Boole and
with a middle name of Haskell, and a programmer to boot) said "It is so
ordered" ([https://casetext.com/case/oracle-america-inc-v-
google-8](https://casetext.com/case/oracle-america-inc-v-google-8)) in the
Oracle Google case, I could only think that's what a computer scientist would
say when quicksort was done.

------
peter_retief
Must say I am glad about that. It confused me when I first discovered this and
was told it was not a bug

------
dehrmann
I've written a lot of Python, but more Java. This is where I have a gripe with
"batteries included." In Java, I'd have to think slightly about this, then use
a LinkedHashMap. It's been in Java since *2002. It also has a Set flavor.
Python just doesn't have as rich of a collection of included data structures,
and the APIs are more limited.

~~~
masklinn
What java calls linkedhashmap, Python calls ordereddict.

It's not quite as old as java's linkedhashmap but is no spring chicken either
(it's a bit above 10 years old).

~~~
dehrmann
Where's orderedset, though?

~~~
masklinn
Nowhere because so far the core team has remained unconvinced. People have
brought up replacing standard set implementation by a variant of dict's (even
expressing surprise that that wasn't already the case) but no dice yet.

There are discussions on the subject on python-dev once in a while e.g.
[https://mail.python.org/pipermail/python-
dev/2019-February/1...](https://mail.python.org/pipermail/python-
dev/2019-February/156466.html)

------
amelius
Title should say what "ordered" means. Is it by key, or by insertion-order?

------
Grimm1
This has been true since 3.7 why is this such a seemingly major revelation on
HN?

------
ibic
So isn't that a backward-incompatible change for versions prior to 3.7?

------
codesnik
Ruby introduced similar change in 1.9 back in 2007, and we're fine.

------
cocoa19
Unpopular opinion: I'd rather have classes that have precise meanings such as
in Java. ArrayList, HashMap, TreeMap.

Having classes with ambiguous names makes it seem like a toy language.

------
neycoda
You ordered snake dicks?!

------
dmitriid
Javascript taught me to never assume any ordering in a dict/map/hashmap
(whatever the term)

~~~
pfdietz
Is the ordering there at least fixed, or can it vary from run to run even on
the same data, or between implementations or versions of the same
implementation?

EDIT: to clarify, I was asking about JavaScript there.

~~~
nobleach
It is deterministic, but it's an odd rule-set. For example: `var o = { 3:
'foo', 1: 'bar', b: 'baz', a: 'quux' }` Will yield: `{1: "bar", 3: "foo", b:
"baz", a: "quux"}`.

Numeric keys get sorted. String keys are insertion-order. If key order is a
priority, a Map should be used instead. Often when JavaScript's objects are
used and a particular sort order is required, an accompanying (sorted) array
is used.

~~~
iiodiiod
(deleted--missed context)

~~~
nobleach
Go do `var o = { 3: 'abc' }` and then `o[1] = 'def';` Now tell me the output
of `o`. (Doing this in your console should suffice). Is that insertion order?

