

Ask HN: Why do I always want to use an ordered dictionary? - noaharc

I routinely find myself yearning for this data structure in almost every project, but I've scoured the web, and inevitably I just find forum postings saying something along the lines of "dictionaries are unordered, idiot".  I usually end up just making a list and dictionary, but it strikes me as inelegant.  And, if it were truly a common problem, I can't believe someone hasn't made this data structure before.  So I must be doing something wrong in the way I'm thinking about these problems.  Any idea what it is?
======
hbien
Here's a library for it in Python:
<http://www.voidspace.org.uk/python/odict.html>

~~~
noaharc
This looks great. Thank you so much!

~~~
noaharc
For posterity, here is a good discussion about the existence (or lack thereof)
in Python:
[http://groups.google.com/group/comp.lang.python/browse_frm/t...](http://groups.google.com/group/comp.lang.python/browse_frm/thread/5ac716f9ad14cc04).

------
gills
I usually want that when what I really need is ordering but for some reason I
want key-based/random access to the values.

I think Ruby added this recently. I'm not using it yet. A crappy hack can just
dump the next/prev key into the value, plus a pointer to the first, and drop
them all in a hash. It's not really pretty, but would give you ordering if you
want it. Another hack is to store just the keys in a sorted list and keep the
values in the hash (but this is equally inelegant to the list+hash hack and it
has the same maintenance overhead).

I think most of the ordered-hashes are implemented with something like list
pointers between the hash keys, but then again I haven't looked too closely
under that kimono.

Edit: trees are also a good compromise.

------
makecheck
As for whether or not the data structure exists, it does. :) The std::map in
C++ is a sorted dictionary, for instance.

But it can be uncommon to require this, usually for two reasons. One, the data
set may be huge. And two, much of your code may not even "care" if the data is
sorted. In either of those cases, if automatic sorting occurs, a lot of
compute time may be wasted for no net benefit. (A solution is to have data
sorted "on demand" within the code that actually cares about sort order.)

------
lacker
Often, an algorithm that requires iteration in a certain order is a sign that
you are keeping information implicitly in the ordering instead of explicitly
in a more appropriate data structure. You also may be trying to pack lots of
unrelated functionality into a single object when multiple objects would serve
better.

But it's hard to say without better examples, and there are times when you do
want order and lookup. Can you give a couple situations where you want ordered
dictionaries?

~~~
noaharc
Right now I'm working on a large data set where there is a set of players and
each player has a score. I'm using a Python dictionary from player ID to the
player score, but I also want an easy way to grab the top N players from the
set. The players' scores are updated continuously, so the quick mapping from a
player ID to their updated score is also critical (if the scores were constant
I'd just put it in a list, sort it, and be done with it).

~~~
lacker
Hmmm, does an ordered dictionary actually solve this problem? It seems like
you want to retrieve based on player id, but get the top N based on score. An
ordered dictionary would be ideal if you wanted to retrieve and get the top N
based on the same key.

I guess you could use two data structures, with one of them being an ordered
dictionary. But then the ordered dictionary doesn't actually have to support
lookup; you could just use a heap.

------
dhotson
Ruby 1.9 makes Hash ordered by default.

The major reason I care about it is that when loading data from yaml files the
key order doesn't get messed up. In ruby 1.8 the order got messed up.

So yeah.. use ruby 1.9 :)

------
Tichy
Java has TreeMap for that purpose.

------
joe_the_user
Seems totally reasonable. The rise of Ruby shows that modern languages should
optimize for programmer efficiency first and speed second.

Modern scripting languages have several data structures: mostly arrays, hashes
and strings. But it is a hassle move data between them.

The _next_ language should only have one, the ordered dictionary.

x = [], x << "hi" # 1=>"hi", x["seven"]=6 # 1=>"hi", "seven"=>6,

Etc

~~~
ggchappell
Very thought-provoking idea.

So, here I am designing the Next Big Language. And I'll take your suggestion.

Now, first issue: if I say x = "dog", then the value of x is really

1 => 'd', 2 => 'o', 3 => 'g'

Right? And despite the fact that this is an O.D., it should print in a string-
like way: dog. Now suppose I do

x['z'] = (1 => 2, 2 => 3)

Now x is

1 => 'd', 2 => 'o', 3 => 'g', 'z' => (1 => 2, 2 => 3)

How should it print now?

Next issue: If I do y[1] = 'a' and y[3] = 'b', and then y << 'c', what is the
value of y?

Final issue: Do the above questions really matter?

One could argue in the first case that if I want something to print nicely,
then I will probably keep it in a "reasonable" form, and not mix data types in
odd ways. So it really doesn't matter much how the O.D. I came up with prints.

Similarly, in the second case, if I want a collection to work well as a stack
or queue, then I am probably not going to use the bracket operator to do
strange things with it. So it doesn't matter which of the multiple reasonable
options for the behavior of << is chosen.

Or maybe it does matter?

~~~
joe_the_user
Hey, Someone took it seriously!

I would assume that strings and "ordinary" O.D. would differ in their object
type: each would inherit from an abstract O.D. and each would have a different
print/display function (you could have a "hidden" variable specifying this
function if you wished to keep in all concrete).

 _If I do y[1] = 'a' and y[3] = 'b', and then y << 'c', what is the value of
y?_

probably y = { 1=>'a', 3=>'b', 4=>'c'}, though like any good
OO/metaprogramming language, "<<" would be an overidable function.

 _"Do the above questions really matter?"_

I am arguing for the _next_ language and by that token, I haven't answered all
the questions this raises. I think old languages like lisp and new languages
like Ruby prove that new languages can unleash new possibilities, even more
than fifty years after the original computer languages were created.

Thanks for your detailed reply, seriously...

