
Go’s map iteration order is random - zenlikethat
http://nathanleclaire.com/blog/2014/04/27/a-surprising-feature-of-golang-that-colored-me-impressed/
======
blahedo
It's completely unclear from this post whether the poster is a) a student who
doesn't understand hash tables and is using the word "random" in its
colloquial sense of "arbitrary", or b) someone who did know that hash tables
store in a non-obvious order and is using the word "random" in its technical
sense.

Given that the poster says that "you would think" the iteration order has
anything to do with the insertion order, and doesn't show multiple iteration
runs with different results, I'm rather inclined to assume the former. Is
there evidence in the other direction?

~~~
josteink
To play devils advocate here... There's nobody saying a map has to be a _hash_
map. It could be any sort of map.

Look at JavaScript. Those are just plain maps. Why should expecting
sophisticated hashes for simple maps be the default?

~~~
kevingadd
JS Object is actually an _ordered_ map, because they failed to specify whether
it was ordered or not early on, and we ended up stuck with the particular map
implementation IE used (there are now lots of applications out there that
depend on getting back k/v pairs in the order they were added).

~~~
bshimmin
In the ECMAScript language spec, at least as far back as 1999, it quite
specifically does say "unordered".

~~~
magicalist
yes, but kevingadd was (almost certainly) talking about in practice. In fact,
I vaguely recall reading that the V8 team originally wanted to abandon
insertion-order enumeration for performance but couldn't because too much code
broke on the web.

It's not _quite_ insertion order, though, which is one of the issues with just
specifying it in ES6 as it's implemented. If you insert with a numeric key (or
a string that's strictly a number), I believe the JS engines don't agree on
the order in which the numeric key will be enumerated in relation to the
alphanumeric ones (v8 and spidermonkey will put the numeric ones ahead of all
the keys, sorted ascending, then do the rest in insertion order, for
instance).

------
azernik
I think this reflects a lot of the philosophy of Go (and, probably the goals
of Google in supporting it). They didn't just want a language that was
productive and shiny and new, but a language whose features would actively
prevent coders (Google employees in particular) from writing bad code,
sometimes even at the cost of brevity and coder convenience.

So in addition to this feature (actively preventing developers from depending
on a particular hash map implementation), there are:

1\. No implicit type conversions.

2\. No assignment/increment operators as expressions (only as statements).

3\. No unchecked array dereferences (though this is pretty standard in new
languages).

4\. You have to go out of your way to do manual locking.

5\. A canonical whitespace style, and tools for enforcing it.

etc. Some of these are useful if you're careful, but the language assumes that
you're not careful and _will_ mess it up if you try doing something tricky.

This can be annoying, but it also creates this feeling that, if your code
compiles, it doesn't have the same kind of hidden problems that, say C++ or
Java code usually does, which is a big draw. (If I understand correctly, those
are the other two main languages used at Google on the server-side.)

~~~
kyrra
> ...if your code compiles, it doesn't have the same kind of hidden
> problems...

Except that Golang has the equivalent of C void*, or Java's Object. This is an
issue with working with common data structures like Linked Lists or Trees as
Go doesn't have generics. The data section of these in go are all interface{},
so they have to be cast to use the data. This cast can cause unexpected
behavior if you're not careful.

EDIT: example[0] here. This is a linked list. PushBack takes an interface{},
and the ".Value" is also an interface{}. The ".(int)" and ".(string)" are
casts to get the data out. The assignment to 'b' is a runtime error (not
compile time). You can get the type using switch statements or reflection, but
there are no compile-time checks for common data structures.

[0] [http://play.golang.org/p/K6PpnOkOq6](http://play.golang.org/p/K6PpnOkOq6)

~~~
kkowalczyk
You're blaming a language for a bug in your code.

interface{} is nothing like void* because it's type-safe. interface{} is
(type, value) i.e. it remembers the type of the value it contains. void * is
just a value.

Your code snippet has a bug because you didn't check the type assertion
result. Here's a fixed version:
[http://play.golang.org/p/vlsXtEmgMs](http://play.golang.org/p/vlsXtEmgMs)

In C a cast is unsafe because it doesn't tell you if you did something wrong.

In Go, type assertion is safe. It tells you if it succeeded and if you ignore
it, it'll panic, informing you that you have tried to perform an illegal
operation.

For more see:
[http://golang.org/ref/spec#Type_assertions](http://golang.org/ref/spec#Type_assertions)

~~~
kyrra
I was partially incorrect. I said "reflection" could look it up, but you can
check it during the conversion as you said (from Effective Go[0]). But my
original point (before the edit) still holds true. You need to convert any
data stored in a common data structure implementations and you don't easily
know its type. So you have to track the type of the data in that structure
yourself. So if you pass around a tree, linked list, or whatever, it's not
clear what the type is of the stored data by the type information.

But you are right, I'm new enough to Go to forget about the 'ok' check on the
type conversion. Checking errors on any unsafe operations is really the best
way to go. If you are unsure of the type of a structure you are getting, you
really should be checking it.

[0]
[http://golang.org/doc/effective_go.html#interface_conversion...](http://golang.org/doc/effective_go.html#interface_conversions)

------
lnanek2
Kind of strange they don't just have a type for that. In Java we have unsorted
ones (hashmap), linked ones that retain the sort order (linkedhashmap), and
even automatically sorting trees (treemap / sortedmap). Honestly, considering
users were relying on the key order being different from the hash order, the
default type should have just been made the sorted or linked one and the type
added just made unsortedmap or something. Seems like a failing of the language
libraries that you have to code your own of something so basic

------
crawshaw
For those interested, this was added a few months before Go 1.0:

[https://codereview.appspot.com/5285042/](https://codereview.appspot.com/5285042/)

This has proven surprisingly useful in cutting down on fragile tests, where
order was relied on for a set of items, the code was later changed to insert
in a different order, and then the test was exposed as never being right.

Also of note, if you range over map of comparable keys with text/template, the
order is reliable:
[http://golang.org/pkg/text/template/](http://golang.org/pkg/text/template/)

------
jevinskie
Perl started to enforce this in 5.18 due to security issues. This also gives
wider latitude to the data structures used in implementations due to the
looser guarantees.

[http://onionstand.blogspot.com/2012/12/are-you-relying-on-
ha...](http://onionstand.blogspot.com/2012/12/are-you-relying-on-hash-keys-
being.html)

~~~
eCa
A minor detail: The key order in Perl hashes was always arbitrary (as opposed
to in the order the keys were added) but identical hashes had the keys
returned (by keys(), values(), etc) in the same order before Perl 5.18
(released about a year ago) [1].

[1]
[https://metacpan.org/pod/release/RJBS/perl-5.18.0/pod/perlde...](https://metacpan.org/pod/release/RJBS/perl-5.18.0/pod/perldelta.pod)

------
yepguy
Since everyone here seems to be confused: yes, the Go compiler _actively_
randomizes the order of the keys beyond the unpredictability inherent in hash
tables. If your function iterates over a map twice in a row without modifying
it, you will get different orders.

~~~
itripn
That is empirically not true, at least under go 1.2.1. The poster's own code
produces identical iteration results even when iterating the same unchanged
Map multiple times in a single code execution.

------
lyinsteve
>The Go language designers noticed that people were relying on the fact that
keys were normally stored in the order they were added in, so they randomized
the order in which the keys are iterated over.

Um, I'm just a first-year CS student here, but isn't that a core part of how
hash tables work?

They can so efficiently store and lookup data because they essentially store
contents in a list with the indices as the hashes of the content.

As such, one can never reliably predict the order of the keys in a hash table,
by design. That's not a feature of Go.

~~~
Scaevolus
Yes, but there's more going on. Simple hash tables use a fixed hash function--
the order of keys is unpredictable, but consistent between runs of your
program. This allows algorithmic denial of service attacks-- if you have a web
server that puts querystring parameters into a hash table, and you know what
bucket a key will hash to, you can force collisions and O(n^2) runtime
complexity.

Hash randomization avoids this attack. Go bug:
[https://code.google.com/p/go/issues/detail?id=2630](https://code.google.com/p/go/issues/detail?id=2630)

~~~
rurban
Hash randomization avoids the O(n) attack on collisions if the randomized seed
is not detectable. But if you don't sort the output of keys, rather depend on
the random seed to iterate over the keys, the seed provides no protection
anymore for the O(n) attack. It is very exposed and a minor security risc for
DDoS attacks.

Perl is bit better than Go here, but not as good as Python. The real problem
is exposing too much information of the seed by the ordering of the keys, and
not preventing the algorithmic explosion on collision attacks. Such languages
should really rework their collision handling strategy and not rely on
randomized seeds (which can be brute forced or easily guessed such as here) or
even worse, by strengthening the hash function.

TL;DR: The poster has no idea about hash tables, what Go did, why they did it,
what the problem is and what should be improved.

PS: His cygwin git blog post is also completely wrong. He uses msysgit with
some msys openssl version and complains about cert warnings. Of course, he is
not using cygwin.

------
vardump
Just tried go1.2.1. Hashmap iteration order was not random, it was same every
time. Same happens with Go playground:
([http://play.golang.org/p/px7C_lCn6d](http://play.golang.org/p/px7C_lCn6d)).

I got this output by running same locally (go1.2.1.darwin-amd64-osx10.8.pkg):

    
    
      $ go run test.go 
      There are 500 views for unix
      There are 300 views for python
      There are 100 views for go
      There are 4000 views for javascript
      There are 800 views for testing
      There are 500 views for unix
      There are 300 views for python
      There are 100 views for go
      There are 4000 views for javascript
    

... same order is repeated forever.

~~~
ahuth
The Go playground and tour won't generate random numbers: "Note: the
environment in which these programs are executed is deterministic, so
rand.Intn will always return the same number."

Could this cause iteration order to be the same when executed there?

~~~
vardump
Perhaps, but I'm getting analogous results when running exactly same code
locally.

------
pedrocr
There are at least 4 different types of ordering behavior in a map:

1\. Returning in insertion order. Quite strange thing to do as it requires
extra bookkeeping.

2\. An order unrelated to insertion order that is consistently the same for a
given set of keys. Using a hash it would be sorting by the hash value, in a
balanced tree, sorting by the value itself.

3\. An order unrelated to insertion order and that can be different for the
same set of keys but is always the same on repeated iterations on the same
map. This could be a hash table where on creation a random salt is generated
on creation and hashed together.

4\. An order that actually changes on every iteration. Is there an efficient
way of doing this?

From the discussion here and some quick testing:

Type1: Ruby 1.9.3 (very strange)

Type2: Ruby 1.8.3, python 2.7.3, Perl <5.18

Does anyone know if Go and Perl 5.18 are doing 3 or 4? Ruby >1.9 behavior is
also quite strange but seems to be intended, they've implemented a hash
together with a doubly linked list[1].

[1] [http://www.igvita.com/2009/02/04/ruby-19-internals-
ordered-h...](http://www.igvita.com/2009/02/04/ruby-19-internals-ordered-
hash/)

------
itripn
People needs to stop confusing "random" with "non-deterministic". It is a
stated attribute of a Hash Map (leaving aside special classes, such as a Tree
Map), that iteration of keys will occur in a non-deterministic order.

That is different than random. Random implies that iterating the same Map
twice (without intervening updates) will produce different results. I've never
met a Hash Map for which that was true.

The irony is that the poster's own code violates the point he was trying to
make -- when run under go 1.2.1, it produces insertion order results every
time (at least for me). I've also never met a Hash Map (until this example in
Go) which predictably produced insertion order key iteration. I am sure even
here, it is mere coincidence.

I do believe the poster lacks a formal understanding of data structures. My
apologies if that is incorrect, and I am missing his point.

~~~
kkowalczyk
Yes, you are missing the point.

In Go, iteration order of keys was always non-deterministic.

At some point it was changed to be also random i.e. iterating the same hash
the second time would produce a different sequence of key/values than the
first time.

This is exactly what the article says, except using more words.

~~~
itripn
Listen. Order of key iteration is well understood attribute of a Hash Map, and
it's always non-deterministic. That's what I have said, and I am not missing
the point. The poster tries to imply an insertion order iteration in hash maps
that has never existed.

My second, arguably more interesting point, is that the poster's own code
violates the premise of his post.

~~~
tcheard
Order of key iteration in a Hash Map is not non-deterministic, unless the hash
is using a random salt. Unless you are using a random salt for each Hash Map,
hashes can be precomputed and will always be the same for the same input of
keys, therefore deterministic even though it may look non-deterministic in
nature.

On the other hand, I agree that the author of the article, does not seem to
understand the underlying data structures. Either that, or the way he has
written the article portrays lack of understanding.

It is quite possible that pre 1.0 they may not have been using a hash map at
all, and instead they were using an ordered map, which would have given an
insertion order iteration.

Although note this is pure speculation as I have not tested this on pre 1.0,
and in fact you may be absolutely correct that he is implying an iteration
order that never existed.

What should be noted though is this:
[https://code.google.com/p/go/issues/detail?id=6719](https://code.google.com/p/go/issues/detail?id=6719).
It looks like that as of Go-1.3, there will be a sort of semi-random
iteration, where while iterating each bucket in the hash map will be iterated
over in increasing or decreasing order, chosen at random. Which is good as it
is not too much of a performance hit, with the benefit that the iteration
order will be non-deterministic for small maps, which is currently not the
case.

EDIT:

Here is an explanation of iteration over Maps in Go <1.3 and >1.3:

Map iteration previously started from a random bucket, but walked each bucket
from the beginning. Now, iteration always starts from the first bucket and
walks each bucket starting at a random offset. For performance, the random
offset is selected at the start of iteration and reused for each bucket.

------
seth1010
What? Is unordered traversal of a hashmap unexpected? I would be surprised if
it did anything different.

~~~
PeterisP
In Java, you could write an unittest that inserts 10 fixed items in a hashmap,
checks if the first one is "foobar", and it'd always pass, since the order is
predictable - it's essentially sorted by the fixed hashfunction of those keys.

In Golang, it'd change between each iteration.

------
kzrdude
With uniquely keyed hashes for each hashmap, all environments that iterate in
hash bucket order should have effectively truly random iteration order: across
equal copies of the same hash map, each copy should have its own random order.

Python 3 dicts iterate in random order; random across invocations of the
interpreter in this case.

------
tcheard
Ok here is a correct clarification of what is happening when iterating over
maps in Go. The author of this article doesn't seem to understand what is
going on.

Ever since Go 1.0, they implemented a feature where when iterating over a map
it will choose a random bucket, in the map, to start from and iterate from the
start of each bucket to the end of the bucket in order and then move to the
next bucket in order. The non-deterministic nature in this stems from the
initial bucket to start from, which was chosen at random. This means that the
number of different possible iterations of the same map is equal to the number
of buckets in the map.

This leads to a problem with small maps (those less than 8 items). Because the
max size of a bucket is 8 items, if a map has less than 8 items there will be
only one ordering, and therefore for small maps (those with less than 8 items)
the programmer could still rely on the iteration order of the map, which is
not desired.

So along came this issue:
[https://code.google.com/p/go/issues/detail?id=6719](https://code.google.com/p/go/issues/detail?id=6719)
which seems to have a fix implemented in Go version 1.3. (Here is the revision
log for the committed change:
[https://code.google.com/p/go/source/detail?r=6ea672cbfe43](https://code.google.com/p/go/source/detail?r=6ea672cbfe43)).

So as of Go version 1.3, the behaviour will be as follows. Buckets will be
iterated over from bucket 1 to bucket N. But within each bucket iteration will
start at a random index and then iterate through the bucket. For performance
reasons, the random starting index for within each bucket will be determined
at the start of iteration and will be the same for every bucket. Iteration
over a map with 8 or fewer elements, a single bucket, will now be non-
deterministic. It should be noted that this also means that the max number of
possible different iterations of a map is 8, but that does not matter as it
solves the problem of having programmers relying on iteration order no matter
what size the map is (except for size 0 and 1).

------
panzi
Wait, what exactly is happening? Do they randomize iterations or just use a
hash algorithm that will spread the keys more randomly into the table? The
former would lead to cache misses on big hash tables, while the latter would
still performs the same, but it wouldn't be random. It just would be more
obviously not the insertion order.

------
millstone
I'm lost after reading this. Is the order actually randomized, or does it
merely reflect the underlying storage order?

~~~
jerf
Put this in a test.go file locally and run it repeatedly with "go run
test.go":

    
    
        package main
    
        func main() {
            x := map[string]string{"A": "", "B": "", "C": "", "D": "", "E": ""}
            for key, _ := range(x) {
                 print(key + "\n")
            }
            print("\nNext iter:\n")
            for key, _ := range(x) {
                 print(key + "\n")
            }
        }
    

You'll see the order can change even within a single run. It does not appear
to be uniformly random; I had to put 5 keys in because it happened far less
often than expected with 3 keys, but you can see randomization often enough
with this.

Normally, I'd link to the Go Playground, but this doesn't manifest there
"correctly", probably due to caching of the results.

------
Locke1689
I think this is a good thing to get right in a new language. The .NET hash
function for dictionaries is fixed because of this problem -- there is too
much legacy code which depends on the ordering of that particular hash
function to change it.

------
NodeMuppet
Nice post. I admire the tenacity of the golang team to keep pushing this
language forward. Adoption was slow at first, and Google can be somewhat quick
to drop projects as we all know, so it's nice to see this one moving on.

------
eternalban

            fmt.Println("Key:", k, "Value:", m[k])
    

Interesting that, given the assumption that it makes regarding the
implementation of the method. (Hash order is order just the same.)

------
TheLoneWolfling
Fuzzing as a language feature!

Hmm... What other things could be done like this, I wonder?

------
callesgg
Either is is hash mapped and the order should be based on the hash value.

Or golang wastes shitloads of good CPU power to help a few developers to write
correct loops, help the developers so they don't get blamed for their faulty
loops when shit goes haywire.

~~~
coldtea
Or you missed some tricks on writing hash maps.

------
tbarbugli
Thats not surprising at all to me.

