
Go maps don't appear to be O(1) - alpb
https://medium.com/@ConnorPeet/go-maps-are-not-o-1-91c1e61110bf
======
mindslight
The basic complexity of any map lookup is Ω(log n) - as the number of elements
grows, keysize must grow logarithmically (to be able to fit them), and so the
comparison time grows as (log n) as well.

With reasonable sizes on common hardware, various operations can be considered
constant time. But this can be misleading when, for example, the size of your
working set overflows a level of cache hierarchy and your "constant" time
changes.

~~~
SilasX
Yes! This has been a big source of frustration for me! That you can't
logically look at the hashtable algorithm and derive O(1). Instead, it's some
ridiculous gentlemen's agreement where we call it O(1) even though it can't
be. See my previous "am I the crazy one here" posts:

[https://news.ycombinator.com/item?id=9807739](https://news.ycombinator.com/item?id=9807739)

~~~
mindslight
Any big-O analysis is necessarily done with a context of assumptions, most
importantly the machine model.

You can derive that a hash table is O(1) if you add the assumption that the
size of the keys is fixed. This assumption isn't generally wrong, but like all
assumptions it has its limits.

I personally think the blind mantra that "hash tables are O(1)" is destructive
because it fosters an intuition that they're akin to a pointer dereference and
strictly better than trees, when really that constant factor is _huge_. After
all, when memory is bounded (and it always is on any physical machine),
_every_ algorithm is O(1)!

Given that the hash table construction is designed specifically to be
constant-time on the model of a modern microprocessor, I think it makes sense
to look under the hood a bit and see how it really behaves if memory and nkeys
are truly unbounded.

And in this case, that result is indeed informative when you start thinking
about things like cache hierarchy. The assumption of a table "fitting in
memory" goes out the window when for a given size, "memory" could be L2 cache
with a higher access speed, or not.

In that previous thread you say:

> _the defining (good) thing about STEM is that there 's actual logic to the
> discipline, where if I forget an answer, I can re-derive it_

Each letter of S.T.E.M. is its own field with its own methods. It's easy to
forget this, because when all you're doing is rote learning in school, they
seem awfully similar - follow a pattern of technical steps through to its
conclusion. Specifically, math works by having a foundation of assumptions for
each problem, and outside of that context any solution is meaningless.

For this particular topic, "assume keys are fixed size" is an assumption you
just have to memorize the idea of, just like for big-O notation in general you
have to memorize the assumption that constant factors don't matter.

~~~
SilasX
Having to "memorize" the idea of assuming bounded key size is fine.

What's _not_ fine is selectively making that assumption without being aware of
it, resulting in a situation where the kewl kids recite back "hashtables are
O(1)" with no understanding of why, or what different assumptions you have to
make to get there, resulting in an exception-laden body of knowledge. That's
not what rigor looks like.

One time I was asked for the run time of a random-line-getter function for a
file that seeks to a random byte and fans out to the line breaks. I said,
"well, that would depend, not on the file size, but the line length, and it
would be linear in that."

He "got me": "Wrong, it's constant." Indeed, if you implicitly assume that
relevant parameter is file size, and don't understand the implications of
"does not depend on ...", then you can compare answers against an answer key
and regard someone as wrong for not saying exactly that.

"Hashtables are O(1)" results from exactly that lack of rigor, where it's all
a game of which password to spit back, rather than "can you identify the
constraints on solving this problem?"

~~~
mindslight
I totally agree but for one point - seek() in a file is probably O(log
<filesize>). So he wasn't even right in his own context...

~~~
SilasX
Right, but the point is, even under the assumption of constant seek times, he
didn't seem to get that "doesn't not depend on x" is the same as "constant
[implicitly taking the only relevant parameter to be x]".

------
rudolf0
Reddit discussion thread:
[https://www.reddit.com/r/golang/comments/3n0lf8/go_maps_dont...](https://www.reddit.com/r/golang/comments/3n0lf8/go_maps_dont_appear_to_be_o1/)

No consensus regarding the true cause, yet.

