Hacker News new | comments | show | ask | jobs | submit login
Spectro: Adventures in Go (markcrossfield.co.uk)
39 points by WestCoastJustin on Nov 21, 2015 | hide | past | web | favorite | 21 comments

> For example, I particularly liked that since map iteration order is not guaranteed, it is actively randomised at runtime to make that clear

What is there to like about this feature?

1. It introduces unavoidable noise into performance testing

2. It makes your functions impure

3. Your simple script will have different output across successive runs, even with the same input

4. It removes a reasonable and useful feature that all other languages support, namely, that you can iterate the same map twice in a row in the same order

5. Working around this usually involves sorting, which in turn requires a lot of boilerplate: https://gobyexample.com/sorting-by-functions

It seems all this does is make life harder. I'd be interested to hear from anyone who thinks this feature has helped them.

I like it.

If you're not going to have guaranteed order, then explicitly breaking it to avoid accidental dependency that will break your programs later when the language implementation changes is very much a good thing.

Compare this to python, which has deterministic dictionary order, but it changes across implementations. That's just begging for problems.

I think the whole philosophy of Go is kind of "try to avoid shit breaking in weird and unexpected a mysterious ways", an approach I become more sympathetic to the longer I program.

Is there any advantage to randomizing on every iteration over just randomizing a single seed at process startup?

Generating one seed at startup would solve the problem of accidental de facto stabilization of dictionary iteration order while avoiding the issues mentioned in the parent comment.

That does seem more reasonable, actually, unless there's some crazy reason you might actually want to vary the order w/in a single process lifetime I'm overlooking...

The performance cost is negligible, the randomization happens only at the beginning of a range call:

range init's the hashmap iterator: https://github.com/golang/go/blob/6327e8dc69c47d6355e2623cc4...

hashmap iterator init sets a random starts: https://github.com/golang/go/blob/master/src/runtime/hashmap...

I wasn't referring to the performance cost of randomizing itself, but to the fact that randomization can affect the performance of whatever it is you're measuring.

For example, say you want to benchmark a function that extracts the keys from a map into a slice, and then sorts it via quicksort. Quicksort's performance is affected by the initial ordering, so different runs of the function will behave differently. Was this run really faster, or did I just get lucky with the order this time?

True then, it's not deterministic. I guess that's quite an annoyance actually, you can't use maps in a single threaded program if you want deterministic outcomes? I wonder if the seed of fastrand1 is constant.

1. I don't see how.

2. Impure?

3. That's the point. It's to force people to stop using maps as if they're sorted.

4. Why would you want to rely on the arbitrary nature of a hash table for order?

5. That's the best type of boilerplate. It's short. It brings clarity. You might not need custom sort functions. The sort package supports basic types, which you'd use if you're keeping track of an index with []int for instance. https://golang.org/pkg/sort/

It's not about relying on hash tables for order. It's about the invariant that traversing a hash table N times will produce the same ordering each time, as long as the hash table is not mutated, even though that ordering is unpredictable.

Disclaimer: I've never used this feature myself and I sympathize with the desire on the part of the Go designers to avoid accidental de facto stabilization of hash table iteration order--though it sounds like it could have been done without drawbacks by randomizing at process start.

It is to prevent hash tables being relied on for order. Randomization was added so maps aren't treated as invariant. The behavior in a range of a map is no less correct to be changing (in Go) rather than unchanging (in other languages). In other words, invariant relative to what? It's not too important. It's one of numerous ways Go promotes clearer design, at least in my eyes, to have disorder represent unordered and order represent ordered.

This is a very philosophical comment, but I'm not seeing a specific, practical advantage that randomizing on every iteration confers over randomizing on process start.

Can you provide one? (I'm genuinely curious.)

It's no more philosophical than saying the alternative is proper. Uniformity of expectations seems more practical to me. In contrast, I wouldn't see an advantage in map ranges ever staying the same even per process. There's no reason to promote that reliance.

Well I gave 5 reasons up at the top :)

But from a philosophical perspective, referential transparency is a big one. We should prefer to make functions referentially transparent, because it makes them easier to reason about. Sometimes we do not, because it would be too slow or too awkward. But in this case, the loss is gratuitous.

Or to put it concretely: in Python (and every other language), we have the fact that `dict.keys() == dict.keys()`. In Go, this is true sometimes. Doesn't that seem weird to you?

But there is one, which the original post in the thread described: you could for example convert a map of arrays to a flat array by iterating over it repeatedly and printing the ith element of each array. This doesn't work if every loop over the elements gets a different ordering.

Runtime order randomization was added to avoid breaking code that depends on map iteration order; this is necessary to support some of the upcoming planned optimizations.

Committing to supporting any specific sorted order iteration would ultimately come with a performance penalty rather than the reverse.

In fact, this strategy is being introduced to several other aspects of golang's runtime like the test execution order and goroutine scheduler (currently only enabled in -race mode).

Thanks for your reply. You and aric both jumped to sorted order as the alternative to randomized order. Why is that? Are you coming from a C++ background?

When I say "any specific sorted order" (e.g. whether it's insertion order or ascending value order or whatever) I mean as opposed to unsorted order. That is, not necessarily randomized but necessarily unspecified and not to be relied on. Randomizing is the most explicit example of unsorted order.

Go is fairly serious about language stability and they want to hold onto some guarantees for the remainder of Go 1.x. In order to be able to achieve this, they need to be careful about what explicit features they commit to as much as what implicit features people end up depending on. If changing an implemention of a datastructure changes the implicit order of how keys are stored and the entire community has grown to depend on the original undocument order as a feature... then that's a big problem.

> Are you coming from a C++ background?

I don't think it particularly matters in this case, but the majority of my programming career so far was spent in Python. :)

There's an odd, big complaint about numeric casts that culminates in this line of code:

    fmt.Fprintf(os.Stderr, "%"+strconv.FormatInt(int64(paddingWidth), 10)+"s %s\r", "", legend)
Which could be improved greatly, and made a lot more readable with:

    fmt.Fprint(os.Stderr, strings.Repeat(" ", paddingWidth), legend+"\r")
In my experience with Go over the last few years, any time I have this "ugh" moments it pays to take a step back. It's usually a code smell that there's a better way.

You can do better -- fmt provides a way to give a dynamic width or precision as an int argument.

  fmt.Fprintf(os.Stderr, "%*s %s\r", paddingWidth, "", legend)
I'd further wonder whether a separate padding calculation is needed at all (too lazy to go read the source on Github at the moment).

I started with that, but there's no reason to use the formatting specifiers when the padding string is already blank. With the dynamic width Fprintf still needs to waste time parsing the format string.

Thanks pbnjay… I agree that’s much more readable and have improved the code.

Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact