Hacker News new | past | comments | ask | show | jobs | submit login

The major problem I have with that is performance is atrocious [1] if the iterator isn't doing something very nontrivial. Since the vast majority of iterators amount to incrementing at most a handful of things and then returning something indexed by those things, you're paying the cost of channel communication per item but not getting that channel cost amortized over any significant costs in the iteration itself. If the thing using the iterator is also not doing anything significant (i.e., adding integers together), this has very bad performance implications.

As a rule of thumb, the amount of work transferred by any concurrency primitive should significantly exceed the cost of the concurrency primitive itself. I do have a couple of uses of this pattern where what is on the other side of the channel is something reading off a network and parsing lines of JSON into internal structs, in which case the overhead of the channel isn't necessarily too bad. (In one case, it even chunks the lines of JSON into a slice of several parsed structs, reducing channel overhead even more.) But it's a terrible solution in general; iterating over an array and doing any sort of very fast "thing" to each element that only costs a handful of assembler instructions, a very common use case, has terrible overhead.

It's a real pity, because the semantics of that solution are pretty close to the right answer. But it's a huge performance trap. Something as basic as iteration needs to not be a huge performance trap.

[1]: Relative to Go, anyhow. I haven't timed it directly but I wouldn't be surprised that a channel-based iterator like that would be comparable to Python's general iteration speed, or at least not off by a very large factor. It's just that "Python's normal level of performance" is "atrocious Go performance".




Hmm, can you provide some examples of how a context based iterator is slow?

If you need raw performance and you are just doing some minimal operations on a slice then yeah you’d want to use a simple for loop instead.

I typically use this pattern in situations where you have a ton of data coming back where you want to avoid storing all of that data in memory.


The problem with using channels is that these require multiple goroutines and locking for a problem that's inherently singlethreaded. Instead, you can define an iterator as a function that returns a function:

    func intSliceIter(ints []int) func() (int, bool) {
        i := 0
        return func() (int, bool) {
            if i < len(ints) {
                ret := ints[i]
                i++
                return ret, true
            }
            return 0, false
        }
    }

    iter := intSliceIter([]int{0, 1, 2, 3, 4})

    for x, ok := iter(); ok; x, ok = iter() {
        fmt.Println(x)
    }
Of course, there's not much benefit to a SliceIter; this is a contrived example, but you can apply this pattern in more complicated cases as well. Similarly, you can define an iterator as an interface (which is similar to bufio.Scanner and a few others in the standard library—a closure is an object is a closure):

    type IntIter interface {
        Next() (int, bool)
    }

    type IntSliceIter struct {
        Cursor int
        Ints []int
    }

    func (isi *IntSliceIter) Next() (int, bool) {
        if isi.Cursor < len(isi.Ints) {
            ret := isi.Ints[isi.Cursor]
            isi.Cursor++
            return ret, true
        }
        return 0, false
    }




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: