

Python's Iterators are a Bad Implementation of Laziness - llimllib
http://sandersn.com/blog/index.php?title=python_s_iterators_are_a_bad_implementat&more=1&c=1&tb=1&pb=1

======
henryprecheur
I can understand the frustration caused by this kind of bug, but it's actually
pretty easy to fix:

    
    
        generator = (x * x for x in xrange(10)) # A generator
        my_list = list(generator) # Build a list, reusable
    

That's the Python way of doing things: trust the programmer, & let him make
mistakes.

Also how does C# implement this kind of "smart" laziness? It seems to me there
will be other more obscure problems: what happen if the Enumerable have side
effect? Is the side effect repeated or is the result cached? Python's
generator are implementation is simple: there's no tricks, magic, & layers of
abstraction.

~~~
mquander
In C#, it's basically just idiomatic to produce enumerables that don't have
side effects as you enumerate them. There's nothing strictly enforcing it.

Like in your Python example, one is expected to put the results of an
enumerable into a real data structure if you need it more than once and you
suspect that the enumeration might be expensive or involve side effects.

Both ways "let the programmer make mistakes" -- in C#, you might accidentally
enumerate something more than once without realizing it, which may be
inefficient or cause subtle bugs. In Python, you might accidentally screw
something up the way the fellow writing the post above did. One might think
that the C# behavior is overly clever, and it's best to fail fast, but it's
convenient in practice when you know that cycling again won't cause problems
(the majority of the time.) I'm not sure which I prefer. One thing to note is
that having static types a la C# does make it clear at all points whether you
are dealing with an enumerable or a real list.

------
RiderOfGiraffes
I observe that when reading this code, there is nothing to show when the
parameter is a list versus a generator. Modern programming is all about
abstracting away detail to allow the programmer to concentrate on the
structure and other stuff that matters, but it seems to me (and I may be
wrong) that increasingly we find things that look the same, get used the same,
generally behave the same, and then are different in a corner case.

It seems that allowing lists and generators to be used in identical ways
generally makes things easier to create and easier to use, but the differences
in the corner cases suddenly mean you have to know about the detail. Then it
becomes hard.

The problem isn't that it's easy to fix. We all know how to fix these sorts of
things. The problem is that details you need to know are being hidden, because
you shouldn't need to know about them.

When I write a routine should I always "assert" that its parameter is a list
and not a generator? Do I have to write the code to use a slower technique
just in case the parameter is a generator? Do I "ungenerate" the generator out
into a list?

That can't be right ...

~~~
abstractbill
_Modern programming is all about abstracting away detail to allow the
programmer to concentrate on the structure and other stuff that matters, but
it seems to me (and I may be wrong) that increasingly we find things that look
the same, get used the same, generally behave the same, and then are different
in a corner case._

I agree. I would paraphrase this as: Leaky abstractions cause bugs, and
unfortunately all abstractions turn out to leak at some point (I think someone
else said it this way before, but I can't remember who right now). EDIT: Found
it -
[http://www.joelonsoftware.com/articles/LeakyAbstractions.htm...](http://www.joelonsoftware.com/articles/LeakyAbstractions.html)

This doesn't mean we shouldn't use abstractions of course - it just means each
abstraction comes with a cost.

~~~
jp_sc
Yes, but the cost in this case is the need to write meanigful variable names.
<http://www.joelonsoftware.com/articles/Wrong.html>

~~~
skwiddor
The use of strongly typed languages would resolve all those problems.

------
tetha
So. The problem is: Someone used an aggressive language (aggresive in the
sense: The language does not defend the programmer from his own stupidness,
but also does not hinder his freedom), used a certain feature of this language
and fell on his face. Thus, the feature is bad. Uhm. I guess, in this logic,
everything of C, C++, Python (at least) is bad, because you can easily fall on
your face with the more efficient idioms. Certainly, you cannot iterate over
the generator twice and you cannot iterate over an iterator twice (remember: a
list can return multiple iterators, one at every time it is asked), but if you
know a certain functions requires multiple iterations, well, pass it a list or
a set or whatever is capable of generating an arbitrary amount of iterators
(or even do things like itertools.repeat(generator) :) ). The point remains:
This is a feature of an aggressive language, and you need to know your code
slightly better (meaning: more documentation, or more code immersion) in order
to reach maximum efficiency. This point holds with pointers, with templates
and with generators. But it also holds, that well-used pointers, well-used
templates and well-used generators can speed up code magnificiently :)

So, overall, I don't really see the point of this article, besides someone who
should go and use some defensive language.

------
notmyname
Yes, the non-repeating nature of generators is a little unintuitive to start
with, especially when you are familiar with the way lists work. But, they are
no more unintuitive that floating point rounding errors (1.1 + 2.2 != 3.3).
Simply learn the type and move on.

And if you need to repeat the generator, use itertools.cycle.

------
enum
"I thought laziness was the New Python 3.0 Way, but apparently the New Python
Way is susceptible to hairy non-local bugs."

You're using "laziness" in a language with mutation. What do you expect?

------
Confusion
_The use-case is: I create a generator somewhere, then return, pass it around
a lot, and finally use the generator twice somewhere else entirely._

This is just silly. You can't use a generator twice, just like you can't keep
reading from a file. When you've reached EOF, you're done. What does this guy
expect, code that magically determines that when you ask for 'next()', you
mean 'restart_at_the_beginning()'?

I also fail to see what this has to do with laziness: the generator lazily
generates it's elements. However, at some points it's still done generating
elements. That's not a failure of laziness or whatever, it's a failure of
using the generator.

~~~
lutorm
I thought he was saying that if you think it's a list you might write the code
to iterate across it twice. Then you happen to have a generator and you might
think that "it'll be just like a list" if you pass it instead.

Doesn't seem quite so silly to me.

~~~
Goladus
It's also a fundamental issue with dynamic languages. Lots of code fails
silently if you make faulty assumptions about the types of parameters.

