
Rust Ownership Explained with Python - adamnemecek
https://paulkernfeld.com/2018/09/16/ownership-explained-with-python.html#
======
shmageggy
Not that it really matters, but since he lamented that the second Python
solution was "longer and less elegant", I'll point out that that algorithm can
be written in a functional style that is both short and (IMO) somewhat
elegant. However with reduce being removed as a built-in since Python 3, this
is arguably not Pythonic.

    
    
        from functools import reduce
        
        squares = (x * x for x in range(10))
        
        min_and_max = lambda current, x: [min(current[0], x), max(current[1], x)]
        minimum, maximum = reduce(min_and_max, squares, [next(squares)]*2)
        
        print(minimum)
        print(maximum)

~~~
c8g

        squares = [x * x for x in range(10)]
        print(min(squares))
        print(max(squares))
    

this one works and looks good.

~~~
wyldfire
range(10) doesn't properly demonstrate the material differences between
generators and lists.

Like posted elsewhere, tee is an appropriate solution that preserves the
generator's benefits over a list. It's too bad that the example is so
synthetic.

~~~
shmageggy
One downside of both solutions is that they both traverse the list twice,
which could be expensive when we're talking about real data and not toy
problems.

------
merlincorey
The Python problem can be solved with `itertools.tee`[0]

    
    
        from itertools import tee
        squares_min, squares_max = tee(x * x for x in range(10))
        print(min(squares_min))
        print(max(squares_max))
    

[0]
[https://docs.python.org/3/library/itertools.html#itertools.t...](https://docs.python.org/3/library/itertools.html#itertools.tee)

~~~
c8g

        squares = [x * x for x in range(10)]
        print(min(squares))
        print(max(squares))
    

this one works and looks good.

~~~
skbly7
It will have memory impact though with larger range.

A better approach might be:

    
    
        squares = lambda: (x * x for x in range(10))
        print(min(squares))
        print(max(squares()))

~~~
rahimnathwani
Yes, your solution is in the original post, under the heading 'Python
solutions':

    
    
      def squares():
          return (x * x for x in range(10))

------
bayesian_horse
As a Rust beginner I am not quite sure why .min() is supposed to take
ownership and change the sequence. I would assume it does nothing with the
sequence except read it.

~~~
kilburn
Sequences can be generated at runtime, and it may even be impossible to
"regenerate" them (think for instance reading bytes from the network).

Therefore, just reading that sequence changes (consumes) it.

~~~
MaulingMonkey
And of course for things that _can_ be iterated over multiple times, it's
likely possible that you can (re)generate the iterator. E.g. this compiles
fine:

    
    
        fn main() {
            let squares : Vec<u32> = (0..10).map(|x| x * x).collect();
            println!("{:?}", squares.iter().min()); // Some(0)
            println!("{:?}", squares.iter().max()); // Some(81)
        }
    

Edit: Or a smaller change to the original, you can add a single .clone():

    
    
        fn main() {
            let squares = (0..10).map(|x| x * x);
            println!("{:?}", squares.clone().min());
            println!("{:?}", squares.max());
        }
    

This clones the std::iter::Map<std::ops::Range<i32>>. Unlike the vec version,
this will re-evaluate the |x|x*x sequence when .max() is called.

There's no technical reason Map<Range<i32>> couldn't be made to implement the
Copy trait, which would make the extra .clone() here unnecessary - but Rust's
standard library has chosen to force you to make an explicit choice instead.

~~~
lostmsu
Curiously, C# provides the behavior one would expect.

It does so by distinguishing IEnumerable<T> [1], and IEnumerator<T> [2]. The
first would be the result of the map call (Select in C#'s LINQ), representing
a sequence, that could be generated, and the later is the one, that actually
generates it one by one, and has to be consumed.

I wonder why Rust did not take C#'s approach.

[1] [https://docs.microsoft.com/en-
us/dotnet/api/system.collectio...](https://docs.microsoft.com/en-
us/dotnet/api/system.collections.generic.ienumerable-1)

[2] [https://docs.microsoft.com/en-
us/dotnet/api/system.collectio...](https://docs.microsoft.com/en-
us/dotnet/api/system.collections.generic.ienumerator-1)

~~~
kibwen
Can you be more specific about what behavior from the Rust code you find
unexpected, in comparison with C#?

~~~
lostmsu
Exactly the consumption part. Perhaps I don't understand Rust library, and
there's another stuff to do it right.

In C# Enumerable.Range and .Select (e.g. .map) internally create an immutable
instance, that implements IEnumerable<T>, that describes what will have to be
returned.

Then .Min internally takes IEnumerable<T>, calls its .GetEnumerator, that
actually returns something, that can be consumed once and has a mutable state,
and finds the minimum by actually consuming that iterator (all internally).

~~~
MaulingMonkey
I'd say the distinction is that Rust's .min() is implemented on the equivalent
of IEnumerator<T>, Iterator - instead of the equivalent of IEnumerable<T>,
IntoIterator. This makes it slightly more flexible, since you can always get
the former from the latter by calling GetEnumerator(), but not vicea versa
(without filling a container with it first or something.)

If C# also implemented Min on IEnumerator<T>:

    
    
        var enumerator = collection.GetEnumerator();
        var min = enumerator.Min();
        var max = enumerator.Max(); // Runtime bug: Min() already the entire enumerator!  InvalidOperationException?
    

There's basically nothing useful you can do with 'enumerator' after the first
call to this hypothetical Min. Rust's stdlib tries to catch this bug by having
min take ownership of/consume 'enumerator', instead of merely borrowing it, so
you can't use it again:

    
    
        let iter = collection.iter();
        let min = iter.min();
        let max = iter.max(); // Compile time error: iter used after move
    

In C#, basically everything that can be is an IEnumerable<T>, and the stdlib
is built around that fact. Does System.IO.File.ReadLines hit disk every time
you re-enumerate it? Or does it store to a container? Many of the stream based
APIs have a .ReadLine() method but no .ReadLines() method, presumably because
they'd have to allocate a container to store the result in case enumerable be
re-evaluated multiple times, and that seems bad if you don't need it?

Rust seems to lean a lot more heavily on single-shot iterators.

    
    
        let lines = stdin.lock().lines();
        for line in lines { ... }
        for line in lines { ... } // Compile time error: lines moved.
    
        let lines = BufReader::new(File::open("input.tsv")?).lines();
        for line in lines { ... }
        for line in lines { ... } // Compile time error: lines moved.
    

"Does lines() hit disk/stdin every time you re-enumerate it?" is a trick
question: You _can 't_ re-enumerate it.

You can resolve these errors by being explicit about storing the input in a
container and iterating that multiple times, or by re-opening the input file,
or by resetting the File position and constructing a new BufReader, depending
on which behavior you wanted for re-enumeration.

------
staticassertion
It's definitely _harder_ in the general case to run into these bugs, but the
real trick, in my opinion, is building APIs where it's impossible.

The 'escape hatch' here is the .by_ref() but you could, if it's critical,
write wrappers that don't provide those escape hatches.

If it were critical that the reuse never happened you can make it impossible,
which is really cool. But, by default, for most APIs, it's just generally more
difficult to mess up.

There are a bunch of articles on session types out there, where you represent
each state in a state machine as a separate type, with transition methods that
consume the previous state. If the correctness touted in this article sounds
appealing, I highly recommend reading about session types.

------
thomasjames
Isn't the property of the generator expression not that it is an iterator
(like a range or list also are) but specifically that it is a generator and
thus inherently has state that is permanently/mutably exhaustible across
different scopes? I like the example, and this is the first time I've totally
wrapped my head around Rust ownership, so thanks! I just think precise Python
terminology might keep people from getting confused about different types of
Python iterator objects. Iterators are anything that has a __next__() method:
[https://www.python.org/dev/peps/pep-0234/](https://www.python.org/dev/peps/pep-0234/)

~~~
rahimnathwani
"it is an iterator (like a range or list also are)"

No, range is not an iterator. Try this in either python 2 or 3:

    
    
      a = range(10)
      a.next()
    

It will fail. Now try this:

    
    
      a = range(10)
      my_iterator = a.__iter__()
      my_iterator.next()
    

"Iterators are anything that has a __next__() method"

Right, and range(10) does not have a __next__() method, so it's not an
iterator. It's an iterable, which is anything which has a __iter__() method
that returns an iterable.

In python 3, range(10) returns a range object (not a list). Because it's an
iterable and not an iterator, it doesn't get 'used up'. For example, try this
(in Python 3 only):

    
    
      a = range(1000000000) # returns instantly, as it's not building a list
      a[3] # returns 3
      a[3] # returns 3 again

------
krispbyte
I remember seeing a visualization of the variables and values in a python
program as it ran and how they get linked to one another and that really
helped me understand python variables. Does such a visualization exists for
Rust?

I'm gonna try to hunt the link for the Python one.

~~~
adamnemecek
Python tutor
[http://www.pythontutor.com/visualize.html#mode=edit](http://www.pythontutor.com/visualize.html#mode=edit)?

~~~
krispbyte
Yes! thank you!

This helped a lot for understanding the behavior of list vs other values, for
example:

[http://www.pythontutor.com/visualize.html#code=a%20%3D%204%0...](http://www.pythontutor.com/visualize.html#code=a%20%3D%204%0Ab%20%3D%20a%0Aa%20%3D%2010%0Aprint%28a,%20b%29%0A%0At%20%3D%20%5B1,2,3%5D%0As%20%3D%20t%0At.append%284%29%0Aprint%28t,%20s%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-
frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)

I think it would do wonders for ownership.

------
bogomipz
I really enjoyed reading this, I thought the author made these very
accessible.

Perhaps the author might consider writing posts on both traits and lifetimes?

Understanding ownership, traits and lifetimes seem to be the keys to
understanding and using Rust effectively. Unfortunately these are also the
things I've had the hardest time getting my head around. If anyone could
suggest similarly accessible articles on traits and lifetimes I would greatly
appreciate it. Cheers.

------
mlthoughts2018
I can’t comment on the general case, but I disagree very strongly with the
author’s perspective about generator expressions here.

First, a generator is a precise type of data structure that can be exhausted.
If you don’t want that, don’t use that structure. It’s foolish to say that
just because a data structure superficially satisfies some API (like being
passable to min), that it “ought to have” some behavior that the author wants.
That’s backwards. If you want certain behavior, choose a data structure that
supports it.

It would be no different from saying if I have a heap and I call sort() it is
faster than if I have the same data in a list and call sort(). They both
support sort() so why aren’t they equally fast? It’s obviously the wrong way
to think about it. Instead, use the appropriate data structure.

And if you’re really worried about the operation being safe, use the provided
default argument, or the pattern for versions before 3.4, as in [0], or write
a wrapper function that catches whatever exceptions and handles them.

This isn’t even the biggest issue though. Generators generally are meant to
function as coroutines and for complex generators, you can use send() to set
the “returned value” of a yield statement inside the generator for when it
resumes, and so a single generator could go through stages of being exhausted
and empty, then having a value set again so that on resumption it’s no longer
empty, and could build back up a whole iterable of data that way.

So you absolutely do want min() or max() to throw errors when trying to
consume from an empty generator, and you absolutely do want a generator’s
state to become modified by the way values are consumed or added, because it
meaningfully represents the state of the generator. Suppose for a more complex
generator, someone uses send() to feed it a value between your call to min()
and your call to max().

Generator expressions that superficially look like list comprehensions are
just one use case of the general data structure and in fact one of the
“generators 101” pieces of advice you always see is that if you need to
iterate the values multiple times, then populate a persistent data structure
out of the generator, like by wrapping in list().

I bring this up because it frustrates me to see people acting like a type
safety or borrow checking idea is somehow aadding something new or solving
some type of problem that the dynamic typing approach doesn’t or can’t solve,
but this is emphatically incorrect and a severely myopic way to look at it.

The dynamic typing approach solves all the same problems, it just facilitates
the solution differently: use the default argument or other patterns for
safety on functions that can fail on empty containers, and use custom wrappers
with exception handling when you need to. Choose a persistent data structure
if you need persistence.

Again, not saying this has anything to do with people’s big picture reasons
for liking static types or borrow checking... just this particular example
strikes a chord to me highlighting the super annoying myopia about it.

[0]: < [https://stackoverflow.com/questions/36157995/a-safe-max-
func...](https://stackoverflow.com/questions/36157995/a-safe-max-function-for-
empty-lists) >

~~~
dnautics
> It’s foolish to say that just because a data structure superficially
> satisfies some API (like being passable to min), that it “ought to have”
> some behavior that the author wants.

I disagree. There is also a concept of "principle of least surprise". To take
an extreme example, let's say you wrote a standard library where push!(array,
item) actually takes the array and deletes it. Would it be foolish to say this
is bad design because it shouldn't "ought" to have some behavior, it's up to
the user to know what's going on under the hood?

~~~
mlthoughts2018
That’s a silly abuse of the idea of least surprise. Documented, long-lived
built-in data structures like generators that support behavior like exhaustion
and mutation in a time-tested way are _clearly_ not some kind of surprising
and useless or pedantic quirk.

I mean, consider implicits in Scala. It’s hard to think of a worse violation
of least surprise, but because it’s documented and time-tested and becomes a
de facto standard in that language, the argument of least surprise becomes
nothing but a preference debate.

The behavior where max raises an exception on an exhausted generator is not at
all surprising, from the point of view of language documentation, ubiquitous
and popular advice on that data structure, etc. etc.

And besides all that: if you did happen to find a serious design bug that
created a surprise problem, there are tons of ways to solve that problem in a
dynamically typed language that wouldn’t require static typing or borrow
checking or any explicit state management. It still would not be evidence that
those things are a superior way to solve it.

------
vbsteven
For me Rust ownership _clicked_ when I was reading a C++ book. More
specifically the sections about references, move and copy.

