
Strange line of Python - Swizec
http://swizec.com/blog/strangest-line-of-python-you-have-ever-seen/swizec/3012
======
danieldk

        state = [st for s in state for st in states[(s,letter)]]
    

What's so hard about this?

\- _states_ is a dict with transitions. The key is a tuple of a state and a
symbol. If the value is a non-empty list, that means there is an outgoing arc
from the state to another state with that symbol.

\- _state_ is the list of states you are currently in (remember, this is an
NFA, so you can be in more than one state).

So, what this fragments declaratively says: the new list of states is obtained
by following the outgoing arcs with the given symbol from the current list of
states.

The comprehension would be more comprehensible (no pun intended) like this:

    
    
        newStates = [newState for curState in curStates
          for newState in transitions[(curState,letter)]]
        # curStates = newStates
    

Edit: line break to avoid the need to scroll.

~~~
pak
You made a great point, indirectly; a lot of the confusion from the original
line of code is the crappy naming of the variables. They should have thrown in
_sta_ and _stat_ to really amp up the Scrabble-like atmosphere.

Once you clearly distinguish the _purpose_ of each variable with a well chosen
name, as you did in your rewrite, the code suddenly lights up the room. Good
variable names are important!

The linebreak to break up the three-way expression also helps to suggest the
order of evaluation. For some reason, our brains do not appreciate nested
ternary operators on the same line. It's the same problem as

    
    
        A==B ? C : D==E ? F : G;
    

If you are familiar with the left- or right-associativity of the operator,
this isn't impossible to read. For instance, if you write C, you already know
it's right-associative and therefore executes much like it reads from left to
right. If so, it's still nicer to convey that with some line breaks:

    
    
        A==B ? C :
        D==E ? F :
               G ;
    

On the other hand, if you are using a language like PHP where it's left-
associative, or you're not 100% positive either way, you will be shooting
yourself in the foot trying to do something like that and brilliantly
misleading anybody else that reads the code.

------
tudorachim

        state = [st for s in state for st in states[(s,letter)]]
    

can be read as

    
    
        new_state = []
        for s in state:
            for st in states[(s, letter)]:
                new_state.append(st)
        state = new_state

~~~
lmm
I'm a big python user and I didn't realize this. It seems like the nesting
order for list comprehensions is backwards? I would naturally read the line as
state = [st for s in [state for st in states[(s,letter)]]] because that's the
way the natural language brackets, but this line is obviously broken. I would
expect the syntax for the for loop you write to look like "state = [st for st
in states(s, letter) for s in state]", which I think reads very clearly and
understandably. So can anyone explain why the nesting works the way it does?

~~~
divtxt
I too expected the syntax list comprehensions to chained.

Turns out that nesting is a feature:

<http://www.python.org/dev/peps/pep-0202/>

 _\- The form [... for x... for y...] nests, with the last index varying
fastest, just like nested for loops._

------
lloeki
If I understand correctly what is asked:

> At first it looks just like a double loop. But then you notice the right-
> most for
    
    
        for st 
    

> is taking the list to iterate over from its own body,
    
    
        in states[(s,letter)
    

> which is the iterator of the left-most for loop
    
    
        for s in state
    
    

Well I don't get what's weird (or I'm actually missing something in the post),
it actually IS a double loop. [0] says:

    
    
        Only the outermost for-expression is evaluated immediately, the other expressions are deferred until the generator is run:
        
            g = (tgtexp  for var1 in exp1 if exp2 for var2 in exp3 if expo)
        
        is equivalent to:
    
        def __gen(bound_exp):
                for var1 in bound_exp:
                    if exp2:
                        for var2 in exp3:
                            if exp4:
                                yield tgtexp
            g = __gen(iter(exp1))
            del __gen
    

So exp2, 3, 4 can each depend on the previous levels var1, 2, 3...

I wrote something like that in a comment on SO[1]

    
    
        return reduce(lambda a, v: (x for v in a for x in kidsFunc(v)), xrange(generation), [val])
    

[0] <http://www.python.org/dev/peps/pep-0289/#the-details>

[1] [http://stackoverflow.com/questions/1016997/generate-from-
gen...](http://stackoverflow.com/questions/1016997/generate-from-
generators/1017105#1017105)

~~~
buro9
It reminded me of the SQL double NOT EXISTS:

    
    
      SELECT DISTINCT store_type
        FROM stores
       WHERE NOT EXISTS (
               SELECT *
                 FROM cities
                WHERE NOT EXISTS (
                        SELECT *
                          FROM cities_stores
                         WHERE cities_stores.city = cities.city
                           AND cities_stores.store_type = stores.store_type
                      )
             );

~~~
ghoul2
find the store_types such that at least one store of that type exists in at
least one city?

this is probably nerd sniping :(

------
chernevik
Strangest line of python . . . ? This guy should come look at my code base.

------
llambda
I ran a quick test to see what's going on:

    
    
        >>> state = [1, 2, 3]
        >>> states = [4, 5, 6]
        >>> [st for s in state for st in states]
        [4, 5, 6, 4, 5, 6, 4, 5, 6]
    

So basically it's looping over 'states' three times. Once for each element in
'state'. The syntax seems strange at first but in reality it's just acting
like a nested for loop.

This could be written as:

    
    
        new_list = []
        for st in state:
            for st in states:
               new_list.append(st)
    

First we say for every element in in state, then for every element in states,
append st to our list new_list. Because st is referenced in both loops, the
value is dependent on the second list, states and is appended according to the
number of elements in state.

edit: formatting and better explanation and fixing result (thanks jhdevos,
it's too early to think without coffee!)

~~~
jhdevos
You've made a mistake with cut'n'paste, methinks. Actually, it is:

    
    
      >>> state = [1, 2, 3]
      >>> states = [4, 5, 6]
      >>> [st for s in state for st in states]
      [4, 5, 6, 4, 5, 6, 4, 5, 6]

~~~
Herald_MJ
So the code can be simplified as

>>> states * len(state)

~~~
masklinn
This code yes, in TFA's code the inner iteration (on `states`) depends on the
result of the outer iteration (on `state`).

For what it's worth, it's pretty easy to "unwrap" a Python listcomp: it's a
sequence of nested loops, nested tests and a mapping.

    
    
        result = [mapping for iteration1 for iteration2 for iteration3 if test1 if test1]
    

is equivalent (modulo some scoping interaction with the code outside it) to

    
    
        result = []
        for iteration1
            for iteration2
                for iteration3
                    if test1
                        if test2
                            result.push(mapping)
    

Apart from the final mapping, listcomps are executed strictly left to right,
an evaluation can use anything produced to the "left" of it.

------
perfunctory
> this line nobody understands.

Gosh, this is a trivial nested loop.

------
jlarocco
Well that was a let down. That author needs to spend more time reading the
docs and less time creating hyperbole.

------
gujk
Strangest line of python is introductory Haskell. The use of 's' and 'st' and
'state' is the most bizarre part. Why not 'sta' to maintain arithmetic
progression? Or s t u since brevity is the goal?

~~~
nooneelse
Indeed. Renaming everything in most functions (lines/snippets/etc) to be short
names that all start the same will result in nigh unto gibberish.

Humans distinguish words from similar words largely by the first distinct
letter in them, with the size/shape of the remaining bit also helping (no way
I could find a citation at the moment, but I remember this from several
studies I read in a cog-sci course). So the work (and time) to read something
goes up with more similarly starting (and shaped) words in the possibility
pile.

Also known as the "don't name characters Sauron and Saruman and then blame the
readers for getting confused" rule.

------
dcolish
This is really well explained in the docs:
[http://docs.python.org/reference/expressions.html?highlight=...](http://docs.python.org/reference/expressions.html?highlight=generator%20series#list-
displays)

------
densh
Python version is definetely written in perl style (pragmatic and write-only).
I've rewritten it to make it readable:

<https://gist.github.com/1377187>

~~~
BrandonM
Your version looks like a java/C# style to me. It's not really necessary to
create a class for this.

~~~
densh
At first I had a version that wasn't using classes but it was sort of ugly as
we have relatively large state (current_states, final_states, transitions) and
you have to pass it along to every other function. It looks a little bit ugly:

<https://gist.github.com/1377685>

Actually I'm a big fan of using plain functions whenever possible but in this
case it doesn't seem to be a right solution

------
cousin_it
I think the easiest way to understand the code is just to switch the loops
around, like this:

    
    
        state = [st for st in states[(s,letter)] for s in state]

~~~
acqq
Not the same, the loop you suggest doesn't work when s is not initialized
before:

    
    
        >>> state
        [0, 1, 2]
        >>> states
        [[11, 12], [21, 22], [31, 32]]
        >>> s=None
        >>> [ st for st in states[ s ] for s in state ]
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        TypeError: list indices must be integers, not NoneType
        >>> s
        >>> [ st for s in state for st in states[ s ] ]
        [11, 12, 21, 22, 31, 32]

~~~
cousin_it
Thanks! Ouch, indeed my code doesn't work. I didn't test it, but honestly
expected it to work. In fact it still seems perverse to me that Python doesn't
allow consistent right-to-left scoping in comprehensions! In other words,
given a list of lists:

    
    
        z = [[1],[2],[3]]
    

Python wants me to write something like this:

    
    
        [x for y in z for x in y]
    

instead of the more natural-looking:

    
    
        [x for x in y for y in z]
    

Does this feel weird to anyone else? Are there any good reasons why Python
can't allow both?

~~~
llimllib
> Are there any good reasons why Python can't allow both?

That is, you want [a for b in c for a in b] to represent _both_ foreach(c,
foreach(b)) _and_ foreach(b, foreach(c))? Surely it must choose one or the
other.

You can sensibly argue that the order should be reversed, but not that python
should allow both. There should be only one obvious way to do things.

> Does this feel weird to anyone else?

Not to me; [(x,y) for x in a for y in b] translates simply to:

    
    
        ret = []
        for x in a:
            for y in b:
                ret.append((x,y))

~~~
cousin_it
Yeah, you're right. Allowing both is a non-starter. Well, it could try to
infer the direction by looking which way the bindings go, e.g. [a for b in c
for a in b] would represent foreach(c, foreach(b)) because otherwise b would
be unbound in the outer loop... but that doesn't sound like an especially good
idea, having syntax change depending on which variable bindings exist.

So all in all, I guess I'd prefer it to always go right-to-left: [a for a in b
for b in c]. That would be consistent with nested comprehensions (which read
right-to-left), but not with nested loops (which read top-to-bottom). YMMV.

------
zidar
may be trivial but to me it looks weird

    
    
      print aa for aa in someList
    

means "print aa" is the body of the for loop in this case and not knowing
python at all, I thought it would be the same for more neste for loops

    
    
      (print x for x in something) for something in somethingElse
    

anything else would just seem weird to me, but i guess that print x is actuall
the body of both nested for loops.

------
djtriptych
Anyone else just learn that 'beseda' means 'word' in slovenian?

~~~
darkmethod
Thanks for pointing that out. That was clearly the only _strange_ part.

