Hacker News new | comments | show | ask | jobs | submit login
The list monad in Perl and Python (plover.com)
67 points by smcgivern on Aug 4, 2015 | hide | past | web | favorite | 17 comments



(The Internet being what this is, this comment is NOT criticism... it's elaboration and explanation.)

So part of what makes this so clean is that this is the "list monad" implementation, and nothing else at all. This is a perfectly fine implementation of the specific "monad", but there's no abstraction between the specific list implementation and the generic monad machinery.

This probably helps more clearly see what's going on with the list monad (arguably the simplest monad implementation that is also non-trivial and slightly hard to follow when you start mixing it with guards and such), but it will get a bit nastier when you factor out the monad machinery and have to invoke it too. Python also gets nastier when you can't fit everything into a lambda because of its restrictions. Plus once you have to put a "def" in the middle, it "pollutes" its way back up the stack, because in order for the closure to work properly it has to be defined inline (inner functions should have access to the outer scope's variables), which means everything above it has to be turned into a full function too.

So, on the one hand it is fair to point out this solution does work and isn't impossibly nasty to look at, but, on the other hand, it isn't generic and it will also tend to break down in practice for Python. Perl's actually comes out slightly nicer since the subroutines are indeed full subroutines so you don't have the lambda limitations to work with, but still impractically complicated if you try to abstract out the "monad" machinery.


This topic is amusing - the question of why you would want to do this has never come up, it is truly just art and playfulness.


An alternative using Python's builtin monadic machinery (comprehensions) with a utility function to aid with threading around the remaining alphabet:

    def to_number(*digits):
        return sum(d * 10 ** n
                   for n, d in enumerate(reversed(digits)))


    def choose1(digits, *exclude):
        for digit in digits.difference(exclude):
            yield digit, digits - {digit}

    all_digits = set(range(10))

    print [
        (send, more, money)
        for s, e, n, d, send, digits in (
            (s, e, n, d, to_number(s, e, n, d), digits)
            for s, digits in choose1(all_digits, 0)
            for e, digits in choose1(digits)
            for n, digits in choose1(digits)
            for d, digits in choose1(digits))
        for m, o, r, more, digits in (
            (m, o, r, to_number(m, o, r, e), digits)
            for m, digits in choose1(digits, 0)
            for o, digits in choose1(digits)
            for r, digits in choose1(digits))
        for y, money, digits in (
            (y, to_number(m, o, n, e, y), digits)
            for y, digits in choose1(digits))
        if send + more == money]
I could've actually had the base alphabet called `digits`, but found calling it `all_digits` perhaps less confusing so then `digits` is the one that's only internally visible in the comprehension. This solution is lazy internally, then eagerly pumped to full exhaustion; usually changing the outer `[...]` to `next(...)` is more useful.

I find it very useful to remember that comprehensions deeply connected to monads: https://en.wikipedia.org/wiki/Monad_(functional_programming)...


Further showing the advantage of the pythonic way, is that the above construction remains about as clear when you refactor it for better pruning:

    (
        (to_number(s, e, n, d),
         to_number(m, o, r, e),
         to_number(m, o, n, e, y))

        for d, digits in choose1(digits)
        for e, digits in choose1(digits)
        for y, digits in choose1(digits)
        if (d + e) % 10 == y

        for n, digits in choose1(digits)
        for r, digits in choose1(digits)
        if (n + r + (d + e) // 10) % 10 == e

        for o, digits in choose1(digits)
        if (e + o + (n + r + (d + e) // 10) // 10) % 10 == n

        for s, digits in choose1(digits, 0)
        for m, digits in choose1(digits, 0)
        if (s + m + (e + o + (n + r + (d + e) // 10) // 10) // 10) % 10 == o
        if (s + m + (e + o + (n + r + (d + e) // 10) // 10) // 10) // 10 == m)
Ends up being several order of magnitudes faster since it partially sums each place and prunes aggressively. You can eliminate the sub-expressions, but it has noise impact on the run time (likely due to trading off temporary tuples for cheap arithmetic):

    (
        (to_number(s, e, n, d),
         to_number(m, o, r, e),
         to_number(m, o, n, e, y))

        for d, digits in choose1(digits)
        for e, digits in choose1(digits)
        for y, digits in choose1(digits)
        if (d + e) % 10 == y

        for n, digits in choose1(digits)
        for r, digits in choose1(digits)
        if (n + r + (d + e) // 10) % 10 == e

        for o, digits in choose1(digits)
        if (e + o + (n + r + (d + e) // 10) // 10) % 10 == n

        for s, digits in choose1(digits, 0)
        for m, digits in choose1(digits, 0)
        if (s + m + (e + o + (n + r + (d + e) // 10) // 10) // 10) % 10 == o
        if (s + m + (e + o + (n + r + (d + e) // 10) // 10) // 10) // 10 == m)



Nice work!

In the perl version, perhaps `remove` would be a bit faster to just delete a hash slice.

    sub remove {
      my ($b, $a) = @_;
      my %h = map { $_ => 1 } @$a;
      delete $h{@$b};
      return [ keys %h ];
    }


Anyone care to explain why this doesn't return a gigantic list of empty arrays and one solution tuple? I don't see where the non-solution []'s are removed from the list.


There's an explanation in the earlier article (http://blog.plover.com/prog/haskell/monad-search.html), but it's because the `bind`/`bd` function takes the solution arrays from its callees, and concatenates the arrays into one big array of solutions, which it then returns to its caller.

In the Python code I think this is built into the behavior of the `+=` operator.

In the Perl code it's part of the way the `map` call works; it combines what would be `concat . map` in Haskell.


Thanks for explaining and thanks for the post. It was helpful seeing an example of why you might want to use a monad.


Can someone explain why the Python version is lazy? I don't understand where the lazy generator appears.


> Can someone explain why the Python version is lazy? I don't understand where the lazy generator appears.

It isn't. It could be lazy if `bind` was defined as

    return itertools.chain.from_iterable(map(f, xs))
or

    for x in xs:
        yield from f(x)
(`unit` could also be defined as `yield x` but that wouldn't make much of a difference)

The provided program returns a `list` of all solutions though, not any sort of iterator, generator or otherwise.

This is confirmed by the runtime being anything other than instantaneous, a lazy version would just print the repr of the lazy iterator (say `<generator object bind>`), not the actual result (unless the iterator defined an __repr__/__str__ but very very few do, and those that do have little to nothing to compute e.g. dict views)

Original program:

    > time python3.4 test_list.py
    [(9567, 1085, 10652)]
    python3.4 test_list.py  6.43s user 0.02s system 99% cpu 6.480 total
Converted to `yield from` and `yield`:

    > time python3.4 test_yield.py
    <generator object bind at 0x1066ba7e0>
    python3.4 test_yield.py  0.04s user 0.01s system 84% cpu 0.062 total
nb: this comment assumes Python 3 is being used given the print function, Python 2 doesn't have `yield from` and you'd need `itertools.imap` as the builtin `map` is eager

nb2: on my machine, reifying the lazy version has roughly the same runtime as the eager version in CPython (2.7.10 and 3.4.3), in PyPy (2.6.0 ~ CPython 2.7.9) the reified lazy version (using chain.from_iterable) runs in half the time (~3s) as CPython and the eager version runs in 1/6ths the time (~1s)


Thanks for the thorough analysis--you answered several of the follow-up questions I might have had.


What is the time to actually run the iterator to the end and extract and print the solutions?


As indicated in the second note, in CPython (2.7 or 3.4) it's roughly the same as the eager version. In pypy the lazy version is about 3 times slower than the eager version.

Version from your post:

    > time python2.7 test_eager.py                           
    [(9567, 1085, 10652)]
    python2.7 test_eager.py  6.41s user 0.02s system 99% cpu 6.451 total
    > time python3.4 test_eager.py                           
    [(9567, 1085, 10652)]
    python3.4 test_eager.py  6.28s user 0.03s system 99% cpu 6.372 total
    > time pypy test_eager.py                                
    [(9567, 1085, 10652)]
    pypy test_eager.py  0.95s user 0.06s system 98% cpu 1.022 total
Using `yield from` (3.4-only) and wrapping the solutions() call in a `list()`:

    > time python3.4 test_yield.py                    
    [(9567, 1085, 10652)]
    python3.4 test_yield.py  6.28s user 0.02s system 99% cpu 6.313 total
Using chain.from_iterable + imap (or map in 3.4) and wrapping the solutions() call in a `list()`:

    > time python2.7 test_imap.py                    
    [(9567, 1085, 10652)]
    python2.7 test_imap.py  6.52s user 0.02s system 99% cpu 6.558 total
    > time python3.4 test_imap.py                    
    [(9567, 1085, 10652)]
    python3.4 test_imap.py  6.26s user 0.02s system 99% cpu 6.292 total
    > time pypy test_imap.py                    
    [(9567, 1085, 10652)]
    pypy test_imap.py  2.87s user 0.06s system 99% cpu 2.943 total
Versions used: Python 2.7.10, Python 3.4.3, PyPy 2.6.0 with GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57); all from macports


[flagged]


Gotta love the people downvoting you without thinking twice.


For those who didn't get it - mjd is the author of the blog post linked to. Although he's also right that it's a stupid comment that deserves downvotes :)


It's a stupid comment and deserves downvotes.




Applications are open for YC Winter 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: