Hacker News new | past | comments | ask | show | jobs | submit login
Pampy: Pattern Matching for Python (github.com)
126 points by fagnerbrack 4 months ago | hide | past | web | favorite | 44 comments

Just a quick FYI, but in vanilla Python 3 [1] you can already do a head/tail split on any iterable:

    >>> hd, *tl = range(5)
    >>> hd
    >>> tl
    [1, 2, 3, 4]
[1] Tested on 3.6.5, don't know about older versions

This is due to PEP 448 "Additional Unpacking Generalisations" [1], which was approved with effect from Python 3.5. Other examples:

    x, y, *r, z = range(7)
    print("x: {}, y: {}, r: {}, z: {}".format(x, y, r, z))
    # prints: x: 0, y: 1, r: [2, 3, 4, 5], z: 6
    x, (qa, *qr), *r = [1, [2, 3, 4], 5, 6]
    print("x: {}, qa: {}, qr: {}, r: {}".format(x, qa, qr, r))
    # prints: x: 1, qa: 2, qr: [3, 4], r: [5, 6]
The same PEP allows expanding lists and dictionaries in literals:

    r = [*r1, *r2]    # Same as r = r1 + r2 if they are lists
    d = {**d1, **d2}  # Merges d1 and d2 into d
[1] https://www.python.org/dev/peps/pep-0448/

I was mistaken, sorry. Those unpackings work as of Python 3.0 due to PEP 3132 "Extended Iterable Unpacking" [2]. PEP 448 is just for those examples in the second half of my comment.

[2] https://www.python.org/dev/peps/pep-3132/

One of the very reasons why Haskell and Rust are so safe is because pattern match checking in these languages is exhaustive. If you don't cover every possible type constructor for an enum or pattern the compiler will throw an error.

For example, the Maybe monad used with match must have Nothing and Just handled during pattern matching. Precompile time logic checking.

The below will throw a precompile time error:

  handleMaybeNum :: Maybe int -> int
  handleMaybeNum Just a = a
The below will not:

  handleMaybeNum :: Maybe int -> int
  handleMaybeNum Just a = a
  handleMaybeNUm Nothing = -1
Could the same be said for this library? If so when combined with mypy type checking and functional programming this can transform python into a language with haskell level saftey.

    $ cat maybe.hs 
    handleMaybeNum :: Maybe int -> int
    handleMaybeNum Just a = a
    $ ghci maybe.hs 
    GHCi, version 7.10.3: http://www.haskell.org/ghc/  :? for help
    [1 of 1] Compiling Main             ( maybe.hs, interpreted )
        Constructor ‘Just’ should have 1 argument, but has been given none
        In the pattern: Just
        In an equation for ‘handleMaybeNum’: handleMaybeNum Just a = a
    Failed, modules loaded: none.
You're right about the "precompile time error", although it's not what you claim...

Let's fix the syntax error:

    $ cat maybe_fixed.hs 
    handleMaybeNum :: Maybe int -> int
    handleMaybeNum (Just a) = a
    $ ghci maybe_fixed.hs 
    GHCi, version 7.10.3: http://www.haskell.org/ghc/  :? for help
    [1 of 1] Compiling Main             ( maybe_fixed.hs, interpreted )
    Ok, modules loaded: Main.
No error now. You do get a warning with -Wall, but that's not quite what you claimed. (Newer versions may be more strict, I don't know.)

Hmm my mistake. The match operator in rust is exhaustive though and will raise a compile time error rather than a warning. Elm does this as well. I'm just getting mixed up.

Yeah, I think it's the same for OCaml. I found it funny that Haskell would be the odd one out here.

Author here. Pampy and Pampy.js (https://github.com/santinic/pampy.js) are more similar to Lisp-style Pattern Matching. Of course they cannot provide static analysis, but they can improve understandability, and they are useful with nested structures which are typical of how you solve problems in dynamic languages (less types, more nested dicts-and-lists).

An example:

pampy.matchAll(json, {_: {age: Number}}, (key, age) => age)

Finds any object with a Numeric age, that is value of any key in a json object we don't know the structure of.

> Could the same be said for this library?

This is clearly answered on the page.

"By default match() is strict. If no pattern matches, it raises a MatchError"

That's different. Haskell and Rust pattern matching isn't just sugar for convenience. It doesn't just check if a pattern matched. It checks if your patterns covers every possible state.

Please note that my psuedo code in haskell was not function calls. It's a function definition using pattern matching in the function signature.

Think of Maybe as an enum with two values. A Just(a) and a Nothing.

The below definition basically says if the enum enters a function as a Just(int a) return int a. (Just(str) is a type error handled by the signature). If the enum enters the function as a Nothing return -1.

  handleMaybeNum :: Maybe int -> int
  handleMaybeNum Just a = a
  handleMaybeNUm Nothing = -1 
if the last pattern match (Nothing) isn't written, the compiler knows that your logic isn't handling a specific case, and it lets you know about this BEFORE doing any matching. The haskell compiler will literally tell you that your function is missing logic to handle certain cases.

It is a simple mechanism that actually proves your programs to be correct. Type checking proves your program to be type safe. Exclusive use of pattern matching in haskell proves that your program logic handled every possible permutation of that type entering a function.

It is a powerful mechanism that is far greater than unit testing. Unit testing just says your is program correct for a specific test case. Haskell pattern matching and type checking provides a proof across every parameter permutation.

I realize python is an interpreted language. So an error can only be thrown at runtime. What I expect for an exhaustive match check in python is this:

        Just(a), lambda a: a)

        Just(a), lambda a: a
        Nothing, 1)
Both matches above have produce a successful match with Just(1) matching Just(a) in the first expression. However for haskell pattern matching the first match will STILL throw an error because the Nothing case for the Maybe enum was not handled.

Apparently from the examples in the docs it clearly does not do this. You can literally match across types and it is unreasonable for the match operator to throw an error if you didn't cover every possible type that exists.

> That's different.

I didn't say it was the same. I said that if he bothered to read the link he would have seen the answer to his question.

Yes, while interesting this is pretty irrelevant for Python which does not have static checks.

You clearly aren't caught up with the latest changes in python. The interpreter doesn't have static checks, but the language supports external tooling to do this.

Look up type annotations for python 3 and the mypy external type checker.

Static type checking is now a very real thing with modern versions of python.

Additionally it is possible to code the match function to implement the functionality I am talking about during runtime instead of compile time. Definitely not irrelevant.

> Additionally it is possible to code the match function to implement the functionality I am talking about during runtime instead of compile time.

Maybe, but I'm not sure there's more value in that.

Having match crash immediately helps you identify the branch error without explicitly writing a test case for it. It is equivalent to raising an exception.

That's misleading. It will only raise an exception of the actual value is not matched. Haskell generates a compile error before even running if that's something that might happen. It is the old distinction between static and dynamic typing applied to pattern matching.

When exactly are you expecting Python, an interpreted language, to raise a compiler error?

You can raise the error as soon as the match operator is executed. Even if match has a valid match with the expression, if all the branches aren't evaluated the function can still raise an error.

This allows for early catching of logic errors during unit testing and negates the need for additional tests.

I guess, but I think a possible solution here would be a pylint plugin to show a warning so you don't have to wait for a runtime exception.

> That's misleading.

It's not misleading. It answers the question: there is no compile-time check. The OP didn't have to "wonder" they could have just read the link.

What's the theoretical difference between this kind of pattern matching and polymorphism? Instead of encoding the logic for each branch in a switch, do we get the same kind of exhaustive static guarantees if instead we encode each branch as a polymorphic implementation on each type?

For example, Java's Optional type seems to provide the same static guarantees without pattern matching (assuming it didn't have the unsafe get() method).

You are right assuming the more or less equality between two approaches.

For example, you can encode Just as fJust :: a -> (a -> b) -> b and Nothing as fNothing :: b -> b. fJust receives a computation that accepts argument of Just constructor, while fNothing receives just a value to be returned. Or, alternatively, you can look at the type of maybe function which is pattern matching on Maybe in disguise: maybe :: b -> (a -> b) -> Maybe a -> b.

(I believe this method is called Church encoding)

But when you go from structure of types and patterns to functions, you lose the ability of analysis (of exhaustivity or anything else). You now deal with something that is Turing complete instead of fixed size structure.

And this is delimition of "possible" and "impossible". With the Church encoding analysis of matching completeness is impossible and even writing matchers for lists or trees is nigh to impossible, actually.

I believe this is actually Scott Encoding.

i think they're considered two sides of the same coin - exhaustive vs extensible, each with its benefits and tradeoffs. this is known as "The Expression Problem"; what follows is a quick and messy intro.

sum types:

you can easily add new functions that accept your sum type, but adding a new "variant" to the type will require modifying every function you've written to support the new case

polymorphism (interfaces):

you can easily add new types that implement your interface, but if you want to add a new method/function to it, you're going to have to implement it for every type that implements your interface

essentially, we've got two "extensibility dimensions": one for adding new operations, and one for adding new types. sum types are good on the operation axis, interfaces are good on the type axis. (it's an open research question if there's a solution that's good on both.) i hope this makes sense!

as a side note, when discussing this, it's useful to distinguish ad-hoc polymorphism (subclassing, interfaces/typeclasses) and parametric polymorphism (generics) – here, we're talking about the "ad-hoc" variety. I mention this because i saw some confusion about this in a sibling thread but can't reply there.

Javas optional type was directly inspired by scala and haskell. The concept is the same. However, because java doesn't have pattern matching as a feature internally it is implemented as a design pattern.

Haskell's implementation of "match" is so elegant that it will take you a while to realize that the "exhaustive" pattern matching is a big part of what is contributing to haskells safety.

I've used pattern matching languages quite a bit (in my case, it's OCaml), so I know what it's like. I'm just curious if there's any fundamental difference between pattern matching and polymorphism. From a cursory glance, it seems like they can statically prove the same properties about programs and are equivalent from that perspective.

If that's the case, then the conversation shifts from "you should use pattern matching" to "when should I use pattern matching?" Which is a pretty interesting question. Logic locality seems to be an obvious criteria (does the logic live with the type, or with the consumer of the type?).

I'm not clear what you're asking here. It sounds to me your asking the difference between a table and a chair.

Pattern matching is syntactic sugar for extracting values from a pattern. In a statically typed language the compiler has the option of forcing the user to handle every possible type permutation with a pattern thus ensuring safety.

Polymorphism is just a generic type.

That's as far as I know... could you elaborate more on what you mean?

It seems match checking requires static typing, so that’ll be a job for a mypy extension.

or a pylint extension

Looks neat! Pattern matching is one of those features that I sorely miss when I have to program in Python.

I do feel a bit wary seeing this[1] though, given that `_` can be considered a special character in Python. I suppose you can use `ANY` instead of `_` in the pattern matches, but of course that would not look as clean.

[1] https://github.com/santinic/pampy/blob/4c8e6e0cabada82a5ed79...

I agree. Lots of Python programmers use "_" as a variable name to indicate (by convention) that they don't intend to use the variable yet are syntactically required to give it a name[1]. Some common examples:

    for _ in range(10):

    red, _, _ = rgb(color)

    first, *_, last = some_list
This convention conflicts with the use of a global variable named "_" because the first time a programmer uses this convention and rebinds "_" to something arbitrary it will mask the "_" imported from pampy. Of course a programmer is free to give it a more conventional name:

    from pampy import _ as placeholder
But for my money pampy should have given "_" a real name and left it up to the programmer to give it a short alias if so desired:

    from pampy import placeholder as _
This is in line with common conventions around say, numpy, which is conventionally aliased to the shorter "np" on import:

    import numpy as np
[1]: https://stackoverflow.com/questions/5893163/what-is-the-purp...

EDIT: So it turns out that pampy already has a more explicitly named alternative to "_" called "ANY". So anybody who doesn't like the _ syntax can use "from pampy import ANY" and use that instead.

And here is an another pattern matching implementation for Python [1]. It was made for compilers construction task and may look similiar for those who has an experience with Prolog, Stratego or Refal.

And here is a toy term rewriting system [2].

[1] https://github.com/true-grue/raddsl

[2] https://github.com/true-grue/code-snippets/blob/master/ttrs....

> And here is an another pattern matching implementation for Python [1].

Oh, I like the prettier syntax:

    calc_rules = alt(
      rule(BinOp("+", Int(X), Int(Y)), to(lambda v: Int(v.X + v.Y))),
      rule(BinOp("-", Int(X), Int(Y)), to(lambda v: Int(v.X - v.Y))),
      rule(BinOp("*", Int(X), Int(Y)), to(lambda v: Int(v.X * v.Y))),
      rule(BinOp("/", Int(X), Int(Y)), to(lambda v: Int(v.X // v.Y)))

This way of sharing variable names between the pattern and the associated action is pretty neat. It's much nicer than Pampy's use of '_' for all variables.

There is also a JavaScript port of this library, by the same author: https://github.com/santinic/pampy.js

It’s neat that Python allows writing such things, but it’d be nice to see what the effect on debugging and stack traces are before writing anything with it.

Pattern matching is actually not very hard to write. There is a portable pattern matcher for scheme that uses macros to provide high quality code generation.

In guile the module (ice-9 match) generally has zero runtime cost compared to equal hand-written code.

This may be true but misses my point: when I navigate a stack trace and try to figure out how execution went through this match process, it will be more difficult to understand than idiomatic Python code, because the tooling is line-oriented.

Good FP debuggers would be able to step through it of course so it can be restated as a “toolchain issue” but that ignores the fact that Python is largely statement oriented, in terms of its execution model.

Sorry, I should have been clearer. The macro way of doing it doesn't mess with stack traces as it expands to pretty clean scheme code, at least after the optimizer has had it's way.

I am sad that most languages don't have macros since it is a very powerful tool. Not only does it let you abstract the ugliness away from approaches like this python one (which is actually pretty decent), but it also let's you produce good machine code from a "bolted on" construct. The python matcher is bound to have some overhead, whereas a syntactically expanded one is not.

Ah I see what you mean (I wrote macros in Clojure years ago). Yep that would do it, but would require an AST-rewriting decorator in Python.

That’s an interesting thought! I guess I just found a good project to play on during the holidays. I wonder if something like this would work:

    with matching(expr) as m:

        def do_for_pattern(v):

        def catchall(v):

Something like this?

  class matching:
      def __init__(self, expr):
          self.expr = expr
          self.patterns = []
          self.result = None
      def __call__(self, pattern):
          def _inner(func):
              self.patterns.append((pattern, func))
              return func
          return _inner
      def __enter__(self):
          return self
      def __exit__(self, *args):
          for pattern, func in self.patterns:
              if pattern.match(self.expr):
                  self.result = func(self.expr)

Ooh that is a really nice syntax. And it would definitely work as it is essentially:

    m(pattern)(lambda v: ...)
Design would be to consider the execution of the match and dispatch as a “cleanup” of the specification created in the block. Which makes a ton of sense in a weird way.

May end up using this, or one of us should send a PR to the OP!

Nice! Extending this further to support associative and commutative operations yields a functionality similar to that provided in Mathematica. This approach was taken by the MatchPy project to hopefully integrate into SymPy so that they can use the Rubi integration ruleset.

Oh my god, I was just considering building exactly the same thing for my project. Thanks a lot :)

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact