Hacker News new | comments | show | ask | jobs | submit login
Pattern Matching with TypeScript (alabor.me)
95 points by mweibel 139 days ago | hide | past | web | 51 comments | favorite



TC39 published its stage 0 proposal for pattern matching in ECMAScript a couple of days ago: https://github.com/tc39/proposal-pattern-matching


Pattern matching without support for algebraic datatypes really misses the point. What you want is to be able to declare all the shapes of your data, with descriptive names, and then be able to say definitely "these are the only such shapes."

Pattern matching without exhaustiveness checking and algebraic datatypes is really just a disguise: syntax level changes when the issue is a semantic (i.e. language-level) problem.


> Pattern matching without support for algebraic datatypes really misses the point.

That's really not true, Erlang has neither static types nor sum types (which I expect is what you mean by ADT rather than including product types) yet several of its construct (not least of which "assignment") make great use of pattern matching. Pattern matching is one of the most important tools in Erlang's toolbox.

> What you want is to be able to declare all the shapes of your data, with descriptive names, and then be able to say definitely "these are the only such shapes."

> Pattern matching without exhaustiveness checking and algebraic datatypes is really just a disguise

Also not necessarily true, closed sum types are probably the best default but OCaml's polymorphic and extensible variants are genuinely great tools.


'the point' that OP cares about is exhaustiveness. one can disagree of course; but without exhaustiveness what you really have is destructuring (this is not what OP cares about)- erlang is a fine example


Without exhaustiveness checking, you still get a more advanced typecase-like control flow construct, not just destructuring.

In practice (and non-trivial code), exhaustiveness checking plays much less of a role, especially if you already have an object-oriented language.

In basic OO languages, you get exhaustiveness through abstract methods AND get the ability to add subtypes to an existing sum type.

One advantage of pattern matching is that it's easier to add actions (instead of subtypes), a.k.a. the expression problem. But that's a modularity issue, has nothing to do with exhaustiveness, and can also go the other way (as pattern matching makes it hard to add new variants to an ADT in a modular fashion).

Another advantage is that pattern matching can be a poor person's tree parser, but for that, exhaustiveness doesn't buy you much, because ADT patterns also typically allow for patterns that aren't allowed by the underlying tree grammar, so you'll have an `_ -> assert false` fallback case often enough.


> But that's a modularity issue, has nothing to do with exhaustiveness, and can also go the other way (as pattern matching makes it hard to add new variants to an ADT in a modular fashion).

I would disagree on this point. With exhaustiveness checks you get at least errors in every place you match against a data type you just extended with a new variant. So your compiler can point you to the places you have to edit after adding new variants to your ADT. That's a helpful feature.


Note that I talked about doing it in a modular fashion. With an ADT, you have to rewrite every single pattern match in your code to support the new variant. (And you don't even get warnings if you have a general fallback case that matches everything.)

In an OO language, the equivalent to simple pattern matching would be to dispatch on an abstract method of a shared supertype. If you add a new type and forget to implement the abstract method, you get a compiler error. There's your exhaustiveness checking and it's entirely modular as you don't have to touch your original code.

As I said, that's one side of the expression problem. OO languages conversely have a problem (unless they support multiple dispatch or methods external to a class) when they have to add new actions. With ADTs, that's just a new function. With inheritance, you have to add a new method to every single class that implements a variant.


The expression problem has two sides, right. I don't advocated that pattern matching on ADTs is "the better" solution. Neither the OO approach is.

There isn't usually a 100% modular solution, because you want to stay flexible respectively in either dimension, to some extend. You have to chose between trade-off (as always).

That's way I like hybrid languages like Scala where I can actually make an informed choice whether it's more likely to add new variants or more functionality in the future. But when I have to do the thing I didn't anticipate in the first place I want help from the compiler to avoid errors. That's way an exhaustiveness checker for pattern matching is a very welcomed and neat feature.


> There isn't usually a 100% modular solution, because you want to stay flexible respectively in either dimension, to some extend. You have to chose between trade-off (as always).

There are actually language mechanisms that can get you both. multimethods, for example, or their ultimate generalization, predicate dispatch. Few languages support them, though, because they are overkill for most application domains (though there are some, such as computer algebra, where they can be very useful).

Note also that while many OO languages constrain you in having methods only occur within a class definition, this is not a requirement. Multimethods invariably need to be more flexible (because there's no privileged "self" object), but the multimethod approach can also be applied to single dispatch. See also partial classes and related techniques.

Also, ADTs in their basic form are still a very specialized, limited form of sum types. Even if you want sum types, better extensibility is often needed (though ADTs remain a very useful special case), as well as the ability to actually describe subtypes of a sum type.

See how OCaml ended up with having polymorphic variants and extensible variant types and Scala with case classes, for example.


it is well known the expression problem is trivial in dynamic (untyped) languages—however, static type safety is a requirement of the problem definition of the expression problem.


I didn't mention dynamic typing anywhere?


no, but multi-methods are 99% of the time found in dynamic languages, right?


1. Multimethods simply are a rare feature in general.

2. Multiple dispatch and static typing are completely orthogonal features.

3. Strictly speaking, you don't even need multiple dispatch. The key feature of multimethods that you need is the ability to add a method outside the class body.


under definition 3 then extension methods are also multi-methods - doesn't seem right. even c# has extension methods


Extension methods do not support dynamic dispatch and hence are not proper instance methods. Try MultiJava's open classes for an approach that works.


> without exhaustiveness what you really have is destructuring- erlang is a fine example

That's completely nonsensical. Erlang does not have exhaustiveness yet is not limited to destructuring.

In fact, mere destructuring is probably the least common of pattern matching's uses in Erlang, much more common are asserting, filtering and dispatching (all of which may make further use of destructuring internally but destructuring is a sub-component of the match rather than the pricipal).


Not just destructuring, see https://elixir-lang.org/getting-started/pattern-matching.htm... for example


The implementation in the blog post does handle exhaustiveness via the "default" case of the switch statement.


What we want from exhaustive pattern checks is the ability to check on all cases without falling back on the "default" case. Having the compiler warn you about missing checks is a powerful thing to have.

This is easy to achieve if you're matching on a variant type, since there's only a finite number of them.


> Pattern matching without support for algebraic datatypes really misses the point.

The idea that pattern matching has only one point misses the point.

> What you want is to be able to declare all the shapes of your data, with descriptive names, and then be able to say definitely "these are the only such shapes."

While that is useful, it's not always the most important thing I want with pattern matching, it's usually a nice to have. If I can match on the shapes I can meaningfully handle in a particular point and use a default case with appropriate behavior (which may be to report an error condition) for any others, that's often enough.


The problem is that blog posts often start out with the phrase "pattern matching like haskell or scala" and then describe a mechanism that is not like those—it misrepresents and conflates what those languages provide.


You can get exhaustiveness checking in some cases in TS by adding a default statement to a switch and assigning the value to type 'never'. This is a bit cumbersome, of course, and the fact that it's opt in partly defeats the purpose. You typically want to check exhaustivity to get more safety, but there the safety only comes if you add it in manually so everywhere you forget you lose safety.


The `Payment` example in the post is effectively an implementation of algebraic datatypes. `PaymentPattern` is how the code specifies "these are the only such shapes", and exhaustiveness is checked because the typechecker won't let you provide a `PaymentPattern` that omits any cases. It's certainly not as flexible as pattern matching like you'd get in a serious functional language, but I think it handles the use case "condition over cases of an algebraic type" just fine.


> The `Payment` example in the post is effectively an implementation of algebraic datatypes.

Algebraic data types are what we call "initial": they're characterized by how they're built up structurally. The 'PaymentMatcher' example, while using the terms "matcher" and "pattern" in the variable names, is actually closer to a "final" solution, since they're characterized by how they behave.

This is a bit abstract, but to put it simply, the key difference between what you and I are talking about is the same difference between algebraic datatypes and type classes. Algebraic datatypes are built up with constructors, and then exist structurally. On the other hand, type classes are more ephemeral. When declaring a type class, we say "these N methods are the ways you can 'poke' and 'prod' me to get a response," that is, it's the behavior of the type class that's most important.

So in fact, it's not that algebraic datatypes are more flexible than type classes or vice versa, but rather that they're duals of each other; they codify similar concepts, but approach the problem from two different sides.

I find this sort of stuff fascinating! Some of my favorite articles dealing with these sorts of nuances:

- Typed Tagless Final Interpretters[1]

- Practical Foundations for Programming Languages, Chapter 15 "Inductive and Coinductive Types"[2]

[1]: http://okmij.org/ftp/tagless-final/course/lecture.pdf

[2]: http://www.cs.cmu.edu/~rwh/pfpl.html


This would be news for Prolog programmers, I think (or any other logic programming language).

Consider also tree parsers or languages with predicate dispatch (both of which are, functionality-wise, a strict superset of pattern matching).


Well, that was disappointing. I thought we left this kind of over-abstracted code behind with Java. A simple one-level switch on typeof or instanceof really does not deserve to be called "pattern matching". Given that TypeScript supports tagged union types it would make far more sense to actually use them and stop using instanceof all together.

If are we going to do this kind of thing why not a least do it simply:

    switch(typeof x) {
      case "string": 
or

    switch(x.constructor)
      case Thing:
Ok, so TypeScript can't handle the types for this yet but it's valid JS so you get the idea.


> If are we going to do this kind of thing why not a least do it simply:

I expect it's because actual type-switching allows the compiler to type-assert at the same time. So it knows what within the String branch the value is an actual string. Whereas typeof() is severely limited (only primitive types) and switching on an arbitrary property tells pretty much nothing to the compiler.


Yes, that's the current state of things. But support for control-flow typing of my above two examples seems like an easy addition to the TypeScript compiler - if that's what you want. Personally I'd like to see JS support for proper pattern matching, i.e. TC39, then TypeScript can build on that.


> But support for control-flow typing of my above two examples seems like an easy addition to the TypeScript compiler - if that's what you want.

The first one is not useful and the second one is nonsensical (as there is no guarantee that there's any relation between #constructor and the type of the variable)


The original code with the switch statement is by far the easiest to read, as well as being by far the fastest and most concise to write and (though it’s unlikely to matter) by several orders of magnitude the fastest to run as well.

The proper way to pattern match on types in JavaScript (and therefore TypeScript) is to first of all not if you can avoid it, but failing that a simple if-else chain checking `typeof(arg) === "primitiveType"` and `arg instanceof ReferenceType`, with ample use of the `debugger` statement to ensure inputs have the types and map to the branches you’re expecting them to.

TypeScript is great for adding type safety and better self-documenting qualities to a JS codebase. But attempting to preserve the semantics of its “imaginary” extensions to the JS type system (interfaces, generics) at runtime is a recipe for bugs, severe confusion and reams of pointless boilerplate.


It would be nice to be able to use the annotations directly, we support this in flow-runtime[0] but I think it would be fairly straightforward to support in TypeScript too,

It looks like this:

    import t from 'flow-runtime';

    const makeUpperCase = t.pattern(
      (input: string) => input.toUpperCase(),
      (input: number) => input
    );

    console.log(makeUpperCase("foo bar"));
    console.log(makeUpperCase(123));
    console.log(makeUpperCase(false)); // throws

[0] https://codemix.github.io/flow-runtime/#/docs/pattern-matchi...


Shouldn't the uppercase of a number be a string?

    const makeUpperCase = t.pattern(
      (input: string) => input.toUpperCase(),
      (input: number) => '' + input
    );
Otherwise you get odd behaviours like:

    makeUpperCase('1') + makeUpperCase('1') = '11' (probably what you want)
    makeUpperCase(1) + makeUpperCase(1) = 2 (probably not what you want)


...wait. Why would the uppercase of '1' be '11'? It would just be '1'.


I had to reread it a few times to parse it. Its result of "adding" the uppercase of 1 that should be 11, not the uppercase itself.

In other words, the result of "makeUppercase" should be a string, not a number.


I see it now.


Yeah probably, but it's just an example


Combining switch statements with destructuring [1] can be very useful:

    type ASTNode = { tag: 'BinOp', op: string, left: ASTNode, right: ASTNode }
                 | { tag: 'UnaryOp', op: string, arg: ASTNode }
                 | { tag: 'Literal', value: number };

    function evaluate(node: ASTNode): number {
        switch (node.tag) {
            case 'BinOp':   { const { op, left, right } = node;
                              return evalBinary(op, evaluate(left), evaluate(right)); }
            case 'UnaryOp': { const { op, arg } = node;
                              return evalUnary(op, evaluate(arg)); }
            case 'Literal': { const { value } = node;
                              return value; }
        }
    }
[1] https://ponyfoo.com/articles/es6-destructuring-in-depth


That's just sad if you've used languages with actual pattern matching though. It's like C's unsafe unions being back with a furor.


Cool to see a post about this sort of thing! I came up with a similar pattern a while ago (using Flow instead of TypeScript) for a lambda calculus evaluator. I ended up using a code generation approach so I define my datatypes in one file and the `match` method and `Matcher` interface are generated for me.

Here's where I define the five cases for a lambda calculus expression undergoing evaluation:

https://github.com/alangpierce/LambdaCalculusPlayground/blob...

Here are the autogenerated types for those cases (there's other code to make it work at runtime):

https://github.com/alangpierce/LambdaCalculusPlayground/blob...

And here's a simple example of using it:

https://github.com/alangpierce/LambdaCalculusPlayground/blob...


The technique mentioned in the article of using an interface for pseudo pattern matching with anonymous implementation is basically the visitor pattern (yes I know patterns are BS once your language is sufficiently powerful but lets ignore that).

It is basically how you implement pattern matching for any language that has OOP dispatch but lacks structural typing (but hopefully has anonymous classes).

One way to facilitate the above pattern is to use a code generator. A Java one I have played with is derive4j [1]. I'm not sure if typescript has a preprocessor but that would be one way to make the lack of structural pattern matching easier to deal with.

[1]: https://github.com/derive4j/derive4j


This article seems relevant, https://www.typescriptlang.org/docs/handbook/advanced-types.... . Particularly the part on discriminated unions. I'm just now learning TypeScript, so I have yet to actually implement anything this way.


This looks a bit overdone. For the first example I'd just do it like this:

    const numbers = [,'one','two','three']
    function matchNumber(n: number) { return numbers[n] || n }


steal Haxe's, they even already have code to compile it to JS.

http://haxe.org/manual/lf-pattern-matching.html


I saw a talk using interfaces to validate API calls/responses with typescript. I've been working in dynamically typed languages for a long time and I've built up habits that make it so I rarely run into issues where I am expecting a Number and get a String or that I can't determine this as the cause quickly, so typescript didn't appeal to me previously. Uisng interfaces to enforce uniform object structures is a game changer though and a use for typrscript that I didn't anticipate.


I actually discovered this in the TypeScript 1.x days (2013 I think) but I didn't even consider it pattern matching. I was probably calling it a type safe switch statement.

As the article points out, the value in doing this is if you have a lot of calling code that creates a switch statement, say on a "type". This solution will give compiler errors at all the callers if you add a new type, but a switch statement will not and just hit the default.


  function calculatePaymentAmount(payment) {
    if(typeof payment.payByCard != "number") throw new Error("payment.payByCard=" + payment.payByCard);
    if(typeof payment.cardFee != "number") throw new Error("payment.cardFee=" + payment.cardFee);
    if(typeof payment.payByCash != "number") throw new Error("payment.payByCash=" + payment.payByCash);
    if(typeof payment.cashDiscount != "number") throw new Error("payment.cashDiscount=" + payment.cashDiscount);
	
    if(payment.payByCash <= 0 && payment.cashDiscount != 0) throw new Error("payment.payByCash=" + payment.payByCash + " payment.cashDiscount=" + payment.cashDiscount);
	
    return payment.payByCard + payment.payByCard * payment.cardFee + payment.payByCash - payment.cashDiscount;
  }
When an error is thrown, a core-dump should be generated, logged, and a developer notified, and the app should restart. The compiler can only catch so many bugs.


There's pattern matching for javascript already: https://github.com/z-pattern-matching/z , works pretty good, also match objects, very powerful


It's amazing that people still design languages without algebraic data types.


You can declare union types in TypeScript:

    type BinaryNode<T> = { tag: 'leaf', value: T }
                       | { tag: 'branch', left: BinaryNode<T>, right: BinaryNode<T> };
Then things like the following will typecheck correctly:

    function traverse<T>(node: BinaryNode<T>): void {
        switch (node.tag) {
            case 'leaf':
                console.log(node.value);
                break;
            case 'branch':
                traverse(node.left);
                traverse(node.right);
                break;
        }
    }
In this example, once the compiler recognizes the tag is 'leaf', it knows there is a node property and will report an error if you try to access the left or right fields, which do not exist. The reverse is true for 'branch'.

Not as nice as Haskell for sure, but the presence of union types is something I've found very handy.

You can also declare tuple types:

    type MinMaxAvg = [number, number, number];
and do pattern deconstruction (though not pattern matching) in function arguments and assignments:

    function test([min, max, avg]: MinMaxAvg): void {
        // ...
    }


The last example is a classic visitor pattern. The earlier ones (using structs of functions) are a variation on this that seems a bit closer to the spirit of pattern matching.


hi, i am the author of the article.

> The last example is a classic visitor pattern

you are absolutely right about the visitor pattern. will add the specific reference to the pattern in a revised version.

thanks for pointing out!




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: