Hacker News new | comments | ask | show | jobs | submit login
From Imperative to Pure-Functional and Back: Monads vs. Scoped Continuations (paralleluniverse.co)
58 points by pron on Sept 5, 2015 | hide | past | web | favorite | 31 comments

The interesting part of this whole thing to me is that you can get composition between your continuations without using something like Monad Transformers[1].

However, the part that I haven't been able to get past yet is that monads are not closed over composition (so there exist some monads that are no longer monads when you compose them). If it turns out that monads and delimited continuations are duals then it seems that there ought to be some continuations that compose only to yield invalid results. (With the alternative being that scoped continuations always compose which suggests something odd is going on.)

I haven't worked through the math yet because it looks suspiciously dreadful, but if anyone knows why scoped continuation composition is closed I would appreciate your input.

[1] - the most disappointing thing about monads to me.

Interesting, and disappointing in a way. I assumed that all Monadic compositions would be closed analogous to linear algebra operations and vector spaces. Out of curiosity, do you have a reference to read up on Monadic Composition not being closed? Particularly any lighter references (it's a Saturday after all).

Here's a simple counter-example.

The state monad is the following functor

    St s a = s -> (a, s)
if we compose it with itself (using `d` as the second state parameter)

    St s (St d a) = s -> (d -> (a, d), s)
is `F a = St s (St d a)` a monad? The return operation is clear enough

    return a = \s -> (\d -> (a, d), s)
but the join operation fails

    join :: F (F a) -> F a
         :: St s (St d (St s (St d a))) -> St s (St d a)
         :: s -> (d -> (s -> (d -> (a, d), s), d), s)
         -> s -> (d -> (a, d), s)

    join m0 = 
      \s0 -> let (m1 :: St d (F a), s1) = m0 s0
             in (\d0 -> let (m2 :: F a, d1) = m1 d0
                            (m3 :: St d a, s2) = m2 s1
                            (a, d2) = m3 d1
                        in (a, d2)
                , ??? )
The trouble is the ??? bit where we'd like to place the furthest advanced `s` state named `s2`, but it's not in scope to return at the top since we have to apply the d state to reach it.

This however only means that St does not compose with itself as a functor. It composes just fine in another method. If we install the inner monad "wrapped around" the tuple

    StT s m a = s -> m (a, s)      -- T for "transformer"
then the composition of states looks like

    StT s (St d) a = s -> St d (a, s)
                   = s -> d -> ((a, s), d)
                   = (s, d) -> (a, (s, d))
                   = St (s, d) a
which is obviously a monad.

It's an interesting article and something I would like to think deeply about, but I hope the author will forgive me if it isn't immediately obvious to me. Especially the statement that "monads have no place in imperative languages".

For example, I do a lot of Java/Coffee script lately. I don't have access to first class continuations. Don't get me wrong; I'd love to have them. I'm sick as the next person of blowing up in a callback and having my stack completely hosed.

But in Coffee script, monads are nearly trivial to implement. The use of the Maybe monad makes doing repeated transformations and filtering incredibly nicer. It is not obvious to me that a continuation based implementation would be clearer at all.

Perhaps the author is trying to make the case that first class continuations are more important than first class monads for imperative languages. If that's the case, then I tend to agree because, like I said, monads are not difficult to implement even if you don't have explicit support for them. Stack preserving continuations on the other hand... :-P

I think the idea of using composable continuations is a very interesting one and I'd like investigate it further, but unless I'm being very dim-witted (a definite possibility) I don't think the matter is as clear as the author hopes.

Author here. I'm not trying to claim that nested scoped continuations (what the literature calls "delimited continuations with multiple prompts") are objectively easy, although I do think they're a lot easier than monad transformers, especially for imperative programmers. In fact, in the post I say that I'm not sure languages should expose this abstraction directly.

It is merely discussed here to show that the theory behind the imperative style is not different from the one behind the PFP style, aimed for those who believe "PFP is just math, while imperative is 'not math'", or to those who are curious to see (as I was) if the imperative style has something like PFP's monad, namely an elegant abstraction that's at the heart of everything.

In practice, however, most continuations should be used sparingly if at all, except one -- the thread (in particular, the lightweight thread). Lightweight threads make the most annoying problems of monads in an imperative language go away, and are easy for imperative programmers to use.

Thanks for the clarification. That makes some sense to me.

Also, thank you for the article. It has made me think quite a lot. It's one of those times where my subconscious is saying, "You should pay attention to this", but I'm not quite smart enough to understand what I want to do with it. It will be fun to puzzle it out for a while ;-)

You don't need to pay attention to scoped continuations. They're a powerful "general" abstraction (like monads), but in practice all you need to know is just the most common kind of continuations -- the (blocking) thread. You'll rarely need to use other instances of continuations, as imperative languages already has their function built into the language (exceptions and even mutable variables).

I like this because it addresses part of the problem of why I've stopped paying any attention to PLT related topics like denotational semantics, monads, category theory, and higher-order/dependent static type systems in general.

In my day to day work the programming language is never the problem because hacking around the limitations of the language to bridge the gap between the problem domain and the computational domain is never the bottleneck. The real problem is when these systems go into production and all static guarantees and beautiful abstractions become basically useless. What I want is something to help me reason about dynamic and evolving large scale software systems. What I get is PLT researchers and library authors bickering over dependent types, monads, and continuations.

Here's a concrete and unsolved problem. How do you specify some kind of topology for your production environment and then have whatever system understands this specification reason about performance, error recovery, failover, and scaling issues. In fact the lower level components don't even matter because you could write it in Haskell or Python or Java or whatever is your favorite language. Right now there is nothing because everyone is arguing about continuations and monads. Basically everyone involved is thinking too small. The one person that I think comes close to approaching the problem from the right angle is http://www.cl.cam.ac.uk/~srk31/.

Doesn’t an expressive and strong type system, as an example, help you reason about software systems?

Nope, my problem isn't reasoning about the lower level components. In fact you can assume I have correctness proofs for every piece of the infrastructure. What I don't have is the higher-level piece that can reason about the lower level components and do so when new pieces are added, old pieces removed, new guarantees enforced, old ones relaxed, etc. A static type system no matter how advanced doesn't help with any of it.

Yep, that's called the "halting problem", and I hear it's very hard to solve :)

Here's how I generally see it. Writing large software is hard (and a software composed of multiple components is large software, even if you decide using the network stack when calling some functions is a good idea ;)). "Type" people (and I'm certainly for some kinds of static types, I just don't think that they're the solution to everything) would tell you that types help restrict your computation in such a way that reasoning about it is easier. To that I say, but I want to do something specific which it's pretty annoying in PFP, and they answer, just do it a different way, to which I say, if I can do it differently, so can the compiler on my behalf. In short, I don't think that explicit types (i.e. those that can be defined in the language you use) are stronger than implicit types (i.e. anything that a verifier can prove about your program), and at least in theory, they are both "the same" (though, see another comment here on what I think about two things being the same in a mathematical sense). We should use whatever is easier for our minds to grasp.

The more essential problem -- although I don't know if it's more interesting or less interesting -- is what exactly do we want to do in our programs? If we require a Turing complete model, then no matter how many types -- explicit or implicit -- we throw at the problem, we just won't be able to formally prove properties we might care about. Some people, however, claim that we don't really need Turing completeness, and if we don't, that might open a whole new world of possibilities. But once we start restricting the computational model, the representation we use -- say, PFP or imperative -- might become much more important, each being able to compute things that the other cannot (we know that lambda calculus and Turing machines both compute the same functions, but once we restrict each in some arbitrary ways, it gets harder to find equivalences between restrictions in the PFP model and those in the imperative model).

Finally, I think that the notions of what it means "to reason about" or even what it means to be correct (and certainly how important it is to be correct) are themselves vague and require answers from empirical psychological/sociological/economic research than from math.

I think the people that say we don't need Turing completeness say that because their models can't really accommodate proper reasoning about exactly the kinds of systems I want to reason about. Proving theorems is nice and all but at the end of the day the temporal and dynamic aspects of the system I'm building are what I really care about. Not some static snapshot of it but the actual evolving gestalt.

Oh, absolutely, but there are formal methods for working with that, too -- well, maybe just the "evolving" part if not the "gestalt" -- mostly based on temporal logic (I can't believe I'm arguing on the side of formal methods now...) See, e.g. the Esterel language[1] used in the industry (directly or through its descendent languages) to write full, safety critical real-time systems. It is an imperative, non-Turing complete language, that is fully verifiable, and actually able to track an evolving gestalt for some restricted definition of "gestalt". Esterel, BTW, has had much more success than Haskell in the industry.

Another example is TLA+, which is used by Amazon to verify their (Java, I assume) web services[2]. Unlike Esterel, TLA+ is Turing complete, and can therefore fail to prove many things. It will also take longer and longer to run the bigger the code is, and therefore only verify algorithms, not entire systems.

Also, see the history of Statecharts[3], a predecessor to Esterel, that was designed first and foremost to assist humans in writing specifications and only later combined with Pnueli's temporal logic to facilitate formal methods.

BTW, the same David Harel who worked with Pnueli on Statecharts and temporal logic, is now involved with Behavioral Programming[4] a very interesting programming style that makes human specifications easier to translate to code. For a usage example, see Harel et al., Programming Coordinated Behavior in Java[5]

[1]: https://en.wikipedia.org/wiki/Esterel

[2]: http://cacm.acm.org/magazines/2015/4/184701-how-amazon-web-s...

[3]: http://www.wisdom.weizmann.ac.il/~harel/papers/Statecharts.H...

[4]: http://www.wisdom.weizmann.ac.il/~bprogram/

[5]: http://www.wisdom.weizmann.ac.il/~amarron/BPJ%20ECOOP%20FINA...

Thanks for the references.

Then, it seems to me, you are not complaining about programming languages, but some higher-level tools. In production, the “static guarantees and beautiful abstractions” you write about do not become useless. They are just not meant to directly solve the larger-grain emergent problems of distributed systems.

That's exactly my point. PLT stuff is a minor nuisance for all the stuff that I deal with day to day. It might be the echo chamber but whether a language has support for monads, higher-order types, etc. is not even in list of top 100 things of my concerns and yet it seems like every other PLT researcher is constantly worried about some category theoretical construct or another and trying to jam it into yet another language.

I still honestly haven't given this enough time to digest—my Java-fu is quite rusty so I'm sort of reading between the lines on what I know about delimited continuations generally.

First I want to state that delimited continuations form a monad and (I believe) that in order to get the Java implementation working you need to implement that monad in the "ambient monad" of Java.

Second, I want to dissect the notion of "monad" as used here into two pieces. First, there is the "PFP Notion" of monad which tends to be represented as a data type and some associated operations. Actually, I mean this even more specifically as being aimed precisely at the standard library monads of Haskell. Secondly, there's the mathematical "monad" which is just a qualifier assigned to any structure which satisfies the monad laws.

Delimited continuations are definitely a mathematical monad. They are represented by one PFP monad, but are generally eschewed as a style there. Why? Because delimited continuations are very "large" and PFP tends to favor the idea of having your data "forbid invalid state" so... large types are disfavored.

So now, to the third point: I think this is all just "implementing a specific overlarge monad in Java and Clojure". It turns out that this is really convenient because overlarge things always are. They are of course more convenient in PFP, too. But this convenience has the price of making code difficult to treat as data, difficult to analyze statically.

All that said, it doesn't mean at all that this isn't a useful technique for implementation of monads in Java/Clojure.

Four, what about composability? (Specifically, here, we mean functor composition.) Turns out that delcont composes nicely compared to the PFP-style monads. The reason why is again because of lack of restrictions on state space. Non-composability of monads is a good thing because it relates to non-commutativity of certain effects.

You can get better composability in three ways: throw out all the non-commutative effects, work with subsets of effects a particular ambient monad (see Purescript's impl of Eff), split the specification of effects from the implementation of effects and leave the decision of ordering up to the implementer (see Haskell's mtl).

If you don't want any of the above, just realize that non-composition is telling you something and forcing you to make choices. One choice is to build a new kind of composition (monad transformer composition) which does work the way we want by being more sensitive to ordering.

Fifth, and finally, all the above is not to say that there's any lack of merit to using delcont for effects. It's a great technique. It may fit the "algorithmic mind" better, too.

I am trying to say, however, that it's not some grand trick that frees the user from thinking about monads. You still are, you just trade one set of implementation concerns for another. But since things are still monads then the same mathematical techniques will translate. Which is a great thing—it means there are more tools to think about, understand, and become confident in the application of this technique.

> I am trying to say, however, that it's not some grand trick that frees the user from thinking about monads.

This brings us back to the discussion of what it means to be "thinking about monads". When a cashier hands me change, is he "thinking about groups"? Is a Rubik's Cube whiz thinking about groups? Sometimes people are better at something by thinking about it in a completely different way. That is the exact meaning of duality: two things are the same, but different circumstances may make looking at it in one way more useful than the other. A "push" API (of which "PFP monads" or the "monadic style" as I call it, is an example) and a "pull" API are also duals, yet I believe that when writing concurrent code in imperative languages, the pull shape is style preferable.

As discussions about cognition might make some PFP people uncomfortable -- we all should just think "in math", right? (I'm kidding. Mostly ;)) -- we can take this back to theory and say that switching between different representations requires computation. In fact, it can be argued that all computation is simply about changing representations and encoding things. 4 is 2 x 2, but going from representing that same thing as "2x2" to "4" is what computation does (and the lambda calculus computes just by substitution -- namely, by "doing nothing" from a mathematical point of view). All this is to say that going from looking at something one way to looking at it another way actually requires some computational power and is not free.

I don't think we're disagreeing at all in principle. I'm mostly trying to point out that your change is a group whether you think about it or not (actually, several). Then, let the best representation win.

Monads aren't a representation, they're a functionality.

Oh, I know we're not disagreeing in principle, but I do think monads are a representation of a functionality -- effects -- in an algebra that does not have impure functions. Once effects are introduced as fundamental constructs, there is no need to be "thinking in monads" anymore.

Imagine if we defined our algebra in such a way that there were no functions -- i.e. nothing that can take parameters and return values -- just sequences of instructions that read and assign global variables (and then define -- outside the calculus -- other effects, such as IO, as changing certain variables). We could then define pure functions in terms of this calculus, rather than doing the opposite and defining change to state in terms of pure functions (i.e. monads). We could then say that a pure function is really a subroutine that performs a transformation on variable addresses which we'll call "variable renaming", and writes its output in some well-known address that the "caller" then renames.

In fact, that calculus exists in the Turing Machine, or in our actual computers, and the work of the Haskell compiler is to transform the "high level" concept of pure functions to the "fundamental" representation of instructions and effects... We could then say that even when working with functions you still have to think in terms of instructions (this is actually true if you care about complexity or its earthly form -- performance).

Of course, there is no objective way to decide which of these representations is more fundamental.. They are just different, requiring a compiler -- or a mind -- to switch from one to the other. It is, however, possible that one of them results in more efficient programs on current hardware than the other, and one of them might be more natural for human cognition than the other. Or the same one could be both ;) but actually I think reality is more interesting than that.

I think both human cognition and compilers might not prefer either but a combination of both. Messy from a mathematical perspective, but that's the nature of computation -- in neurons or silicon. When we need to formally reason about the program, the formal verification tool can then transform the messy representation into one or the other "robust" algebras and do its thing there.

Uh... What do you think monad transformers do?

Very interesting piece. It sounds a lot like how exceptions be used to implement non-error control flow [1]. The one bone I have to pick is the juxtaposition of PFP and imperative as opposites. Certainly it's the case that PFPers can and do implement imperative logic monadically, but it's also the case that imperative programmers often use frameworks that largely treat the business logic, where most of the programming is actually done, as more or less pure functions. Moreover, explicitly hybrid languages like JavaScript, Scala, Rust, Swift, etc. seem to be where a lot of the practical PL excitement is these days. I think the real debate is in the best patterns to use for particular generic problems. That's the core of the blog post of course, I'm just disagreeing with some of the rhetorical framing.

Anyhow, the premise reminds me a lot of What Color Is Your Function [2] and its critique of async programming. I think monads are super useful and intuitive when representing data. Promises/futures blew my mind, once I finally understood them, but I think there are important points about how they can become extremely cumbersome within a large architecture. I have to admit that I'm starting to agree that they may be a dead end, no matter how much syntactic sugar you sprinkle on it.

I think the big issue is that if you can't pull data out of a monad, you can't really encapsulate its logic. For containers, this isn't much of a problem, because the monadic data structure is either inherent to the data itself or you can fold your way down to something more primitive. In those cases, the monadic model seems completely reasonable and very nicely encapsulated to me. I'm not sure I buy the idea that scoped continuations should replace this monad usage.

Connecting back with async programming, I wonder how this jives with the actor model, which has been so widely praised as a model of concurrency that is conceptually simple, easy to reason about, fault tolerant, and imperative. When the author mentioned the thread being the atom of imperative programming, actors immediately came to mind.

Lastly, much like monads, futures and every other abstraction out there, the usefulness of scoped continuations may come down to how easily it can be expressed and comprehended within real programming languages. I found the examples in the original talk and the blog post to both be rather difficult to follow. The former, probably because of the use of = to select the scope and the latter because of the verbosity of Java. It would be interesting to see if there's a syntax that really captures the operation in an intuitive way.

[1] http://stackoverflow.com/questions/2742719/how-do-i-break-ou...

[2] http://journal.stuffwithstuff.com/2015/02/01/what-color-is-y...

> The one bone I have to pick is the juxtaposition of PFP and imperative as opposites.

Well, I most certainly didn't say FP and imperative are opposites; quite the contrary: most FP languages are imperative. Part of the confusion is that there is no definition of FP. PFP, OTOH, is well defined: it is a style of programming where all functions are pure. If you can write impure functions, you're not doing PFP, and if you're doing PFP, you can't write impure functions.

> I'm not sure I buy the idea that scoped continuations should replace this monad usage.

I'm not entirely sure what you mean by "this usage" (lists?), but, like I say in the talk, I think monads are great for parallel streams, where you explicitly don't care about the stack context.

> I wonder how this jives with the actor model... When the author mentioned the thread being the atom of imperative programming, actors immediately came to mind.

Exactly. Threads (and by that I mean any thread implementation, not just kernel threads) are the best use of continuations, and blocking actors like Erlang's (or our own Quasar) make great use of (lightweight) threads, as do Go's goroutines (and Quasar fibers).

> may come down to how easily it can be expressed and comprehended within real programming languages. I found the examples in the original talk and the blog post to both be rather difficult to follow.

My opinion is that it's not important for scoped continuations to be directly exposed at all, and if they are, they should be rarely used directly. Using them to create lightweight threads solves almost all of the problems with the monadic style, and the code looks just like any plain-old thread-blocking code (we've had very good success with that in Quasar; Code stays the same, but instead of blocking a kernel thread, it blocks a fiber; the APIs don't even need to change).

> The former, probably because of the use of = to select the scope

I admit that when I gave the talk the precise details of the API and usage weren't yet clear to me, which is why I wrote the post.

> because of the verbosity of Java

Where does Java's verbosity bother you here? (although I'm probably biased, as I like Java's, or Go's, intentional verbosity)

Fair point on PFP.

> I'm not entirely sure what you mean by "this usage" (lists?) [...]

By "this usage", I meant containers, generally. Maybe algebraic data types would be more specific.

> [...] blocking actors like Erlang's (or our own Quasar) make great use of (lightweight) threads [...]

Have you been following the Reactive Manifesto [1] and reactive streams [2]? If so, do you have any thoughts on whether this it builds out design practice in a complementary way to your suggestion here, or is it divergent?

> Using them to create lightweight threads solves almost all of the problems with the monadic style, [...]

My concern here would be whether you'd then be recreating the old problem you mentioned of having many implicitly monadic constructs in imperative programming but no good way of adding your own. In the analogous case in PFP, even if the continuation monad is the mother of all monads (and I haven't finished reading that post you linked to), I'd have to imagine it wouldn't be a whole lot of fun to use it exclusively. Or else people would do it!

> Where does Java's verbosity bother you here?

I probably should have withheld comment on Java's verbosity. There's probably nothing wrong with it. I think I'm just used to Scala and dynamically typed languages.

Anyway, awesome work. You make a lot of great points.

[1] http://www.reactivemanifesto.org/

[2] http://www.reactive-streams.org/

> Have you been following the Reactive Manifesto [1] and reactive streams [2]? If so, do you have any thoughts on whether this it builds out design practice in a complementary way to your suggestion here, or is it divergent?

Reactive streams' design is the dual to threads/fibers/continuations, and I believe that as a user facing construct it is the wrong one (I'll get to that in a minute), but, as Java does not yet have a standard, built-in implementation of continuations/fibers, Reactive Streams is meant as an interoperability standard and does a very good job at that. I like that it exists and I like the spec, which is why Quasar is also RS compliant (turning every stream to a channel and vice-versa).

I don't like the reactive manifesto, though. It conflates some well-known design practices -- which we simply used to call "robust soft-realtime" -- with a specific implementation (nonblocking), and in the process, overloads a term, "reactive programming", which has a very different meaning (dataflow, i.e. spreadsheet style). Even Erlang, the inspiration to the manifesto, doesn't conform to it (because it discusses implementation). Also, I'm not too fond of manifestos in general.

As to push-based streams, I didn't have time to get into that in the talk (the slide is there, though), but I believe that in imperative languages a pull-based API (like channels) is always superior to a push API for several reasons:

* It is more general -- both are duals, but going from pull to push only requires a while loop. Going the other way requires adding a queue.

* It conveys more information to the developer -- in the push approach, the concurrency is clear: the item is received on the same thread that calls `receive` (the only thing you need is maybe just a tag saying if `receive` itself is thread-safe or not). With a push API, the developer needs to read the documentation to figure out whether the callback can be called on multiple threads concurrently, and, if not, whether it's always called on the same thread.

* It conveys more information to the library -- a pull API has implicit, automatic backpressure. Calling `receive` means we're ready to receive more. With a push API you either don't have backpressure, or, like with reactive streams, you need explicit backpressure (`request`) along with more documentation on when `request` may be called. Look how much of the RS spec is devoted to specifying exactly that, and how much shorter it would be if it were a pull API (like Go or Quasar channels).

> My concern here would be whether you'd then be recreating the old problem you mentioned of having many implicitly monadic constructs in imperative programming but no good way of adding your own.

Well, yeah, but that's a secondary issue. I'm not a language purist, and you always need to weigh which is better -- a very general abstraction that is then put to use by everyone to create their own incompatible DSLs, or choose the most useful specific abstractions and make sure everyone can read everybody's code, even if not everything now has the most elegant expression. For large projects I think the latter is preferable.

It's pretty interesting that the author did not talk about async/await which gives us back the imperative coding style with futures, effectively solving the problem - it has been adopted by many languages like Python, C# and JavaScript and can be implemented with generators. Generators, are the generalization that gives you the expressiveness you need (not fibers) here. Where are they ineffective here?

async/await are a form of stackless coroutines (or stackless delimited continuations although that term is rarely, if ever used). They're a limited form of delimited continuations, and are mostly used as a compromise when implementing full coroutines is hard (or not wanted for some reason). Clojure's core.async's go blocks are also stackless coroutines. Basically, they're delimited continuations that only allow suspension ("shift") at the topmost stack frame of the continuation, and don't allow suspending at deeper stack frames.

I think that most languages don't want the complexity of coroutines, as they are a unfamiliar concept to most programmers. It is a shame that they've never gained popularity as they are very general concept.

A different take on the merits of explicit yielding (whether by async/await or raw generators), and the perils of full coroutines, by Glyph Lefkowitz of Twisted Python fame:


I'd say that Lefkowitz's claims are mostly relevant when your runtime has no multicore support (like Python). If your runtime is single-threaded, explicit yields might (it's mostly a matter of taste), make explicit reasoning about state-change ordering easier. But then you're using shared-state concurrency again, and unsafe shared-state to boot. If your runtime has multicore support -- like Java or .NET -- all this doesn't matter, because those concurrent changes Lefkowitz is talking about may happen any time -- not just at yield points.

The solution, of course, is not to use unsafe-shared-state-concurrency! Either use threads with messages (like Erlang), or threads with transactional shared-state, like Clojure, or both (in fact even Erlang has both actors and ETS, and Clojure has agents and channels as well as atoms and refs).

He does (briefly) mention that kind of variation and why it's a slightly different idea in the original presentation. It's around 28 minutes into the video.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact