However, the part that I haven't been able to get past yet is that monads are not closed over composition (so there exist some monads that are no longer monads when you compose them).
If it turns out that monads and delimited continuations are duals then it seems that there ought to be some continuations that compose only to yield invalid results. (With the alternative being that scoped continuations always compose which suggests something odd is going on.)
I haven't worked through the math yet because it looks suspiciously dreadful, but if anyone knows why scoped continuation composition is closed I would appreciate your input.
 - the most disappointing thing about monads to me.
The state monad is the following functor
St s a = s -> (a, s)
St s (St d a) = s -> (d -> (a, d), s)
return a = \s -> (\d -> (a, d), s)
join :: F (F a) -> F a
:: St s (St d (St s (St d a))) -> St s (St d a)
:: s -> (d -> (s -> (d -> (a, d), s), d), s)
-> s -> (d -> (a, d), s)
join m0 =
\s0 -> let (m1 :: St d (F a), s1) = m0 s0
in (\d0 -> let (m2 :: F a, d1) = m1 d0
(m3 :: St d a, s2) = m2 s1
(a, d2) = m3 d1
in (a, d2)
, ??? )
This however only means that St does not compose with itself as a functor. It composes just fine in another method. If we install the inner monad "wrapped around" the tuple
StT s m a = s -> m (a, s) -- T for "transformer"
StT s (St d) a = s -> St d (a, s)
= s -> d -> ((a, s), d)
= (s, d) -> (a, (s, d))
= St (s, d) a
For example, I do a lot of Java/Coffee script lately. I don't have access to first class continuations. Don't get me wrong; I'd love to have them. I'm sick as the next person of blowing up in a callback and having my stack completely hosed.
But in Coffee script, monads are nearly trivial to implement. The use of the Maybe monad makes doing repeated transformations and filtering incredibly nicer. It is not obvious to me that a continuation based implementation would be clearer at all.
Perhaps the author is trying to make the case that first class continuations are more important than first class monads for imperative languages. If that's the case, then I tend to agree because, like I said, monads are not difficult to implement even if you don't have explicit support for them. Stack preserving continuations on the other hand... :-P
I think the idea of using composable continuations is a very interesting one and I'd like investigate it further, but unless I'm being very dim-witted (a definite possibility) I don't think the matter is as clear as the author hopes.
It is merely discussed here to show that the theory behind the imperative style is not different from the one behind the PFP style, aimed for those who believe "PFP is just math, while imperative is 'not math'", or to those who are curious to see (as I was) if the imperative style has something like PFP's monad, namely an elegant abstraction that's at the heart of everything.
In practice, however, most continuations should be used sparingly if at all, except one -- the thread (in particular, the lightweight thread). Lightweight threads make the most annoying problems of monads in an imperative language go away, and are easy for imperative programmers to use.
Also, thank you for the article. It has made me think quite a lot. It's one of those times where my subconscious is saying, "You should pay attention to this", but I'm not quite smart enough to understand what I want to do with it. It will be fun to puzzle it out for a while ;-)
In my day to day work the programming language is never the problem because hacking around the limitations of the language to bridge the gap between the problem domain and the computational domain is never the bottleneck. The real problem is when these systems go into production and all static guarantees and beautiful abstractions become basically useless. What I want is something to help me reason about dynamic and evolving large scale software systems. What I get is PLT researchers and library authors bickering over dependent types, monads, and continuations.
Here's a concrete and unsolved problem. How do you specify some kind of topology for your production environment and then have whatever system understands this specification reason about performance, error recovery, failover, and scaling issues. In fact the lower level components don't even matter because you could write it in Haskell or Python or Java or whatever is your favorite language. Right now there is nothing because everyone is arguing about continuations and monads. Basically everyone involved is thinking too small. The one person that I think comes close to approaching the problem from the right angle is http://www.cl.cam.ac.uk/~srk31/.
Here's how I generally see it. Writing large software is hard (and a software composed of multiple components is large software, even if you decide using the network stack when calling some functions is a good idea ;)). "Type" people (and I'm certainly for some kinds of static types, I just don't think that they're the solution to everything) would tell you that types help restrict your computation in such a way that reasoning about it is easier. To that I say, but I want to do something specific which it's pretty annoying in PFP, and they answer, just do it a different way, to which I say, if I can do it differently, so can the compiler on my behalf. In short, I don't think that explicit types (i.e. those that can be defined in the language you use) are stronger than implicit types (i.e. anything that a verifier can prove about your program), and at least in theory, they are both "the same" (though, see another comment here on what I think about two things being the same in a mathematical sense). We should use whatever is easier for our minds to grasp.
The more essential problem -- although I don't know if it's more interesting or less interesting -- is what exactly do we want to do in our programs? If we require a Turing complete model, then no matter how many types -- explicit or implicit -- we throw at the problem, we just won't be able to formally prove properties we might care about. Some people, however, claim that we don't really need Turing completeness, and if we don't, that might open a whole new world of possibilities. But once we start restricting the computational model, the representation we use -- say, PFP or imperative -- might become much more important, each being able to compute things that the other cannot (we know that lambda calculus and Turing machines both compute the same functions, but once we restrict each in some arbitrary ways, it gets harder to find equivalences between restrictions in the PFP model and those in the imperative model).
Finally, I think that the notions of what it means "to reason about" or even what it means to be correct (and certainly how important it is to be correct) are themselves vague and require answers from empirical psychological/sociological/economic research than from math.
Another example is TLA+, which is used by Amazon to verify their (Java, I assume) web services. Unlike Esterel, TLA+ is Turing complete, and can therefore fail to prove many things. It will also take longer and longer to run the bigger the code is, and therefore only verify algorithms, not entire systems.
Also, see the history of Statecharts, a predecessor to Esterel, that was designed first and foremost to assist humans in writing specifications and only later combined with Pnueli's temporal logic to facilitate formal methods.
BTW, the same David Harel who worked with Pnueli on Statecharts and temporal logic, is now involved with Behavioral Programming a very interesting programming style that makes human specifications easier to translate to code. For a usage example, see Harel et al., Programming Coordinated Behavior in Java
First I want to state that delimited continuations form a monad and (I believe) that in order to get the Java implementation working you need to implement that monad in the "ambient monad" of Java.
Second, I want to dissect the notion of "monad" as used here into two pieces. First, there is the "PFP Notion" of monad which tends to be represented as a data type and some associated operations. Actually, I mean this even more specifically as being aimed precisely at the standard library monads of Haskell. Secondly, there's the mathematical "monad" which is just a qualifier assigned to any structure which satisfies the monad laws.
Delimited continuations are definitely a mathematical monad. They are represented by one PFP monad, but are generally eschewed as a style there. Why? Because delimited continuations are very "large" and PFP tends to favor the idea of having your data "forbid invalid state" so... large types are disfavored.
So now, to the third point: I think this is all just "implementing a specific overlarge monad in Java and Clojure". It turns out that this is really convenient because overlarge things always are. They are of course more convenient in PFP, too. But this convenience has the price of making code difficult to treat as data, difficult to analyze statically.
All that said, it doesn't mean at all that this isn't a useful technique for implementation of monads in Java/Clojure.
Four, what about composability? (Specifically, here, we mean functor composition.) Turns out that delcont composes nicely compared to the PFP-style monads. The reason why is again because of lack of restrictions on state space. Non-composability of monads is a good thing because it relates to non-commutativity of certain effects.
You can get better composability in three ways: throw out all the non-commutative effects, work with subsets of effects a particular ambient monad (see Purescript's impl of Eff), split the specification of effects from the implementation of effects and leave the decision of ordering up to the implementer (see Haskell's mtl).
If you don't want any of the above, just realize that non-composition is telling you something and forcing you to make choices. One choice is to build a new kind of composition (monad transformer composition) which does work the way we want by being more sensitive to ordering.
Fifth, and finally, all the above is not to say that there's any lack of merit to using delcont for effects. It's a great technique. It may fit the "algorithmic mind" better, too.
I am trying to say, however, that it's not some grand trick that frees the user from thinking about monads. You still are, you just trade one set of implementation concerns for another. But since things are still monads then the same mathematical techniques will translate. Which is a great thing—it means there are more tools to think about, understand, and become confident in the application of this technique.
This brings us back to the discussion of what it means to be "thinking about monads". When a cashier hands me change, is he "thinking about groups"? Is a Rubik's Cube whiz thinking about groups? Sometimes people are better at something by thinking about it in a completely different way. That is the exact meaning of duality: two things are the same, but different circumstances may make looking at it in one way more useful than the other. A "push" API (of which "PFP monads" or the "monadic style" as I call it, is an example) and a "pull" API are also duals, yet I believe that when writing concurrent code in imperative languages, the pull shape is style preferable.
As discussions about cognition might make some PFP people uncomfortable -- we all should just think "in math", right? (I'm kidding. Mostly ;)) -- we can take this back to theory and say that switching between different representations requires computation. In fact, it can be argued that all computation is simply about changing representations and encoding things. 4 is 2 x 2, but going from representing that same thing as "2x2" to "4" is what computation does (and the lambda calculus computes just by substitution -- namely, by "doing nothing" from a mathematical point of view). All this is to say that going from looking at something one way to looking at it another way actually requires some computational power and is not free.
Monads aren't a representation, they're a functionality.
Imagine if we defined our algebra in such a way that there were no functions -- i.e. nothing that can take parameters and return values -- just sequences of instructions that read and assign global variables (and then define -- outside the calculus -- other effects, such as IO, as changing certain variables). We could then define pure functions in terms of this calculus, rather than doing the opposite and defining change to state in terms of pure functions (i.e. monads). We could then say that a pure function is really a subroutine that performs a transformation on variable addresses which we'll call "variable renaming", and writes its output in some well-known address that the "caller" then renames.
In fact, that calculus exists in the Turing Machine, or in our actual computers, and the work of the Haskell compiler is to transform the "high level" concept of pure functions to the "fundamental" representation of instructions and effects... We could then say that even when working with functions you still
have to think in terms of instructions (this is actually true if you care about complexity or its earthly form -- performance).
Of course, there is no objective way to decide which of these representations is more fundamental.. They are just different, requiring a compiler -- or a mind -- to switch from one to the other. It is, however, possible that one of them results in more efficient programs on current hardware than the other, and one of them might be more natural for human cognition than the other. Or the same one could be both ;) but actually I think reality is more interesting than that.
I think both human cognition and compilers might not prefer either but a combination of both. Messy from a mathematical perspective, but that's the nature of computation -- in neurons or silicon. When we need to formally reason about the program, the formal verification tool can then transform the messy representation into one or the other "robust" algebras and do its thing there.
Anyhow, the premise reminds me a lot of What Color Is Your Function  and its critique of async programming. I think monads are super useful and intuitive when representing data. Promises/futures blew my mind, once I finally understood them, but I think there are important points about how they can become extremely cumbersome within a large architecture. I have to admit that I'm starting to agree that they may be a dead end, no matter how much syntactic sugar you sprinkle on it.
I think the big issue is that if you can't pull data out of a monad, you can't really encapsulate its logic. For containers, this isn't much of a problem, because the monadic data structure is either inherent to the data itself or you can fold your way down to something more primitive. In those cases, the monadic model seems completely reasonable and very nicely encapsulated to me. I'm not sure I buy the idea that scoped continuations should replace this monad usage.
Connecting back with async programming, I wonder how this jives with the actor model, which has been so widely praised as a model of concurrency that is conceptually simple, easy to reason about, fault tolerant, and imperative. When the author mentioned the thread being the atom of imperative programming, actors immediately came to mind.
Lastly, much like monads, futures and every other abstraction out there, the usefulness of scoped continuations may come down to how easily it can be expressed and comprehended within real programming languages. I found the examples in the original talk and the blog post to both be rather difficult to follow. The former, probably because of the use of = to select the scope and the latter because of the verbosity of Java. It would be interesting to see if there's a syntax that really captures the operation in an intuitive way.
Well, I most certainly didn't say FP and imperative are opposites; quite the contrary: most FP languages are imperative. Part of the confusion is that there is no definition of FP. PFP, OTOH, is well defined: it is a style of programming where all functions are pure. If you can write impure functions, you're not doing PFP, and if you're doing PFP, you can't write impure functions.
> I'm not sure I buy the idea that scoped continuations should replace this monad usage.
I'm not entirely sure what you mean by "this usage" (lists?), but, like I say in the talk, I think monads are great for parallel streams, where you explicitly don't care about the stack context.
> I wonder how this jives with the actor model... When the author mentioned the thread being the atom of imperative programming, actors immediately came to mind.
Exactly. Threads (and by that I mean any thread implementation, not just kernel threads) are the best use of continuations, and blocking actors like Erlang's (or our own Quasar) make great use of (lightweight) threads, as do Go's goroutines (and Quasar fibers).
> may come down to how easily it can be expressed and comprehended within real programming languages. I found the examples in the original talk and the blog post to both be rather difficult to follow.
My opinion is that it's not important for scoped continuations to be directly exposed at all, and if they are, they should be rarely used directly. Using them to create lightweight threads solves almost all of the problems with the monadic style, and the code looks just like any plain-old thread-blocking code (we've had very good success with that in Quasar; Code stays the same, but instead of blocking a kernel thread, it blocks a fiber; the APIs don't even need to change).
> The former, probably because of the use of = to select the scope
I admit that when I gave the talk the precise details of the API and usage weren't yet clear to me, which is why I wrote the post.
> because of the verbosity of Java
Where does Java's verbosity bother you here? (although I'm probably biased, as I like Java's, or Go's, intentional verbosity)
> I'm not entirely sure what you mean by "this usage" (lists?) [...]
By "this usage", I meant containers, generally. Maybe algebraic data types would be more specific.
> [...] blocking actors like Erlang's (or our own Quasar) make great use of (lightweight) threads [...]
Have you been following the Reactive Manifesto  and reactive streams ? If so, do you have any thoughts on whether this it builds out design practice in a complementary way to your suggestion here, or is it divergent?
> Using them to create lightweight threads solves almost all of the problems with the monadic style, [...]
My concern here would be whether you'd then be recreating the old problem you mentioned of having many implicitly monadic constructs in imperative programming but no good way of adding your own. In the analogous case in PFP, even if the continuation monad is the mother of all monads (and I haven't finished reading that post you linked to), I'd have to imagine it wouldn't be a whole lot of fun to use it exclusively. Or else people would do it!
> Where does Java's verbosity bother you here?
I probably should have withheld comment on Java's verbosity. There's probably nothing wrong with it. I think I'm just used to Scala and dynamically typed languages.
Anyway, awesome work. You make a lot of great points.
Reactive streams' design is the dual to threads/fibers/continuations, and I believe that as a user facing construct it is the wrong one (I'll get to that in a minute), but, as Java does not yet have a standard, built-in implementation of continuations/fibers, Reactive Streams is meant as an interoperability standard and does a very good job at that. I like that it exists and I like the spec, which is why Quasar is also RS compliant (turning every stream to a channel and vice-versa).
I don't like the reactive manifesto, though. It conflates some well-known design practices -- which we simply used to call "robust soft-realtime" -- with a specific implementation (nonblocking), and in the process, overloads a term, "reactive programming", which has a very different meaning (dataflow, i.e. spreadsheet style). Even Erlang, the inspiration to the manifesto, doesn't conform to it (because it discusses implementation). Also, I'm not too fond of manifestos in general.
As to push-based streams, I didn't have time to get into that in the talk (the slide is there, though), but I believe that in imperative languages a pull-based API (like channels) is always superior to a push API for several reasons:
* It is more general -- both are duals, but going from pull to push only requires a while loop. Going the other way requires adding a queue.
* It conveys more information to the developer -- in the push approach, the concurrency is clear: the item is received on the same thread that calls `receive` (the only thing you need is maybe just a tag saying if `receive` itself is thread-safe or not). With a push API, the developer needs to read the documentation to figure out whether the callback can be called on multiple threads concurrently, and, if not, whether it's always called on the same thread.
* It conveys more information to the library -- a pull API has implicit, automatic backpressure. Calling `receive` means we're ready to receive more. With a push API you either don't have backpressure, or, like with reactive streams, you need explicit backpressure (`request`) along with more documentation on when `request` may be called. Look how much of the RS spec is devoted to specifying exactly that, and how much shorter it would be if it were a pull API (like Go or Quasar channels).
> My concern here would be whether you'd then be recreating the old problem you mentioned of having many implicitly monadic constructs in imperative programming but no good way of adding your own.
Well, yeah, but that's a secondary issue. I'm not a language purist, and you always need to weigh which is better -- a very general abstraction that is then put to use by everyone to create their own incompatible DSLs, or choose the most useful specific abstractions and make sure everyone can read everybody's code, even if not everything now has the most elegant expression. For large projects I think the latter is preferable.
The solution, of course, is not to use unsafe-shared-state-concurrency! Either use threads with messages (like Erlang), or threads with transactional shared-state, like Clojure, or both (in fact even Erlang has both actors and ETS, and Clojure has agents and channels as well as atoms and refs).