Hoping for a little advice: I'm trying to push for some development culture changes at work, one of which is doing more local, incremental refactors rather than large, all-at-once changes (i.e. the sort of thing for which one books 'architecture meetings'). Since we'll hopefully be massaging large parts of the codebase, there's an opportunity to jump into something new with both feet.
Question: what do you think of the thesis of pure4j [1] vis. frege [2]? Namely, are the advantages gained by switching to a haskell-like language far above those attained by simply forcing effects into a small part of the codebase?
To expand on this suggestion: Scala isn't big on purity (AFAIK the language has no concept of pure functions out of the box) but it is big on immutable data types, and mostly-pure functions arise naturally when you're working with immutable types.
Immutable types happen to be something that is familiar and comfortable to Java developers. I would therefore expect to meet a lot less resistance to "syntactic sugar and immutable classes" than "functional programming with pure functions" even if the end result is 80 percent the same.
This was the compromise our team made. Our team was having a bad time dealing with Spring. Our lead dev is a functional buff who really likes Haskell, and got us all hype about trying something new. We picked scala in lieu of Haskell because the learning curve with Haskell was a much harder sell. We avoid mutable data types unless there's a compelling reason (interop with Java libraries, for example), and most of our functions are implicitly pure. ScalaZ might have some purity constraint, idk.
5 months in with Scala and it has been a wonderful experience. Now I just have to convince our front end dev to switch over to Scala.js...
We tried the same thing, but it did not go well at all. We picked scala because people should be able to learn it easier and more gradually. But we found as people learned FP using scala, they realized how much scala held them back. We ended up redoing a PHP project in scala, then as it was 80% done redoing it again in haskell. And then from there out using haskell for everything.
Do you have any specific examples of how scala would hold y'all back in relation to haskell? I actually have trouble imagining how one could justify, in a business, scrapping an 80% complete rewrite to rewrite it again in a different language. I wonder what would be so compelling.
Scala is an OO language with FP tacked on, so it basically defaults to "make everything a pain". It was easy to justify rewriting it again because it only took two weeks to do it. A re-write is much faster than initially writing it, you're just copying yourself.
I wonder why Haskell people feel so hurt about Scala...
It's much closer to Haskell's capabilities than pretty much any other functional language.
It has better type classes, the compiler just got type inference for partially applied type constructors, it supports programming with dependent types and implicit parameters allow you to let the compiler prove almost arbitrary things.
If your definition of FP means "purity", you pretty much discarded all languages traditionally considered to be functional.
There is nothing wrong with liking Haskell more than Scala, but I think your claims are not very well-founded.
I wonder why you assume I am a haskell person, why you assume I am "so hurt about scala", and mostly why even if those two absurd assumptions were true, that you could generalize that to "haskell people".
I am obviously not a haskell person, or I would have just used haskell rather than trying scala first. I am not hurt about scala, it simply was much less productive than haskell.
Scala is not remotely the closest to haskell. Not even the closest JVM language.
And my "claims" are my personal experience, how can they not be "very well founded". You seem to be putting some emotional context into this that has nothing to do with me.
Haskell and Scala win and lose on different criteria but for type classes, I find Haskell's syntax and semantics infinitely more elegant, more terse and more flexible than Scala's. Defining type classes in Scala is a cacophony of syntax boiler plate riddled with implicits and brackets. Haskell is much more elegant in that area.
Scala's typeclasses are first-class values and they actually work as advertised.
Haskell's type classes are a mess because Haskell-the-spec (anti-modular, coherent) and Haskell-the-language (incoherent, "modular") work completely different and nobody cares and there us no real fix because both options suck, as soon as your project consists of more than one module.
It's the reason why no language after Haskell does type classes the Haskell way, and all of them do it similar to Scala's approach.
Could Scala have more syntax sugar? Sure! But people are more focused on getting them right first, before adding additional syntax.
What on earth are you talking about? Tons of other languages do type classes "the haskell way". Which is also known as "type classes". Scala doesn't have type classes, it has a close approximation.
So basically what you are saying is "Haskell has Haskell's typeclasses!".
That's a yuuuuge amount of languages that follow "Haskell"'s design! (Maybe you should start counting Haskell-to-X compilers separately to pad the number "1" a bit?)
I have already mentioned three languages that haven't followed Haskell's broken mess, while you have failed to produce even a single bit of information, while constantly crying foul at everything you don't like?
That's just amazingly hypocritical, but I guess the tactic of "blame other people for your own behaviour to make it look like tu-quoque if they respond" hasn't gotten old for you yet?
I did not read your responses to other people, since you are obviously trolling. Your three examples are nonsense, two are haskell style, and one doesn't have type classes at all (ocaml).
Ok, so if I understand you correctly: you claim that your failure to contribute anything of substance should be blamed on your inability to read and your lack of any knowledge of the topic.
The spec mandates that type class instances can only be defined together with the type class or with the corresponding data type.
While it's a very poor idea as no person except the author of the type class or the author of the data type can define a type class instance, it at least guarantees the global uniqueness and coherence of type class instances. ("There can only ever be a single typeclass instance for typeclass A and data type B.")
GHC (the only implementation that kind of matters today) takes a YOLO approach, allows as many typeclass instances as you want, doesn't bother enforcing the rules laid out in the spec.
Additionally, as type class instances are not first-class values of the language a developer doesn't even have the option to make sure he is getting the instance he expects.
There exists a GHC warning for "orphan type class instances"–but as the warning triggers on everything, not only the relevant cases where multiple instances clash and break the program–it's as useful as saying "we made sure that C's goto-fail bug can never happen again by emitting a warning for every suspicions indentation – you only have to sift through 1732 warnings until you find the dangerous one!".
Your C comparison at the end is disingenuous. It is entirely possible to use the orphan instances warning to ensure coherence in an application that contains orphan instances without a single spurious warning. Simply make a habit of putting all orphan instances in a single module, and mark only that one module -fno-warn-orphans. The rest of your code can be organized appropriately. It is not ideal, but not painful.
It's true that you can't safely define orphan instances in a library, but that also should involve no "sifting through" warnings for valid uses. It is genuinely bad practice to put orphan instances in a library, even if we had global checking - it would produce cases where you simply cannot use otherwise compatible libraries in the same project.
Your idea of centralizing things into a module assumes you have control over all code ever written.
This might be true for the 90% of Haskell that is created and abandoned by PhDs writing their theses, but if you look at other language communities: People like to use a library from time to time.
You're being deliberately obnoxious in your phrasing. Cut it out.
Regarding your content:
You are making (or missing) a point I made above, and moving goal posts.
If you are writing a library that defines a new type and would like to provide an instance of some classes, there is no problem, those are not orphans.
If you are writing a library that defines a new class and would like to provide an instance for some types, there is no problem, those are not orphans.
If you are writing an application, the approach I suggested for the context of working on an application is entirely appropriate - you write orphan instances, and have a static check that they are coherent.
There is a bit more trouble if I am a library wanting to provide an instance of a class that I do not define, for a type that I do not define. Allowing orphan instances in this case, with or without a static check for coherence, assumes consumers of my library "have control over all code ever written" - it will either be dangerous (without a check) or forbidden (with a check) that they use my library with (potentially seemingly unrelated) libraries. The correct workaround is usually to provide a newtype, for which you can define non-orphan instances. This is not bad, but it certainly is a bit of a tradeoff.
But again, you moved the goal posts. I made no general claim about the suitability of Haskell's expectation of coherence. I objected to your false claim that maintaining coherence involves picking over spurious warnings about orphan instances. It does not.
On the latter front that jvm work is a summer of code student. So on one hand it's definitely exciting work, there's definitely a bit of work between that .. Working and it being a tier one backend for ghc.
I've been haskellin at a large Megacorp environment for the past 1.5 years, with lots of smaller freelance before that.
There's a lot of different value props one can make for different tools. One pragmatic case I make is that in Haskell land it's super easy and rewarding to to engage with upstream to resolve any problems I hit. One other angle that the imvu article and I think a lot of the comments over look is that using Haskell as a shared concrete specification / engineering design substrate for communicating actually makes it much much easier to collab with colleagues on pretty crazy projects. Even if the deployed system isn't ultimately written in Haskell.
That said, with the right colleagues, having the freedom / trust to choose tools that help you best deliver is always the ideal. :)
Personally, I think you'll have more success with something more gradual than a complete syntactic overhaul. I've never used pure4j or frege but it sounds like pure4j is something you could try out incrementally, which could be quite immediately useful for your team.
I can't speak for pure4j, but I would be very wary about suggesting Frege at my workplace to (what I'm assuming are) Java programmers.
I feel like some Java programmers would have almost as much culture shock with something like Haskell as some of the php programmers at IMVU apparently did.
Fair point, but it might almost be worth it just to suggest it just to see what the opposition is like -- if there isn't much, or any, go with Frege; otherwise, fall back to pure4j. I'd imagine framing the decision as picking from two choices would improve the odds of at least one being selected.
That definitely made the short list! I picked up ocaml for a side project and enjoyed the language quite a bit. However, ocamljava is either dead or dormant at this point (last commit to master in June of last year) and it seemed to only have had one major contributor -- it would seem irresponsible to adopt such a solution in light of that.
Aah, fair point, it is pretty niche. Still, merely going off of their comparative contribution graphs [1] [2] and issue trackers [3] [4], it seems like Frege has a fair bit more throughput: for better or for worse, it seems like more _things_ are happening. Although I'll be the first to admit it's a misleading metric for determining the maturity of a project, it's still something. Imagine finding a bug in Frege: it seems like they merge even small patches from contributors, whereas ocamljava is essentially a one man show. Adopting ocamljava would probably mean forking and maintaining it too, whereas there's a possibility of just contributing to Frege.
That's all just my opinion, of course, and I hardly have enough industry experience to back up its validity! 'Just sanity checking my reasoning :)
Yes, but I have ... reservations. Same as with OCaml, I took on a small project with Scala and enjoyed the language, but only after carving out a subset of it that was palatable. Although it fits the bill, I'd be worried that enforcing conventions would become a large burden.
Unlike Frege where I (might? Frege's the only language of the three that I haven't investigated thoroughly) can point learners to essentially any Haskell tutorial and have them vaguely do 'the right thing', there exists a subset of Scala developers who use the language like Java with type inference. With Frege I could, for example, merely enforce the minimal usage of IO or State, but doing so with Scala is more complicated (maybe force the usage of cats/scalaz and their equivalent monads? It's still very easy to not use them, however). If there were an equivalent to F# on the JVM I'd settle on it in a heartbeat, but the dearth of well supported, pure functional languages makes this a tough choice. All in all, I'll probably go with pure4j, but it would be nice to stop writing Java at some point in my career.
Don't underestimate Haskell style variability--there are well-known 'schools' of Haskell writing style ... look up the Glasgow style, the Utrecht style etc. Frege, being on the JVM, adds its own twists on top of the standard Haskell syntax, from having to interop with Java classes. Essentially, it treats most Java classes as being in the IO monad because of their mutability and unbounded effects.
Yes, people can write Scala in 'Java style' or 'Haskell style' (not exactly), but I've come to think of that as a strength. Your colleagues can always start out near their comfort zone of imperative Java, and you can start out at yours, and you can slowly refactor your way towards each other. You can give Scala design talks to familiarise the team with best practices. With Haskell/Frege, you don't have that luxury. Everyone jumps in at the deep end.
About enforcing monad use: yes, it's possible to do that in Scala, but you're right that it has to be under the 'honour system'. About F#, it's not a pure functional language either. It's about as pure as OCaml and Scala.
As for pure4j, the reason I don't recommend it is that it makes it very clear that it's all in an experimental stage. So I have no idea which direction it's going.
Not relevant to this particular blog post, but I just wanted to say thank you: one of your old projects, Sphere, had a big influence on me as a young programmer (in my early teens), and is a big part of why I got into programming. SCons is also awesome. I'm glad to see you are still doing cool stuff :)
I toyed around with Sphere a bit, but RPG 6.5 was where it was at for me. I spent weeks poring over that code, and it was my first encounter with C++. I learned so much, especially from the implementation of the toolkit (tile editor, map editor, etc.). I never was able to compile it myself, though, as I didn't have access to a proper Borland C++ compiler for DOS, but that didn't matter so much as long as I could read the code and see the end result.
I'm just learning Haskell through a university course. It's quite traditional in the author's nomenclature, with extensive discussion of laziness, currying and point-free notation.
But it introduces monads very similarly to how he did it: starting with Arrays, drawing parallels to Maybes, then introducing Control.Monad.State through an assignment before moving on to IO. This has helped me a lot in grasping the concept behind monads.
It makes me sad how people think imperative programming is a "natural" way to think. I have known lots of people who resist using functional languages for serious projects, marginalizing the "functional" way of thinking as though thinking in terms of "labeled boxes with data in them" (Von Neumann architecture) is some sort of more normal way to think.
Before computers people modeled the real world using a Haskell-like pseudocode called "mathematics".
I find both mental models useful. For example, "accumulate all of the monoids in this collection" (https://hackage.haskell.org/package/base-4.9.0.0/docs/Data-F...) makes total sense as a functional operation. You have a collection and some pure operations you want to apply to combine it all into one thing.
But then, sometimes you just want to iterate across a list of directories, read some files, and put some results in lists. Tail-recursion here feels dumb. Like, dammit, I'm the human and you're the computer, I know you can turn a sequence of instructions into a tail recursive loop for me, so don't make me think.
Haskell, due to its effectful / monadic vs. pure syntax distinction isn't terribly great here. Ideally I'd write all of my code in an "imperative style" all the time, but a row-typed effect inference system would determine which functions were pure, or in IO, or async, or in ST, or threw exceptions, etc. Koka [1] is an interesting language that explores some of these ideas.
Some other projects that are doing cool things in this space are Idris[1] and F-Star[2] (the latter is also a Microsoft Research project). I am particularly impressed by Idris's algebraic effect approach, since it gives you a general effect system that combines the ability to easily extend that monad transformer approaches provide with the ease of use and inference of F*'s effect system.
I think the theory behind algebraic effects is a key step in making it easy to develop code with more stringent and customized safety properties than is available in languages like Rust.
> Ideally I'd write all of my code in an "imperative style" all the time, but a row-typed effect inference system would determine which functions were pure, or in IO, or async, or in ST, or threw exceptions
Yes, this would be a really nice approach (and it's not one that Haskell lovers would be upset by, BTW).
I would be upset if it meant no visible distinction between effectful and non-effectful calls. When refactoring it's really important to know when the order of calls is significant vs when it's incidental.
Agreed, I just recently took some recursive F# code from a photobooth (display countdown, async sleep, take photo, download from camera, recur with new byte[] list)->reverse list, and rewrote it it an imperative loop appending to a normal List<byte[]> and the code is much more intuitive.
Some things are more intuitive as a list of things to do and changes to make.
Yes they are, but I think "for loops" and "mutable lists" a'la Java's ArrayList and Vector and C#'s List<T> don't exist even in IO (I'm not a Haskell expert). That kind of thing has to be done with recursion. They exist in F# but are not idiomatic. However sometimes they're more intuitive (to me at least).
forM is a for each loop over anything iterable (Traversable in Haskell) that will execute a given monadic action on each element in the collection. A more traditional for loop is this over a range of integers or whatever else you feel like.
forM (range 5) println
Mutable collections exist in various forms depending on what kind of mutability you want. For example, there's transactional mutability that gives you mutable data structures inside an STM but forbids IO. And there's IO mutability that removes all the bounds but restricts you to the IO Monad. TVars, MVars, and the various kinds of Refs are what you're looking for here.
I suspect he didn't keep the size of his stm atomically programs small. The docs and associated papers and book materials all make it very very clear that you want to keep stm transactions small and try to defer computation into thinks that are only evaluated when the transaction succeeds.
Compare the cost of optimally, expertly built STM-using code to, say, just assigning some byte values in a *char array. Instruction count? Cache locality? Allocations?!?
EDIT: Oh, and isn't STM typically just a wrapper over immutable stuff anyways? So he said "immutable copying is too slow for me", and you said "here, use this instead, it has all the copy overhead plus some additional overhead".. that's a kick in the teeth.
> Compare the cost of optimally, expertly built STM-using code to, say, just assigning some byte values in a *char array
This is comparing the wrong tasks. Apples and oranges.
STM is usable for this particular task, but will almost assuredly be far slower. If performing the exact same task in Haskell, you could use IO and mutable memory for comparison.
Where STM really shines, however, is in highly-concurrent environments. There's no great "C comparison" for hundreds of concurrent operations locklessly reading/writing to/from the same memory. STM makes this safe and performant without a change in semantics or additional mental overhead from the "non-concurrent" case.
I'd limit your case to "highly-concurrent environments where want lots of shared, mutable state". There's always cases for some shared or global thing that you're transacting with, but I think 99% of code in highly-concurrent environments should be written with message-passing and limited scope at any point in the program.
... Stm in ghc is definitely mutable. Before commit it has an oplog representation but it's most assuredly doing very imperative things like find grained locking and writes to the same reference :)
Though maybe I stand corrected. I revisited that. Here's what I had initially:
(include the following stubs to compile):
let upload _ = async.Return()
let takePhoto i = async.Return [|0uy|]
// i stands for "photo i of N", a required parameter for our workflow
---
let runSequence n =
async {
let rec runSequenceRec (framesSoFar: byte[] list) =
async {
if framesSoFar.Length = n then
return List.rev framesSoFar
else
let! bytes = takePhoto framesSoFar.Length
do! upload bytes
return! runSequenceRec (bytes::framesSoFar)
}
return! runSequenceRec []
}
Nasty dual-level async calls, list reversal, recursion, if/else, an initializing []. A lot of visual junk that takes away from what the code is actually doing. I rewrote it imperatively to this:
let runSequence' n =
async {
let frames = System.Collections.Generic.List<byte[]>()
for i = 0 to n do
let! bytes = takePhoto i
do! upload bytes
frames.Add bytes
return frames
}
Much cleaner, you can see exactly what's going on. I was fine with that, and that's where I left it until now.
However reading some of the Haskell comments, I thought there might be something better. So I tried replicating Haskell's forM:
let forNAsync f n =
let rec runOnce (soFar: _ list) =
async {
if soFar.Length = n then
return List.rev soFar
else
let! result = f soFar.Length
return! runOnce (result::soFar)
}
async.ReturnFrom(runOnce [])
Okay, not pretty but it's a library function so we should never have to look at it again. With that in hand though, we can define:
let runOneShot i =
async {
let! bytes = takePhoto i
do! upload bytes
return bytes
}
let runSequence'' = forNAsync runOneShot
This is terrific. Shorter and cleaner than the imperative method, preserves immutability throughout, and composes easily.
If you were to try to compose `runOneShot` from the imperative method, you'd end up with this, because of your frames variable
let runSequence''' n =
async {
let frames = System.Collections.Generic.List<byte[]>()
for i = 0 to n do
let! bytes = runOneShot i
frames.Add bytes
return frames
}
Or if you wanted to try to rework the `runOneShot` to include the frames variable, it wouldn't be much easier:
let runOneShot'''' i (frames: System.Collections.Generic.List<byte[]>) =
async {
let! bytes = takePhoto i
do! upload bytes
frames.Add bytes
}
let runSequence'''' n =
async {
let frames = System.Collections.Generic.List<byte[]>()
for i = 0 to n do
do! runOneShot'''' i frames
return frames
}
So, maybe things that seem imperative do work better functionally. You've just got to create some combinators that let you hook them up.
Additionally, in languages with good, convenient support for functional programming like Haskell and Scala, I find that I tend to write lots of short functions, often making use of higher-order functions to decompose code into really simple, and potentially reusable, pieces. And like you say, this really helps to show intent. The same sort of thing, in fact, that object-oriented programmers try to do with design patterns such as the Strategy pattern - but with fewer lines of code.
I'm not sure I agree. Functional Reactive Programming is a declarative way of doing user interaction, but it's still a computer science research topic (Elm used to claim to be FRP, but has now decided to move away from its old allegedly FRP-like style). I need to play around with the latest FRP research libraries some time!
In addition there is also machines/pipes/conduits, which all mix imperative I/O with declarative style code in an interesting and new way.
FRP is awesome and is an example of how something previously thought of as an inherently imperative problem can be much easily done declaratively; we completely agree on this!
What I was talking about is: almost all lower-level abstractions are designed with the view of computer as a state machine. If we are solving a problem at an abstraction low enough to make us deal with this analogy, it is easier not to have a declarative layer in between.
machines/pipes/conduits are all very interesting and I was going to mention them too, but I still don't think they solve this problem in the way OP was referring to.
Before computers, far more people did bookkeeping or followed recipes or read sheet music than did abstract mathematics. I think one can reasonably say that lists of instructions that change mutable variables/objects is one of the most natural ways people can operate. Anyone learning a task essentially does this in the real world. It may indeed be terrible when done at a large scale. But there are a lot of people who can naturally understand these.
Immutable data may be a great idea. But is it the first that occurs to anyone? I have an MA in mathematics and the idea of mutable variables never seemed strange in the process of learning to program.
But even more, for a lot of people, even algebra isn't a natural way of thinking. What's natural is a series of instruction about how perform arithmetic. Learning what an abstract variable means is a hurdle for a lot of students.
The thing is that "natural", in this context, just means "what occurs to you first, all things being equal".
You may have a good and correct argument that learning Haskell is the right way to do things, that the costs of learning are more than exceed by the benefits.
But I don't think you can argue people into having functional approaches be the first thing that occurs to them.
I think it's impossible to disentangle the reality of common programming languages and how they work from what is inherently more "natural". The bottom line is programming is not natural, not in any way shape or form. It's hard to learn imperative programming, it's hard to learn functional programming, and it's really hard to learn object-oriented programming (well). If all languages were functional I don't think that would really be harder than learning BASIC, people would just accept that is what programming is and get on with their lives.
I acknowledge that imperative programming is more analogous to real world tasks, but then as soon as you try to write your first program you run headlong into the fact that programming is nothing like other tasks you have done before. The tricks you have to learn to make imperative programming work are not really any easier than the tricks for other paradigms, and in fact you can mix and match from various paradigms to great effect (see: ruby).
"...it's really hard to learn object-oriented programming (well)."
You have no idea how desperately I wish more people understood that. I could swear that every introductory programming book produced in the last years was either written by someone who either completely forgot how hard this material was to originally master, or who was just some freak of nature who just intuitively grasped it.
Please, please, please, give your students a solid grounding in procedural (or even functional!) programming before teaching OOP. You can teach them about objects and how to use them, but please stop making people write entire classes before they have at minimum a month of coding exercises behind them! (Universities that teach CS101 in Java, I'm looking at you.)
Anyway, I strongly agree with you and reached for the upvote button, and ended up downvoting you instead. Stupid touchscreens.
but then as soon as you try to write your first program you run headlong into the fact that programming is nothing like other tasks you have done before. The tricks you have to learn to make imperative programming work are not really any easier than the tricks for other paradigms.
There you go again, equating good and natural.
It seems like even functional decomposition can be unnatural for people even given the commonness of "one big loop" mega-functions as the solution created by a lot of contractors.
Certainly, the problems of mud ball and simple imperative programming styles is that they get harder as you go along.
Indeed, programming methodologies are more or less unnatural ways to program that allow programmers to be more productive and programs to scale in compensation for that unnaturalness.
Consider that mathematicians among the general public, pre-computers, were actually far more rare than functional programmers among programmers. Mathematics is also an unnatural construction. I grew having a great deal of difficulty with handwriting before computers appeared and I learned to love math shortcuts for the writing they saved me. But I also learned most people are more mentally lazy than physically lazy and on average would rather do more work than learn a new thing.
If you view it in terms of psychological development, understanding concrete imperative steps happens before children are capable of understanding abstract concepts like mathematics. The works of Seymour Papert describe how these findings relate to education.
That said, people who connect this to software in a professional context are doing a disservice since hopefully software professionals have progressed in their intellectual development beyond their teens and are capable of abstract mathematical thought.
Concrete imperative steps are extremely mathematical -- even elegantly so -- and most of software verification is predicated on this idea. Just like you can describe physics pretty precisely with simple math, you can do it with computation. The math of PFP is one particular approximation, chosen for its supposed utility as the basis for a language. It is more linguistic math than descriptive math. I also hope people adopt abstract mathematical thought when reasoning about their programs, but that has little to do with the kind of math they choose to model their programs' source code with.
Sure, you can model imperative steps in a variety of ways using mathematics, but for 99% of mathematics there is a far more direct link to pure functions. In how many branches of mathematics do you see a function that is anything but a mapping between sets (or types, or objects in a category). In terms of using the same kind of processes and intuitions, pure functional programming is much closer to mathematics as done by humans than imperative programming. Even theoretical computer science looks this way outside of particular algorithms being studied.
Not that this is some sort of indictment of imperative programming; it's a much better fit for a lot of different situations. But if you're trying to model something in abstract mathematics, pure or pure-ish function programming is likely to be a better solution. For example, look at the various proof assistants like Coq. Many of them are written in functional languages, and if you look at the code you can see why: the manipulation of formulas and proofs fits very well with an immutable, functional approach.
> but for 99% of mathematics there is a far more direct link to pure functions.
But state transitions are pure! Deterministic state machines are nothing more than mathematical functions from one algorithm state to the next, and nondeterministic ones are binary relations! (see this for an overview: http://research.microsoft.com/en-us/um/people/lamport/pubs/s...) Those can then be mathematically manipulated, substituted and what-have-you elegantly and simply. This is how formal verification has been reasoning about programs since the seventies (early on the formalization was a bit more cumbersome, but the concept was as simple).
> For example, look at the various proof assistants like Coq.
For example, look at the specification language (and proof assistant) TLA+. It is far simpler than Coq, just as pure, and requires no more than highschool math.
Coq and other PFP formalizations are built on the core idea of denotational semantics, an approximation which at times only complicates the very simple math of computation, requiring exotic math like category theory.
> the manipulation of formulas and proofs fits very well with an immutable, functional approach.
The manipulation of formulas and proofs fits very well with the imperative approach, only no category theory is required, and core concepts such as simulation/trace-inclusion arise naturally without need for complex embeddings. "PFP math" is what arises after you've decided to use denotational semantics. The most important thing is not to confuse language -- for which PFP has an elegant albeit arcane mathematical formalization -- with the very simple mathematical concepts underlying computation.
PFP is an attempt to apply one specific mathematical formulation to programming languages (other similar, though perhaps less successful attempts are the various process calculi). But it is by no means the natural mathematics of computation. Conflating language with algorithms is a form of deconstructionism: an interesting philosophical concept, but by no means a simple one, and perhaps not the most appropriate one for a science. Short of that, we have the basic, simple core math, and on top of it languages, some espousing various mathematical formalizations, related to the core math in interesting ways, and some less mathematical. But the math of computation isn't created by the language!
> But state transitions are pure! Deterministic state machines are nothing more than (obviously, "pure") functions from one program state to the next, and nondeterministic ones are binary relations!
Okay, yes, but 99% of mathematics isn't deterministic state machines. So again, pure functions can model a lot of things, but deterministic state machines are foreign to the way people think about mathematics in most instances.
> For example, look at the pure specification language (and proof assistant) TLA+. It is far simpler than Coq, just as pure, and requires no more than highschool math.
Okay, that's completely fine as an example. I just mentioned Coq because it's one of the better known ones.
> Coq and other PFP formalizations are built on the core idea of denotational semantics, an approximation which at times only complicates the very simple math of computation, requiring exotic math like category theory.
I'm not sure where CiC (or any of the other recent type theories, including HOTT) requires category theory (which is hardly exotic in 2016). People who are doing meta-mathematics on these systems are interested in categorical models and such, but implementing and using these systems requires no category theory.
Categories are not fundamental objects in these theories; you have to build them just like you do in material set theories, although some parts of that process are easier (especially in HOTT). But that is the reverse of the link that you're claiming.
> The manipulation of formulas and proofs fits very well with the imperative approach, only no category theory is required, and core concepts such as simulation/trace-inclusion arise naturally without need for complex embeddings.
Again, no category theory is required (though I'm still not sure why this is a bad thing?) to develop a prover like Coq or TLA+. If you're bringing simulation and trace-inclusion into this, then you're just saying the stateful, imperative approach is well adapted to working with stateful, imperative systems. I agree, but how exactly does that equate to that approach having any benefit whatsoever for formalizing the rest of mathematics?
> Okay, yes, but 99% of mathematics isn't deterministic state machines. So again, pure functions can model a lot of things, but deterministic state machines are foreign to the way people think about mathematics in most instances.
Sorry, I wasn't clear. 99% of mathematics isn't Schrödinger's equation either, but Schrödinger's equation is still relatively simple math. State machines are simple math, but math isn't state machines. State machines are the concept that underlies computation, and simple math is used to reason about them.
> I just mentioned Coq because it's one of the better known ones.
... and one of the least used ones, at least where software is concerned. There's a reason for that: it's very hard (let alone for engineers) to reason about computer programs in Coq; it's much easier (and done by engineers) to reason about computer programs in TLA+ (or SPIN or B-Method or Alloy).
> which is hardly exotic in 2016
I think it's safe to say that most mathematicians in 2016 -- let alone software engineers -- are pretty unfamiliar with category theory, and have hardly heard of type theory. Engineers, however, already have nearly all the math they need to reason about programs and algorithm in the common mathematical way (they just may not know it, which is why I so obnoxiously bring it up whenever I can, to offset the notion that is way overrepresented here on HN that believes that "PFP is the way to mathematically reason about programs". It is just one way, not even the most common one, and certainly not the easiest one).
> but implementing and using these systems requires no category theory.
Right, but we're talking about foundational principles of computation. Those systems are predicated on denotational semantics, which is a formalization that identifies a computation with the function it computes (yes, some of those systems also have definitional equality, but still, denotational semantics is the core principle), rather than view the computation as built up from functions (in fact, this is precisely what monads do and why they're needed, as the basic denotational semantics fails to capture many important computations). This formalization isn't any better or worse (each could be defined in terms of the other), but it is more complicated, and is unnecessary to mathematically reason about programs. It does require CT concepts like monads to precisely denote certain computations.
> If you're bringing simulation and trace-inclusion into this, then you're just saying the stateful, imperative approach is well adapted to working with stateful, imperative systems.
There are no "imperative systems". Imperative/functional is a feature of the language used to describe a computation, not the computation itself (although, colloquially we say functional/imperative algorithms to refer to those algorithms that commonly arise when using the different linguistic approaches). The algorithm is always a state machine (assuming no textual deconstructionism) -- whether expressed in a language like Haskell or in a language like BASIC -- and that algorithm can be reasoned about with pretty basic math. And I am not talking about a "stateful" approach, but a basic mathematical approach based on state machines (a non-stateful pure functional program also ultimately defines a state machine).
> I agree, but how exactly does that equate to that approach having any benefit whatsoever for formalizing the rest of mathematics?
Oh, I wasn't talking about a new way to formalize the foundation of mathematics (which, I've been told, is the goal of type theory), nor do I think that a new foundation for math is required to mathematically reason about computation (just as it isn't necessary to reason about physics). I just pointed out that algorithms have a very elegant mathematical formulation in "simple" math, which is unrelated to PFP. This formulation serves as the basis for most formal reasoning of computer programs.
> I think it's safe to say that most mathematicians in 2016 -- let alone software engineers -- are pretty unfamiliar with category theory
I'm sure that's true for software engineers, but my experience is that category theory has permeated most fields to a significant degree. Many recent graduate-level texts on fields of mathematics from topology to differential geometry to algebra incorporate at least basic category theory like functors and natural transformations. It's even more common at the research level. And I say all of this not having actually met a single actual category theorist, only those in other fields who used at least some of it.
> Those systems are predicated on denotational semantics, which is a formalization that identifies a computation with the function it computes ... rather than view the computation as built up from functions
You know there's operational semantics too, right? Operational semantics typically describes the behaviour of your program as a state machine, especially the small step type of operational semantics.
Of course. But I was talking about FP there, and, I think, operational semantics must be embedded in FP. In the state machine view, operational semantics are simply a refinement SM of the denotational semantic (which could be also specified as a nondeterministic SM)
I was being vague and imprecise. Obviously, any semantics is semantics of a language, and every language has its own perfectly fine operational semantics. I think that FP is a non-optimal first-choice for operational reasoning; let's call it that.
But while I have your attention, let me try to put the finger on two concrete problem areas in the typed-PFP reasoning. Now, I'm not saying that this isn't useful for a programming language; I am saying that it is a poor choice for reasoning about programs. I think that FP has to discrete "jumps" between modes, that are simply unnecessary for reasoning and do nothing but complicate matters (again, they may be useful for other reasons):
The first is the jump between "plain" functional code and monadic code. Consider an algorithm that sorts list of numbers using mergesort. The program could be written to use simple recursion, or, it could be written using a monad, with the monadic function encoding just a single step. Those two programs are very different but they encode the very same algorithm! The first may use a more denotational reasoning, and the second a more operational one. In TLA, there is no such jarring break. You can specify an algorithm anywhere you want on a refinement spectrum. Each step may be just moving a single number in memory, splitting the array, all the way to sorting in one step. The relationship between all these different refinement levels is a simple implication:
R1 => R2 => R3
where R3 may be completely denotational and not even algorithmic as in:
done => result = Sorted(input)
(assuming some operator Sorted).
The second discrete jump is between the program itself and the types. I was just talking to an Agda-using type theorist today, and we noted how a type is really a nondeterministic program specifying all possible deterministic programs that can yield it. This is a refinement relation. Yet, in FP, types are distinct from programs (even in languages where they use the same syntax). In TLA the relationship between a "type", i.e. a proposition about a program and the program is, you guessed it, a simple refinement, i.e. simple logical implication (he figures that intermediate refinement steps are analogous to a "program with holes" in Agda). So, the following is a program that returns the maximal element in a list:
but it is also the type (assuming dependent types) of all programs that find the maximum element in a list, say (details ommitted):
A1 ≜ done = FALSE ∧ max = {} ∧ i = 0
∧ [](IF i = Len(input) THEN done' = TRUE
ELSE (input[i] > max => max' = input[i])
∧ i' = i + 1)
Then, A1 => A2, because A1 is a refinement of A2.
So two very central concepts in typed-PFP, namely monads and types are artificial constructs that essentially just mean refinements. Not only is refinement a single concepts, it is a far simpler concept to learn than either monads or types. In fact, once you learn the single idea of an "action" in TLA, which is how state machines are encoded as logic (it is not trivial for beginners, but relatively easy to pick up), refinement is plain old familiar logical implication.
So I've just removed two complicated concepts and replaced it with a simple one that requires little more than highschool math, all without losing an iota of expressivity or proving power.
That depends what it is that you want to specify: a function or a computation (i.e., a machine). In TLA+ as in "reality" computations are not functions, and functions are not computations.
We'll start with a function. It's basically `[a -> [a -> a]]` or `[a × a -> a]`. More completely (to show that `a` is abstract):
CONSTANT a, f
ASSUME f ∈ [a -> [a -> a]]
`CONSTANT` means that a can be anything (formally, any set)
If you want to specify a computation, that, when terminates[1] returns a result in `a`, the "type" would be `[](done => result ∈ a)` (where [] stands for the box operator, signifying "always"), or more fully:
CONSTANT a, Input
ASSUME Input ∈ a × a
VARIABLES done, result
f ≜ ... \* the specification of the computation
THEOREM f => [](done => result ∈ a)
So the first type is purely static. A set membership. The second has a modality, which says that whenever (i.e., at any step) `done` is true, then `result` must be in `a`.
------
[1]: Let's say that we define termination as setting the variable `done` to TRUE. BTW, in TLA as in other similar formalisms, a terminating computation is a special case of a non-terminating one, one that at some finite time stutters forever, i.e., doesn't change state, or, formally <>[](x' = x), or the sugared <>[](UNCHANGED x) namely eventually x is always the same (<> stands for diamond and [] for box; for some reason HN doesn't display those characters)
Reduce(Op(_, _), x, seq) ==
LET RECURSIVE Helper(_, _)
Helper(i, y) ==
IF i > Len(seq) THEN y
ELSE Helper(i + 1, Op(y, seq[i]))
IN Helper(1, x)
A couple of things. First, the need for the helper is just a limitation of the language that doesn't allow recursive operators to take operators as arguments. Second, I used indexing, but you can use the Head and Tail operators. It's just that I use this often, and I care about the model-checking performance. Finally, Reduce is an operator not a function (it is not a function in the theory). Hence, its domain (for all arguments that aren't operators) is "all sets" something that would be illegal as a "proper" function (Russel's paradox). Operators, like functions, can be higher order, but they're not "first-class", i.e., they can't be considered "data". Functions require a proper domain (a set).
> Do you mean you can't pass an operator to a function
Right. Functions are objects in the theory, and must have a domain which is a set. Operators are outside the theory, and aren't in any sets. You also can't return an operator as a result of an operator.
A function can, of course, return a function, but functions can't be polymorphic within a single specification. So they must have a particular (though perhaps unspecified) set as their domain. Why? Because there is no such thing as polymorphic functions in math. Polymorphism is a feature of a certain language or "calculus", and TLA+ is about algorithms, not about a specific linguistic representation of them. "Polymorphism" of the kind I've shown does make sense, because you determine the level of the specification, and you can say that you want to reason about the use of a function (or a computation) without assuming anything more specific other than your explicit assumptions about a certain set. But that is not to say that you can’t have an operator that “creates” functions generically. E.g.:
Foo(a) ≜ [x ∈ a × a ↦ x[1]] /or/ Foo(a) ≜ LET f[x ∈ a × a] ≜ x[1] IN f
Which you can then use like so: Foo(Nat)[1, 2]. Foo must be an operator because its argument `a` ranges over all sets, so it's not a proper domain (Russel, etc.)
> or make it the argument of a computation
Ah, that actually you can do, and it's quite common as it's very useful. A constant can be an operator (a variable can't because a variable is state, and state must be an object in the theory). For example, it's useful for taking a relation as input (although a relation can also be defined as a set of pairs, so it’s an object in the theory). If in the polymorphic example you want to say that the set `a` is partially ordered (because you're specifying a sorting algorithm), you can write[1]:
CONSTANTS a, _ ⊑ _
ASSUME ∀x, y, z ∈ a :
∧ x ⊑ x
∧ x ⊑ y ∧ y ⊑ x => x = y
∧ x ⊑ y ∧ y ⊑ z => x ⊑ z
All of this, BTW, is actually the "+" in TLA+ (which is a bit technical, and trips me up occasionally), and not where the core concepts lie, which is the TLA part. Lamport would say you’re focusing on the boring algebraic stuff, rather than the interesting algorithmic stuff… I guess you could come up with another TLA language, with a different "+", but the choice of this particular "+" (i.e, set theory) was made after experimentation and work with engineers back in the '90s (Lamport says that early versions were typed). There is actually a type inferencer for TLA+, which is used when proofs are passed to the Isabelle backend (or intended to be used, I don't know; the proof system is under constant development at INRIA). So you get rich mathematical reasoning using simple math — no monads or dependent types (nor any types) necessary. Those things can be useful (for automatic code extraction, maybe, or for deep mathematical theories of languages — not algorithms), but for mathematically reasoning about programs, this is really all you need. And it works well, in the industry, for programs as small as a sorting routine or as large as a complex distributed database.
[1]: The aligned conjunctions are a syntactic convenience to avoid parentheses.
Constants are external inputs to a specification. All you know about them is what you assert in the assumptions. When you use the proof system, the assumptions are your axioms and if you don't have any, then it's really polymorphic. If you use the model checker, you'll need to supply the checker with a concrete -- and finite[1] -- set for all constants.
[1]: The one supplied with TLA+, called TLC, is powerful enough to model-check an expressive language like TLA+, but it's algorithm is rather primitive; it's an "old-school" explicit state model-checker. A more modern, state-of-the-art model checker is under research: http://forsyte.at/research/apalache/. BTW, modern model checkers are amazing. Some can even check infinite models.
You always want to decompose your model to help reasoning about it though, no? Even if you're modeling as a state machine, rather than a single big state machine you'd want to separate it into small orthogonal state machines as much as possible.
I see the functional approach as taking that one step further: separate the large proportion of the program that doesn't depend on state at all (i.e. that's conceptually just a great big lookup table - which is the mathematical definition of a function) from the operations that fundamentally interact with state. I find anything involving state machines horrible to reason about, so I'd prefer to minimize the amount of the program where I have to think about them at all.
> Even if you're modeling as a state machine, rather than a single big state machine you'd want to separate it into small orthogonal state machines as much as possible.
Your notion of state is too specific for the theoretical meaning. Which function is executing in an FP computation is also state, as is the values of its arguments. Every software system is a single, possibly nondeterministic state machine. The decomposition you speak of is merely a decomposition of the state transition into various chunks (formulas). You can think of an abstract state machine as a single (pure) function -- just like a simple FP program -- except not a function from the input to the output, but a function from one state to the next (kinda like a state monad's monadic function, but expressed more naturally). One last complication is that it isn't quite a function but a relation, as you want to express the fact that your state machine may do one of several things in the next state, e.g. to describe different interleaving of threads, or even to model the user generating input.
Another thing you want to do is refinement and abstraction, i.e. specify your algorithm in different levels of detail (machines with more or fewer states) and show that the properties you want are preserved in the abstraction. Of course, you won't do that for something as simple as a sorting algorithm, but you also want to reason about large complex things, like an OS kernel or a distributed database.
So TLA simplifies things further by saying that the whole world is a single state machine, and your specifications are restricted views of that "world machine". This allows you to specify a clock with minute second hands, and another specification of a clock with just a minute hand, and then say that both are views of the same clock, with the first being just a more refined description of it than the first (this is a problem with multiple state machines, as one takes a step every second, and the other only every minute).
> I see the functional approach as taking that one step further: separate the large proportion of the program that doesn't depend on state at all (i.e. that's conceptually just a great big lookup table - which is the mathematical definition of a function) from the operations that fundamentally interact with state.
Again, what you consider state and the "abstract state" in abstract state machines are not quite the same. There is no such thing as a program that doesn't depend on state. Wherever there's a program, there's state. If you implement an algorithm that finds the maximum element in a list of numbers by a simple traversal in a PFP language and in an impure imperative language, the result would look very different, but the algorithm is the same, hence the state and the state transitions are identical.
That's the whole point in thinking of algorithms, not of code. I'd guess that this is what you do anyway -- regardless of the language. You don't necessarily always reach the same level -- e.g. once your complex distributed blockchain needs to find the maximal number in a list, you may go down to the index level in an imperative language, yet stop at the fold level in FP, and that's fine (you decide what an abstract machine can do at each step) -- but ultimately, at some point, you always think of your algorithm in the abstract -- you see it running in your mind -- rather than in linguistic terms, and that is the interesting level to reason about it. Forget about FSMs. What you imagine is pretty much the abstract state machine you can reason about mathematically. A mathematical state machine is simply a description of your program's steps (and every program is composed of steps) at any level of detail.
> That's the whole point in thinking of algorithms, not of code. I'd guess that this is what you do anyway -- regardless of the language. You don't necessarily always reach the same level -- e.g. once your complex distributed blockchain needs to find the maximal number in a list, you may go down to the index level in an imperative language, yet stop at the fold level in FP, and that's fine (you decide what an abstract machine can do at each step) -- but ultimately, at some point, you always think of your algorithm in the abstract -- you see it running in your mind -- rather than in linguistic terms, and that is the interesting level to reason about it. Forget about FSMs. What you imagine is pretty much the abstract state machine you can reason about mathematically.
This is a fascinating view, but no, I really don't. I think of the function that finds the maximal number in a list as a mathematical function (that is, "really" a set of pairs - a giant lookup table, that the implementing computer will have some tricks for storing in a finite amount of memory but those tricks are implementation details). I think of a function composed of two functions (when I think about it at all) as a bigger table (like matrix multiplication) - not as anything stateful, and not as anything involving steps. Like, if I think about 5 + (2 * 3), I think of that as 5 + 6 or 11, and sure you can model that as a succession of states if you want but I don't find that a helpful/valuable way to think about it. It seems like you want to think of the process of evaluation as first-class, when as far as I'm concerned it's an implementation detail; I see my program as more like a compressed representation of a lookup table than a set of operations to be performed.
> I think of the function that finds the maximal number in a list as a mathematical function
My point is not how you would classify the object which is the program, but how you would go about programming it. You can think of it as a lookup table all you want, but you will define your algorithm as some state machine. Maybe looking for the maximum value is too simple, as the use of a combinator is too trivial, but think of a program that sorts a list. You can imagine the program to be a function if you like, but when you design the algorithm, there is no way, no how, that you don't design it as a series of steps.
> not as anything stateful
Again, don't conflate what programmers normally think of as "stateful" with the concept of a state in an abstract state machine.
> and sure you can model that as a succession of states if you want but I don't find that a helpful/valuable way to think about it.
I don't find it helpful, either, so I don't do it (more on that later). Whether you want to say that 5 + (2 * 3) in TLA is one step or several is completely up to you. TLA is not a particular state machine model like lambda calculus or a Turing machine, but an abstract state machine.
> It seems like you want to think of the process of evaluation as first-class, when as far as I'm concerned it's an implementation detail
You are -- and I don't blame you, it's hard to shake away -- thinking in terms of a language, which has a given syntax and a given semantics and "implementation". When reasoning about algorithms mathematically, there are no "implementation" details, only the level of the algorithm you care to specify. In TLA+ you can define higher-order recursive operators like "reduce" that does a reduction in one step -- or zero-time if you like -- or you can choose to do so over multiple steps. It is up to you. There is no language, only an algorithm, so no "implementation", only the level of detail that interests you in order to reason about the algorithm. Personally, for simple folds I just use the reduce operator in "zero-time", because I really don't care about how it's done in the context of my distributed data store. But if you wanted to reason about, say, performance of a left-fold or a right-fold, you'll want to reason about how they do it, and how any computation is done is with a state machine.
> You can think of it as a lookup table all you want, but you will define your algorithm as some state machine. Maybe looking for the maximum value is too simple, as the use of a combinator is too trivial, but think of a program that sorts a list. You can imagine the program to be a function if you like, but when you design the algorithm, there is no way, no how, that you don't design it as a series of steps.
Again, disagree. I think of it more as defining what a sorted list is to the computer. (An extreme example would be doing it in Prolog or the like, where you just tell the computer what it can do and what you want the result to look like). It's very natural to write mergesort or bubblesort thinking of it a lookup table - "empty list goes to this, one-element list goes to this, more-than-one-element list goes to... hmm, this".
Now that's not an approach that's ever going to yield quicksort, but if we're concerned with correctness and not so much about performance then it's a very usable and natural approach.
> You are -- and I don't blame you, it's hard to shake away -- thinking in terms of a language, which has a given syntax and a given semantics and "implementation". When reasoning about algorithms mathematically, there are no "implementation" details, only the level of the algorithm you care to specify. In TLA+ you can define higher-order recursive operators like "reduce" that does a reduction in one step -- or zero-time if you like -- or you can choose to do so over multiple steps. It is up to you. There is no language, only an algorithm, so no "implementation", only the level of detail that interests you in order to reason about the algorithm.
I think you're begging the question. If you want to reason about an algorithm - a sequence of steps - then of course you want to reason about it as a sequence of steps. What I'm saying is it's reasonable and valuable, under many circumstances, to just reason about a function as a function. If we were doing mathematics - actual pure mathematics - it would be entirely normal to completely elide the distinction between 5 + (2 * 3) and 11, or between the derivative of f(x) = x^2 and g(x) = 2x. There are of course cases where the performance is important and we need to include those details - but there are many cases where it isn't.
> if you wanted to reason about, say, performance of a left-fold or a right-fold, you'll want to reason about how they do it, and how any computation is done is with a state machine.
Sure. (I think/hope we will eventually be able to find a better representation than thinking about it as a state machine directly - but I completely agree that the functional approach simply doesn't have any way of answering this kind of question at the moment).
> Again, disagree. I think of it more as defining what a sorted list is to the computer. (An extreme example would be doing it in Prolog or the like, where you just tell the computer what it can do and what you want the result to look like).
Exactly, but the point is defining at what level. Take your Prolog example. This is how you can define what a sorted list is in TLA+:
So this is a definition, but it's not an algorithm. Yet, in TLA+ it could describe a state machine of sorts: the nondeterministic state machine that somehow yields a sorted list. Any particular sorting algorithm is a refinement of that machine. Of course, the refinement does not need to happen in one step. You can describe a mergesort as a machine that somehow merges, which is a refinement of the magical sorting machine. Any machine that merges, say, using a work area in a separate array, is then a refinement of that. You decide when to stop.
If there's a difference in how you "define" a sorted list via bubble-sort or mergesort in your lookup table image that difference is an abstract state machine. For example, in the mergesort case your composition "arrows" are merges. You have just described a (nondeterministic) state machine. Otherwise, if your lookup table algorithm doesn't have any steps, it must simply map the unsorted array to the sorted array and that's that. If there is any intermediate mappings, those are your transitions.
The state machines are nondeterministic, so you don't need to say "do this then do that". It's more like "this can become that through some transformation". Every transformation arrow is a state transition. You don't need to say in which order they are followed.
Here is a certain refinement level of describing mergesort
∃x, y ∈ sortedSubarrays : x ≠ y ∧ x[2] + 1 = y[1]
∧ array' = Merge(array, x, y)
∧ sortedSubarrays' = {sortedSubarrays \ {x,y}} ∪ {<<x[1], y[2]>>}
That's it; That's the state machine for mergesort (minus the initial condition). It is completely declarative, and it basically says "any two adjacent sorted subarrays may be merged". Note that I didn't specify anything about the order in which subarrays are sorted; this machine is nondeterministic. Even that may be too specific, because I specified that only one pair of subarrays picked at each step. I could have said that any number of pairs is picked, and then my specification would be a refinement of that.
You can define an invariant:
Inv ≜ [](∀x ∈ sortedSubarrays : IsSorted(x))
The box operator [] means "always". Then you can check that
MergeSort => Inv
to assert that the sorted subarrays are indeed all sorted.
> I think/hope we will eventually be able to find a better representation than thinking about it as a state machine directly
There are two requirements: first, the formulation must be able to describe anything we consider an algorithm. Second, the formulation must allow refinement, i.e. the ability to say that a mergesort is an instance of a nondeterministic magical sorting algorithm, and that a sequential or a parallel mergesort are both instances of some nondeterministic mergesort.
> If there's a difference in how you "define" a sorted list via bubble-sort or mergesort in your lookup table image that difference is an abstract state machine.
Perhaps. But very often that's precisely the kind of difference I want to elide, since the two are equivalent for the purposes of verifying correctness of a wider program that uses one or the other.
I find it a lot more difficult to think about a state machine than about a function. What is it I'm supposed to be gaining (in terms of verifying correctness) by doing so? (I might accept that certain aspects of program behaviour can only be modeled that way - but certainly it's usually possible to model a large proportion of a given program's behaviour as pure functions, and I find pure functions easier to think about by enough that I would prefer to divide my program into those pieces so that I can think mostly about functions and only a little about states).
I think we're talking at two different levels here. I am not talking about using a sorting routine in a program, but about writing a sorting routine. You want to know that your sorting routine actually sorts. Your routine is a state machine; its blackbox description is a function. You want to prove that your state machine implements the function. Our goal is to verify that mergesort actually sorts. Not that the air traffic control program using mergesort works (we could, but then the ATC program would be the state machine, and sorting would be just a function).
> Perhaps. But very often that's precisely the kind of difference I want to elide, since the two are equivalent for the purposes of verifying correctness of a wider program that uses one or the other.
This means that both are refinements of the "magical sorting machine". You can use "magical sorting" in another TLA+ spec, or, as in this running example, prove that mergesort is a refinement of "magical sorting". If what you want to specify isn't a mergesort program but an ATC program that uses sorting, of course you'd use "magical sorting" in TLA+. You specify the "what" of the details you don't care about, and the "how" of the details you do.
> I find it a lot more difficult to think about a state machine than about a function.
Sure, but again when you want to verify an algorithm, it may use lots of black boxes, but the algorithm you are actually verifying is a white box. That algorithm is not a function; if it were, then you're not verifying the how just stating the what. The how of whatever it is that you actually want to verify (not the black boxes you're using) is a state machine. Other ways of thinking about it include, say, process calculi, lambda calculus etc., but at the point of verifying that, considering it a function is the one thing you cannot do, because that would be assuming what you're trying to prove.
> Sure, but again when you want to verify an algorithm, it may use lots of black boxes, but the algorithm you are actually verifying is a white box. That algorithm is not a function; if it were, then you're not verifying the how just stating the what. The how of whatever it is that you actually want to verify (not the black boxes you're using) is a state machine. Other ways of thinking about it include, say, process calculi, lambda calculus etc., but at the point of verifying that, considering it a function is the one thing you cannot do, because that would be assuming what you're trying to prove.
Well a function is necessarily a function (if we enforce purity and totality at the language level). Presumably what we want to verify is that its output has certain properties, or that certain relations between input and output hold. But that seems like very much the kind of thing we can approach in the same way we'd analyse a mathematical function.
Let me also address this with some snarky quotes by Lamport[1] (and just add, that having tried what he suggests on real-world programs, although what he says may sound more complicated, it is actually simple):
> Computer scientists collectively suffer from what I call the Whorfian syndrome — the confusion of language with reality
> [R]epresenting a program in even the simplest language as a state machine may be impossible for a computer scientist suffering from the Whorfian syndrome. Languages for describing computing devices often do not make explicit all components of the state. For example, simple programming languages provide no way to refer to the call stack, which is an important part of the state. For one afflicted by the Whorfian syndrome, a state component that has no name doesn’t exist. It is impossible to represent a program with procedures as a state machine if all mention of the call stack is forbidden. Whorfian-syndrome induced restrictions that make it impossible to represent a program as a state machine also lead to incompleteness in methods for reasoning about programs.
> To describe program X as a state machine, we must introduce a variable to represent the control state—part of the state not described by program variables, so to victims of the Whorfian syndrome it doesn’t exist. Let’s call that variable pc.
> Quite a number of formalisms have been proposed for specifying and verifying protocols such as Y. The ones that work in practice essentially describe a protocol as a state machine. Many of these formalisms are said to be mathematical, having words like algebra and calculus in their names. Because a proof that a protocol satisfies a specification is easily turned into a derivation of the protocol from the specification, it should be simple to derive Y from X in any of those formalisms. (A practical formalism will have no trouble handling such a simple example.) But in how many of them can this derivation be performed by substituting for pc in the actual specification of X? The answer is: very, very few.
> Despite what those who suffer from the Whorfian syndrome may believe, calling something mathematical does not confer upon it the power and simplicity of ordinary mathematics.
Obviously, we all suffer from the Whorfian syndrome, but I think it's worthwhile to try and shake it off. Luckily, doing it doesn't require any study -- just some thinking.
I feel like you have confused quite a few concepts here.
First, type theory based proof assistants (like Coq) and model checkers are very different tools with very different guarantees. Most model checkers are used for proving properties about finite models and can not reason about infinite structures.
A type theory with inductive types gives one a mechanism for constructing inductive structures, and the ability to write proofs about them.
I am not sure how you have equated denotational semantics and type theory but there is no inherent connection, denotational semantics are just one way to give a language semantics. One can use them to describe the computational behavior of your type theory, but they are not fundamental.
Category theory and type theory have a cool isomorphism between them, but otherwise can be completely silo'd you can be quite proficient in type theory and never touch or understand any category theory at all.
On the subject of TLA+, Leslie Lamport loves to talk about "ordinary mathematics" but he just chose a different foundational mathematics to build his tool with, type theory is an alternative formulation in this regard. One that is superior in some ways since the language for proof, and programming is one and the same and does not require strange stratification or layering of specification and implementation languages.
Another issue with many of these model based tools is the so called "formality gap" building a clean model of your program and then proving properties is nice, but without a connection to your implementation the exercise has questionable value. Sure, with distributed systems for example, writing out a model of your protocol can help find design bugs, but it will not stop you from incorrectly implementing said protocol. In the distributed systems even with testing, finding safety violations in practice is hard, and many of them can occur silently.
Proof assistants like Coq make doing this easier since your implementation and proof live side by side, and you can reason directly about your implementation instead of a model. If you don't like dependently typed functional languages you can check out tools like Dafny which provide a similar work style, but with more automation and imperative programming constructs.
> This formulation serves as the basis for most formal reasoning of computer programs.
On this statement I'm not sure what community you come from, but much of the work going on in the research community is using things like SMT which exposes SAT and a flavor of first order logic, an HOL based system like Isabelle, or type theory, very few people use tools like set theory to reason about programs.
> Engineers, however, already have nearly all the math they need to reason about programs and algorithm in the common mathematical way (they just may not know it, which is why I so obnoxiously bring it up whenever I can, to offset the notion that is way overrepresented here on HN that believes that "PFP is the way to mathematically reason about programs".
Finally this statement is just plain not true, abstractly its easy to hand way on paper about the correctness of your algorithm. I encourage you to show me the average engineer who can pick up a program and prove non-trivial properties about its implementation, even on paper. I wager even proving the implementation of merge sort correct would prove too much. I've spent the last year implementing real, low-level systems using type theory, and this stuff is hard if you can show me a silver bullet I would be ecstatic, but any approach with as much power and flexibility is at least as hard to use.
Its not that "PFP" (and its not PFP, its type theory that makes this possible) is the "right way" to reason about programs but that it makes it possible to reason about programs. For example, how do you prove a loop invariant in Python? how would you even start? I know of a few ways to do this, but most provide a weaker guarantee then you would type theory version would, and requires a large trusted computing base.
> > This formulation serves as the basis for most formal reasoning of computer programs.
> On this statement I'm not sure what community you come from, but much of the work going on in the research community is using things like SMT which exposes SAT and a flavor of first order logic, an HOL based system like Isabelle, or type theory, very few people use tools like set theory to reason about programs.
Pron is an engineer and he cares about what's easy for engineers to use. He's uninterested in research.
Well, interested intellectually, but yes, I'm an engineer and I judge the utility of tools by their applicability to the work of engineers. That a tool like Coq could theoretically be used to fully reason about and verify a large program is of little interest to me, as no one so far has been able to do that, let alone enable engineers to do that.
BTW, I'm not too sure what the point on SMT was. SMT just uses SAT for whatever theory is needed (and possible). TLA+ uses SMT (as well as Isabelle) extensively as automated tactics in deductive proofs (in set theory), which I have only just started to use. SMTs (and certainly SATs) are not related in any way to one mathematical formulation or another, just like garbage collection isn't. In fact, SATs are commonly used today in bounded temporal-logic model checking.
No. It's medium-sized; and it required a world expert, and it required a lot of effort, and even he had to skip on the termination proofs because they proved too hard/time-consuming, so he just put in a counter and throws a runtime exception if it runs out.
> building a clean model of your program and then proving properties is nice, but without a connection to your implementation the exercise has questionable value
This is one thing that has always confused me about TLA+ since pron introduced me to it. Maybe translation of specification into implementation is always the easy part, though ...?
> Maybe translation of specification into implementation is always the easy part, though ...?
Not only is it the easy part, but we're talking about reasoning about programs. If reasoning requires end-to-end certification you're working too hard.
Even within TLA+, you don't work at one refinement level. The lowest level is just too low, and therefore too hard for some proofs. You create a couple of sound refinements -- simpler state machines that capture the relevant essence -- and verify the behavior of yours. It's simpler than it sounds (refinement is just plain old logical implication, =>, in TLA) but it does take effort to reason about -- let alone prove -- a complex algorithm regardless of the formalism you use. FP doesn't make it easier, and it does require more difficult math.
> First, type theory based proof assistants (like Coq) and model checkers are very different tools with very different guarantees.
I wasn't talking about model checkers but about mathematical specification languages. Some of them have proof assistants to help prove them, some have model checkers to help prove them, and some (like TLA+) have both. But the point is the formulation, not the proof tool.
> I am not sure how you have equated denotational semantics and type theory
No, I said FP is based on denotational semantics, and that reasoning about (typed) FP programs requires some type theory.
> but he just chose a different foundational mathematics to build his tool with
(When I speak of the math I'd rather refer to the logic TLA rather than the tool TLA+, but that doesn't matter). Obviously there is no objective, external way to classify mathematical theories as easy or hard. But that tool requires little more than highschool math, and it's been shown to be readily grasped by engineers with little training and almost no support. I think this qualifies as an objective evidence -- if not proof -- that it is, indeed, simpler, and the main reason why I encourage engineers to learn that first, and only later learn "FP math" if they wish.
> Proof assistants like Coq make doing this easier since your implementation and proof live side by side, and you can reason directly about your implementation instead of a model.
That "easier" bit is theoretical. AFAIK, there has been only one non-trivial real-world program written in Coq, it was written by a world-expert, it took a lot of effort in spite of being quite small, and even he had difficulties, so he skipped on the termination proofs.
> very few people use tools like set theory to reason about programs.
Don't use set theory if you don't want -- though it is easier by the measure I gave above -- as that's just the "static" part of the program. I'm talking about the dynamic part and temporal logic(s). TLs are far more common when reasoning about programs in the industry than any FP approach. Or even other approaches that work on Kripke structures, such as abstract interpretation.
> I encourage you to show me the average engineer who can pick up a program and prove non-trivial properties about its implementation, even on paper.
I'm one.
> I wager even proving the implementation of merge sort correct would prove too much.
I wager that I can take any college-graduate developer, teach them TLA+ for less than a week, and then they'd prove merge-sort all by themselves.
> I've spent the last year implementing real, low-level systems using type theory, and this stuff is hard
It is. But I've spent the past few months learning and then using TLA+ to specify and verify a >50KLOC, very complex distributed data structure, and Amazon engineers use TLA+ to reason about much larger AWS services every day. It's not end-to-end certified development, but reasoning and certified proof of implementation are two different things. State-machine reasoning is just easier, and it doesn't require the use of a special language. You can apply it to Python, to Java or to Haskell.
> but that it makes it possible to reason about programs
State machine and temporal logic approaches have made it possible to reason about programs so much so that thousands of engineers reason about thousands of safety-critical programs with them every year.
> For example, how do you prove a loop invariant in Python?
Python is not a mathematical formulation. But proving a loop invariant in TLA+ is trivial. Sure, there may be a bug in the tranlation to Python, but we're talking about reasoning not end-to-end certification, which is beyond the reach -- or the needs -- of 99.99% of the industry, and will probably stay there for the foreseeable future. The easiest way to reason about a Python program is to learn about state machines, and, if you want, use a tool like TLA+ to help you and check your work.
> > I encourage you to show me the average engineer who can pick up a program and prove non-trivial properties about its implementation, even on paper.
> I'm one.
This doesn't seem right. TLA helps you prove properties of its specification, not its implementation. Or are you saying you can somehow prove properties of implementation too?
> I wager that I can take any college-graduate developer, teach them TLA+ for less than a week, and then they'd prove merge-sort all by themselves.
Sure, but prove properties of its specification, not its implementation. Unless I'm missing something ...
> State machine and temporal logic approaches have made it possible to reason about programs so much so that thousands of engineers reason about thousands of safety-critical programs with them every year.
Again, surely the specification of programs not their implementation?
> > For example, how do you prove a loop invariant in Python?
> ... Sure, there may be a bug in the tranlation to Python
Absolutely. That's the whole point. Translating it to Python is going to be hard and full of mistakes.
> but we're talking about reasoning not end-to-end certification, which is beyond the reach -- or the needs -- of 99.99% of the industry
If the implementation part truly is (relatively) trivial then this is astonishingly eye opening to me. In fact I'd say it captures the entire essence of our disagreements over the past several months.
> The easiest way to reason about a Python program is to learn about state machines, and, if you want, use a tool like TLA+ to help you and check your work.
No, that's the "easiest way" of reasoning about an algorithm that you might try to implement in Python.
> TLA helps you prove properties of its specification, not its implementation. Or are you saying you can somehow prove properties of implementation too?
What do you mean by "implementation"? Code that would get compiled to machine code? Then no. But if you mean an algorithm specified to as low a level as that provided by your choice of PL (be it Haskell or assembly), which can then be trivially translated to code, then absolutely yes. In fact, research groups have built tools that automatically translate C or Java to TLA+, preserving all semantics (although, I'm sure the result is ugly and probably not human-readable, but it could be used to test refinement).
Usually, though, you really don't want to do that because that's just too much effort, and you'd rather stop at whatever reasonable level "above" the code you think is sufficient to give you confidence in your program, and then the translation may not be trivial for a machine, but straightforward for a human.
> Translating it to Python is going to be hard and full of mistakes.
Not hard. You can specify the program in TLA+ at "Python level" if you like. Usually it's not worth the effort. Now, the key is this: full of mistakes -- sure, but what kind of mistakes? Those would be mistakes that are easy for your development pipeline to catch -- tests, types whatever -- and cheap to fix. The hard, expensive bugs don't exist at this level but at a level above it (some hard to catch bugs may exist at a level below it -- the compiler, the OS, the hardware -- but no language alone would help you there.
But I can ask you the opposite question: has there ever been a language that can be compiled to machine code, yet feasibly allow you to reason about programs as easily and powerfully as TLA+ does? The answer to that is a clear no. Programming languages with that kind of power have, at least to date, required immense efforts (with the assistance of experts), so much so that only relatively small programs have ever been written in them, and at great expense.
So it's not really like PFP is a viable alternative reasoning to TLA and similar approaches. Either the reasoning is far too weak (yet good enough for many purposes, just not uncovering algorithmic bugs) or the effort is far too great. Currently, there is no affordable way to reason with PFP at all.
> No, that's the "easiest way" of reasoning about an algorithm that you might try to implement in Python.
You specify until the translation is straightforward. If you think there could be subtle bugs in the translation, you specify further. I'm in the process of translating a large TLA+ specification to Java. There's a lot of detail to fill in that I chose not to specify, but it's just grunt work at this point. Obviously, if a program is so simple that you can fully reason about it in your head with no fear of subtle design bugs, you don't need to specify anything at all...
Well the classic here is his book Mindstorms where he talks about how to construct learning environments for children that take into account that bridge between concrete and abstract thought and turn it into an asset instead of a liability.
If you want more on the topic that's not from the hands-on perspective of an educator, you might have to go further back to Piaget's papers, but unfortunately I haven't had the opportunity to dive into these much.
Well, you don't get to decide what people find natural, they do.
And deep below, computers work under a model that's anything but functional (assembly language is basically a big mutable state machine).
There is value in understanding both imperative and functional programming and applying each of them wherever they are the best fit and there are some real world situations where sticking to immutability is a terrible choice (e.g. neural networks).
Programming is about making a piece of hardware do what you want to do. Many people seem to forget that, or treat it as an afterthought. "Oh yeah, it's slow. Hopefully the compiler will do its magic!". Except it does not, because it cannot rewrite your code to match what the hardware wants.
This is why we now live in a world where many programmers don't think there is a problem with running web servers on programming languages that slow down computation by orders of magnitude, in addition to being very cache and memory inefficient. We can scale! Yeah sure, use 50 servers where one could have done it. We would also use Saturn V to launch small individual LEO satellites, you know. Why not? It works!
Taking a functional approach is fine for a lot of tasks, and when you can do it without cryptic or inefficient code, you should do it. But when imperative is easier, use that instead.
You seem to have completely misunderstood my point.
> Well, you don't get to decide what people find natural, they do.
I am not contending that functional/declarative is a more natural way to think; I am saying that there is nothing inherently more natural about imperative programming. Since the latter comes from imitating the machine architecture up in the abstraction hierarchy, we should consider for a moment if this is really desirable.
> And deep below, computers work under a model that's anything but functional (assembly language is basically a big mutable state machine).
This is entirely irrelevant. Computers are highly stateful but the languages we design don't have to be stateful because of this. They have to meet the criterion of easily expressing the mental constructions we designate which I would definitely say is much better handled with statically typed functional programming languages.
> There is value in understanding both imperative and functional programming and applying each of them wherever they are the best fit...
We completely agree on this! Not all problems are best solved with functional languages (e.g., things actually involving the machine).
> there are some real world situations where sticking to immutability is a terrible choice (e.g. neural networks).
I don't see why immutability is a terrible choice for neural networks? Could you elaborate on this? It might be that almost all neural network implementations have been imperative so you have come to accept that as more normal.
Typical neural networks have multiple layers of hundreds of thousands of nodes, which then get updated very frequently during the learning phase.
It would be suicidal for these updates to be immutable (e.g. returning brand new neurons with the updated value instead of updating the neuron that's already in the graph).
I have a cup of coffee, if I drink a little I don't think a new cup of coffee with less coffee has been created, just that the old one is different. There are some times where imperative thinking seems more natural.
> There are some times where imperative thinking seems more natural
This is certainly true. But there are also times when there is nothing natural about it at all. Consider this scenario:
I give you a cup of coffee. You set it on your desk to cool. Then when you go to take a sip, the cup is empty! You see, I gave you the same cup I was drinking from.
A race condition I guess. At least we didn't try to drink at the same time.
For the record I usually write functional code. Let me put another example instead of coffee. If I have a differential equation, "dx/dt=f(x,t)", I think of the solution as a function "x(t)", but if I try to solve it numerically or even try to reason about it I visualize how "x" changes with time. Is this the reason why numerical code is usually imperative? Maybe it's just because of performance but I really think that sometimes there is a cognitive burden trying to reason fully functionally (I can visualize the parabolic trajectory of a ball but I cannot visualize the 6-dimensional trajectory of an airplane doing a manouver without seeing it "change with time").
I see your point. Learning how to reason effectively in higher and higher levels of abstraction can be daunting; however, it can pay off very well. Ask a mathematician about the set of interesting problems that can be visualized and they will tell you it has measure zero.
I think in a block-time model of the world. The coffee cup now and the coffee cup a minute ago are different things; my memory of the coffee cup a minute ago doesn't suddenly have less coffee in it.
The recipe analogy is certainly very natural, but it does not extend as we get into things like assignments and references, which are a crucial part of Von Neumann languages (which is really the thing I am criticizing).
Tactic languages in theorem provers like Coq are stateful too: you have a proof state and you keep giving instructions for it to change. I wouldn't say that this is an unnatural way to think; for that subtask of the problem, we can view the proof as something with state without any problems.
"Mathematical modeling" sounds too sophisticated to be natural for humans. Let me illustrate what I meant; the kind of naturality I was talking about is this:
sorted(xs): a permutation xs' of xs such that, as we traverse xs' each element is greater than or equal to the previous one.
This is a perfectly fine sorting algorithm. It contains all the information we need to sort a list of comparable things, and it is very unlikely to contain an error; it is the _definition_ of sorting.
There are righteous performance concerns about this but ignoring all performance issues this is still more natural than listing out the steps needed to carry out the task of sorting. I don't think it is possible come up with a simpler recipe-like algorithm for this task.
There are recipe-like tasks in programming: ones that don't involve modeling the world; for those the functional paradigm would not be a good one. But these are rare if you are doing programming the right way.
You are confusing algorithms (Operational Semantics) with definitions (Denotational Semantics). One of the key difference between mathematics and computing is that mathematics is far less concerned with constructing results, and far more concerned with reasoning about the properties of objects. Mathematics sits at a level of abstraction unsuited to physical hardware.
In Haskell, this is evident in the complex and difficult to reason about graph-reduction engine that lays hidden in the runtime. Denotional semantics => Haskell Math yay. Operational Semantics => Haskell's downfall in real-world programmers.
I don't see how either operational or denotational semantics is relevant as I was not talking about semantics of any kind.
I don't think I am confusing anything, you are reacting exactly as I would expect from someone who does not get the point I'm making: we don't have to necessarily program with steps of instructions. If we have a good solver for our language, we can very efficiently execute definitions like my example. I agree that this presents the challenge of efficiently executing such high-level definitions, but this is irrelevant to my point.
> Mathematics sits at a level of abstraction unsuited to physical hardware.
Precisely what I am talking about. Mathematics sits at a level of abstraction convenient for humans not machines.
Have you ever programmed in Prolog? I strongly suggest you try it to get a feeling of why programs are not inherently lists of instructions and how computation can be viewed as deduction.
But in a way, imperative is more natural, because it captures the notion of computation more precisely. Haskell -- like other pure FP languages -- is built around the approximation of denotational semantics, which does have a bit of a mismatch with computation (not to say it isn't extremely useful much of the time). Anyway, mathematical thinking about programs didn't start with PFP, nor is PFP the most common way of formalizing programs. See:
I believe that the best way to get better programs is to teach programmers how to think better. Thinking is not the ability to manipulate language; it’s the ability to manipulate concepts. Computer science should be about concepts, not languages. But how does one teach concepts without getting distracted by the language in which those concepts are expressed? My answer is to use the same language as every other branch of science and engineering—namely, mathematics.
It makes me sad that some PFP enthusiasts enjoy the mathematical aspects of it -- as they should -- yet are unfamiliar with the more "classical" mathematical thinking. I think it's important to grasp the more precise mathematics first, and only then choose which approximations you'd make in the language of your choice. Otherwise you get what Lamport calls "Whorfian syndrome — the confusion of language with reality".
Dijkstra, in his "on the cruelty of really teaching computing science" makes a very similar point that people should learn how to reason about programs before learning to program.
This is why dependent types are a great framework to program in. If a program is worth writing, it should at least contain everything the programmer thinks about the program, including the reasoning of why it is the program he wants to write!
Right. In computational semantics there are two schools. The "school of Dijkstra" (now championed most vocally by Lamport) -- which has largely taken hold in the field of formal verification -- and the "school of Milner" (Backus?) -- which has largely taken hold in the field of programming language theory. The former reasons in concepts and abstract structures (computations, Kripke structures), and the latter reasons in languages ("calculi").
The interesting philosophical question is this: can programs be said to exist as concepts independent of the language in which they are coded (in which case the language is an artificial, useful construct) or not (in which case the concept is an artificial useful construct)?
Whatever your viewpoint, the "conceptual" state machine math is a lot simpler than the linguistic math offered by PFP.
> Haskell -- like other pure FP languages -- is built around
> the approximation of denotational semantics,
Interesting, do you have any references for this? I thought that the primary reason for purity was to enable equational reasoning, but I have no sources for this. Also, AFAIK, there are no formal semantics for Haskell?
I have programmed in Java for some time now, and I have only recently dabbled in the functional world (Haskell, and still ways to go before getting any better in it). My observations about it is that functional programming in typed and immutable world, provide better composability, reusability, terseness and maintainability. It does have it's drawbacks - Haskell is known to be difficult for estimating complexity, and i think it lags in many benchmarks.
I think OO can be thought of as an extension of functional programming, except it has some fancy names and idioms (single responsibility principle, polymorphism, abstractions ... ), but in the end are incarnations of functions. For example, int String.length(){..} could be thought of as length(char[]):int function, and List.size() as length(List):int. Node Tree.next(DepthFirstStrategy) could be thought of as traverse(Tree, traversalFunction):Node. But because OO languages tend to be mutable, they lose the benefits of the FP world. However, because they are mutable, they tend to perform better in benchmarks. I think OOP implemented in an immutable language would be over-engineered FP, whereas OOP implemented in mutable language is just glorified imperative programming.
With my experience so far learning functional programming, I think it is a better tool: All the things you can do in OO world, you can also do in FP world, mostly better. When performance is of real concern, then those bits could be factored out into some imperative code - whether it's done in OOP or assembly, C is a matter of choice.
In other words, I think in the order of technology preferences, FP should be prioritized higher than others although there seems to be a steeper learning curve. I think FP is not getting the attention it deserves as a better tool.
Interesting, especially about the cultural division and politics.
Something to think about, since I'm thinking of swapping out some problematic C# code for F#. Pattern matching and a stronger, more expressive type system would be very nice. F# isn't as far off the reservation as Haskell, but I'm not sure how well throwing a monkey wrench, at the team will go.
Moving from C# to F# is much less politically difficult than moving to Haskell. For a start the framework they're on is the same, and actually, you can write object oriented F# that looks almost the same as C# (not that I would advise that, I prefer to use the OO side of F# as a last resort).
I'm the CTO at my place of work, and our primary app is C#. I wanted to tame some of the excesses of the language, and started looking at Haskell - I knocked up a few small projects with it; got it talking to the C# app through RabbitMQ, all happy days. But a few months later I came back to the projects to find they had rotted because of cabal. Some projects I found it next to impossible to recover. That left a pretty sour taste, which was gutting because I loved the language.
Then I took a look at F#. It's not as impressive as Haskell as a language, but it really does sit in that sweet spot for 'fitting in' to a big project environment. I've found that creating satellite projects to our core app in F# has been the perfect way to go. It's moved the team into that mindset slowly without having to force the issue.
> I've found that creating satellite projects to our core app in F# has been the perfect way to go
Can you elaborate on how that is laid out? I've identified some existing C# code in various of our current projects that could benefit from conversion to F#, but I'm not sure how the resulting projects should be structured.
Some things are just projects in the main VS solution. So for example I wanted to create a document search system. So I created an F# project in the main solution, brought in Lucene.NET from nuget. Half an hour later, done. You can call the functions in F# directly from C#. There are sometimes some type interop issues to deal with, but mostly it's painless.
Other projects are separate microservices that talk to our main app through my messaging framework [1]. It's an actor-model that I built a C# and F# API for, and they communicate via Redis. But tbh you can use anything you like: make it RESTful, use WebServices, RabbitMQ, etc. The great thing is you have access to the same libraries on both languages.
If you're thinking of refactoring existing projects then you could go the mechanical route of just converting classes like-for-like. But you miss out on some of the amazing type system benefits of F# if you do that.
What I tend to do is design all of the types first, as records or discriminated unions. A bit like if you were doing a DB related project you might spend some time designing the schema first, thinking about how the data should be laid out. With ADTs you can design the states of your application, whatever it is, and then you write the functions that transform those states in a referentially transparent way.
Then you book-end your lovely functional code with the IO.
This whole process is much simpler when you start with smaller services; so if you've found sections of your existing app that you want to break out, then that's perfect.
By the way, if you want some of the functional niceness in C# too, then check my library below. It was built with your situation in mind (because it's mine too). It also has some utility functions for F#/C# interop.
Thank you, that is a very nice summary of the design process. I'll be keeping it in mind as I think about my Skype for Business Online UCWA wrapper later this week..,
To have the equivalent outside the .net world, there could be a Ocaml for JVM, that would allow to fall back to a ecosystem of libraries and toolchains. I think there have been some experiments to do that, but not a finished one to my knowledge.
I think Scala is the main functional offering for JVM, no?
Although with C#'s the direction of travel (bringing in pattern matching, record types, etc.) I reckon C# and F# will end up more like Scala and OCaml, rather than Java and OCaml.
Great read. I love it when oral history in our field gets recorded. One of my great fears is that more interesting stories than not will be lost to time.
I like Haskell, perhaps because of my strong mathematics background. My dilemma is should I spend more time developing my Haskell skills or should I spend some time with the new C++ features, or learning Rust, or just programming in the languages that I am already productive in?
When I assess a programming language, one of the questions I ask myself is "How will this programming language help me write correct programs?" In more detail I want a language that helps with four goals related to the ability to produce programs that achieve their specifications:
(1) Partial correctness: the program doesn't give wrong answers
(2) Total correctness: the program does (1) and doesn't deadlock or get stuck in an infinite loop. It eventually produces a correct answer.
(3) The program does (2) within the desired performance bounds specified.
(4) The program does (3) while making efficient use of hardware.
Haskell definitely helps with (1). Reasoning about the partial correctness of Haskell programs is easier than it is with imperative programming languages. Total correctness is very hard, especially with multi-thread, multi-process, or multi processor programs, but I think that Haskell is helpful here as well.
The real problem is that Haskell makes (3) and (4) very difficult to understand. Some of this is that compilers for functional programming languages are magic, making it hard to reason about what they are doing. This is further aggravated by Haskell's non-strictness.
A discouraging look at Haskell: "Why Is Haskell Used So Little In The Industry"[1] and
"Disadvantages of Purely Functional Programming"[2]
Citing Credit Suisse and Standard Chartered is just
name-dropping big banks in the hope that it will deceive
people into thinking that these institutes are seriously
invested in Haskell. They are not.
Lol, Standard Chartered is hiring like 30 haskellers _this year_, and I'm pretty sure they were heavily invested in Haskell in 2010. That writer of that article has an axe to grind, and is twisting facts to suit it.
Update: sorry you're getting downvoted, that's not me. I think worrying about the predictability of performance is perfectly reasonable, especially in the presence of lazy evaluation.
Jon is a professional troll. He's done the same nonsense with lisp, ocaml, F# and haskell. And just as with all his other trolling, very little in those posts is accurate.
* GHC’s support for safe concurrent programming is world-class
* Sound type safety led to refactoring being a breeze
* Fun, stretches the mind (for some people)
* Perfectly reliable unit tests.
* Concurrent batched IO.
* Abstractions for things like timeouts and running tasks on other threads while correctly bubbling failures.
* Fantastic support for safely encoding and decoding JSON.
* One of the best benchmarking tools I’ve ever used.
And the complaints amounted to:
* Devs have to think more about which String representation to use
* Devs have to think more about syntactical differences from the more familiar C-derived languages
* The team has to support more backend platforms, duplicate infrastructure
* Not a mainstream language (harder to learn, hire for, bus factor)
* “at this stage in my career I don’t want another programming language”
75% were for and 25% were against. Frankly the only complaints worthy of serious response were infrastructure duplication and hiring/skill development/bus factor. The rest are people just being close-minded and lazy.
Andy, the same Andy mentioned in the article, wrote a blog post in 2014 with his perspective of what it's like to use Haskell in production, if you're interested. https://engineering.imvu.com/2014/03/24/what-its-like-to-use...