StandardML really hits a nice sweet spot in language design.
The syntax is super-easy to learn (The BNF for the whole fits in a mere 2 pages[0]), but contains a lot of features in that small package. Rather than tacking on functional features (eg, Java with lambdas), these features have been carefully considered and streamlined and include bits like proper tail calls and currying.
You get nice bits like actually sound types (hindley-milner types as Milner was also one of the SML spec authors), generics (way better than typical interfaces), type inference that actually works, and modules (super-powerful encapsulation). Pattern matching in all it's awesomeness is also on display.
SML has an amazing concurrency story (CML is rather like golang channels, but better with better typing and a bit more flexibility) and compilers like PolyML or Mlton are very fast (once again, around the same as golang).
Despite this, the language doesn't have the academic flaws of its descendants like Haskell.
SML isn't a lazy language, so reasoning about performance is much easier than some other functional languages.
SML doesn't pretend the world is a pure function. You are free to make functions that have side effects.
While most primitives and data structures are immutable by default, they either have mutable variants (eg, vector and array) or can be used as if mutable with refs (something like typesafe pointers without all the reference/dereference bits).
To be a bit pedantic (because this thing is interesting), it's more like a context. In Haskell you'll say "this code must run on the real world".
Mercury literally pretends the world is a piece of state. You explicitly say "Here, run this code. The world before it runs is on variable `a`, place the world after it runs on variable `b`".
The classic paper Imperative Functional Programming https://www.microsoft.com/en-us/research/publication/imperat... introduced the IO monad and explained that its internals are based on passing around the state of the world. The monad hides the world and keeps it linear. The compiler (ghc) optimizes out the world so that it is implicit in the compiled program. I believe this is still the way that ghc works.
Oh, yes, the stdlib does pretend the world is a piece of data. But it does a very good job on keeping this hidden from any developer, so this is more of a compiler design that does not leak into the language.
The profusion of Category-theoric abstractions and some of the more recent purely functional norms in Haskell are like the PhD-level version of `AbstractVisitorContextFactoryBuilder` which, aside from all the unnecessary cognitive load, lead to enormous dependency graphs of the npm variety.
Personally, I wonder if uniqueness types in languages like the sadly forgotten Clean (a close relative of Haskell) would have been better than the whole Monad/Arrow thing. All of that abstract theoretical stuff is certainly fascinating but it's never seemed worth all the bother to me.
On the other hand, in SML you get all the benefits of strong HM type inference/checking, immutability by default, etc, etc.. while also still being able to just `print` something or modify arrays in place. SML's type system isn't higher-order like Haskell's so its solution to the same problem Haskell solves with type classes isn't quite as elegant but otherwise SML is the C to Haskell's C++.
I find there are two main problems with monads in Haskell:
- Monad is just an interface (with nice do-notation, to be sure), but it's elevated to an almost mythic status. This (a) causes Monad to be used in places which would be better without (e.g. we could be more generic, like using Applicative; or more concrete by sticking to IO or Maybe, etc.) and (b) puts off new comers to the language, thinking they need to learn category theory or whatever. For this reason, I try to avoid phrases like "the IO monad" or "the Maybe monad" unless I'm specifically talking about their monadic join operation (just like I wouldn't talk about "the List monad" when discussing, say, string splitting)
- They don't compose. Monad on its own is a nice little abstraction, but it forces a tradeoff between narrowing down the scope of effects (e.g. with specific types like 'Stdin a', 'Stdout a', 'InEnv a', 'GenRec a', 'WithClock a', 'Random a', etc.) and avoiding the complexity of plugging all of those together. The listed advantages of SML are essentially one end of this spectrum: avoiding the complexity by ignoring the scope of effects; similar to sticking with the 'IO a' type in Haskell (although even there, it's nice that Haskell lets us distinguish between pure functions and effectful actions).
Haskell has some standard solutions to composing narrowly-scoped effects, like mtl, but I find them to be a complicated workaround to a self-imposed problem, rather than anything elegant. I still hold out some hope that algebraic effect systems can avoid this tradeoff, but retro-fitting them into a language can bring back the complexity they're supposed to avoid (e.g. I've really enjoyed using Haskell's polysemy library, but it requires a bunch of boilerplate, restrictions on variable names and TemplateHaskell shenanigans to work nicely).
Well said. This is exactly my gripe with Haskell monads. There are too many places where the hierarchy isn't uniform and there are a bunch of special monads like IO and List. It leads to the same ambiguous "is a" problem of complicated OO hierarchies and is one of the reasons they don't compose well.
The ironic thing is that these algabraic effect systems have a feel/control flow pattern that's quite similar to exception handling from the OO languages or interrupt handlers in low-level code.
It's much easier to just say "this piece of code is impure, it may do X, Y, and Z and if so then ..." than to try and shove everything ad-hoc into the abstract math tree. But then you lose the purity of your language and it's really awkward in a language whose primary concern is purity. That may be a reason why algebraic effects seem a bit more natural in OCaml.
I don't think there's a way to not use monads, because a ton of everyday things just work in a monadic way, things like lists, or statement sequences in presence of exceptions. I think it's wiser to admit and use these properties instead of ignoring them.
Ignoring maths that underlie computation when writing software is like ignoring math that underlies mechanics when building houses: for some time you can get by, but bigger houses will tend to constantly fall or stand a bit askew, and the first woodpecker to fly by would ruin the civilization, just as the saying goes.
Uniqueness types were a nice idea indeed. I suspect linear types (or their derivative in Rust) do a very similar thing: data are shared XOR mutable.
I recently read a blog post by someone who worked as a carpenter for a summer. He said he disliked the saying “measure twice, cut once” because actual carpenters avoid measuring as much as possible! For example, to make a staircase, you could try using simple trig to calculate the lengths of the board, angle of stair cut outs, step widths, etc. But that would all be wasted effort because you can’t actually measure and cut wood to sufficient precision. Instead, you need to just get rough measurements and then use various “cheats” so the measurements don’t matter, like using a jig to cut the parallel boards at exactly the same length (whatever it turns out to be).
Analogy to monads is this: yes, there are mathematical formalisms that can describe complex systems. Everything can be described by math! That’s literally its one job. But will the formalisms actually help you when the job you’re doing has insufficient achievable precision (eg finding market fit for a website)?
I think that the metaphor leaks badly. Finding market fit is like deciding on the general shape of the staircase. Execution of whatever design still needs the stairway to be strong and reliable all the same.
Writing code is a much more precise activity that cutting wood. Using "cheats" can get you somewhere, but due to the precise nature of the machine logic, the contraption is guaranteed to bend and break where it does not precisely fit, and shaving off the offending 1/8" is usually not easy, even if possible.
> But will the formalisms actually help you when the job you’re doing has insufficient achievable precision (eg finding market fit for a website)?
In my experience, this is exactly the problem Haskell's core abstractions—monads among them—help with!
What do you need when you're trying to find product-market fit? Fast iteration. What do Haskell's type system and effect tracking (ie monads) help with? Making changes to your code quickly and with confidence.
1. Make clean, well-factored code the path of least resistance.
2. Provide high-level, reusable libraries that keep your code expressive and easy to read at a glance despite the static types and effect tracking.
3. Give you guardrails to change your code in systematic ways that you know will not break the logic in a wide range of ways.
If I had a problem where I'd need to go through dozens of iterations before finding a good solution—and if libraries were not a consideration—I would choose Haskell over, say, Python any day. And I say this as someone who writes a lot of Python professionally these days. (Curse library availability!)
Honestly, Haskell has a reputation of providing really complicated tools that make your code really safe—and I think that's totally backwards. Haskell isn't safe in the sense that you would want for, say, aeronautical applications; Haskell programs can go wrong in a lot of ways that you can't prevent without extensive testing or formal verification and, in practice, Haskell isn't substantially easier to formally verify than other languages. And, on the flipside, Haskell's abstractions really aren't complicated, they're just different (and abstract). Monads aren't interesting because they do a lot; they're interesting because they only do a pretty small amount in a way that captures repeating patterns across a really wide range of types that we work with all the time (from promises to lists to nullable values).
Instead, what Haskell does is provide simple tools that do a "pretty good" job of catching common errors, the kind of bugs that waste a lot of testing and debugging time in run-of-the-mill Python codebases. This makes iteration faster rather than slower because you get feedback on whole classes of bugs immediately from your code, without needing to catch the bug in your tests or spot the bug in production.
As a bit of an aside, it's interesting that the CS (Haskell, mainly) descriptions of a Monad are much more complicated than the math.
Knowing a little bit of maths, but not much category theory, the wiki entry for Monads(Category Theory) is pretty clear. First sentence: A Monad is an endofunctor (a functor mapping a category to itself), together with two natural transformations required to fulfill certain coherence conditions. Easy.
Knowing a bit of programming, but not much Haskell, reading the entry for Monad (functional programming) or any blog post titled "A Monad is like a ...", it almost seems as if the author is more confused about what a Monad is than me. The first sentences of the Wikipedia article for example are a word-salad. With dressing.
From an outsider's perspective, it's almost as if a monad in functional programming is not a 1:1 translation of the straightforward definition of category theory, leading to an overall sense of confusion.
The formal definitions are straightforward enough, but the definitions alone don't really motivate themselves. A lot of mathematical maturity is about recognizing that a good definition gives a lot more than is immediately apparent. Someone without that experience will want to fully understand the definition, and fairly so -- but a plain reading defies that understanding. That is objectively frustrating.
> As a bit of an aside, it's interesting that the CS (Haskell, mainly) descriptions of a Monad are much more complicated than the math.
I actually do agree with this, though. I feel like monads are much simpler when presented via "join" (aka "flatten") rather than "bind"; and likewise with applicative functors via monoidal product (which I call "par") rather than "ap". "bind" and "ap" are really compounds of "join" and "par" with the underlying functorial "map". That makes them syntactically convenient, but pedagogically they're a bit of a nightmare. It's a lot easier to think about a structural change than applying some arbitrary computation.
Let's assume the reader knows abut "map". Examples abound; it's really not hard to find a huge number of functors in the wild, even in imperative programs. In short, "map" lets us take one value to another, within some context.
Applicative functors let us take two values, `f a` and `f b`, and produce a single `f (a, b)`. In other words, if we have two values in separate contexts (of the same kind), we can merge them together if the context is applicative.
Monads let us take a value `f (f a)` and produce an `f a`. In other words, if we have a value in a context in a context, we can merge the two contexts together.
Applicative "ap", `f (a -> b) -> f a -> f b`, is "par" followed by "map". We merge `(f (a -> b), f a)` to get `f (a -> b, a)`, then map over the pair and apply the function to its argument.
Monadic "bind", `(a -> f b) -> f a -> f b`, is "map" followed by "flatten". We map the given function over `f a` to get an `f (f b)`, then flatten to get our final `f b`.
It's a lot easier to think about these things when you don't have a higher-order function argument being thrown around.
The problem isn't with the mathematical concept of monads, it's that Monad is a typeclass in a hierarchy along with other totally abstract category theoric classes introduced at different stages and people use them with varying levels of knowledge and ignorance so things are placed arbitrarily and there are a bunch of ambiguities. Just look at this [0] mess. Is that really what we want?
It's used pervasively in low-level code because the idealism of monads just doesn't cut it. In my opinion, this proves that the pragmatism of side effects is a necessary evil for actually getting things done in a performant way.
Lazy evaluation has the same kind of issues. Most humans don't think that way, so performance suffers. This may be a universal problem as in my experience, Haskell doesn't have good tooling to help with this issue.
Always immutable is another large issue. Yes, there's a ton of safety in immutability and it's the right tool for MOST code. Compilers aren't prefect and there seem to be an endless stream of situations where the compiler can't figure out if it is safe to mutate "immutable" data to gain performance. For the foreseeable future, the ability to mutate can have huge performance dividends.
Finally, Haskell and it's libraries are prone to rather academic programming styles and techniques. These are amazing and beautiful. They also can be hard to grep even if you know the math. When you consider that the overwhelming majority of programmers don't know the math, it seems plain that these constructs are a detriment to pragmatic, non-academic usage.
> It's used pervasively in low-level code because the idealism of monads just doesn't cut it. In my opinion, this proves that the pragmatism of side effects is a necessary evil for actually getting things done in a performant way.
It seems like you aren't very familiar with the ways Haskell programmers deal with side effects and mutation. UnsafePerformIo is sometimes needed, and there are a few common idiomatic ways to use it, but there are usually better options. What at first looks like an impenetrable wall between the IO monad and pure code is actually surprisingly permeable, just not (usually) in ways that violate expected semantics. For instance, you can read a file into a string in the IO monad and pass the string into pure code to process it. The string is lazy, so the pure code actually triggers the file to be read as it's consumed.
Another escape hatch is the ST Monad. It allows you to run imperative computations (i.e. use mutable variables and arrays) within pure code. Since the ST computations are deterministic, they don't violate any important guarantees about pure code. You can't read and write files from the ST Monad, but if you want to do array-based quicksort or something similar it's available.
Some aspects of Haskell are hard to work with. You're right that laziness introduced some performance issues that can be tedious to fix, and some of the libraries are hard to use. However, it's a perfectly reasonable tool for many general-purpose programming applications. It's not a replacement for C, but you could say the same thing about C# or Java or any other language with a garbage collector.
Wait, I thought that lazy io (e.g. getting a lazy string back from "reading" a file, which triggers subsequent reads when you access it) was widely considered bad and a mistake.
This depends very much on the context. Using something like `readFile` without carefully exhausting it is absolutely a mistake in a long lived application or a high volume web server, and in a setting like that reaching for that kind of an interface and hoping future modifications preserve the "reads to exhaustion in reasonable time" is questionable at best.
On the other hand, in a program that handles a small number of files and doesn't live long after file access anyway (say a small script to do grab a couple things, crunch a few numbers, and throw the result at pandoc), there's nothing wrong with lazy IO and it can be quite convenient.
Lazy IO is really useful for commands which stream data over stdio, e.g. this is a really useful template, where 'go' is a pure function for producing a stdout string from a stdin string:
{-# LANGUAGE OverloadedStrings #-}
import qualified Data.ByteString.Lazy.Char8 as BS
main = BS.interact go
go :: BS.ByteString -> BS.ByteString
go input = -- Generate output here
If you don't mind Haskell's default String implementation, then it's just:
main = interact go
go :: String -> String
go input = -- Generate output here
Yes, that's true. I thought about mentioning that in my comment, but it seemed like a bit of a digression and it was already getting a bit wordy.
Anyways, the problem is that you might open a file, read it into a string, close the file, and then pass the string into some pure function. The problem is that the file was closed before it was lazily read, and so the string contents don't get populated correctly; you get the empty string instead, or whatever was lazily read before the file was closed.
That was a design oversight from the early days. If you know about it, it's pretty easy to work around it in simple applications. There are some more modern libraries for doing file IO in a safer way, but I haven't used them and don't really know what the details are.
Reasoning about haskell performance is hard because it is a lazy language and the compiler pulls a lot of tricks. In SML, code is executed in some sense roughly in the order it is written whereas in haskell it is very hard to know when something will be executed.
They're very similar. Some people prefer Standard ML's syntax, though I actually prefer OCaml's (most likely just a result of me learning OCaml first).
The module systems have some small differences, but are very similar (I believe OCaml's was inspired by SML's).
Standard ML is also standardized (hence the name), and there are multiple implementations, whereas OCaml is basically defined by its single implementation.
But OCaml does have a larger ecosystem, Opam, more features (first-class modules, polymorphic variants, etc.).
Rust's performance is almost certainly much better, with a larger library ecosystem.
(Common) Lisp has macros, which allow me to implement pattern-matching; an even simpler BNF (as short as one line, depending on how you define it); dynamic typing, which makes generics unnecessary, and for all the hate that it receives is regularly deployed to production systems on large scales; tooling that is almost certainly better than SML's; and a condition system, which beats any other type of error-handling system, bar none (meaning that I can make my programs more robust than yours).
Rust is very much a ml variant. It's all the things you claim and probably more. It's also at least an order of magnitude more complex. If you know rust, you'll probably find SML super easy to learn and refreshingly easy to code.
People gravitating toward SML likely want a few things: easy to learn, simple infix syntax with functional support, compile time type static type checks, multithreaded, very fast without a drastic recode.
Common lisp libraries are better, but not drastically so (well, I've never had 5 grand to throw at lisp works, so I can't comment there). CL is more complex to learn (must people can probably learn the entire SML language in the time it takes to master the loop macro). Macros are powerful, but mean that even if you can find a lisp dev, it'll take a long time for them to learn your custom variant of the language.
CL offers unofficial threading, but last I checked it was much harder than the SML solutions. CL is pretty fast out of the box, but if you ever need peak performance, the code changes a lot and can become very finicky (though CL makes seamlessly hiding those bits easier than most languages). Even at it's most optimized, I don't know that it can match milton in performance or memory usage.
Dynamic vs static typing is a battle that's all but over. When you have devs coming and going on a team, type make transitions easier. Loads of dynamic languages have started adding them and even CL sort of does.
CL has loads of interesting features from well known ones like macros/reader macros or metaobject protocol to less known ones like optional dynamic scoping.
I'm just saying why I enjoy SML (not trying to argue that it's the end all, be all of programming). There are other great languages too. Do what you love.
Your words betray your ignorance. Common Lisp has a type system. Have you ever written it? Do you actually know anything about it? Moreover, macros are a plus - you clearly have not actually used Lisp macros, which are categorically different than macros in other languages.
> * Doesn't have parantheses and the annoying prefix notation
Spelling mistake. Additionally, the parenthesized syntax is a choice, and one that you get used to quickly. Subjective, and therefore not valid as one of the items you listed.
> * Your knowledge of SML can be translated easily to OCaml, Java, Scala, which are more or less part of the same family
Meanwhile, knowledge of Common Lisp transfers to almost every dynamically-typed language ever made, and a good many static ones - including Java and Scala - because Lisp influenced all of those languages. That is, Lisp knowledge transfers to far more languages than SML knowledge does.
Ok, I don't want to start a language war again, when I say types, I mean the modern reference to the word that it has a static type system. Indeed, Lisp languages are strongly typed but I'm sorry, in the real world it doesn't help me, I need types, statically checked types.
I don't know if Lisp, the language, influenced Java and Scala so much, it was more about its runtime, the garbage collector, etc. The only thing that I liked about Lisp is the composabiltiy, and coupled with objects from Simula created my favorite industrial languages, Java and TypeScript. I'm passionate about parsers, compilers and transformers and this is why I have a fost spot for SML, OCaml. I simply dislike Lisp, interesting language but not for me.
> when I say types, I mean the modern reference to the word that it has a static type system
Can you show me a significant body of literature or a whole community that uses "types" to mean "a strong type system"? Because I've never heard that aliasing done before.
> I don't know if Lisp, the language, influenced Java and Scala so much, it was more about its runtime, the garbage collector, etc.
Lisp pioneered lambdas and first-class functions, in addition to garbage collection, all of which were adopted by Scala and Java later.
> I simply dislike Lisp, interesting language but not for me.
I understand! I have things that I like and dislike, too. However, it's extremely frustrating when you ask for an analytical comparison between several languages, and some random person lacking knowledge on at least one of the languages (I suspect you don't know Rust either) comes in and, instead of actually providing relevant information, gives fallacies and opinions.
All languages associate types to expressions. But, all dynamic languages can have only one type for expressions. Indeed, dynamic languages associate types to values all the time but there is no static distinction. What people from dynamic languages are calling "types" are in fact runtime tags. What makes Lisp strongly typed are those runtime tags that prevent implicit conversions at runtime, something that JavaScript lacks.
In Common Lisp objects are values, so runtime tags are attached to data objects => which are actually values => which are stored in variables => which are not typed.
You can make it optionally typed at compile time to help the compiler but that means to annotate your code with 'type' keyword and it's verbose. The compiler does not enforce it.
Again, I've never seen this distinction made before. Please show me a body of literature that claims that dynamically typed languages do not have a type system - your claims on their own do not convince me.
> Your words betray your ignorance. Common Lisp has a type system. Have you ever written it? Do you actually know anything about it? Moreover, macros are a plus - you clearly have not actually used Lisp macros, [...]
This isn't really okay. It's fine if you disagree, but I don't get why such aggression is needed.
I specifically avoided insulting them. "Ignorance" is not an insult; it's a description of your knowledge on a topic, and if it isn't the right word to describe telling such a blatant and fundamental misunderstanding, then what is?
I first came across Standard ML in Dan Grossman's Programming Language course on Coursera[1]. After previously trying Haskell and Scala, this is where the benefits of functional programming and static typing really clicked for me. Really wish the language had caught on outside of academia so I had an excuse to use it more. In addition to the implementations mentioned in the blog post, The University of Chicago created Manticore[2] and Tohoku University in Japan created SML#[3]
Thanks for sharing Manticore! I didn't mention SML# because most of the documentation is in Japanese and it's hard to find recent info. There is an unofficial fork on Github but it explicitly states that it is unofficial so... not something I'd recommend to a general audience.
Edit: nevermind! I checked out their docs again and it seems to be decently translated. I'm still not sure where their code is actually hosted, just that you can download releases. So added it to the list of major implementations!
My favourite thing about SML is that it's a relatively idiot-proof language. It has something of Python's "one obvious way to do it" - a good chance of being able to understand someone else's code. This is partly because the language is small and features many sensible decisions, and partly because it isn't particularly malleable at the syntactic level (compared to your Lisps or Haskell).
The tradeoff is that it can be more laborious than other functional languages. My code would be shorter if it had record updating, a single-character lambda syntax, contextual operators for real or int arithmetic, etc. Not only does it lack syntactic sugar for associative containers, it doesn't even have them in the standard library. Nor sorting, nor random numbers, nor complex numbers or matrices, nor uniform type-to-string formatting or string interpolation. Some of these holes can be filled by libraries, but then library choice becomes a problem.
I can live with an awful lot of that though, given the relative clarity of the language, the really practical module system, and the delightful feeling you get when using a language standardised over 20 years ago that, once written, your code can stay written.
Here is a paper[1] written by Andreas Rossberg, specification author of Web Assembly, that discusses the defects in the definition of Standard ML. I believe he tried to address many of these issues with Alice ML[2].
My biggest personal gripe is no nested functors. SML/NJ does support this as an extension but since no other implementation does you effectively can't use it if you want decent performance (that SML/NJ doesn't really give).
I'm also pretty jealous of OCaml's modular implicits, just because it saves you typing the module name before an operator (i.e. MyModule.+ vs just + implied by context).
It also kinda sucks that operator precedence is defined at the module level and cannot be exported. You have to always redefine a library's operator precedence yourself.
They haven’t been merged. One feature that gives some brevity is type directed constructor disambiguation which allows you to omit the module name when accessing record fields, matching values, or constructing values, provided the type can be inferred in a certain directional way (rather than the more general unification based type inference which is used by the type checker)
I've recently picked it up, reading "ML for the working programmer". The language is fairly simple, everything just fits. However:
1. Both Vim and Emacs are horribly annoying with their automatic indentation for SML. There is a lot of fighting against the editor in this department. In Emacs e.g. you have to delete whitespace all the time, because otherwise you'd have top-level function definitions shifted 80 characters to the right.
2. I've used Poly/ML and SML/NJ so far. Both of them are purely interactive, meaning I can't just compile a program into ELF and ship it somewhere else without the compiler. That makes it a no-go for me for real-world use.
3. The interactive modes of Poly/ML and SML/NJ don't support readline shortcuts. They are the most cumbersome REPLs I've ever used.
4. Inline type declarations (as opposed to Haskell-style type declarations on a separate line) are very noisy - they make reading the code harder. Omitting them (which is the rule in SML in practice) leads to hard-to-decipher compilation errors when you write a new piece of code and you made an error somewhere which confused the type-inference about your intentions. Suddenly forgetting about a word or a set of parentheses in one function results in errors in another perfectly-good function. It's the horror of C++ templates all over again.
Not a huge fan of ML for the Working Programmer, personally. I'd love to see (or one day write) the equivalent of Practical Common Lisp (which itself needs an update at this point) because MftWP is sooo dated. But I can see how it's a decent enough intro.
Regarding your points:
1. Yeah editor support sucks. I normally edit in text mode with my own minimal keyword highlighting or ocaml-mode.
2. Poly/ML can definitely generate binaries! Most distros ship with `polyc` that will build the binary for you. But this is just a shell script around opening the REPL and calling some dump image function (like how you build a binary on SBCL). MLton and Poly/ML definitely allow you to build binaries. I don't know about SML/NJ.
3. For sure a pain. I use rlwrap [0] to work around this, which is ultimately simple enough!
4. Interesting! I personally find Haskell-style decorations so much more a pain since they're not inline. SML is very much like other major languages in the way it does inline types (TypeScript, Go, C#, etc.).
Just want to push back a bit and leave a dissenting view on ML for the Working Programmer. If one actually works their way through the text, by the end they will have written a basic interpreter for the lambda calculus, and a small tactics oriented theorem prover, either one of which is highly illuminating to someone who hasn't done this before. The text may not educate a reader on dependent type theory or similar developments but I'm struggling to understand how it is otherwise "dated".
Other than that thank you for keeping SML in front of people, I for one appreciate it, its probably my favorite language these days
I see your point. I guess it's just the title that has turned me off even after reading the book.
It doesn't seem to send the right message these days if the idea is that working programmers are building lambda calculus interpreters or theorem provers? Even interpreting a lisp would seem more practical.
I think Practical Common Lisp was more in the right track for content but today I'd focus more explicitly on language design or backend system design.
Andrew Appel's book does already fill the language design slot though nicely.
I absolutely love ML for the Working Programmer, but I think you have to see it as a philosophical text as much as a tutorial.
To these points, I'd say
1. I find Emacs sml-mode is just about up to it. I do have complaints - I wish it would pull back the indentation more in lines that continue an expression, it doesn't handle multiple "where" clauses elegantly, it misaligns anonymous function alternatives - but every time I consider trying to fix them I decide they don't quite upset me enough. It's a slight pity though because I strongly believe a language should be auto-indentable (life's too short to indent code yourself), and SML is, just not quite with the existing mode.
2. I like to use Poly/ML for automatic builds during development and MLton for "production" builds - both producing executables. There are still problems on Windows, which doesn't have a properly native MLton port - the existing one uses MinGW which is ok-ish but not what I would prefer. (MLton has a code generator that produces C, so the limitation is that the runtime hasn't been ported rather than with the compiler itself.)
3. Agree, "rlwrap poly"
4. I like inline type decorations, but I also like to omit them most of the time. I think you do get some feel for when it's a good idea to add them, to clarify things for the call site or check your own intuition about the deduced types. Module boundaries (with signatures) also form a natural firebreak for out-of-control type errors.
> 2. I like to use Poly/ML for automatic builds during development and MLton for "production" builds
Do you have/use any dependencies? Like a http/2 (or tls capable) web server, a database client or a gui library? If you do, how does that work with two implementations - if not... What kind of programs do you write/problems are you solving?
> 1. Both Vim and Emacs are horribly annoying with their automatic indentation for SML. There is a lot of fighting against the editor in this department. In Emacs e.g. you have to delete whitespace all the time, because otherwise you'd have top-level function definitions shifted 80 characters to the right.
I've used sml-mode in Emacs for years, and I've never had the problem you're describing. On the contrary, I find sml-mode to be quite adequate for my needs. Can you elaborate a bit on what's causing you problems?
Edit: Ah, I see what you're saying. After finishing a function definition and starting a new, the cursor is wildly indented, that's true. But if you just type `fun` and press tab, sml-mode automatically indents the definition correctly.
> 2. I've used Poly/ML and SML/NJ so far. Both of them are purely interactive, meaning I can't just compile a program into ELF and ship it somewhere else without the compiler. That makes it a no-go for me for real-world use.
I can recommend MoSML for interactive development. It can also produce compiled binaries. When I want to produce efficient compiled code, I usually use MLton.
> 3. The interactive modes of Poly/ML and SML/NJ don't support readline shortcuts. They are the most cumbersome REPLs I've ever used.
Use the REPL in emacs, it's excellent.
> 4. Inline type declarations (as opposed to Haskell-style type declarations on a separate line) are very noisy - they make reading the code harder. Omitting them (which is the rule in SML in practice) leads to hard-to-decipher compilation errors when you write a new piece of code and you made an error somewhere which confused the type-inference about your intentions. Suddenly forgetting about a word or a set of parentheses in one function results in errors in another perfectly-good function. It's the horror of C++ templates all over again.
I agree, this is a pain point. I usually leave type declarations in comments before the definitions, but that of course has obvious drawbacks.
I started learning SML earlier this year - it's quite a nice language and it's a shame that the ecosystem isn't better.
For fun and educational purposes, I began working on a Standard ML compiler [1] and VSCode extension in Rust as well - with the end goal being a psuedo-clone of MLton. Currently taking a short break from it to work on some other stuff, but I'm mostly done with monomorphization
I used Standard ML for a research project in college and really fell in love with it. Yes, at times it feels a bit old, but nothing felt bad. One of the creators of Standard ML (Harper, I think) wrote a really beautiful textbook, Introduction to Programming in Standard ML, which I think is the best textbook on programing I've ever come across. I'm glad that Standard ML is still kicking.
So I use F# and plan to look at rust in 2021 for my "compile to native code" toolset since .net is not everywhere and .net native appears to be moving very slow to release F# to native code support.
should I consider taking a look at some description of standard ml instead of rust? It seems like rust has pattern matching and immutability which matches to F# pretty good?
My use cases are mostly desktop class machines, with optional extensions to single board computers and maybe, maybe, mobile devices of some kind. Utility programs, soap and rest web services, guis for the rest apis etc.
> So I use F# and plan to look at rust in 2021 for my "compile to native code" toolset since .net is not everywhere and .net native appears to be moving very slow to release F# to native code support.
I have projects that generate native binaries on Linux. Here's an example Makefile [0].
> should I consider taking a look at some description of standard ml instead of rust? It seems like rust has pattern matching and immutability which matches to F# pretty good?
Library and tooling support in F# and Rust is significantly more developed than in Standard ML.
To me the best use case for Standard ML today is in language development.
If its ecosystem were more mature it would be a great choice for application development. But today that's just not the case unless you use a version like Morel that is backed by the JVM (and its ecosystem).
Which SML implementation has the best editor tooling?
The article mentions that certain SML implementations have seen better support for parallel and concurrent programming than OCaml, but Merlin and ocamlformat for OCaml are great as far IDE tooling go. Which SML implementation has the best tooling for things like go to definition, reveal type at cursor, and format file?
None of them have invested very much in tooling as far as I know. There are some small sml-modes for emacs or plugins for vim but they only do basic syntax highlighting.
One of the biggest missing things for SML tooling is parsers for SML written in SML. They tend to be written in the implementation language and not exposed as a library. So you don't see SML formatters or documentation generators so much.
There have been a few attempts at this but they haven't seriously caught on.
I am designing a language based on OCaml/Standard ML, what are some of the problems present in these languages that can be fixed by a new design without being constrained by a spec or backwards compatibility?
Avoid Ocaml's syntax soup. I'd also avoid making it easy to bolt on a bunch of syntactic extensions ala camlp4
Keep your standard library standard.
Module Typeclasses
One, immutable string type. Make it UTF-32 by default (you can always store in a smaller format behind the scenes and extend). Make it multiline. Allow variable interpolation (typeclasses should help).
SML structural typing and anonymous records are better than nominal typing.
Skip on syntax bloat like named parameters.
Overload your math operators instead of having special float operators like Ocaml (once again, typeclasses make this much easier).
Rethink imports. F# requires you to add tons of files in the correct order to an XML. Ocaml imports all the things from the folder and blows up if you use the same module name twice. SML needs to standardize `use`. I'd recommend something like JS static import statements.
If you build good documentation, good compiler errors and good tooling, more people coming to the project and help make a better compiler (a reversal of this is the big issue with SML right now in my opinion).
- you want `deriving`
- you might want to look at staging (MetaOCaml)
- you might want to look at resource management (uniqueness typing fe)
- you might want to look at paralellism and maybe you can even release it before Ocaml/multicore arrives ;)
I'd no knowledge of SML until I came across the Programming Languages course by Washington Uni on Coursera. I tried learning Scala first but then never got anywhere. This course was amazing. Unfortunately, didn't have enough time to dedicate to it. Thanks to this post, I'm excited to try again. :)
I learned Common Lisp in high school and took a course called "Discrete Math and Functional Programming"[0] my freshman year of college (the prof wrote the textbook). To me it felt like a math course (covering predicate logic, sets, graphs, and eventually the lambda calculus. It was also the first time I had to write proofs outside of geometry class), but all of the teaching was paired with examples in ML. For example, half the homework one day would be writing proofs about powersets, and the other half would be writing ML functions to create and manipulate powersets.
It was a really neat idea, and it felt like ML got out of the way and just let me program without having painful syntax or semantics. That said, my biggest ML programs would be five functions spanning ~30 lines loaded into an smlnj interpreter. I never got to deal with functors and modules and who knows what else, and am sorry that I don't know how to "program in the large" in ML. I'm now a professional Clojure dev and would love to see how ML does things.
I own a copy of ML for the Working Programmer (which I see @eatonphil is panning below) and the ML compiler book by Appel, neither of which I've read. Maybe I'll make the time one of these months…it'd be nice to have a Motivating Project though.
It was the first language that was taught in computer science at the University of Copenhagen until around 2015, then they changed that language to F#/
Judging from the activity on Stack Exchange, it seems to be pretty dead. Maybe a few universities around the world still teach it? It can't be more than I handful I guess.
https://stackoverflow.com/tags/sml/topusers
As someone reasonably versed in OCaml but knows nothing about SML, what are primary differences? Given that OCaml has much better tooling and more of a community, what draws people to SML still?
SML records are structurally typed and may be anonymous which makes a lot of things easier to write. I've heard Ocaml has been attempting to tack that on, but that leads to the next point of Ocaml syntax being a complete mess. Ocaml needs keyword arguments, but in SML, structural typing gives named arguments without all the extra syntax.
I find the `ref` syntax of SML much nicer to use than the mutable record syntax in Ocaml.
SML doesn't have all the dotted math operators for floats.
SML strings are immutable which lends itself to a lot of potential optimizations (and there's always byte arrays if you actually need to mutate).
Ocaml let vs let expressions are annoying to me.
SML has a standard. "The implementation is the spec" in my observation always leads to problems.
My wishlist for SML:
FIX USE COMPATIBILITY. It's intentionally not specified by the standard. This makes portable code very hard. I'd love to see an approach more like JS, but without dynamic imports (useless) and with support for something like GO_PATH plus maybe the ability to recognize and import from URLs (especially .git and .sml).
JS style template strings with interpolation (the type system is at least smart enough to convert primitives to strings) and multi-line capabilities. This would also be an easy backward-compatible way to add support for unicode. Guarantee the ability to implement them as UTF-32 with the knowledge that an advanced compiler could choose to internally represent them as UTF-16 or even Latin1 to save space (JS implementations have optimized to convert their ropes from UCS-2 to latin1 when possible with huge memory savings as even Chinese sites are 90+% ASCII).
Module Typeclasses should keep typeclasses from happening everywhere (looking at you Haskell) while still allowing them to be used for more than equality.
Unofficial first-class functor support needs to be made official.
Pick one of the 4-5 slightly different concurrency models and standardize it.
Things I want that are in SuccessorML:
Guards, "OR" shorthand, optional leading pipe in matches
> I find the `ref` syntax of SML much nicer to use than the mutable record syntax in Ocaml.
Maybe I'm missing something in SML, but OCaml has `ref` as well and it works just like SML's?
let x = ref 0
x := 12
print_int !x
> SML strings are immutable which lends itself to a lot of potential optimizations (and there's always byte arrays if you actually need to mutate).
OCaml's strings are also immutable. They have been immutable by default since 2017 and prior to that one could opt into this behavior via a compiler flag.
You are correct that Ocaml has a ref type (though I believe it is actually a special case of a mutable record with one field). Most stuff I've run into uses mutable records more than refs (granted, I'm hardly deep into the Ocaml way of doing things). There is a small difference between an immutable record of which one property is a reference to changing data and a mutable record where the record itself changes. I prefer the first though I realize this is mostly preference.
2017 was just 3 years ago. Loads of Ocaml stuff rely on versions much, much older than that. In any case, the idea of outright changing a formerly mutable structure to an immutable one would be unthinkable in most languages due to all the breakages it is likely to cause.
> You are correct that Ocaml has a ref type (though I believe it is actually a special case of a mutable record with one field).
> There is a small difference between an immutable record of which one property is a reference to changing data and a mutable record where the record itself changes. I prefer the first though I realize this is mostly preference.
In OCaml ref is defined as
type nonrec 'a ref = 'a ref = { mutable contents : 'a }
so it is an immutable record which contains one item which is mutable.
> In any case, the idea of outright changing a formerly mutable structure to an immutable one would be unthinkable in most languages due to all the breakages it is likely to cause.
I agree with you. I started using OCaml in 2018 but I believe many linux distributions stayed on OCaml 4.05 (the release before safe-string was default) for a while because of breakages.
> Module Typeclasses should keep typeclasses from happening everywhere (looking at you Haskell) while still allowing them to be used for more than equality.
I was under impression that Haskell's typeclasses plus a couple of most common extensions minus the namespace pollution are pretty much equivalent to ML modules+functors, so could you elaborate how exactly that should work?
Yes and no. They can do the same thing, but with varying amounts of effort. SML has what is essentially a hard coded typeclass for equality with special syntax for it as well (two single quotes instead of one). There can also be issues creating and overloading operators. There's still ongoing discussion about modular typeclasses for SuccessorML
Yeah... not only the fact that SML modules always left an impression of arbitrariness (SML97 and OCaml have some subtle semantic differences and neither of those choices seem to be inherently wrong), but also the sheer complexity of their theory compared to Hindley-Milner... seems that there just has to be a better way.
Ocaml's optimization is very good, but not perfect. For example, if you need 32-bit integers instead of 31-bit integers, your performance is going to tank due to boxing.
MLton is a whole program optimizer (rather than function at a time like Ocaml) and wrings out a ton of performance though compile times are quite a bit longer.
As of 3 years ago it wasn't, but maybe that has changed.
Whole program compilation isn't without its downsides though. It seems pretty common to develop in SML/nj because of fast compiling then doing final performance profiling and deploying with mlton (having a standard helps). Function at a time is also more amenable to caching and reusing part of the previous compile.
SML is standardized which is useful if you want to build an implementation for academic purpose. I don't think anyone seriously uses it outside academia however.
I have worked professionally with Standard ML for a few years, but my friends and I used to joke that I was probably the only professional Standard ML developer in existence.
A portfolio management website (frontend and backend) and a portfolio performance calculation engine. You're welcome to contact me on my personal email, if you want to hear more.
Not totally sure, but it's definitely not as active as the first three I mentioned. The last commit is from over a year ago [0]. Although the SML/NJ website shows releases presumably from 2020.
I’ve always enjoyed working in SML - usually my default choice is SML/NJ. Most recently Ocaml has been my ML of choice due to a coworker preferring it, but lately I’ve revisited SML for a new project since I like how small the language and basis library is.
Dumb question, but since there are many Standard ML implementations, do they all implement the same language? Or just variants/visions of the same language? And is there any standardization on any particular implementation?
It's like Common Lisp or Scheme: there's a formal spec and many implementations implement the spec. There's no reference implementation like CPython or MRI Ruby.
Can anybody point me to a thorough tutorial and matching impl that will install cleanly on OSX? Or should I do it in a VM? I'd very much like to learn me some ML.
SML is a nice language with a cleaner syntax and is standardized.
Still, my answer would be OCaml without questions. It has better tooling, better performance, a larger community, a larger ecosystem, more features and more work being done on it with basically no downsides.
The syntax is super-easy to learn (The BNF for the whole fits in a mere 2 pages[0]), but contains a lot of features in that small package. Rather than tacking on functional features (eg, Java with lambdas), these features have been carefully considered and streamlined and include bits like proper tail calls and currying.
You get nice bits like actually sound types (hindley-milner types as Milner was also one of the SML spec authors), generics (way better than typical interfaces), type inference that actually works, and modules (super-powerful encapsulation). Pattern matching in all it's awesomeness is also on display.
SML has an amazing concurrency story (CML is rather like golang channels, but better with better typing and a bit more flexibility) and compilers like PolyML or Mlton are very fast (once again, around the same as golang).
Despite this, the language doesn't have the academic flaws of its descendants like Haskell.
SML isn't a lazy language, so reasoning about performance is much easier than some other functional languages. SML doesn't pretend the world is a pure function. You are free to make functions that have side effects. While most primitives and data structures are immutable by default, they either have mutable variants (eg, vector and array) or can be used as if mutable with refs (something like typesafe pointers without all the reference/dereference bits).
[0] https://cse.buffalo.edu/~regan/cse305/MLBNF.pdf