I remember around a decade ago when I first got into programming, I was super excited that OCaml would soon get multicore, regularly checking the progress. Although it took a lot longer than I imagined it would, nevertheless it feels amazing to see it finally here, almost like a dream when you've waited so long for something and you can't believe it's finally happening.
Offtopic, but Signals and Threads is my absolute favourite programming podcast, maybe even favourite podcast overall. They go into interesting topics deeply, instead of the way too common "interviewer read the summary of the wikipedia page about the subject and is now interviewing someone who actually read the entire page."
They are doing a really good job at employer branding. They make the requirements/constraints of the finance industry sound interesting.
A field that cares about correctness, ordering and accurate clocks.
thanks for mentioning this! I've been looking for a new one that's good. Even if they're behind schedule, it looks like I've got some catching up to do to keep me busy in the meantime.
One thing to point out to anyone listening to that episode is that at the time the plan was to only upstream the multicore GC for 5.0 and then follow up with effects. Instead they both went in to 5.0.
People who aren't PL theorists understand the multicore benefits even if the retrofitting paper might be over their heads. On the other hand, the bounding data races in time and space paper/presentation is also a fairly significant change that I don't think the average dev has an easy to understand relationship with.
> Ron: Do you have a pithy example of a pitfall in multicore Java that doesn’t exist in multicore OCaml?
> Anil: There’s something called a data race. And when you have a data race, this means that two threads of parallel execution are accessing the same memory at the same time. At this point, the program has to decide what the semantics are. In C++, for example, when you have a data race, it results in undefined behavior for the rest of the program, the program can do anything. Conventionally, daemons could fly out of your nose is an example of just what the compiler can do.
> In Java, you can have data races that are bounded in time so the fact that you change a value can mean later on in execution, because of the workings of the JVM, you can then have some kind of undefined behavior. It’s very hard to debug because it is happening temporally across executions of multiple threads.
> In OCaml, we guarantee that the program is consistent and sequentially consistent between data races. It’s hard to explain any more without showing you fragments of code. But conceptually, if there’s a data race in OCaml code, it will not spread in either space or time. In C++, if there’s a data race, it’ll spread to the rest of the codebase. In Java, if there’s a data race, it’ll spread through potentially multiple executions of that bit of code in the future.
> In OCaml, none of those things happen. The data race happens, some consequence exists in that particular part of the code but it doesn’t spread through the program. So if you’re debugging it, you can spot your data race because it happens in a very constrained part of the application and that modularity is obviously essential for any kind of semantic reasoning about the program because you can’t be looking in your logging library for undefined behavior when you’re working on a trading strategy or something else. It’s got to be in your face, at the point.
Specifically how the “better” crowd spent a lot of time trying to solve PCLSRing while the “worse” crowd just said: meh, throw an error and let the program figure it out.
Honestly, the best of both worlds would have been to have the simple kernel implementation and put the boilerplate recovery code in the system libraries. For instance, libc's write() should handle interruption resumption, and libc should also provide a write_interruptable() (okay, appropriately shortened for the symbol limits in early linkers) to expose the raw system call in the uncommon case that the programmer wants to manually handle interruptions.
I guess you still want well-defined semantics even if most users use some modes-based or other high-level api. Eg normal rust code gets to be ‘data race free’ but low level implementations of eg Arc could be racy if they were incorrect. But maybe the OCaml thing is more than just doing a good job of specifying semantics and there is some overhead in generated code because of it. I confess I haven’t looked into the details.
OCaml is one of those languages that is a real joy to use and makes me wonder why its not used more often. I feel like if I jump into a Lisp or ML language, I am so productive and can write some really complicated software with relative ease. But they are relatively rare in industry, and I have never really crossed the t on why, despite various reasonings about it.
This is a great achievement for OCaml, but does anyone have an explanation on why it was so difficult to implement for them?
I think John Carmack made a good point on this topic which I did not think of before.
He stated that albeit he spent years working with Lisps (CL and Racket mostly) or Haskell he stated that jumping on projects in those languages instantly requires you pay the price of having to learn the abstractions and DSLs that the users wrote for the project before being able to understand anything.
He compared it with C or Go, where he realized that it was easier to him to read kernel code without any context, because there is no abstraction price to pay. What you see is what there is to understand.
He basically says that those languages (MLs, Lisps) are great for personal projects or very small teams but they don't scale well, this also reflects on open source where many will build their libraries and programs, but very few get into collaborating in those communities.
I agree with Carmack's analysis of Lisp but don't think it really applies to OCaml, which I've used professionally.
IMO, OCaml is significantly easier to ramp up on than Haskell. The object model and the fact that the language allows you to ramp up on imperative code let you write working code and the language is feature rich enough you're not generally writing a DSL. Granted, I think it is still harder to ramp up programmers than in JS, Python, Java, or Go.
I think OCaml's biggest problem has been one of timing and value proposition. The language matured to some level of production readiness in the mid 2000s at which point multicore started to become important.
If you wanted extreme performance, OCaml increasingly couldn't compete with C++ or Java. Most shops don't the robustness the superior type system could provide and the other downsides of an unpopular language always loomed large when adoption was being considered.
I still think OCaml or something OCaml-like might become popular with time, but it will require the kind of improvements the multicore project is hinting at providing and more.
Those don't apply as heavily to F# and OCaml as heavily though. F# and OCaml are much more practical than Haskell and much more strict than Lisp/Scheme/Racket, and so they sit in a very nice sweet spot of programming language design. The MLs, despite their influence on other languages, are the most overlooked and underused languages.
In F#, even for some DSLs, there's not much funny business at all. You have types and functions. I'm not as familiar with OCaml, but in F#, the only really confusing thing along the lines of DSLs or macros are computation expressions. But for the most part, you don't need them aside from the built-in `async` one.
> F# and OCaml are much more practical than Haskell
For industrial applications in most deployment environments Haskell is a much more practical language than OCaml, due to having many more modern runtime features (green threads, STM, etc., that for many years have made the lack of multicore support in OCaml seem embarrassing). It's not really that that gap has been caught up to now either, OCaml is still many years behind Haskell in basic runtime features.
F# obviously has a practically useful and featureful runtime, but has a scheduler that makes it easy to get thread exhaustion, whereas Haskell has a preemptive one that will make that a non-issue.
I find that the "practical language" argument is usually used by people who have never used either OCaml or Haskell for solving real world problems. In practice OCaml is a did-not-finish versus the comparatively (to most other languages, notably losing to the BEAM languages) excellent finishing time of Haskell.
I was mainly addressing Carmack's points on the "culture" of these languages, but F# is definitely more practical overall than Haskell, in my opinion. Eager by default with optional lazy evaluation, mutability, OOP, functional, pragmatic and huge standard library, easy FFI, CLI tooling, easy to use async and concurrency, and a strong VM that basically runs anywhere and is easy to install all go a long way. I am less familiar with the particulars of OCaml because I have never needed to reach for it over F#, and yes, the longstanding multicore support deficiency was not great.
Also, to elaborate on my reply to Carmack, I think he is grossly underestimating the prevalence of esoteric implementations and DSLs in C, C++, and their ilk.
And yes, Elixir and Erlang, especially Elixir, are very practical languages. I'm hard pressed these days to look any further than F# or Elixir for projects because they effectively cover all bases from soft real-time embedded and up.
The community burned me in the end - fortunately a little committee took over my libraries when I left in a rage at some of its truly insufferable vindictive self appointed leaders. I think this comes and goes and the main body of users is tirelessly helpful.
But the rest of this is nonsense. The compiler is a flat out miracle, a monument to human understanding, and produces unbelievably fast programs given the weird and wonderful abstract material one hands to it.
I think golang is designed intentionally to make it easy to read unfamiliar code.
To paraphrase: The idea being that a well designed library api gets you 90% of the benefits of a richer language’s features, but without exactly that penalty that Carmac calls out.
Doesn't that kinda presume that everyone should just "write a well-designed library"? If it's that much easier to just use a richer language shouldn't we just do that?
There's even a paradox. Lisp like languages aims at not scaling, since you have control over parsing (to an extent) and interpretation (macros). If something requires a lot of hands.. people will compress that into a DSL, and keep the team small.
It does create dialects and potential silo effect. I never worked in real CL projects but books and articles mention to not abuse macros and DSLs because of that, old lispers are also often very educated .. and they rarely do things for trivial reasons. I'm regularly surprised by how well thought out things are.
> He compared it with C or Go, where he realized that it was easier to him to read kernel code without any context, because there is no abstraction price to pay. What you see is what there is to understand.
I don't get it. In C you're programming with structures, data types and functions. In ML and Lisp, you're programming with structures, data types and functions. Lisp lets you muck with syntactic forms so maybe that has some obscuring effect, but I expect most programs are written in fairly direct style. Where people do add abstractions, it's to reuse code so there's ostensibly less code to understand overall.
It's just very hard to write a solid concurrent GC. There are not many in existence today. Here the challenge was doubled by the obligation to preserve the single core performance of existing programs, with the very fast allocation path on the minor heap. It's always harder to add these features to an existing language that already has significant programs written in it.
If your language matches ocaml's runtime semantics, then sure. It means uniform representation, 63bits integers, etc. but can be quite useful if your language is ML-ish.
Also worth pointing out there's design constraints on the OCaml 5 GC imposed by some of OCaml's language features (looking at you ephemerons) and C API invariants.
There may be different constraints for other runtimes.
Congratulations and a big thank you to the OCaml team! I hope that multicore support finally ticks all the requirement boxes that had prevented many from taking a serious look at OCaml. The language certainly deserves it: it hits that sweet spot between expressiveness, performance, and pragmatism like no other.
As an Elixir (which steps on Erlang) and Rust dev I'm curious if you think OCaml 5.0 / Eio will give the Erlang's BEAM VM and Rust's tokio a run for their money in the parallel runtime space.
I'm super curious about OCaml, picked it up and left it several times in the last 3 years. Now that multicore is here I'll absolutely be picking it up again and try to use it for parallel scripting and for some of my work. Great job!
Are the plans for typed algebraic effects solidifying, or are they still nebulous? Concretely, are you willing to take a guess as to when we are expected to see OCaml 6? ;-)
Thanks for the reply. I hope that the array and list comprehensions land soon in upstream; it's a useful and hopefully not-too-controversial feature.
I'm more ambivalent regarding the local allocations and the unboxed types. I totally understand why they'd be useful when you are trying to squeeze every last drop of performance, but they do require a not-so-trivial complexification of the language.
The local types are less invasive than the full support for typed effects. In particular, they are opt-in and associated complexity is pay-as-you-go. In my initial experiments, they seemed pretty nice to program with.
The type system for algebraic effects is still in the research and design phase at this point.
Right now, I am not even taking a guess of what will be the defining new major features of OCaml 6 (effect system + modular implicits maybe? Maybe not?).
Thanks for the reply. I'm hoping that modular macros land soon. I'm very ambivalent about the PPX mechanism, and I hope that modular macros reduces the need of PPX.
How does "bounding races in space and time" impact the generated machine code? Do you have some interesting examples of how it's different from the code C++ or JVM JIT would generate? What's the performance impact? Thanks!
Right. It is hard in general to compare performance or generated code across languages as they make different trade offs. Table 3 and 5 show compilation of our memory model to X86 and ARMv7. The compilation and the related work section discusses trade offs.
To summarise, atomic read and write in OCaml is more expensive than C++ since C++ SC atomics only establish total order between SC operations on not operations that are weaker but related to SC operations by happens before. We need stronger atomics for local data race freedom.
Our non atomic are also stronger than non atomics on C++ and Java. That said, The non atomic operations are free on X86 (compiled to plain loads and stores) and involve a lightweight fence before stores on weaker architectures such as ARM, Power and RISC-V. The performance results show that extra fences before stores have a barely noticeable impact on ARM and Power.
Note that The Whitespace Thing actually preceded F#. (Mike and I were in the same fraternity, I remember talking to him about TWT, and we haven't talked much since our college days, which pre-date F#.)
IIRC, Mike was also doing a bunch of stuff with a Jabber client in OCaml around that time.
OCaml syntax is whitespace-insensitive. You don't need to worry about indentation, copy-pasting code snippets, etc. I think that makes for a better experience especially for beginners when the parser is more forgiving.
Yes, I started in OCaml but switched to F# for the last 3 years and will likely move back to OCaml when effects are mature (when lwt and async are snuffed out).
Being able to use C# libs is very nice but also it sucks that I don't have true null safety.
Who knows, maybe F# is always more performant than OCaml and it doesn't make sense to switch back.
> when effects are mature (when lwt and async are snuffed out).
That's actually on the way right now. The main concurrency library which is designed for OCaml 5, eio, internally uses effects but doesn't require its users to be exposed to them. Users write code in direct style and don't need to care about internal implementation with effects.
F# is .NET, so you can have pretty fine grained control over memory layout, right? Spans, packed structs, 8/16/32/64 bit signed/unsigned ints, etc.
That stuff all gets a bit screwy in Ocaml, an int has 31 bits of information but is 32 bits in size, a record with two int 64s ends up being twice you expect, etc.
opam 2.2's release cycle has fallen a bit behind the compiler's (actually because of the Windows support). It's an experimental branch, but this works with opam-repository-mingw to get a vanilla mingw-w64 build of OCaml 5.0.0:
I saw some kind of other installer that might come with either MSYS2 or Cygwin I forget which, but it also said that installer takes 2 hours to run. They seem to strongly recommend WSL over it based on how I read things.
Not for 5.0, the aim of 5.0 was to focus on getting multicore support out of the door. Thus only mingw64 is supported for Windows. Support for MSVC will come with later versions. At the same time, improving opam ecosystem support for Windows is one of the major goal of opam 2.2 . Thus hopefully the support for native Windows will improve in the future (next year?).
Presumably, in cases where performance really matters, they're using single-threaded processes pinned to cores and shared memory ring buffers to communicate across processes. It ends up looking not that much different from a high performance C++ project, except for the lack of full address space sharing.
Modern MMUs don't require full TLB flushes when switching address spaces, and physically tagged cache lines allow the ring buffers to be shared in cache even if they're mapped at different virtual addresses in the different processes. Also, you're guaranteed to not have false sharing, and there's zero contention for locks within malloc/free.
I don't mean to suggest there are no advantages to full address space sharing, but they're both fewer and less than one would initially assume.
Great, so, can someone send me a PR with an OCaml implementation of "my"/our GC-content benchmark (a simple string proccessing benchmark counting the fraction of G:s and C:s in DNA sequences, compared to all the A:s, C:s, G:s and T:s)?
Amazing work from the team! I wonder if this is actually the first mainstream language which has managed to remove its "global lock" without breaking changes ?
To clear up any misconception, out of the box OCaml will behave like OCaml 4 with a single domain and a "domain lock". Programs currently using multiple threads for concurrency will remain single-core for the time being, as they will need to opt-in to parallelism features. In this sense, adding parallelism to OCaml does not break existing programs, but they still might have to be audited for thread-safety depending on how they want to use parallelism. There is no magic.
This backwards compatibility decision to separate threads from domains has been very useful and allows to gradually "port" existing code to 5.0: first just fixup C bindings to avoid naked pointers, and then code can safely run on OCaml 5 just as before.
And once a program (and all its dependencies) have removed dependence on global state they can opt-in to multicore by spawning additional domains.
There's nothing "modern" about the C-style curly brace and semi-colon syntax in Reason. In fact, the rise of Python and it's lightweight syntax means that C-style languages are starting to look dated.
Indentation-sensitive syntax is totally obsolete in a world with autoformatters. In addition, with this kind of syntax you also lose the ability to have your code autoformatted in certain situations. Here's a trivial example to illustrate the point:
def example():
x = 5
print("Hello world")
What's the mistake here? Depending on whether the print is part of the function, it should either be indented or have a newline before it. The point is you (and any formatting tool) can't know what the horizontal alignment of this code should be just by examining the vertical line order. You can only determine this by knowing (or reanalyzing) the semantics of the code. During a refactor where you're moving around lots of code, this can be a significant PITA. However, in the JS example,
function example() {
let x = 5
console.log("Hello world")
}
it's unambiguous what the mistake is because you can determine the correct formatting entirely from the line order, without having to know anything about the code's semantics.
This is a really good point, but one still doesn't need curly braces and semis. Despite having a lightweight syntax, OCaml isn't indentation sensitive, so doesn't suffer from this. The Ocamlformat tool also works really well.
> OCaml isn't indentation sensitive, so doesn't suffer from this
True, I was just making the point that the lightweight syntax of indentation-sensitive languages like Python can be problematic. I don't know how OCaml is able to achieve this style of syntax without the use of significant indentation, but it is impressive.
Adding spurious curly braces to make OCaml look more like JS is not 'ergonomic and modern' syntax, it's a step backwards if anything (though I can appreciate there is some sense to that in context of a compile-to-JS language for frontend dev)
I started learning ocaml and i stopped when I heard about the lack of multicore support (felt like it wasn't ready for general purpose use) this is an awesome developemt. From all the functional lanuages I've sampled so far Ocaml is the most intuitive. Hoping to finally get into it in 2023.
Did you compare it to F#? I’ll probably try to learn an ML language next year and without doing much research F# looks really nice since it gets all the .Net benefits.
F# is nicer than the alternatives if you need to be (or want to be) on the Microsoft ecosystem but there are some things I miss when I use it from OCaml, like modules and functors. It also doesn't support named arguments, which are a fairly trivial thing that drastically cuts down the bugspace and increase usability for me.
I would expect F# to be faster than OCaml, given that it doesn't box floats, and is backed by a JIT that proactively does monomorphization of generics.
Can you give examples of the latter? I know from personal experience (of looking at disassembly of JIT output) that, when CLR generics are instantiated with structs, it is perfectly capable of full monomorphization with inlining of stuff similar to C++ <algorithm>.
Yes, structs are generally very good because they are mostly monomorphized. I haven't tested lately if structs of the same size but different types will fully monomorphize or if they share code and so require a small dispatch table. There was a CLR release awhile back where they discussed sharing code in this way to reduce code bloat.
There are still a few pitfalls. Off the top of my head, accessing generic static fields, ie. static class Foo<T> { public readonly static T Member }. Mainly for non-struct members, this typically requires a hashtable-like lookup to resolve the offset for the.
Also, generic interface dispatch can't be monomorphized, ie. interface IFoo<T> { void Method<T>(T value); }, so this too costs more than a regular virtual call because it too requires a hashtable-like lookup to resolve the generic overload to invoke.
It’s faster in some ways and slower in others. But in general this is “very fast” vs. “very fast”. You won’t find yourself wanting in web service performance in either language, for example.
I wouldn't expect F# to be faster in anything actually, speaking as a .Net developer. OCaml is very well optimized and the abstract machine was well designed to have an efficient execution. Do you have any specific examples where the CLR is better?
Yes, you can find several benchmarks on the language benchmarks game, for example. And I would expect that any situation where you use Spans in F# to outperform almost everything outside of native code (just like with C#).
It's complex [1], OCaml does have a unboxed representation of floats in arrays [2] and records (provided all fields are floats), but elsewhere they are indeed boxed.
Good point. Polymorphic code that doesn't reduce to an unboxed float representation can be slower because of the boxing, but I don't think such code is very common exactly for this reason. I wonder if OCaml developers have tried NaN-boxing to see how it would impact performance.
Maybe so. But it benefits from the larger ecosystem that it can piggyback onto. This is often important to people selecting a language bound by constraints in their product, company, etc and can't be dismissed.
.NET/JIT/GC improvements, improvements to core API's, enterprise libraries/SDK's, etc. C#'s improvements do tend to flow into F#, if not from a language/syntax perspective from an ecosystem perspective. An upgrade of .NET version for example also benefits a F# developer greatly even if no work is done on the language at all for example. A performance improvement in say ASP.NET Core benefits many of the F# web frameworks too. A language/tool is more than just its syntax - you need to learn the libraries, package management, build tools, etc as well and be confident of their long term support/improvement. All dimensions are important.
At this stage F# does have more broader technology support and interoperability as a result of this "second class" status. Whether this matters depends on the use case, company, and engineering resources at hand. Being second class may be more feasible for a language that can ride the tail wind may be better than standing on its own two feet? Right now in my context personally I could use F# for my company's apps and not hit too many blockers, I probably couldn't use OcAML given the technologies we use day to day.
Sure, but does that really matter? Second class doesn't mean they aren't investing in it... and some of the coolest use cases for F# (Fable ecosystem) falls outside of MSFT's scope entirely.
It's a small community with great libraries/support, and you can generally bet that the folks who are excited by it and want to use it are pretty strong technically.
I'd say significant portion of F# users are using Fable to generate Javascript. F# can compile to Rust and Python as well.
https://github.com/ncave/fable-raytracer
Fable -> JS is our main use case for F#; having had to hire for this, I wouldn't say it's accurate. Most are still focused on the .NET implementation side of things. Compilation to Python is nice, I've yet to figure out why we wouldn't just use Rust rather than F# -> Rust... Rust is a great language and it feels a little silly to abstract over it.
I haven't tried F# because I am already having to work with powershell and C# and that whole .net/visualstudio ecosystem. The last thing I will look at on my personal time is a language in that ecosystem but since you recommended it I will have a look.
Since we have a Unity product, I’m moving us to an F# backend! I actually quite like it since we’re building a typical web api for a game.
You don’t have to deal with the typical need to determine what packages to use as you would in ocaml. I have aspnet and co. I’m okay with OOP leaking here and there when I get most of what I need already: an ML tool with an huge ecosystem.
My interest is in (soft) real-time music systems. I'm curious if anyone can comment whether this will make OCaml a potential language in that domain. I've held off on Haskell because of the whole "you might not know how long this will take", which is a non-starter for music. I mostly use Scheme and C (and Max/MSP but with custom C/Scheme in it), but I'd love to use something other than C for the low level stuff.
Wondering if anyone can comment on whether this might mean OCaml can be a contender in that space now?
The road to multicore OCaml was indeed longer and harder than expected. At the end of the day, the constraint of trying to preserve the behavior of almost all existing programs ended up driving a majority of design choices. Typically, this required at least one major rewrite of the multicore runtime along the way.
Is there a guide on how to use the new constructs? The link to the 5.0 docs was broken on the site, and after manually fixing the URL all I found was some type annotations in the `Effects` module.
The link to the Effect section of the manual at https://v2.ocaml.org/releases/5.0/manual/effects.html works for me and should contain an higher level description of what are effect handlers. Which link was broken for you?
The Effects module is kind of low level right now, as it understand it. You should like at Eio for a library that gives you nice fibers and non blocking IOs on top of effects! It's a neat library.
IIRC, while they added the underlying language support for effects to this version to get the multicore support working, the standard library support for effects isn't really ready yet, so you may want to hold off another year or so.
I think eventually it will hopefully turn into something like what languages like purescript have which would be really cool.
(I've only used ocaml a tiny bit and use f# a lot more but I keep periodically checking the status of this because it's something that would make ocaml a lot more interesting to me.)
Still early days, but I had done some exploratory work in the past on Reagents, a composable lock-free library [1]. Now that OCaml 5 is released, we're reviving this work.
It's semantics is weaker than STM -- unlike STM, it doesn't provide serializability but Reagents can compile down to multi-word compare and swap operations, which can be implemented with the help of hardware transactions (when present) or efficient software implementations of it [2]. Hence, Reagent programs should be faster than STM.
I haven't heard anyone talk about STM for OCaml, funny. People talk about, or work on, lightweight fibers, lockfree data structures, io_uring, etc. but not STM. Is it falling out of fashion? Even in clojure I hear that few people actually use it.
I'm curious about what design decisions lead to OCaml not having Multi-threading when version 1.0 came out. Majorly impressive work getting something as complex as that added on afterwards. Kudos to everyone involved!
Caml 1.0 was released in '85 and OCaml (the O is for Object Orientation) was 1996. Multithreading wasn't a high priority for anybody back then. The JVM didn't even have threads until 1997 and those threads were green threads, OS threads came to Java later.
However this is all Unix and academia based. If you wrote code for Windows and OS/2 in the commercial world you were using threads sine '89 and thus did not want to use the languages that did not use threads e.g. python, OCaml
In the late 80s and early 90s we wrote things with threads but they were primarily a kind of convenience to get multitasking behaviour and not any kind of performance boost.
Multicore / multiprocessor systems were not a mainstream thing in consumer hardware until the 21st century.
It would be pretty difficult to write threaded code for Windows in 1989, since it didn't support threads until WinNT 3.1 (1993).
But even in late 90s it was still common for desktop Win9x apps to use the main window message loop for async processing (Win32 API itself heavily encouraged it at the time - e.g. that's how OS timers work) in lieu of threads.
That was much later, in the mid 90s. But who did have threads in the mid 80s was Erlang which back then only existed in Ericsson's research lab. Ericsson had anpther internal language which Erlang drev inspiration from which also had support for concurrency.
This will search in the current directory tree for all files that contain the code pattern 'foo(x, y)' and replace it with 'foo(x)', using Scala syntax rules. It's super convenient for doing large-scale codemods. E.g. https://github.com/tinymce/rescript-webapi/pull/40
A lot of projects also started on ocaml and then later moved off of it once they had succeeded by showing the concept works and got some momentum going. People like it for exploratory compiler dev, then switch off it when their language can self host. IIRC both rust and elm started like this, certainly others as well.
Oh you're right. Now that I think about it elm is more stylistically similar to haskell too. Last time I used elm I had never played with haskell so I probably just assumed it was related to ocaml which I did know.
I've recently completed bugfixing/testing on 4.14.1+no-naked-pointers, and 5.0 compatibility is not far behind (we're usually 1 or 2 compiler versions behind latest, e.g. current production releases are built using 4.13.1)
Disclaimer: I work on the XAPI project as part of my job, the project itself is >15 years old at this point.
One of the top use cases for OCaml is to build interpreters and compilers for new languages. Its syntax and type system makes it very natural to navigate tree-structured data, such as ASTs. It has a great parser generator library, Menhir, as well as an up-to-date LLVM API. The first version of Rust was written in OCaml.
One famous example (from 1997) is to lay out the rules for generating Fast Fourier code (for any input range, not just powers of two) and have correct efficient C code generated.