Multicore OCaml: July 2021

the_duke · on Aug 2, 2021

I never used OCaml, except for some toy side projects, but I'm curious about this multicore saga. It feels like I've been reading about it for years and years.

What's the reason it's taking so long? Inherent difficulty in changing the language? Lack of resources? Lack of prioritization / urgency / industry support? (it's a rather niche language after all, apart from some prominent users like Jane Street)

choeger · on Aug 2, 2021

I think the main difficulty consists of two parts that affect each other. On the hand, you have to design language features and that requires some sound theoretical foundation (i.e., what does "multicore" mean, when programming OCaml). Then there is the profound technical challenge to implement these semantics in a performant runtime system. The thing is, single-core OCaml is actually pretty efficient, especially considering its relative (e.g., compared to Scala and Haskell) simplicity and elegance. People would not accept losing these properties.

As an example, consider the straightforward implementation of a heap allocation in the single-core runtime: To allocate a small value (something that OCaml programs tend to do often), you just increment the pointer to the young generation (usually a 2MB buffer). Only when that fails, more complicated operations are triggered. Because it fails seldom and because the compiler can store the pointer in a register all the time, you don't pay very much for many allocations.

If you want to do that for multicore, you have one obvious problem to solve: How do different cores communicate values that live in this young buffer? Do they share the same buffer? Then you'd have to coordinate access, wasting time. Do they copy the values? When? Could we infer shaded values and allocate them specially? Do we need type-system support? Etc., Etc.

Languages that work on the JVM or .NET have it much simpler to answer these kind of questions because the answers have been built-in to the VM itself.

Zababa · on Aug 2, 2021

It's taking a long time because they don't want to break existing code, and don't want to loose too much single core performance. I'd say that counts as inherent difficulty in changing the language. I'm not aware of other languages trying to retrofit multicore without breaking existing code, but I'm sure if any did other people will correct me and can say how it went.

Edit: they're also working on an effect system at the same time, making everything fit into a 25 years old language that's really stable is not trivial.

UncleOxidant · on Aug 5, 2021

I kind of wonder why they don't make multicore an option to the compiler so that if you were willing to sacrifice single-core performance to get multi-core benefits you'd just say 'ocaml -mc' or something.

awestroke · on Aug 2, 2021

It can be hard to retrofit multicore capability. See Python for example

pritambaral · on Aug 2, 2021

Or Ruby (i.e., Ractors), for an example of how the current Python attempt might turn out.

dangerbird2 · on Aug 2, 2021

The real question is whether a thread-based actor model is any better performance than existing approaches in python for cpu parallelism. At this point, about 95% of cases that need CPU parallelism are already going to be implemented as native modules bypassing the interpreter and the GIL entirely in hotpaths, or otherwise the overhead of process-parallel python code via celery or the stdlib's multiprocess module is acceptable.

pritambaral · on Aug 2, 2021

I think you're downplaying the importance of first-class parallelism in Python land. I'd agree if you asserted that "95% of cases where CPU parallelism is actually used" are implemented by "bypassing the interpreter and the GIL entirely in hotpaths", but that's largely out of limitations and not because there was no further need.

There is plenty of Python code written where the authors didn't realise it was going to be CPU-critical or in hotpaths until it was too late. It's a small fraction of these cases where the authors have the fortune/opportunity to subsequently rewrite their code "via celery or the stdlib's multiprocess module" — sharing data in-memory is simply a lot easier than sharing across processes, and avoiding mutation of data structures is easier than writing serialisation code for them.

Zababa · on Aug 2, 2021

I think (but I'm not sure) that Swift might be going through the same challenges currently.

zepto · on Aug 2, 2021

It’s hard to retrofit, but also they want to do it as seamlessly as they can, and make sure they are creating a sound theoretical foundation.

dmitriid · on Aug 2, 2021

What OCaml story shows is how hard it is to retrofit any sort of proper multithreaded implementation onto a language/runtime not designed for it.

I can't but admire the people who do this for OCaml

zulban · on Aug 3, 2021

Ah yes, ocaml. The only course in my computer science degree that I disliked and failed.

yawaramin · on Aug 3, 2021

OCaml is more than just a badly-taught CS course, it's an industrial-strength general-purpose language that's being mined for ideas by basically every modern language.

jp0d · on Aug 3, 2021

I recently started learning F#. Would anyone recommend learning OCaml or F#?