Hacker News new | past | comments | ask | show | jobs | submit login
OCaml 5.0 Multicore is out (ocaml.org)
465 points by sadiq on Dec 16, 2022 | hide | past | favorite | 173 comments



I remember around a decade ago when I first got into programming, I was super excited that OCaml would soon get multicore, regularly checking the progress. Although it took a lot longer than I imagined it would, nevertheless it feels amazing to see it finally here, almost like a dream when you've waited so long for something and you can't believe it's finally happening.


I guess I was too impatient, switched to F# right away because it felt more complete.


Signals and Threads had an interview with Anil Madhavapeddy last year:

https://signalsandthreads.com/what-is-an-operating-system/

He talked about the work to put a multicore-ready memory model[1] and GC[2] under OCaml.

[1] https://anil.recoil.org/papers/2018-pldi-memorymodel.pdf

[2] https://arxiv.org/abs/2004.11663


Offtopic, but Signals and Threads is my absolute favourite programming podcast, maybe even favourite podcast overall. They go into interesting topics deeply, instead of the way too common "interviewer read the summary of the wikipedia page about the subject and is now interviewing someone who actually read the entire page."

Hoping they'll get more content out soon.


They are doing a really good job at employer branding. They make the requirements/constraints of the finance industry sound interesting. A field that cares about correctness, ordering and accurate clocks.


Yaron Minsky is such a charismatic speaker, watched all his recordings you can find on internet more times that I’d like to admit, s&t is great.


thanks for mentioning this! I've been looking for a new one that's good. Even if they're behind schedule, it looks like I've got some catching up to do to keep me busy in the meantime.


One thing to point out to anyone listening to that episode is that at the time the plan was to only upstream the multicore GC for 5.0 and then follow up with effects. Instead they both went in to 5.0.

(Was a very enjoyable episode though!)


Just FYI the talk about multicore starts at 44:22


People who aren't PL theorists understand the multicore benefits even if the retrofitting paper might be over their heads. On the other hand, the bounding data races in time and space paper/presentation is also a fairly significant change that I don't think the average dev has an easy to understand relationship with.

Anil covers it in a bit more plain English in this Signals and Threads episode. Here's a bit of the transcript, starting at about 50 minutes in: https://signalsandthreads.com/what-is-an-operating-system/

> Ron: Do you have a pithy example of a pitfall in multicore Java that doesn’t exist in multicore OCaml?

> Anil: There’s something called a data race. And when you have a data race, this means that two threads of parallel execution are accessing the same memory at the same time. At this point, the program has to decide what the semantics are. In C++, for example, when you have a data race, it results in undefined behavior for the rest of the program, the program can do anything. Conventionally, daemons could fly out of your nose is an example of just what the compiler can do.

> In Java, you can have data races that are bounded in time so the fact that you change a value can mean later on in execution, because of the workings of the JVM, you can then have some kind of undefined behavior. It’s very hard to debug because it is happening temporally across executions of multiple threads.

> In OCaml, we guarantee that the program is consistent and sequentially consistent between data races. It’s hard to explain any more without showing you fragments of code. But conceptually, if there’s a data race in OCaml code, it will not spread in either space or time. In C++, if there’s a data race, it’ll spread to the rest of the codebase. In Java, if there’s a data race, it’ll spread through potentially multiple executions of that bit of code in the future.

> In OCaml, none of those things happen. The data race happens, some consequence exists in that particular part of the code but it doesn’t spread through the program. So if you’re debugging it, you can spot your data race because it happens in a very constrained part of the application and that modularity is obviously essential for any kind of semantic reasoning about the program because you can’t be looking in your logging library for undefined behavior when you’re working on a trading strategy or something else. It’s got to be in your face, at the point.

(and so on)


This reminds of the worse is better debate.

Specifically how the “better” crowd spent a lot of time trying to solve PCLSRing while the “worse” crowd just said: meh, throw an error and let the program figure it out.

Previously

https://news.ycombinator.com/item?id=20225555


Honestly, the best of both worlds would have been to have the simple kernel implementation and put the boilerplate recovery code in the system libraries. For instance, libc's write() should handle interruption resumption, and libc should also provide a write_interruptable() (okay, appropriately shortened for the symbol limits in early linkers) to expose the raw system call in the uncommon case that the programmer wants to manually handle interruptions.


Aren’t these problems better resolved by linear types and alike?

Successful implementations were made in several languages now.


I guess you still want well-defined semantics even if most users use some modes-based or other high-level api. Eg normal rust code gets to be ‘data race free’ but low level implementations of eg Arc could be racy if they were incorrect. But maybe the OCaml thing is more than just doing a good job of specifying semantics and there is some overhead in generated code because of it. I confess I haven’t looked into the details.


Can you name a single successful implementation that can e.g. implement modern concurrent lock-free data structures without using `unsafe`?


ATS has it [1], you can safely implement lock-free queues and ring buffers in it.

[1] https://ats-lang.sourceforge.net/DOCUMENT/INT2PROGINATS/HTML...


OCaml is one of those languages that is a real joy to use and makes me wonder why its not used more often. I feel like if I jump into a Lisp or ML language, I am so productive and can write some really complicated software with relative ease. But they are relatively rare in industry, and I have never really crossed the t on why, despite various reasonings about it.

This is a great achievement for OCaml, but does anyone have an explanation on why it was so difficult to implement for them?


I think John Carmack made a good point on this topic which I did not think of before.

He stated that albeit he spent years working with Lisps (CL and Racket mostly) or Haskell he stated that jumping on projects in those languages instantly requires you pay the price of having to learn the abstractions and DSLs that the users wrote for the project before being able to understand anything.

He compared it with C or Go, where he realized that it was easier to him to read kernel code without any context, because there is no abstraction price to pay. What you see is what there is to understand.

He basically says that those languages (MLs, Lisps) are great for personal projects or very small teams but they don't scale well, this also reflects on open source where many will build their libraries and programs, but very few get into collaborating in those communities.


I agree with Carmack's analysis of Lisp but don't think it really applies to OCaml, which I've used professionally.

IMO, OCaml is significantly easier to ramp up on than Haskell. The object model and the fact that the language allows you to ramp up on imperative code let you write working code and the language is feature rich enough you're not generally writing a DSL. Granted, I think it is still harder to ramp up programmers than in JS, Python, Java, or Go.

I think OCaml's biggest problem has been one of timing and value proposition. The language matured to some level of production readiness in the mid 2000s at which point multicore started to become important.

If you wanted extreme performance, OCaml increasingly couldn't compete with C++ or Java. Most shops don't the robustness the superior type system could provide and the other downsides of an unpopular language always loomed large when adoption was being considered.

I still think OCaml or something OCaml-like might become popular with time, but it will require the kind of improvements the multicore project is hinting at providing and more.


I'm certainly no expert, but isn't (linux) kernel C full of pre-processor macros that muddy reading the code?


Those don't apply as heavily to F# and OCaml as heavily though. F# and OCaml are much more practical than Haskell and much more strict than Lisp/Scheme/Racket, and so they sit in a very nice sweet spot of programming language design. The MLs, despite their influence on other languages, are the most overlooked and underused languages.

In F#, even for some DSLs, there's not much funny business at all. You have types and functions. I'm not as familiar with OCaml, but in F#, the only really confusing thing along the lines of DSLs or macros are computation expressions. But for the most part, you don't need them aside from the built-in `async` one.


> F# and OCaml are much more practical than Haskell

For industrial applications in most deployment environments Haskell is a much more practical language than OCaml, due to having many more modern runtime features (green threads, STM, etc., that for many years have made the lack of multicore support in OCaml seem embarrassing). It's not really that that gap has been caught up to now either, OCaml is still many years behind Haskell in basic runtime features.

F# obviously has a practically useful and featureful runtime, but has a scheduler that makes it easy to get thread exhaustion, whereas Haskell has a preemptive one that will make that a non-issue.

I find that the "practical language" argument is usually used by people who have never used either OCaml or Haskell for solving real world problems. In practice OCaml is a did-not-finish versus the comparatively (to most other languages, notably losing to the BEAM languages) excellent finishing time of Haskell.


I was mainly addressing Carmack's points on the "culture" of these languages, but F# is definitely more practical overall than Haskell, in my opinion. Eager by default with optional lazy evaluation, mutability, OOP, functional, pragmatic and huge standard library, easy FFI, CLI tooling, easy to use async and concurrency, and a strong VM that basically runs anywhere and is easy to install all go a long way. I am less familiar with the particulars of OCaml because I have never needed to reach for it over F#, and yes, the longstanding multicore support deficiency was not great.

Also, to elaborate on my reply to Carmack, I think he is grossly underestimating the prevalence of esoteric implementations and DSLs in C, C++, and their ilk.

And yes, Elixir and Erlang, especially Elixir, are very practical languages. I'm hard pressed these days to look any further than F# or Elixir for projects because they effectively cover all bases from soft real-time embedded and up.


I have used both. Ocaml is a joy. I would rather swallow shards of glass than work with Haskell again.

The language is a mess. The compiler is slow. Performance is poor. The community is insufferable.

I think that what is meant by practical.


The community burned me in the end - fortunately a little committee took over my libraries when I left in a rage at some of its truly insufferable vindictive self appointed leaders. I think this comes and goes and the main body of users is tirelessly helpful.

But the rest of this is nonsense. The compiler is a flat out miracle, a monument to human understanding, and produces unbelievably fast programs given the weird and wonderful abstract material one hands to it.


I think golang is designed intentionally to make it easy to read unfamiliar code.

To paraphrase: The idea being that a well designed library api gets you 90% of the benefits of a richer language’s features, but without exactly that penalty that Carmac calls out.


Doesn't that kinda presume that everyone should just "write a well-designed library"? If it's that much easier to just use a richer language shouldn't we just do that?


I don't know what JC said exactly but I would guess that some of the issues of scaling Lisps (dynamic typing, DSL hell) don't really apply to OCaml


There's even a paradox. Lisp like languages aims at not scaling, since you have control over parsing (to an extent) and interpretation (macros). If something requires a lot of hands.. people will compress that into a DSL, and keep the team small.

It does create dialects and potential silo effect. I never worked in real CL projects but books and articles mention to not abuse macros and DSLs because of that, old lispers are also often very educated .. and they rarely do things for trivial reasons. I'm regularly surprised by how well thought out things are.


> He compared it with C or Go, where he realized that it was easier to him to read kernel code without any context, because there is no abstraction price to pay. What you see is what there is to understand.

I don't get it. In C you're programming with structures, data types and functions. In ML and Lisp, you're programming with structures, data types and functions. Lisp lets you muck with syntactic forms so maybe that has some obscuring effect, but I expect most programs are written in fairly direct style. Where people do add abstractions, it's to reuse code so there's ostensibly less code to understand overall.


It's just very hard to write a solid concurrent GC. There are not many in existence today. Here the challenge was doubled by the obligation to preserve the single core performance of existing programs, with the very fast allocation path on the minor heap. It's always harder to add these features to an existing language that already has significant programs written in it.


Would it be possible to take this work and turn it into a more general GC library that language implementers can use beside LLVM?


If your language matches ocaml's runtime semantics, then sure. It means uniform representation, 63bits integers, etc. but can be quite useful if your language is ML-ish.


Also worth pointing out there's design constraints on the OCaml 5 GC imposed by some of OCaml's language features (looking at you ephemerons) and C API invariants.

There may be different constraints for other runtimes.


Congratulations and a big thank you to the OCaml team! I hope that multicore support finally ticks all the requirement boxes that had prevented many from taking a serious look at OCaml. The language certainly deserves it: it hits that sweet spot between expressiveness, performance, and pragmatism like no other.


If you want to see an example of it: https://v2.ocaml.org/releases/5.0/manual/parallelism.html.


What is the most fun and best source to learn OCaml for a programmer?

I know this one [0] so far. Also the famous Coursera PL course [1] covers ML.

[0]: https://cs3110.github.io/textbook/cover.html

[1]: https://coursera.org/learn/programming-languages


Check out Real World OCaml too:

https://dev.realworldocaml.org/


There are a few OCaml contributors lurking and happy to answer questions if you have them.


As an Elixir (which steps on Erlang) and Rust dev I'm curious if you think OCaml 5.0 / Eio will give the Erlang's BEAM VM and Rust's tokio a run for their money in the parallel runtime space.

I'm super curious about OCaml, picked it up and left it several times in the last 3 years. Now that multicore is here I'll absolutely be picking it up again and try to use it for parallel scripting and for some of my work. Great job!


We certainly hope so. You may find Thomas Leonard's talk from last year's workshop interesting: https://watch.ocaml.org/videos/watch/74ece0a8-380f-4e2a-bef5...

The paper we wrote on retrofitting effect handlers: https://arxiv.org/abs/2104.00250 also has some http benchmarks


Are the plans for typed algebraic effects solidifying, or are they still nebulous? Concretely, are you willing to take a guess as to when we are expected to see OCaml 6? ;-)


Rather than a full effect system, we're very likely to have lexically scoped "checked" effects with the help of modal types. I briefly talked about it at the end of my ICFP keynote: https://icfp22.sigplan.org/details/icfp-2022-papers/48/Retro...

There are other cool stuff that is being worked on, which I am very excited about: https://discuss.ocaml.org/t/jane-street-compiler-development.... Hopefully, we will see many of these make it into OCaml 6.


Thanks for the reply. I hope that the array and list comprehensions land soon in upstream; it's a useful and hopefully not-too-controversial feature.

I'm more ambivalent regarding the local allocations and the unboxed types. I totally understand why they'd be useful when you are trying to squeeze every last drop of performance, but they do require a not-so-trivial complexification of the language.


The local types are less invasive than the full support for typed effects. In particular, they are opt-in and associated complexity is pay-as-you-go. In my initial experiments, they seemed pretty nice to program with.


The type system for algebraic effects is still in the research and design phase at this point.

Right now, I am not even taking a guess of what will be the defining new major features of OCaml 6 (effect system + modular implicits maybe? Maybe not?).


Thanks for the reply. I'm hoping that modular macros land soon. I'm very ambivalent about the PPX mechanism, and I hope that modular macros reduces the need of PPX.


How does "bounding races in space and time" impact the generated machine code? Do you have some interesting examples of how it's different from the code C++ or JVM JIT would generate? What's the performance impact? Thanks!


See examples in the second section and the results in the paper: https://kcsrk.info/papers/pldi18-memory.pdf


Do you mean section 2 of the paper? It only contains C-like pseudo-code, no machine code.

The results section of the paper only compare the performance of Multicore OCaml with plain OCaml, not OCaml vs C++ / Java.


Right. It is hard in general to compare performance or generated code across languages as they make different trade offs. Table 3 and 5 show compilation of our memory model to X86 and ARMv7. The compilation and the related work section discusses trade offs.

To summarise, atomic read and write in OCaml is more expensive than C++ since C++ SC atomics only establish total order between SC operations on not operations that are weaker but related to SC operations by happens before. We need stronger atomics for local data race freedom.

Our non atomic are also stronger than non atomics on C++ and Java. That said, The non atomic operations are free on X86 (compiled to plain loads and stores) and involve a lightweight fence before stores on weaker architectures such as ARM, Power and RISC-V. The performance results show that extra fences before stores have a barely noticeable impact on ARM and Power.


    $ opam update
    $ opam switch create 5.0.0 --repositories=default
    $ eval $(opam env)
    $ ocaml
    OCaml version 5.0.0
    Enter #help;; for help.


Been dabbling in f# lately, but this has me wanting to give OCaml a try for comparison, very interested in the new effects stuff.


I wish OCaml had something like F#'s lightweight syntax: https://learn.microsoft.com/en-us/dotnet/fsharp/language-ref...



Note that The Whitespace Thing actually preceded F#. (Mike and I were in the same fraternity, I remember talking to him about TWT, and we haven't talked much since our college days, which pre-date F#.)

IIRC, Mike was also doing a bunch of stuff with a Jabber client in OCaml around that time.


I'd never seen the verbose syntax for f#, I thought you had to write it the whitespace dependent way. Huh.


Lightweight syntax is a pain to use for blind people. Glad OCaml hasn't fallen for that fad.


Do blind people use special code editors? If so, they should probably have a whitespace sensitive mode to make this easier.


ReasonML or ReScript not your cup of tea? They are different syntactical front-ends for OCaml.


OCaml syntax is whitespace-insensitive. You don't need to worry about indentation, copy-pasting code snippets, etc. I think that makes for a better experience especially for beginners when the parser is more forgiving.


This so-called lightweight syntax gives me anxiety just scrolling through these examples.


Yes, I started in OCaml but switched to F# for the last 3 years and will likely move back to OCaml when effects are mature (when lwt and async are snuffed out).

Being able to use C# libs is very nice but also it sucks that I don't have true null safety.

Who knows, maybe F# is always more performant than OCaml and it doesn't make sense to switch back.


> when effects are mature (when lwt and async are snuffed out).

That's actually on the way right now. The main concurrency library which is designed for OCaml 5, eio, internally uses effects but doesn't require its users to be exposed to them. Users write code in direct style and don't need to care about internal implementation with effects.


F# is .NET, so you can have pretty fine grained control over memory layout, right? Spans, packed structs, 8/16/32/64 bit signed/unsigned ints, etc.

That stuff all gets a bit screwy in Ocaml, an int has 31 bits of information but is 32 bits in size, a record with two int 64s ends up being twice you expect, etc.


Still no native windows build To install on windows, I guess your best bet is via WSL (Windows Subsystem for Linux)

I now run mainly on windows and this is an issue for me to try OCaml


opam 2.2's release cycle has fallen a bit behind the compiler's (actually because of the Windows support). It's an experimental branch, but this works with opam-repository-mingw to get a vanilla mingw-w64 build of OCaml 5.0.0:

opam switch create 5.0 --repos=dra27=git+https://github.com/dra27/opam-repository#windows-compilers --packages=ocaml.5.0.0,ocaml-option-mingw


I saw some kind of other installer that might come with either MSYS2 or Cygwin I forget which, but it also said that installer takes 2 hours to run. They seem to strongly recommend WSL over it based on how I read things.


Do I misremember? I thought that was a goal for 5.0.


Not for 5.0, the aim of 5.0 was to focus on getting multicore support out of the door. Thus only mingw64 is supported for Windows. Support for MSVC will come with later versions. At the same time, improving opam ecosystem support for Windows is one of the major goal of opam 2.2 . Thus hopefully the support for native Windows will improve in the future (next year?).


Thank you, that's good.


Jane Street must be so happy today! I think this opens up a whole new host of fast applications on ocaml.

I wonder how this will affect ReScript


Presumably, in cases where performance really matters, they're using single-threaded processes pinned to cores and shared memory ring buffers to communicate across processes. It ends up looking not that much different from a high performance C++ project, except for the lack of full address space sharing.

Modern MMUs don't require full TLB flushes when switching address spaces, and physically tagged cache lines allow the ring buffers to be shared in cache even if they're mapped at different virtual addresses in the different processes. Also, you're guaranteed to not have false sharing, and there's zero contention for locks within malloc/free.

I don't mean to suggest there are no advantages to full address space sharing, but they're both fewer and less than one would initially assume.


OCaml has a reputation for being fast.

https://sixthhappiness.github.io/articles/python-scheme-and-...

Does anyone have any multicore benchmarks that illustrate performance increases in 5.0?


Great, so, can someone send me a PR with an OCaml implementation of "my"/our GC-content benchmark (a simple string proccessing benchmark counting the fraction of G:s and C:s in DNA sequences, compared to all the A:s, C:s, G:s and T:s)?

https://github.com/samuell/gccontent-benchmark

:D


Here are some parallel benchmarks from the Sandmark continuous benchmarking service: https://sandmark.tarides.com/?app=Parallel+Benchmarks&parall...


Amazing! An incredible amount of effort! This was holding back ocaml almost for as long as I remember - and now it's just gone.

Congratulations to the team!

What are the next plans for the project? Spread newly available features..?


There's definitely some work to build atop the new functionality available in 5.0 and make sure there's plenty of good learning material.

In terms of the compiler and runtime development, the OCaml and ML Workshops at ICFP in October have videos that cover some of the experimental work happening: https://watch.ocaml.org/video-channels/ocaml2022/videos and https://www.youtube.com/playlist?list=PLyrlk8Xaylp7f8T7L5SFF...

There's also a compiler development newsletter that's posted on the discuss at regular intervals which details some of the other work happening: https://discuss.ocaml.org/t/ocaml-compiler-development-newsl...


Amazing work from the team! I wonder if this is actually the first mainstream language which has managed to remove its "global lock" without breaking changes ?


To clear up any misconception, out of the box OCaml will behave like OCaml 4 with a single domain and a "domain lock". Programs currently using multiple threads for concurrency will remain single-core for the time being, as they will need to opt-in to parallelism features. In this sense, adding parallelism to OCaml does not break existing programs, but they still might have to be audited for thread-safety depending on how they want to use parallelism. There is no magic.


Just to add to the sibling comment. To maintain backwards compatibility, OCaml 5 has both threads and domains.

Threads belong to a domain and only one thread can hold the runtime lock for the domain. This is the same behaviour as in OCaml 4.

With OCaml 5 you can have as many domains as you want though (we recommend no more than you have cores though).


This backwards compatibility decision to separate threads from domains has been very useful and allows to gradually "port" existing code to 5.0: first just fixup C bindings to avoid naked pointers, and then code can safely run on OCaml 5 just as before.

And once a program (and all its dependencies) have removed dependence on global state they can opt-in to multicore by spawning additional domains.


Is it still possible to use Reason as frontend for Ocaml 5?

I felt an ergonomic and modern syntax is the only missing piece in Ocaml.


There's nothing "modern" about the C-style curly brace and semi-colon syntax in Reason. In fact, the rise of Python and it's lightweight syntax means that C-style languages are starting to look dated.


Indentation-sensitive syntax is totally obsolete in a world with autoformatters. In addition, with this kind of syntax you also lose the ability to have your code autoformatted in certain situations. Here's a trivial example to illustrate the point:

    def example():
        x = 5
    print("Hello world")
What's the mistake here? Depending on whether the print is part of the function, it should either be indented or have a newline before it. The point is you (and any formatting tool) can't know what the horizontal alignment of this code should be just by examining the vertical line order. You can only determine this by knowing (or reanalyzing) the semantics of the code. During a refactor where you're moving around lots of code, this can be a significant PITA. However, in the JS example,

    function example() {
        let x = 5
    console.log("Hello world")
    }
it's unambiguous what the mistake is because you can determine the correct formatting entirely from the line order, without having to know anything about the code's semantics.


This is a really good point, but one still doesn't need curly braces and semis. Despite having a lightweight syntax, OCaml isn't indentation sensitive, so doesn't suffer from this. The Ocamlformat tool also works really well.


> OCaml isn't indentation sensitive, so doesn't suffer from this

True, I was just making the point that the lightweight syntax of indentation-sensitive languages like Python can be problematic. I don't know how OCaml is able to achieve this style of syntax without the use of significant indentation, but it is impressive.


Strong disagree from me

Adding spurious curly braces to make OCaml look more like JS is not 'ergonomic and modern' syntax, it's a step backwards if anything (though I can appreciate there is some sense to that in context of a compile-to-JS language for frontend dev)


Great news! Kudos to everyone that helped make this happen.


Christmas came slightly early this year.


The Christmas presents will be all the issues and bugs that will be discovered between now and Christmas eve in the shiny new multicore runtime.


I started learning ocaml and i stopped when I heard about the lack of multicore support (felt like it wasn't ready for general purpose use) this is an awesome developemt. From all the functional lanuages I've sampled so far Ocaml is the most intuitive. Hoping to finally get into it in 2023.

Great work Ocaml devs!


Did you compare it to F#? I’ll probably try to learn an ML language next year and without doing much research F# looks really nice since it gets all the .Net benefits.


F# is nicer than the alternatives if you need to be (or want to be) on the Microsoft ecosystem but there are some things I miss when I use it from OCaml, like modules and functors. It also doesn't support named arguments, which are a fairly trivial thing that drastically cuts down the bugspace and increase usability for me.

I also don't think F# is as fast as OCaml?

Fairly relevant piece of culture on why a company switched from OCaml to F# https://blog.darklang.com/new-backend-fsharp/



I would expect F# to be faster than OCaml, given that it doesn't box floats, and is backed by a JIT that proactively does monomorphization of generics.


I doubt very much that F#/.NET is faster than OCaml. The latter is very fast, and the CLR's runtime generics have runtime costs.


Can you give examples of the latter? I know from personal experience (of looking at disassembly of JIT output) that, when CLR generics are instantiated with structs, it is perfectly capable of full monomorphization with inlining of stuff similar to C++ <algorithm>.


Yes, structs are generally very good because they are mostly monomorphized. I haven't tested lately if structs of the same size but different types will fully monomorphize or if they share code and so require a small dispatch table. There was a CLR release awhile back where they discussed sharing code in this way to reduce code bloat.

There are still a few pitfalls. Off the top of my head, accessing generic static fields, ie. static class Foo<T> { public readonly static T Member }. Mainly for non-struct members, this typically requires a hashtable-like lookup to resolve the offset for the.

Also, generic interface dispatch can't be monomorphized, ie. interface IFoo<T> { void Method<T>(T value); }, so this too costs more than a regular virtual call because it too requires a hashtable-like lookup to resolve the generic overload to invoke.


It’s faster in some ways and slower in others. But in general this is “very fast” vs. “very fast”. You won’t find yourself wanting in web service performance in either language, for example.


I wouldn't expect F# to be faster in anything actually, speaking as a .Net developer. OCaml is very well optimized and the abstract machine was well designed to have an efficient execution. Do you have any specific examples where the CLR is better?


Yes, you can find several benchmarks on the language benchmarks game, for example. And I would expect that any situation where you use Spans in F# to outperform almost everything outside of native code (just like with C#).

Have you compared the two before?


I recall OCaml wasn't particularly fast with anything involving floats, IIRC because they often require boxing?


It's complex [1], OCaml does have a unboxed representation of floats in arrays [2] and records (provided all fields are floats), but elsewhere they are indeed boxed.

[1] https://discuss.ocaml.org/t/optimizing-small-vector-operatio... [2] https://v2.ocaml.org/api/Float.Array.html


Good point. Polymorphic code that doesn't reduce to an unboxed float representation can be slower because of the boxing, but I don't think such code is very common exactly for this reason. I wonder if OCaml developers have tried NaN-boxing to see how it would impact performance.


I like F# and I think its a great language. But I also think it will always be second class for Microsoft.


Maybe so. But it benefits from the larger ecosystem that it can piggyback onto. This is often important to people selecting a language bound by constraints in their product, company, etc and can't be dismissed.

.NET/JIT/GC improvements, improvements to core API's, enterprise libraries/SDK's, etc. C#'s improvements do tend to flow into F#, if not from a language/syntax perspective from an ecosystem perspective. An upgrade of .NET version for example also benefits a F# developer greatly even if no work is done on the language at all for example. A performance improvement in say ASP.NET Core benefits many of the F# web frameworks too. A language/tool is more than just its syntax - you need to learn the libraries, package management, build tools, etc as well and be confident of their long term support/improvement. All dimensions are important.

At this stage F# does have more broader technology support and interoperability as a result of this "second class" status. Whether this matters depends on the use case, company, and engineering resources at hand. Being second class may be more feasible for a language that can ride the tail wind may be better than standing on its own two feet? Right now in my context personally I could use F# for my company's apps and not hit too many blockers, I probably couldn't use OcAML given the technologies we use day to day.


Sure, but does that really matter? Second class doesn't mean they aren't investing in it... and some of the coolest use cases for F# (Fable ecosystem) falls outside of MSFT's scope entirely.

It's a small community with great libraries/support, and you can generally bet that the folks who are excited by it and want to use it are pretty strong technically.


I'd say significant portion of F# users are using Fable to generate Javascript. F# can compile to Rust and Python as well. https://github.com/ncave/fable-raytracer


Fable -> JS is our main use case for F#; having had to hire for this, I wouldn't say it's accurate. Most are still focused on the .NET implementation side of things. Compilation to Python is nice, I've yet to figure out why we wouldn't just use Rust rather than F# -> Rust... Rust is a great language and it feels a little silly to abstract over it.


I haven't tried F# because I am already having to work with powershell and C# and that whole .net/visualstudio ecosystem. The last thing I will look at on my personal time is a language in that ecosystem but since you recommended it I will have a look.


Since we have a Unity product, I’m moving us to an F# backend! I actually quite like it since we’re building a typical web api for a game.

You don’t have to deal with the typical need to determine what packages to use as you would in ocaml. I have aspnet and co. I’m okay with OOP leaking here and there when I get most of what I need already: an ML tool with an huge ecosystem.


My interest is in (soft) real-time music systems. I'm curious if anyone can comment whether this will make OCaml a potential language in that domain. I've held off on Haskell because of the whole "you might not know how long this will take", which is a non-starter for music. I mostly use Scheme and C (and Max/MSP but with custom C/Scheme in it), but I'd love to use something other than C for the low level stuff.

Wondering if anyone can comment on whether this might mean OCaml can be a contender in that space now?


You can tune the behavior of OCaml’s GC. See the discussion of best-fit/next-fit/first-fit allocation here:

https://dev.realworldocaml.org/garbage-collector.html


good to know, thanks!


What is the most fun and best source to learn OCaml for a programmer?

I know this one [0] so far. Also the famous Coursera PL course covers ML.

[0]: https://cs3110.github.io/textbook/cover.html


this was being talked about when i was still in high school and last year i did my masters


The road to multicore OCaml was indeed longer and harder than expected. At the end of the day, the constraint of trying to preserve the behavior of almost all existing programs ended up driving a majority of design choices. Typically, this required at least one major rewrite of the multicore runtime along the way.


Is there a guide on how to use the new constructs? The link to the 5.0 docs was broken on the site, and after manually fixing the URL all I found was some type annotations in the `Effects` module.


The link to the Effect section of the manual at https://v2.ocaml.org/releases/5.0/manual/effects.html works for me and should contain an higher level description of what are effect handlers. Which link was broken for you?


I also didn't think to look in the "language extensions" section of the manual, instead going to the API docs and clicking on the "Effect" module.

This looks much more interesting, on a skim. Thanks!


The "Manual" link on this page https://ocaml.org/releases is broken


This is fixed now, thanks!


Make sure to check out the new chapter on parallelism too that is linked from the release notes: https://v2.ocaml.org/releases/5.0/manual/parallelism.html


The Effects module is kind of low level right now, as it understand it. You should like at Eio for a library that gives you nice fibers and non blocking IOs on top of effects! It's a neat library.


> like at Eio

Look at Eio here: https://github.com/ocaml-multicore/eio


IIRC, while they added the underlying language support for effects to this version to get the multicore support working, the standard library support for effects isn't really ready yet, so you may want to hold off another year or so.

I think eventually it will hopefully turn into something like what languages like purescript have which would be really cool.

(I've only used ocaml a tiny bit and use f# a lot more but I keep periodically checking the status of this because it's something that would make ocaml a lot more interesting to me.)



No way, it's finally out! This is fantastic news.


Multicore is there, STM instead of mutable imperative logic in all the libraries developed for the past two decades: not so much though.


Still early days, but I had done some exploratory work in the past on Reagents, a composable lock-free library [1]. Now that OCaml 5 is released, we're reviving this work.

It's semantics is weaker than STM -- unlike STM, it doesn't provide serializability but Reagents can compile down to multi-word compare and swap operations, which can be implemented with the help of hardware transactions (when present) or efficient software implementations of it [2]. Hence, Reagent programs should be faster than STM.

[1] https://github.com/ocaml-multicore/reagents [2] https://arxiv.org/pdf/2008.02527.pdf


I haven't heard anyone talk about STM for OCaml, funny. People talk about, or work on, lightweight fibers, lockfree data structures, io_uring, etc. but not STM. Is it falling out of fashion? Even in clojure I hear that few people actually use it.


It works great in Haskell. The problem is that it really needs typed effects, hopefully OCaml will get typed algebraic effects at some point.


I'm curious about what design decisions lead to OCaml not having Multi-threading when version 1.0 came out. Majorly impressive work getting something as complex as that added on afterwards. Kudos to everyone involved!


Caml 1.0 was released in '85 and OCaml (the O is for Object Orientation) was 1996. Multithreading wasn't a high priority for anybody back then. The JVM didn't even have threads until 1997 and those threads were green threads, OS threads came to Java later.


However this is all Unix and academia based. If you wrote code for Windows and OS/2 in the commercial world you were using threads sine '89 and thus did not want to use the languages that did not use threads e.g. python, OCaml


In the late 80s and early 90s we wrote things with threads but they were primarily a kind of convenience to get multitasking behaviour and not any kind of performance boost.

Multicore / multiprocessor systems were not a mainstream thing in consumer hardware until the 21st century.


It would be pretty difficult to write threaded code for Windows in 1989, since it didn't support threads until WinNT 3.1 (1993).

But even in late 90s it was still common for desktop Win9x apps to use the main window message loop for async processing (Win32 API itself heavily encouraged it at the time - e.g. that's how OS timers work) in lieu of threads.


That was much later, in the mid 90s. But who did have threads in the mid 80s was Erlang which back then only existed in Ericsson's research lab. Ericsson had anpther internal language which Erlang drev inspiration from which also had support for concurrency.


> I'm curious about what design decisions lead to OCaml not having Multi-threading when version 1.0 came out

You do know that C (edit - didn't have multithreading in the language spec until C11, right?)

It was common on languages of that era. Also Ocaml has had libraries for multithreading for many years, just like C has POSIX threads and things...


I thought C11 had threads and atomic and concurrency primitives?


Ah yeah my bad. Still like 30+ years after it's creation though. And the vast majority of C software out there doesn't use C11 anyway.


What OSS is out there that uses OCaml?




One that I've found pretty helpful is comby: https://comby.dev/

It's a code-syntax aware large-scale search-and-replace tool. E.g.,

    comby -matcher .scala -review 'foo(:[x], :[y])' 'foo(:[x])'
This will search in the current directory tree for all files that contain the code pattern 'foo(x, y)' and replace it with 'foo(x)', using Scala syntax rules. It's super convenient for doing large-scale codemods. E.g. https://github.com/tinymce/rescript-webapi/pull/40



FFTW is written in C, but it used a FFT compiler written in OCaml to generate snippets of C code. [1]

[1] https://www.fftw.org/fftw3_doc/Generating-your-own-code.html


A lot of projects also started on ocaml and then later moved off of it once they had succeeded by showing the concept works and got some momentum going. People like it for exploratory compiler dev, then switch off it when their language can self host. IIRC both rust and elm started like this, certainly others as well.



nitpick: Elm was and is written in Haskell.


Oh you're right. Now that I think about it elm is more stylistically similar to haskell too. Last time I used elm I had never played with haskell so I probably just assumed it was related to ocaml which I did know.


The XAPI toolstack is written in OCaml: https://xenproject.org/developers/teams/xen-api/

I've recently completed bugfixing/testing on 4.14.1+no-naked-pointers, and 5.0 compatibility is not far behind (we're usually 1 or 2 compiler versions behind latest, e.g. current production releases are built using 4.13.1)

Disclaimer: I work on the XAPI project as part of my job, the project itself is >15 years old at this point.


One that I use is the unison sync tool


Anyone know if OCaml was used at FTX, btw? Given that it's big at Jane Street, where SBF worked.


Probably not, given that their public sample code[1] has C++, Go, and Python.

[1]: https://github.com/ftexchange/ftx


They did not. They were a Python and React shop, with efforts underway to rewrite in Rust.


Python for trading, yikes

Would love to see that codebase

Then again, Stripe has been wildly successful with Ruby

Albeit they’ve had type checking with sorbet for years


OCaml syntax is a little hard to read for plebian like me.

What is the use cases for OCaml?


One of the top use cases for OCaml is to build interpreters and compilers for new languages. Its syntax and type system makes it very natural to navigate tree-structured data, such as ASTs. It has a great parser generator library, Menhir, as well as an up-to-date LLVM API. The first version of Rust was written in OCaml.


One famous example (from 1997) is to lay out the rules for generating Fast Fourier code (for any input range, not just powers of two) and have correct efficient C code generated.

[1] https://www.fftw.org/

[2] https://en.wikipedia.org/wiki/FFTW


What's the current status of Esy? https://github.com/esy/esy

Any plans to backport its design back to Opam?


Halleluja! I think they were at it for more than a decade.


What an amazing ending of 2022!

Just Hotwire Strada left.


Does it support Apple Silicon Macs?


Yes.


Congrats! Super excited to start playing around with the effects system.


Meta: I believe the title should be "OCaml 5 is out!", as it appears on the OCaml site.

At first glance, I understood the title "OCaml 5.0 Multicore is out" to mean "Multicore is no longer part of OCaml 5.0".


Great news! Kudos to everyone that helped make this happen.


Oh congratulations! Well done to all involved!


any easy way to try it on windows (not wsl) ?


It's from an experimental branch, so not very easy, but this works with opam-repository-mingw to get a vanilla mingw-w64 build of OCaml 5.0.0:

opam switch create 5.0 --repos=dra27=git+https://github.com/dra27/opam-repository#windows-compilers --packages=ocaml.5.0.0,ocaml-option-mingw


Thanks I will have a look




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: