Multicore OCaml: September 2020

adenozine · on Oct 8, 2020

Looks like 2021 is shaping up to be a fantastic year of Ocaml. I know I've been digging into it, trying to sharpen my skills in FP.

I think Ocaml's learnability is a huge asset. I'm surely biased, my main skillset is in Python so of course the low friction to a new language is more important to me rather than say, a Haskell user, or a Scala user. The 2nd edition of Real World Ocaml on Yaron Minsky's site is a great resource for me, though once multicore is actually stable and fully activated there will be still more learning to do.

It's a really nice change to embrace types so close to the top of the software process. Things like Mypy have been a godsend, and I've said in the past that Mypyc represents the biggest quantum leap that Python will ever make, but having such a mature and FAST compiler, with a syntax that's clean and consistent on top of a huge, mature ecosystem (not as huge as pypi, ofc) is just such a powerful value proposition.

Aside from Multicore, I think OWL (ocaml.xyz) is a big leap forward for the ecosystem too. I can see some finance shops putting together Ocaml-Jupyter notebooks and that sort of thing. Maybe bio-tech researchers with FP backgrounds could get more into it.

This is all just my opinion, but having leaned pretty heavy on Python's ecosystem for a pretty long period of time, I am VERY optimistic about finding ways to replace things with Ocaml for speed and correctness. I do enjoy Python, but I think the typed-Python renaissance is about 3 years too late now. It'll win more battles in the years to come, but I see war on the horizon, and not just with Ocaml either.

chrischen · on Oct 9, 2020

We've been converting our code from Typescript to Ocaml/ReasonML. The genType project from the Reason team makes it possible to gradually integrate the ReasonML codebase into our existing Typescript project because it generates fully typed Typescript code and not just plain JS. And best of all, if something is hard to do in Ocaml/ReasonML we can actually just use a nodejs package.

sandeepc24 · on Oct 9, 2020

You might be interested in F# https://fsharp.org/

adenozine · on Oct 9, 2020

The NET ecosystem is too heavily built for C#. The documentation is hard to use, and nearly non existent for F#. I looked at it before I started digging into Ocaml. I interact with Linux far more than anything else, I think Ocaml is just going to be more useful to me, more often.

I really admire Don Syme though. He's one of the most intelligent people in software, by my reckoning. Literally a brilliant genius in a field littered with mediocre hacks that work real hard to ship things. He's a real legend.

Thanks for the suggestion though. Maybe some folks will come across it for the first time now. It's a well designed syntax, and has some really interesting features. Units of Measure in particular is a really cool thing I've needed before, I usually just use data classes and decorators in python to do a lookup, and just pay the performance penalty. It's neat to have that info at compile time for free.

sandeepc24 · on Oct 9, 2020

dotnet core runs on linux, mac and windows - just mentioning.

ducaale · on Oct 9, 2020

I was to introduced to F# through the excellent fsharpforfunandprofit blog[1]. I wonder if there is a similar thing for other ML-like languages

[1] https://fsharpforfunandprofit.com

clockworkfrogs · on Oct 9, 2020

https://blog.janestreet.com/ is fantastic.

Jane St are a hedge fund doing some seriously heavy lifting in OCaml land, and they sometimes write up really interesting stuff they've done. It's less tutorial-like than fsharp for fun and profit, though.

lmm · on Oct 9, 2020

If you're willing to be judicious about which libraries you use then you can absolutely pick up Scala directly from Python - a lot of things you'd write in Python translate directly into Scala, particularly if you were already writing map/filter style code.

(I like OCaml too, don't get me wrong)

adenozine · on Oct 9, 2020

Oh yeah, for sure. I've seen some scala code that appeared quite palatable. I just meant that I don't know very much about writing it.

There's been some times where I needed python for systems that weren't so lax about what I was able to install, and python was already there. Other than that, I've never really come across a reason disqualifying scala, barring know-how and time to learn it.

lmm · on Oct 9, 2020

> There's been some times where I needed python for systems that weren't so lax about what I was able to install, and python was already there.

FWIW if a system has Java and Maven installed (i.e. any machine set up for Java) then you can use Scala there (indeed I'd say that's a much nicer way to use it than installing it at the system level). Unfortunately Scala tooling isn't really set up for starting quickly if you're not familiar with it, even though it's actually a pretty decent language for one-off scripts.

as_keyof_typeof · on Oct 8, 2020

You might find this interesting

> A conversation with Laurent Mazare about how your choice of programming language interacts with the kind of work you do, and in particular about the tradeoffs between Python and OCaml when doing machine learning and data analysis. Ron and Laurent discuss the tradeoffs between working in a text editor and a Jupyter Notebook, the importance of visualization and interactivity, how tools and practices vary between language ecosystems, and how language features like borrow-checking in Rust and ref-counting in Swift and Python can make machine learning easier.

https://signalsandthreads.com/python-ocaml-and-machine-learn...

adenozine · on Oct 9, 2020

I actually read a good chunk of this today!

My experience in software is pretty different than his, though I find his insights very welcome and quite valuable. I don't really have to build large systems, I work mainly with programs I'd describe as duct tape and wrenches. They're small and one-off, and the main goals are to compose them quickly and arrive at a stable behavior that I can write solid tests for. With ocaml, I think the way I'd write a lot of programs I use python for is roughly the same way, just piping a bunch of crap together and then cramming it into a test harness, but with ocaml I can have type safety and speed and a higher level of footgun to do work with. For me. I do weird work.

If I was a more typical software guy, I'd be able to relate a lot more.

jurip · on Oct 9, 2020

Signals and Threads has been a fantastic podcast thus far, I've really enjoyed all the episodes.

yaseer · on Oct 8, 2020

Awesome stuff. I'm an ocaml novice, but love the language. I've found its fusing of object oriented and functional styles far, far more elegant than scala.

wjsetzer · on Oct 8, 2020

This is unrelated to multicore, but Ocaml is a language I want to like. I wanted to learn OCaml with the make a lisp project. That is, until I realized it doesn't have Perl regex built in (yes, I have been spoiled by Python, which has practically everything in the standard library). The best way to get Perl regex was a rarely updated 3rd party library which was missing key features like lookahead and lookbehind.

jlrubin · on Oct 8, 2020

https://ocaml.janestreet.com/ocaml-core/109.55.00/tmp/re2/Re...

should do the trick! ocaml core is well maintained AFAIK

laylomo2 · on Oct 8, 2020

Here's a link to the latest documentation: https://ocaml.janestreet.com/ocaml-core/latest/doc/re2/Re2__...

philzook · on Oct 8, 2020

S expressions are one of the most used serialization formats in OCaml. You can get pretty far relying on the standard parsers and printers. It's what I would do for a lisp in OCaml. Maybe you wanted to DIY as a learning thing? There are other options for parsing including parser combinators or using Menhir/ocamllex which are interesting in their own right.

https://dev.realworldocaml.org/data-serialization.html

https://dev.realworldocaml.org/parsing-with-ocamllex-and-men...

djur · on Oct 9, 2020

This isn't really a good answer for someone who wants a regex engine. People use regexes for a lot of things where a full-fledged parser would be overkill, and I'm not sure what serialization has to do with anything.

bgorman · on Oct 9, 2020

I completed the make a lisp project in Reason (alternative syntax for OCaml) a few months ago.

I used the PCRE library, which has pretty much all the features you expect, and it is actively maintained. Note: the heavy lifting is done by C libraries.

https://opam.ocaml.org/packages/pcre/

If you want to see how I integrated it with the interpreter the code is here:

https://github.com/briangorman/reason-mal/blob/master/reader...

johnisgood · on Oct 9, 2020

Might be outdated (or not), but this page has tons and tons of examples, incl. lookahead: http://pleac.sourceforge.net/pleac_ocaml/patternmatching.htm...

And the GitHub page of the aforementioned library: https://mmottl.github.io/pcre-ocaml/

Joker_vD · on Oct 8, 2020

Were you going to parse S-expressions with regular expressions? I guess that saves you from learning how to write loops and conditionals, and what is substr() called (and what arguments it takes) in this new language, but is not that against the point of learning the new language?

bgorman · on Oct 9, 2020

The make a lisp tutorial provides a PCRE regular expression to generate the tokens that are later fed into the reader.

bjoli · on Oct 9, 2020

Now, I am a grumpy old fart, but I would suggest to make a one-pass recursive descent lexer+reader. That should be trivial for the MAL lisp. WRiting an ocaml recursive descent parser should be amazingly straightforward, especially since you can just tailcall the different states.

Joker_vD · on Oct 9, 2020

While re-using an existing regexp like

    [\s,]*(~@|[\[\]{}()'`~^@]|"(?:\\.|[^\\"])*"?|;.*|[^\s\[\]{}('"`,;)]*)

is easier than writing a tokenizer manually (keeping track offsets and looping and stuff), writing that regexp is definitely harder than writing the tokenizer like this:

    curr, end = 0, len(s)
    while True:
        while curr < end and isspace(s[curr]):
            curr += 1

        if curr >= end:
            break
            
        if s[curr:curr + 2] == "~@":
            yield s[curr:curr + 2]
            curr += 2

        elif isspecial(s[curr]):        # isspecial(c) matches c against []{}()'`~^@
            yield s[curr]
            curr += 1

        elif isquote(s[curr]):          # isquote(c) matches c against "
            start = curr
            curr += 1

            # check this condition out: you can totally support several quotes,
            # and accurately match the closing and opening ones. Imagine doing it
            # with a regexp: either duplicate it, or use some back-referencing magic

            while curr < end and not (s[curr] == s[start] and s[curr-1] != '\\'):
               curr += 1
            curr += 1        # we want to include the closing quote
            yield s[start:curr]

        elif s[curr] == ';':
            yield s[curr:]
            break

        else:
            start = curr
            while curr < end and not (isspace(s[curr]) or isspecial(s[curr]) or isquote(s[curr])):
                curr += 1
            yield s[start:curr]

Yeah, it's more verbose and somewhat repetitive, but on the other hand, it's way more readable, and debuggable too: with regexps, it's always a mystery which part of it exactly didn't match what you wanted or captured something you didn't want to match. Here, the loop invariants and preconditions are almost immediately obvious.

bjoli · on Oct 12, 2020

Not only that: having a generator generate the tokens means you can do it in one pass, while writitng code that has the clarity of 2 passes.

lowleveldesign · on Oct 8, 2020

Can anyone point me to the tool they use to generate the concurrency graphs, such as this one https://user-images.githubusercontent.com/410484/93755338-ba...? Is it something OCaml specific?

sadiq · on Oct 8, 2020

That's the Chrome tracing tool: https://www.chromium.org/developers/how-tos/trace-event-prof...

It uses the event logging infrastructure in multicore which outputs json.

Normal OCaml has integrated this since 4.11: https://ocaml.org/releases/4.11/htmlman/instrumented-runtime... but switches to the Common Tracing Format (CTF) so you'll need an extra step to convert it to a format Chrome tracing can ingest.

lowleveldesign · on Oct 8, 2020

Great thanks for the detailed answer.

DanielBMarkham · on Oct 8, 2020

Algebraic Effects look awesome. Can't wait to play around with all the cool new stuff.

agambrahma · on Oct 8, 2020

The event tracing here is pretty cool; does Haskell have something equivalent in its runtime?

horiz0n · on Oct 9, 2020

Haskell has ThreadScope https://wiki.haskell.org/ThreadScope

dunefox · on Oct 9, 2020

How do the type systems of F# and Ocaml differ? I'm trying to decide which I should learn over the next months. I don't need many libraries or anything, the decision is rather made regarding the language and tools themselves.

person_of_color · on Oct 9, 2020

Has anyone learnt OCaml, just to improve their chances of getting a job at Jane Street?

fractionalhare · on Oct 9, 2020

There's no point, they very explicitly don't expect candidates to know OCaml before joining. You can interview in just about any language you want.

ducaale · on Oct 9, 2020

I wonder why every time I search for a functional programming related concept, Jane street always pops in the search results

person_of_color · on Oct 9, 2020

Any tips to crack a JS interview? You seem to be employed there.

johnisgood · on Oct 9, 2020

There are blog posts by them regarding that. Might be of use to you.

https://blog.janestreet.com/interviewing-at-jane-street/

https://blog.janestreet.com/what-a-jane-street-dev-interview...

https://blog.janestreet.com/jane-street-interview-process-20...

https://www.janestreet.com/join-jane-street/interviewing/

pjmlp · on Oct 9, 2020

Not the OP, but I failed at a Jane Street interview for their UK office.

Get updated on algorithms and data structures, Google interview style.

I wasn't (having been around for a few decades), and while I still managed to swim, failed to impress.

person_of_color · on Oct 9, 2020

Are you going to keep trying?

pjmlp · on Oct 9, 2020

Nope, it was a different phase of my life and I don't plan to go into UK any longer.

Plus I am not keen in companies that follow Google hiring practices.

person_of_color · on Oct 9, 2020

The ROI is worth it, probably.

pjmlp · on Oct 9, 2020

Probably, but they aren't the only ones in town if ROI is the thing you care about.