Hacker News new | past | comments | ask | show | jobs | submit login
Real World OCaml (realworldocaml.org)
183 points by anandabits on Nov 11, 2013 | hide | past | favorite | 79 comments

For the early adopters and experimenters amongst you, you might like Felix http://felix-lang.org/share/src/web/tutorial.fdoc

It is a whole program optimized, strongly typed, polymorphic, ML like language that can interact effortlessly with C and C++ code and has coroutines baked in. Its own demo webserver is based on coroutines. It uses a mix of lazy and eager evaluation for performance and compiles down to C++. Execution speed is comparable to C++, mostly better. Its grammar is programmable in the sense that it is loaded as a library.

With inaccuracies in analogies assumed, Felix is to C++ what F# is to C# or to some extent Scala is to Java.

It is also mostly a one man effort but with a feverish pace of development so it comes with its associated advantages and disadvantages.

Tooling info is here http://felix-lang.org/share/src/web/tools.fdoc

The author likes to call it a scripting language but it really is a fullfledged statically compiled language with a single push button build-and-execute command. http://felix-lang.org/ The "fastest" claim is a bit playful and tongue in cheek, but it is indeed quite fast and not hard to beat or meet C with.

Hrm, what do you think about rust vs Felix? They seem to be targeting the same problem.

Rust and Felix both try to be general languages so they're both targeting that. Felix has a better type system. Rust provides more secure but restricted protocol for concurrency, Felix has no such restrictions, it's specifically designed to support shared memory concurrency, which Rust specifically doesn't allow. Rust uses message passing but organises via the memory management mechanism to do it very fast.

Note that Rust does allowed shared memory, either via "unsafe" code (i.e. same flexibility & dangers as in C, but the compiler only accepts it when wrapped in an `unsafe {}` block, so it's clear that you need to be careful), or higher level wrappers around this like Arc[1] for immutable shared memory, or RWArc[2] & MutexArc[3] for mutable shared memory.

> Felix has a better type system

What do you mean by this? From what I can see, the only way Felix encodes any form of memory safety (e.g. dangling pointers) in the type system is by garbage collection.

[1]: http://static.rust-lang.org/doc/master/extra/arc/struct.Arc....

[2]: http://static.rust-lang.org/doc/master/extra/arc/struct.RWAr...

[3]: http://static.rust-lang.org/doc/master/extra/arc/struct.Mute...

I have to delegate that to John (Felix's author) not much of a language theorist myself. It seems to me that Rust wants to be the better/safer C replacement. I think Felix's sweet spot is at a slightly higher level. For example, Felix's garbage collector can indeed be avoided for lower level code, it was meant to be used as an optional feature, but to me it seems it requires more than superficial knowledge to avoid it successfully.

I had a look at it and it sounded pretty cool. The one thing I found unfortunate is the lack of separation between safe and unsafe code, but it certainly has a lot going for it.

There's no such thing as safe code. So to do as you suggest requires some set of suitable concepts of relative safety, together with some way to enforce them. I am very interested in implementing mechanisms that provide guarantees. And not just safety. Another would be licence management (e.g allow only BSD licenced code to be used) .. that's a legal safety guarantee :)

Yeah, that is D's strength, that and compile time function execution.

This looks really awesome! Thanks for bringing it to my attention. Why is this so much under the radar? It seems like it would have a ton going for it.

The historical answer is that originally Felix programs were being written for the Shootout and the compiler was upgraded so it performed well. In fact it trashed everything. Then control of the Shootout changed hands and Felix got dropped by the new manager. Today, there are no forums for developing new languages.

Felix "targets" people that would like to use Haskell or Ocaml but have a ton of code in C and C++ to interface with. Felix is a C++ upgrade: it discard the syntax, but retains ABI compatibility, at quite some cost to things like safety for example.

Felix is more or less guaranteed to perform on par with C/C++ or better for the simple reason you can embed C++ directly into Felix, this works because Felix generates C++. And of course you can link to your favourite libraries with minimal syntax.

  type mytype = "My::Type";
  ctor mytype : int = "My::Type ($1)";
  fun addup : mytype * mytype -> mytype = "$1.addup($2)";
Unlike Ocaml which requires a lot of hard to write glue logic, Felix and C++ share types and functions. Typically only type glue is required to create a bridge.

No forums for developing new languages? Seems like Go, Nimrod, Rust, D, CoffeeScript and many other relatively new languages have large and active development.

That's interesting that they were booted from the shootout. I wonder what happened there. Anyway, I haven't really looked at this enough to see how its features really play out, but on the surface it looks great. Certainly as an alternative to C++ it sounds miles ahead, and indeed many of the languages being developed today are intended precisely as alternatived to C++. A guarantee of C/C++ performance or better is very enticing. :)

I think skaller meant that there is no common shared forum for (new) languages. Well, there is LTU but the discussion there tends towards the theoretical side, its focus is language implementers rather than language users. The language game site used to be one place a user could get exposed to different languages. It has largely become autocratic, arbitrary and "dont tell me how to spend my free time"hostile in the sense language gets dropped from the list for no clear reason. I think it has stopped being the de facto go to place for language comparison as well, not sure of the latter.

Lack of marketing I guess, not many users, and may be because people panic when they cannot immediately find the thing they are familiar with, for example OO hierarchies, dynamic types. Plus I found it hard to understand till I got a rudimentary understanding of OCaML, and Haskell's typeclasses. So its not ready to be used by everyone.

To do something nontrivial with it you would need to be on the mailing list though.

@dllthomas as far as I know OCaML doesnt, by no standards am I an OCaML user, although want to get better.

OCaml doesn't have typeclasses, does it? It's been a while since I've used it, so things may've changed or my memory may be failing...

It now has first-order modules, which can be used to accomplish almost the same thing. The biggest difference is first-order modules are explicit while typeclasses are implicit. Whether this is a good or bad thing is up for debate.

It had first order modules when I was using it, but people may well have found uses for them that make the comparison more apparent.

> compiles down to C++. Execution speed is comparable to C++, mostly better.

Uh, what. Can you clarify?

This seems counter-intuitive to people who don't write compilers, but a language that compiles to C++ can perform "better than C++" because it can emit code that no human would bother writing, usually taking advantage of information available in the source language that couldn't be safely determined by the C++ compiler alone. Whole program optimization, for example, allows you to do a lot of aggressive reorganization of code that would be very ugly if done by hand (think of putting all your code in one file and putting everything in an anonymous namespace).

Then say "better than hand-written C++", not "better than C++". It's like saying "C++ performs better than assembly". No it doesn't. It may perform better than hand-written assembly, and that's completely different.

It's just annoying when people will do anything to get on the "faster than C++" wagon.

Debian describes OCaml as follows:

Objective Caml (OCaml) is an implementation of the ML language, based on the Caml Light dialect extended with a complete class-based object system and a powerful module system in the style of Standard ML.

OCaml comprises two compilers. One generates bytecode which is then interpreted by a C program. This compiler runs quickly, generates compact code with moderate memory requirements, and is portable to essentially any 32 or 64 bit Unix platform. Performance of generated programs is quite good for a bytecoded implementation: almost twice as fast as Caml Light 0.7. This compiler can be used either as a standalone, batch-oriented compiler that produces standalone programs, or as an interactive, toplevel-based system.

The other compiler generates high-performance native code for a number of processors. Compilation takes longer and generates bigger code, but the generated programs deliver excellent performance, while retaining the moderate memory requirements of the bytecode compiler. It is not available on all arches though.

It is not available on all arches though.

Although in practices the arches that lack a native OCaml compiler don't matter.

In RHEL we ship OCaml natively for everything except S/390 and AArch64. AArch64 will be important, but since hardware doesn't exist in a form you can buy for servers, we're happy to wait for upstream to implement this. We'll probably help them out with hardware too.

Edit: We maintain our own PPC64 backend.

Xavier Leroy committed experimental native AArch64 support to OCaml trunk over the summer. I'm planning to add it to the OPAM package testing pool as soon as I get my hands on some Calxeda (or other) hardware.


We actively use/used OCaml in production on SPARC/Solaris, POWER/AIX, HP-UX/IA64. (The optimized compiler wouldn't build on AIX.)

Various backends for less used architectures are maintained out of tree, eg here:


Ocaml: Love the language, but I don't think I've seen a worse standard library. I don't mean that it's sparse -- I don't mind that so much. I mean that it's really just badly designed. For example, global, mutable variables in a functional language? Really?

Thankfully, Jane Street and the Batteries Included projects are supplementing it, but I'm still of the opinion that the standard library should be torn out and replaced with something nice.

It's important to remember that the standard library is the compiler standard library. It's actually very useful to have it be so minimal when compiling OCaml to odd embedded and microkernel targets (such as our own MirageOS at https://openmirage.org).

We took an explicit decision not to use the compiler standard library in Real World OCaml, and instead work using the Core stdlib from Jane Street. I think it's quite a testament to the modular power of OCaml that they managed to not only separate the standard library from millions of lines of internal code, but also to make it so usable for external users in a brief 12 months.

There are a lot more developments coming soon, of course: see my groups research page at http://ocaml.io for some of the projects. I've been lapse at updating it in the past few months, but normal service shall resume very shortly...

Even if you don't use / plan to use the Core stdlib, the book is quite useful.

I was pretty impressed with how a single `open Core.Std` line you can basically do the whole "tear out the standard library and replace with something nice" on a per-module basis. Unfortunately it has the side-effect of making small native binaries impossible, but depending on the project it's nice to have the choice.

> has the side-effect of making small native binaries impossible

fear not; support for module aliases in signatures will resolve that problem quite soon, and quite elegantly too.

That would be fantastic; looking forward to it.

Here is a description of OCaml usage at Jane Street : https://queue.acm.org/detail.cfm?id=2038036

ocaml.org have a list of users: http://ocaml.org/companies.html

Some "big" companies in the list: Facebook, Citrix, Dassault Système.

Another cool one from CUFP this year is how Facebook is using OCaml to add incremental type inference to their vast PHP code base. There's (slightly hard to see video) here, but I'm sure Facebook will publicize it more widely when they're ready.


That acm article was very interesting, than you! Lots of nice little concrete examples - that are simple, yet not too simple.

I've yet to play much with ocaml — we had a bit of standard ml in our programming paradigms-class at university (at the time the course used [1] "Programming Languages: Concepts and Constructs (2nd Edition) by Ravi Sethi" — now they've (unfortunately, yet understandably) replaced standard ml with Haskell). I've since had a little trouble adapting to similar-yet-different languages like both Haskell and OCaml.

[1] http://www.amazon.com/Programming-Languages-Concepts-Constru...

For anyone curious to hear more and is at QCon SF today, I'll be speaking about a new library operating system we've been building in OCaml for the past few years. The slot is in a couple of hours at 1030 PST.

qcon link: http://qconsf.com/presentation/my-other-internet-mirage my slides: http://decks.openmirage.org/qcon13/

Where is OCaml's place in the current world?

I find it interesting as it seems to generate little "buzz", but has two new books this year, and is a pre-cursor to another functional language that itself seems to be gaining traction, and is yet produced my Microsoft: F#

My observational / untested impression is OCaml seems to be more practical, and maybe a little easier to transition to for someone like me who uses mostly Python and Go, and a lot of bash/awk/sed/grep.

If you want to use a functional language on Unix, _and_ have a type system, _and_ avoid JVM/.Net then you don't have many choices. There is Haskell of course, but it is lazy by default, which makes it harder to reason about how your program will execute, and how much space it'll use. Rust is something interesting to keep an eye on, but AFAIK it is not ready yet for production use. Hence I prefer OCaml.

From the little I know of OCaml, Rust isn't really in the same category. While it has functional elements, it's really aimed at being a cleaner C++ (for instance, it doesn't have TCO).

Rust was originally implemented in OCaml, though. You're correct theyre not exactly in the same category, but there is influence there.

OCaml is a popular tool for implementing languages, IIRC the javascript referecence standard was implemented in ocaml as well.

Whops, no, that was actually Standard ML:


Why OCaml over ML?

I was shown some OCaml code at the university, so learning OCaml later seemed easier than Standard ML. Also I use Debian, and after a quick look at the repositories it has far more libraries for OCaml (-ocaml-dev) than for Standard ML, which tipped the balance in favour of OCaml as the language that I wanted to learn.

Why did I stick with OCaml after that? Mostly for two reasons: there are some libraries that I like (OCamlnet, Lwt, just to mention a few), and the community appears to be more active in recent years (lots of work on build systems, packaging, new libraries coming out, user meetings, etc.).

TBH I never followed the Standard ML community, so I don't know if its similar.

OCaml looks clearly more verbose/shittier to me, but I did learn SML/NJ first.


Jane Street gets mentioned a lot but Facebook is also a user. See the recent video from CUFP.


A major limiting factor is that OCaml has no support for parallelism, and due to it's use of a global interpreter lock for GC it can't run multiple threads.

That said it's great to program in (like Haskell with convenient IO and semicolons) and it compiles to blazing fast executables.

This is by choice, because single thread performance matters more, and you can use fork or MPI to scale across NUMA nodes and clusters much more scalably and safely.

Having used fork/join to parallelize an OCaml genetic algorithm, I can say from experience it is neither safe nor practical.

Edit: Intentionally only supporting single-threaded processes is a perfectly fine design decision, however I haven't seen this argument made for OCaml, rather I've seen "multi-threading our GC would be hard" as the justification for the single-thread limitation. Admittedly it was 3-4 years ago when I last had to deal with this.

There are now several experimental multithreaded OCaml runtimes floating around, which also have the key property of not slowing down the single-threaded case. I don't expect multicore to remain a limitation for OCaml in 2014.

My impression was that the core team was actively opposed to adding parallelism support. Has this changed?

That's a feature not a bug; it would slow down Coq is the reason.

Haskell has optional semicolons, though it probably isn't idiomatic to use them most of the time.

It's extremely practical, and used in virtualization in the real world. Virt-* tools, Xen.

See also: https://news.ycombinator.com/item?id=6711893

I think people are starting to see the value, or at least popularity, of functional languages (Haskell/Scala/F#), and some of those are picking up OCaml as a language that fits into their workflow better - one that compiles to native binaries, without the extreme functional purity of Haskell.

Here's the other new book, which is for a different audience: (http://www.ocaml-book.com)

How do the audiences differ? Experience / newcomers vs experienced?

Edit: Deciding which to read first.

I did read (and comment a bit) on the 'Real World Ocaml' beta book, but I've already known and used the language for quite some time before, so I can't really judge which book would be better for a beginner. Take this with a grain of salt:

"OCaml from the very beginning" seems to focus more on teaching the language itself. If you've never used OCaml before then this might be the place to start.

"Real World OCaml" in addition to teaching the basics of the language has some intermediate-level chapters (dealing with json, S-expressions, asynchronous events, parsing), and some advanced-level chapters (GC, compiler frontend/backend). I definitely recommend reading it at some point.

It's a great teaching tool (so in a university/academic setting at the very least)

I have been taught fundamental concepts of programming and programming paradigms with OCaml, and I am very grateful for that. To me, it is an expressive, clean, powerful multi-paradigm language that deserves a lot of love.

Also interesting read,

"Unix system programming in OCaml", http://ocamlunix.forge.ocamlcore.org/

Although not a book per-se, the tutorials on the OCaml.org site are based on the old ocaml-tutorial.org ones, which is among the ones I used to get started (there was no Real World OCaml, or OCaml from the Very Beginning at the time): http://ocaml.org/tutorials/

For future reference, all books should be listed here: http://ocaml.org/books.html

In my progression of Haskell-OCaml-Common Lisp-Clojure-Scala, I remember OCaml having odd edge cases, modules of functors or some such that didn't really help me in solving real world problems (this was 2007-2008 so my memory could be off). I'm currently a fan of Scala, which has all the functional and Algebraic Data Type goodness I remember from OCaml but is more "practical", more companies and projects are using it, plus I work in the JVM ecosystem. Any reason to go to OCaml now instead of Scala?

It depends on what you are working on. OCaml lets you interact more closely with Unix and with C libraries, if you need that. It also has a lot faster startup time and lower memory overhead, making it somewhat more suitable for writing Unix-style programs that need to run quickly with low overhead.

I also moved from OCaml to Scala for my primary programming, and have really enjoyed it. The functional goodness on top of JVM is a major win for the kinds of things that I primarily work on these days.

When I was working with OCaml, the community was going through a lot of work on figuring out what the ecosystem should look like. This involved at least two competing standard library extensions or replacements (Batteries and Jane St. Core), growing pains in packaging & deployment, etc. The language was (and still is) nice, but I could not, at the time, invest the time into dealing with the ecosystem. Things seem to have improved a lot since then, particularly with things like OPAN emerging, but Scala is still a better fit for the work I do, and Haskell has been serving me well for command-line kinds of things. But OCaml is a fine, practical language for a lot of things.

Is OCaml, then, a good fit for the kinds of programs many people have started writing in Go?

Native, stand alone binaries. High performance (not sure how the Go vs. OCaml benchmarks look right now). Good networking. More productive and less error prone than C or C++. Less verbose than Java.

Haven't written any OCaml programs, but seems to check the same boxes. Go seems to have a much better concurrency story with channels.

OCaml seems to have a much better type system and functional programming support.

Go does have a much better concurrency story, OCaml's is rather weak (as others have observed).

For the work I do (data analysis and scientific research, particularly on recommender systems, with system-building to support that), if I need concurrency, it's suitable to run the program on the JVM. So I would use Scala in that case, and OCaml would be fine for the other systems-y stuff.

I personally have taken to writing such code in Haskell these days, but that's largely to practice my FP skills in a manner that's easier to transfer back to Scala.

My understanding is that Go has a focus on concurrency which OCaml presently lacks, and so it probably has an edge there (possibly a big one). Otherwise, yes, OCaml is a good fit for the kinds of problems I understand Go to be used for.

If you only need concurrency for I/O, and in particular networking then OCaml is pretty good here:

- you can use system threads. Yes you'll only be able to run one OCaml thread at once, but when one OCaml thread gets blocked on I/O it switches to another.

- you can do co-operative threading if you write your code in a monadic style (there are a few libraries providing this: Lwt, Async, and in some sense Equeue). Then you don't need to worry about the thread/context-switching overhead, and the event-driven architecture should allow you to scale to thousands of concurrent events. While this may sound scary at first, the syntax is quite easy to understand, and it really is a lot easier than trying to write non-blocking code in C with lots of callbacks.

In particular have a look at Ocsigen, and Eliom that provide a web-server and framework: http://ocsigen.org/tutorial/

Regarding the book have a look at the 'Concurrent Programming with Async' chapter that shows how you can write a TCP server/client that doesn't block.

Good points.

Once you get used to OCaml's type inference that just works, Scala seems a bit clunky.

Marius Eriksen (who leads a lot of the Scala work at Twitter) gave us a great quote for the back of the book that says it best.

    Programmers are digital choreographers, carefully 
    balancing correctness, modularity, concurrency, 
    and performance. Real World OCaml teaches you how 
    to perform this balancing act in simple, elegant ways.
OCaml's the simplest systems language I've ever used with a decent static type system to prevent common errors from creeping into your code (most other systems languages sacrifice modularity in favour of even more simplicity, but OCaml strikes a balance).

For me, OCaml is just easier on the eyes. For example, compare ADT declaration in OCaml vs. Scala case classes.

This is a really good presentation on OCaml by one of the authors of this book.


I find the recent ocaml meeting presentations quite useful and interesting too, in the sense that it presents what people are using OCaml for, and what tools people are working on: http://ocaml.org/meetings/ocaml/2013/program.html http://oud.ocaml.org/2012/

Another one: Effective ML


Crazy coincidence; I just met someone who works at Jane Street on Saturday. My jaw pretty much dropped when he told me what his company does. When I got home and did some research, I was blown away by their dedication. I'm very happy to see a company realize the value in their language ecosystem and contribute as much back as they have.

People asking about Golang may be interested in this article, where someone considering rewriting a large python project choose a new language, where both Golang and OCaml are among the candidates.



>(Though to my understanding, Jane Street capital, a proprietary trading firm, is the only company that uses this language in real world applications)


In case anyone's interested, there's also a Real World Haskell book freely available to read online: http://book.realworldhaskell.org/read/

I'd be interested to see how they compare.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact