Hacker News new | past | comments | ask | show | jobs | submit login
Real World Ocaml announced (realworldocaml.org)
116 points by j2labs on Feb 28, 2012 | hide | past | web | favorite | 55 comments

Discussions of whether OCaml is "practical" or useful or good aside, this book really, really needed to be written.

It may be the case that there are truly excellent resources for learning OCaml, but I know the language, and I've never heard of them. There's Jason Hickey's tutorial, an introduction or two scattered around the internet, and some book that was fan-translated from French, sure, but OCaml lacks the hackerly dialogue that typifies discussion of languages like Haskell. It is a problem, for example, that these usually very incomplete hobbyist-curated tutorials almost always outstrip the professional literature: when people do publish books about OCaml, they are often littered with errors or flat-out wrong (I'm looking at you, Practical OCaml).

Regardless of whether we can all get on board with asynchronous whatever, or pronouncements about its "real" speed, or whether it should be used when Haskell is around, it is clear that there is a huge divide between the people who actually can argue these things well, and those who cannot, and switching from the first group to the second by yourself is tempestuous and trying. Is it the case that there are no OCaml experts in the world? If you are going by the amount of information on the Internet, it is not really obvious that this is actually the case. How can you, then, become one?

Besides that, though, having a sound OCaml counterpoint will do communities like Haskell good. I hope this finally ushers in the golden era of learning OCaml. Not everyone can take CS51 at Harvard, and besides, the credulity of a real (and good?) book about OCaml will hopefully advance the dialogue further than the typical questions of whether it can actually be done in X, Y and Z enterprise environments. That is a debate I (and probably many OCaml fans) have heard enough of.

In meagre defence of the catastrophe that was Practical OCaml, I'll point out this:


I'm reviewing drafts of Real World OCaml too, but after the sheer amount of work (and what went wrong) with P.O., I'm just going to informally comment on them.

OCaml as a language is extremely useful and practical. A good book on it really needs to be written.

Are OCaml or Haskell taught in CS51 these days? That would be wonderful. Neither language was used on the general (non-PL theory, except a bit in the compilers/languages area) CS track in the 90s.

Also, OCaml is spelled "F#" in English.

Yaron Minsky is the technical director at Jane Street Capital, which is one of the most prestigious functional programming shops in the world.


A while ago I truly enjoyed his talk at CMU: https://ocaml.janestreet.com/?q=node/61 both as an introduction to high frequency trading, arbritage and stuff, and to FP in industry too.

EDIT: it was already posted below.

I wonder how goes an interview at their company.

a few people have blogged about it, its supposedly one of the hardest software interviews in the world. one blogger said the final in-person interview prompt was to write a regular expression matcher on the whiteboard in ocaml in 45 minutes.

Perhaps OT: I've tried to like OCaml 2 or 3 times so far and have not succeeded. I've loved Haskell (until I hit the wall), enjoy Python, enjoyed Ruby, am very happy with JS, had a love/hate relationship with C/C++ (really, header files?!), loved Java compared to C/C++ (until I learned to hate Java). And I'm generally really excited by what's going on in languages right now.

I just wish there was an SML for the JVM. But until then Ocaml is probably the gear and I look forward to reading this book.

And, oh my golly, have you seen OCaml's Eliom for Ocsigen? Absurdly interesting. http://ocsigen.org/eliom/ One language for the browser and server, with client/server transparency and best-in-class server performance.

My only experience with OCaml is two hours that I spent coding in it for an interview problem. So, from a complete beginner's perspective:

i) It needs better libraries, or perhaps better library documentation. A full hour of my time was spent working out how to read lines from a file.

ii) It is a beautiful language. The code I wrote, as someone completely new to the language and not using any libraries, was much shorter than the equivalent code I would have written in Python, a language which I know much better, and an order of magnitude shorter than what I would have written in Java or C. It makes functional abstraction easy in the same way that Haskell does, but the ability to mix in imperative code if it gets the job done in fewer lines (or in a less mind-bending way) is supremely useful.

For i), it's in progress: see OCaml Batteries Included http://batteries.forge.ocamlcore.org/

I'm curious. How many folks here have used OCaml for production work? I'd just like to get an idea of the numbers.

Edit: By 'production work' I meant built things that are being used by other people.

You might find this interesting: https://ocaml.janestreet.com/?q=node/61

TL;DR: Jane Street is a quantitative trading firm whose language of choice is OCaml and has several people programming in it full-time. Besides experiencing the commonly-cited advantages of FP (productivity, expressiveness, local reasoning), they also found it highly performant, useful for rapid prototyping and resilient to changing requirements.

By "several people programming in it full-time" you mean "everyone who writes code in the company (IT, developers, researchers and traders) writes code in OCaml" right?. Yaron has said (perhaps in that video, perhaps somewhere else) that the founding partners (who were voice traders, not programmers) learned OCaml as their second programming language, after VBA. The language pervades the company.

That's right. All I know is what I saw in that video, though, so you probably know more about this than I do.

Simcorp is another OCaml financial software shop.

I haven't, but the large company I work for has a product partially written in OCaml. I came across it in source control once. I'd guess it's tens of thousands of lines of code, but that's a complete guess.

There's also a list of companies using it here: http://ffconsultancy.com/products/ocaml_journal/free/introdu...

I worked on a Coverity-like static analysis tool product which used OCaml. It used EDG C/C++ parser with FFI.

I worked at a company that based a lot of stuff on OCaml, and since I'm working at Red Hat we use OCaml for a few things. In no particular order ...

* generating code, bindings etc in libguestfs (http://libguestfs.org/)

* analysing Windows Registries for hivex, using ocaml-bitstring

* virt-top

* virt-dmesg

* whenjobs

I used to. We even have an AppStore application written in OCaml.

The search index that backs latexsearch.com is written in ocaml.

As a primarily imperative programmer who is just becoming more familiar with Haskell, I'd love to see a compare/contrast between Haskell and Ocaml from someone with expertise in this area.

The biggest differences are that Haskell defaults to lazy evaluation and OCaml defaults to strict and that OCaml allows you to mix in imperative code without explicitly using constructs like monads.

I think modules vs. typeclasses are a bigger difference. The Haskell approach to IO isolation would be much more horrible without exploiting features peculiar to the typeclass system. The IO system in early versions of Haskell did not exploit these features, and it was a disaster (http://research.microsoft.com/en-us/um/people/simonpj/Papers...).

Sort of. It is more like every function in OCaml is in IO. There is just no purity guarantee anywhere.

I'm not well versed on Ocaml, but my impression is that the biggest conceptual difference is that Ocaml is more lenient with side effects. Haskell restricts any code with side effects like IO to Monads, whereas Ocaml allows IO, mutable data, etc. It may be possible to write Ocaml like Haskell, and manually restrict side effects to Monads, I'm not sure, but in Haskell you have no choice.

There are other differences as well, but imho that one sets Haskell apart moreso than any of the others, and not just from Ocaml but from every other language (that I know of at least).

Both are great, both have their strengths and weaknesses. Start with whichever you feel comfortable with, and branch out to the other when you want to discover a new dimension. Ocaml has a lot of strong points and a statically typed module system, Haskell has a less restrictive type system and a different way of dealing with heterogeneous polymorphism, and has all the good / bad that come with laziness by default (mostly good, with a dash of gotchas)

Besides laziness/immutability, the HN modules vs type class debates (there's also Lambda-ultimate debates, but they take that stuff so seriously





(yes, from 1998, but worth the PS conversion if youre on windows)


not mentioned by others is the real reason why Haskell is winning: concurrency & parallelism. The GHC compiler for Haskell has an incredible runtime with async IO by default, it is capable of running millions of threads, works on multi-core, and has a good STM implementation.

Is there something about Haskell-the-language that offers concurrency/parallelism benefits or is this primarily an advantage of GHC-the-implementation?

Oh.. I thought that this was going to be an actual version of Ocaml that was practical in the real world, rather than a book.

So what's impractical about Ocaml? I just looked at it and it looks cool, but I haven't tried it for anything.

If they actually have event driven programming that seems like a big step over most functional languages, where I/O is kind of an afterthought.

Steve Yegge wrote some points about Ocaml quite awhile ago. http://sites.google.com/site/steveyegge2/ocaml He says it is quite fast and I've seen that in other places.

The thing I don't like about competing with C/C++ in "fast" is that those languages invariably are memory hogs. They hide that in the benchmarks.

Anyone know if you have control over memory layout in Ocaml? Or if you can reason about it like you can in C/C++?

OCaml isn't actually a pure functional language: it has side-effects and mutation. In fact, you can code that looks an awful lot like C if you want, except with automatic type inference. But then you can also lift up your level of abstraction in the same codebase to use a purely functional idiom where it's appropriate, or use an object-oriented style for something else. All of those styles are supported as consistent, first-class features in the language.

There are several event-driven programming libraries in OCaml, and two that I like that use monadic abstractions (to hide the messy control flow that makes node.js so painful) are Async (https://ocaml.janestreet.com/?q=node%2F100) and Lwt (http://ocsigen.org).

You have good low-level memory layout in OCaml, either via an FFI or the Ancient module to import in non-GCed values. You can read about the OCaml heap and GC at this blog series by Richard Jones. http://rwmj.wordpress.com/2009/08/04/ocaml-internals/ (and yes, the memory representation is a straightforward mapping from the type declaration; very little magic happens in the compiler)

OCaml is used all over the Xen Cloud Platform (http://github.com/xen-org/xen-api), and I'm running an ongoing microkernel research project called Mirage (http://www.openmirage.org) which has a full network stack written in pure OCaml (and is also competitive performance-wise).

Very cool, thank you. I watched the openmirage talk since I'm interested in distributed computing -- this is great work. I like anything that gets rid of excessive layers, and the typical cloud stack is really suffering from this.

I have been meaning to give OCaml a try for awhile but it's bumped on my list now. (Though it seems that the thing that people keep complaining about is lack of a polymorphic print...)

> So what's impractical about Ocaml?

The biggest issue Ocaml has is that its GC is single-threaded which basically removes the ability to do parallel programming inside a single executable. You have to resort to message passing between multiple processes.

For what it's worth, I think I heard rumors there's work on an improved GC.

I'd add that Unicode support is also a problem. latin-1 isn't even good for French. My brother's name is Jérôme and the first thing I try with any language to check strings is this:

    # print_endline (String.uppercase "Jérôme");;
Oops, two characters didn't get properly uppercased.

I really like OCaml, it's probably my favorite language, and I hope that RWO will increase its popularity and that this will in turn help improve the language. I'm reading a lot about Haskell these because of the times when I want to do some multi-programming with unicode text.

As annoying as that is, it could be turned into a design feature of your programs - only doing message passing.

Hoare's CSP pretty much takes this approach, and for my money, I think that sort of approach makes the common case of parallel programming easier.

I'm doing a project using Python on AppEngine. AppEngine forces you to use a bunch of single threads doing what is really message passing, but implemented in Python so it's slower than Ocaml.

I think this is going to turn out to be a killer feature of Ocaml once people get used to the fact that using 8 cores at once isn't that great of a goal when on-demand computing platforms let you scale much much larger. Pretty soon no one will be all that concerned about how much happens on any one machine. It will be how much gets done per process, and how much gets done per 'cloud'. AppEngine already does this.

And it already is a design feature. The GC is not single threaded b/c they're too dumb to figure out how to fix it (not that you were saying that, but people tend to jump to that conclusion). It's because they want the fastest possible single threaded performance, which means not polluting the GC with the kinds of locking mechanisms necessary to support multi-threaded operation.

I think this is going to turn out to be a killer feature of Ocaml once people get used to the fact that using 8 cores at once isn't that great of a goal when on-demand computing platforms let you scale much much larger.

I think this point isn't made often enough. There certainly are times when scaling within a single machine is useful but, as is more often the case, if you have to scale beyond a single physical instance you might as well plan for process-oriented parallelism up front.

Certainly, but there are circumstances - say high-performance math & data heavy stuff - that benefits greatly from a shared memory implementation.

And the ease of use of multi-process architectures needs to be improved as well.

Note that OCaml is just fine for heavy math and data stuff: the garbage collector has a global lock, but threads work just fine for CPU intensive activities. Just don't generate much garbage (which is important for any data-heavy processing), and all is good.

The common case of parallelism is usually fine with process/message passing work IMO.

For the other 20% of the time... threads are awesome. :)

JoCaml, my friend.

Not sure why this got downvoted. JoCaml is actually an implementation.

Does JoCaml rewrite the GC? Or how else does it get around the GC issue?

JoCaml uses multiple processes. (But hide IPC details from you.)

I've tried OCaml a few times over the years, and I think its biggest problems are chicken/egg problems. It's impractical for "real world" use because not many people are using it in the real world.

My biggest complaints:

* Poor libraries - This is getting better, but it used to be a real pain to do stuff like make an HTTP request, parse JSON or XML, or connect to a database. They definitely weren't in the stdlib, and it was even pretty hard to find good third party libraries for it.

* Awkward library interfaces - Even the libraries that are included are awkward to use, like the Set and Map.

* Poor documentation - There's nothing (or was nothing) like http://perldoc.perl.org/ or http://docs.python.org/

And, off-hand, one language nitpick:

* Poorly handled strong typing - For example '+' is used to add integers while '+.' is used to add floats. The underlying reason makes sense, but there's awkward stuff like that all over. Haskell handles it better, IMO.

The first three are indeed a big irritation, and a big motivation during writing this book. There is a lot of work going on to improve all of these at the moment, notably through Core from Jane Street providing a well-maintained accessible standard library, and a convergence of packaging systems. The good folks at OCamlPro are also working hard to better editor and documentation support, so I except there will exist a documentation/package repository before this book is released.

connect to a database

This is something I have been working on http://gaiustech.github.com/ociml/

Really overdue. Not sure if it closely follows Hickey's PDF draft from a couple years ago, but thought it was well done.

Of the 8 most frequently encountered FP languages (clojure, scheme, CL) (haskell, ocaml, F#, erlang, scala) all have at least a handful of good intro texts, plus a few intermediate +.

Except Ocaml

It's called F#. It comes with a much more succinct syntax, too.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact