Hacker News new | past | comments | ask | show | jobs | submit login
Python to OCaml: Retrospective (roscidus.com)
235 points by Envec83 on June 6, 2014 | hide | past | favorite | 71 comments

As a language, OCaml gets most things right: being functional by default appeals to me very much, as does the easy path into imperative programming when you need it. Its type system is awesome not only at catching bugs at compile-time, but also at guiding you in your design. There are a couple of nice-to-have that would be cool though: Haskell-like type classes come to mind, and some minor syntax warts could be improved. Its implementation is also extremely solid: I always expect the code generated by OCaml to perform well. With tools like OPAM, getting the latest compiler and libraries is very easy. Definitely one of my favorite language toolset.

And there is always an option of using F# when needed.

This is possibly the best language writeup I have seen in years, if ever. Congratulations, sir!

His earlier post "OCaml: What you gain"[1] is also an interesting read.

[1] http://roscidus.com/blog/blog/2014/02/13/ocaml-what-you-gain...

This was a fascinating read. What I'd like to see more of is how the migration was managed, concentrating on how the mix of the two languages was managed. It talks about replacing functions and using JSON to glue them together and/or allow them to communicate.

I'd like to know more about that. Have I missed something?

Regardless, excellent write-up. Thank you.

Running both in a single process looked tricky because we need to work with the already-installed Python version, which could be 2.6, 2.7, 3.3, etc... so we'd have to provide a separate Python module for each one to do that.

Instead, the OCaml created a pair of pipes and spawned a Python subprocess (you need to use two pipes, not a single socket, because Windows doesn't support Unix sockets).

The protocol was asynchronous (replies can arrive out of order). Although the original JSON bridge is now gone, the new "0install slave" command (which allows other programs to control a 0install subprocess) uses a very similar system. It's documented here:


The thing not described by the author is what oCaml is like to program. Python is easy to write and understand and makes sense and is a joy and not cryptic. What about OCaml?

Maybe not in the main article, but he gives an overall view of OCaml in the sublinks. For example, in the summary for "OCaml: what you gain": http://roscidus.com/blog/blog/2014/02/13/ocaml-what-you-gain...

The summary:


"OCaml’s main strengths are correctness and speed. Its type checking is very good at catching errors, and its “polymorphic variants” are a particularly useful feature, which I haven’t seen in other languages. Separate module interface files, abstract types, cycle-free dependencies, and data structures that are immutable by default help to make clean APIs.

Surprisingly, writing GTK GUI code in OCaml was easier than in Python. The resulting code was significantly shorter and, I suspect, will prove far more reliable. OCaml’s type checking is particularly welcome here, as GUI code is often difficult to unit-test.

The OCaml community is very good at maintaining API stability, allowing the same code to compile on old and new systems and (hopefully) minimising time spent updating it later."


This gives me the impression that OCaml is also a joy to program with, and additionally is more reliable and better at catching errors earlier than Python.

It is a joy when you get used to it, but it's not as immediately accessible as Python. As I recall, my first hour or so with OCaml was full of frustrating syntax errors. Keep going though, it's worth it!

One thing I don't think the tutorial mentions, but which helps a lot, is turning on all warnings ("-w A"). The tutorial also fails to mention ocamlbuild, so you start by compiling things manually, which makes everything harder than it should be.

And this is the reason while I like Python a lot, nowadays only use it for scripting under UNIX.

Well, I would say this is a pretty positive post for Python as well. Only 8x slower than OCaml... Not bad for an interpreted dynamically typed language.

This makes a good case for prototyping in Python and only moving to something else when performance becomes an issue.

Hindley Milner type system makes strong typed languages also quite pleasant to use for prototyping.

I mentioned UNIX, because on Windows my to go language for private scripts is F#.

Interesting. Do you find it more or less terse than PowerShell?

I use both actually.

F# scripts for my own stuff, Powershell when those scripts need to be shared across the team or I need to interoperate with third party cmdlets.

It is quite terse, specially because Powershell has a bit of VB flavour to its syntax, mixed with attributes everywhere.

I rather use the cleanliness of the ML syntax, if possible.

It can be as terse as you want it to be. The bigger virtues are its speed, language features, and the environment.

F# interactive is what powershell/ISE wants to be, when it grows up and drinks the magic potion of awesomeness.

Ocaml is very enticing in many ways, with a lot of the features I love about Haskell; a few that are missing but also several that are not present in Haskell. Certainly, the strictness is a big plus for me, and being able to do mutability and IO without a monad (purity isn't as important to me). I've been averse to picking it up because I've invested so much time and effort in Haskell, because I'm used to the Haskell way of FP, and because I much prefer haskell's syntax. But maybe if I just dived in, then I'd find I even preferred it to Haskell.

I've actually thought about ocaml myself, but one silly thing keeps putting me off. It feels like typing "let" over and over again would drive me crazy!

I guess it's not that bad after looking at this for example:


Here's a copy of a previous post I've made on HN about why you might want to choose OCaml instead of another language. Nothing has changed about my opinion since the time I wrote it: Original link: https://news.ycombinator.com/item?id=7766315


A couple of people have asked why you might choose OCaml over other languages. I've not done as much OCaml work as others on this thread (I work primarily on ReactJS (Facebook/Instagram's) functional UI framework), but I can offer a different perspective as someone who is outside of the OCaml community, but asking the same questions. Here are some of my personal findings. I'll narrow any comparison down to the ML family of languages. Java/C++/ and many other languages are just now beginning their slow, but inevitable evolution into becoming a dialect of ML (which IMHO is a sort of admission of the ML/functional family superiority). Once you embrace the power of pattern matching, it's hard to use anything but an ML family language (StandardML/Haskell/F#/OCaml). I would program in any one of those languages over Java/C++/Objective-C/JS. Practical reasons why you might choose OCaml:

- OCaml's records aren't as elegant as SML's but OCaml has labeled arguments with optional default values which can satisfy many of the reasons why you'd use records as arguments in the first place (and may be even more powerful in some cases).

- Two modes of compilation (fast native executable XOR fast compilation). Who doesn't like options.

- All the benchmarks I can find show that OCaml is very fast (around as fast as C++).

- Excellent JS target and and apparent commitment to maintaining it (as someone building a JS library, this is very important to me) (and as someone who wants to build apps and be able to instantly share them with everyone in the world.)

- Someone has built an autocomplete plugin for Vim/Emacs (merlin). ("VimBox" (https://github.com/jordwalke/VimBox/) has configured it to complete as you type - like in Visual Studio etc.)

- On very rare occasion, you'll run into a problem that is inherently better suited to OO (dynamic dispatch). I can usually find a way to solve it with functors/modules, but it's nice to know that you have OO in your back pocket in case you ever need it. It's also nice to know you probably won't have to.

- Finally, a common package manager (OPAM) is becoming standard. I look forward to seeing how OPAM helps make the new dev experience and the code-sharing/development experience seamless.

- The module system is very powerful (SML's). Haskell does not have this, and strangely F# dropped it. (I hear, Haskell's type classes fulfill similar roles (but with more sugar)).

- There's usually ocamlyacc grammars for most languages. Most examples of languages, type systems, parsers are already in OCaml (or ML). It's a nice (but small) perk.

- Predictability. OCaml is not lazy by default. Lazy computations could become problematic for low-latency applications (such as UIs) if a lot of computation becomes is deferred until the moment a final dependency has been satisfied, but by that time you may be close to your screen refresh deadline and it may lead to a dropped frame. It would have been better to have been computing while waiting for a final dependency. I'm not sure if Haskell (a lazy language) has had this problem. You can opt into laziness in OCaml if you would like to.

- Mutability. I feel strange saying this, as such a huge proponent of immutability, but sometimes you just need to hack something in place, mutate some state and come back to clean it up later. (You can still use monads in OCaml).

- Tagged Variants (no need to predeclare variants, just pattern match on them and OCaml ensures that only properly matched values ever make their way into that expression).

- Industry use is growing. OCaml is used here at Facebook and many other places as mentioned.

- There are many abstractions to choose from (Records, Objects, Modules, Functors, First Class Modules, GADTs, ...).

OCaml Cons: - There are many abstractions to choose from (Records, Objects, Modules, Functors, First Class Modules, GADTs, ...). (Edited for formatting)

Two weeks after the original time of writing, Apple released Swift - which is considerably influenced by the ML family of languages.

Indeed. I heard someone comment that every FP conference will be doing a 'victory lap' due to Apple's announcement. At the very least, I hope people/devs will explore ML based languages for their next project.

It would be interesting to get your take on the toolstack we're building using OCaml - http://nymote.org (I'm aware the copy on the site needs works but it's the tools I'm referring to).

Bryan O'Sullivan commented that https://mobile.twitter.com/bos31337/status/47356953136621158... and Graydon Hoare referenced it in his post on Swift http://graydon2.dreamwidth.org/5785.html

Yup, that's what I heard (obviously 3rd hand). Second like is very useful as I've not had a chance to look at the swift docs yet.

Each of the components of nymote are interesting in their own right. I'm very interested to see how they can all be brought together to accomplish your goals which are ambitious and noble. How can I keep up to date on your progress?

"- The module system is very powerful (SML's). Haskell does not have this, and strangely F# dropped it. (I hear, Haskell's type classes fulfill similar roles (but with more sugar))."

F# runs on the standard .net type system which is not able to do type classes, modules or higher kinded generics. On the other hand you have compatibility to all .net libraries which is a huge plus.

It takes a full minute to build the finished project - is this normal or..?

That's without parallelisation. There is a slowdown when building OCaml projects with a lot of packages, but this is being fixed with some build system improvements (removing the need for ocamlfind on every compiler invocation, switching from camlp4 to ppx extensions, and using the native code compiler instead of a bytecode one). It only affects large projects though, as most of the time you just modify a few files and do an incremental build.

That made me curious and did a quick profile:

    25.52%        camlp4  ocamlrun                           [.] caml_interprete
 14.60%  ocamlopt.opt  ocamlopt.opt                       [.] 0x000000000016ec61
 10.51%     ocamlfind  ocamlfind                          [.] caml_interprete
Looks like ocamlfind is the bytecode version too ... I opened a bug.

Is my impression correct that the general interest in OCaml has waned & its development has somewhat stalled? I played with OCaml when it was at version 3. I remember the mailing list to be rather vibrant. Now there seem to be fewer posts. I think this is a pity because I really would like to see OCaml to take off. Projects like ocamljava and js_of_ocaml also sound quite promising.

That impression is incorrect. I don't think the core devs and the overall community have ever been as vibrant as they are now.

Take for instance compiler development: half a decade ago, updates were rare and marginal. Nowadays you can expect a new compiler version every single year with lots of exciting features.

I don't think you've looked very closely (if at all?). There is tons of stuff happening in the OCaml ecosystem. Compiler improvements, large scale projects, more industrial uses and also new books. Try looking at http://OCaml.org for some links (and the planet newsfeed), as well as the websites of OCaml Labs (see the article) or OCamlPro.

He s right for the mailing list though. I regret there is not more discussion happening there

Why? Frankly, I'm glad that there is more stuff happening out on the net at large rather than trapped on one list. I consider it a sign of success (there are probably many more users out there than who are on the mailing list anyway). I've used Python, Ruby, etc but I haven't ever joined their main lists and I don't feel I'm missing out (anything important will spread in other ways).

As a beginner, it is interesting to read more advanced questions and their answers. I learned a lot of ruby that way at the time, and i miss it for 0caml now.

This does still happen (e.g [1]), though I've no idea if the frequency of such posts has changed. If no-one asks questions, then there's nothing to answer (so please do ask!). On the other hand, I've noticed that more people have started to ask/answer questions on StackOverflow [2] so perhaps this just represents the general shift that other langs have already been through.

[1] https://sympa.inria.fr/sympa/arc/caml-list/2014-05/msg00146....

[2] http://stackoverflow.com/questions/tagged/ocaml

Is it me, or are the "Big and Slow" and "Small and Fast" labels in the Speed vs Size chart placed in the wrong corners?

it's startup time, not general speed

I was just misreading the graph, assuming farther out on each axis was better (faster/smaller)

Wonder what he mean by "lack of support for shared libraries".

Does Ocaml support dylibs/so/dll's?

Very informative post, thanks a lot!

  Mirage unikernel - an operating system written in OCaml 
  (the Mirage web-site is all implemented in OCaml, down to 
  and including the TCP/IP stack!). That will have to be 
  the subject for another blog post though...
I am looking forward to such posts. Keep up your good work!

Mirage was also featured on the latest Software Engineering Radio podcast: http://www.se-radio.net/2014/05/episode-204-anil-madhavapedd...

You can read about the Mirage work at http://openmirage.org It's also a core piece of a toolstack we're putting together for building distributed systems/applications (http://nymote.org).

here's a little more discussion about mirage, and the related arrakis :



A year. A year of an apparently excellent programmer's output -- perhaps 2% of everything he will ever accomplish. To port a stable project to an 18 year old language.

and in which he gained a massive speedup in the project, made it a lot easier to maintain and extend, learnt a lot about ocaml, and finally managed to get a really exciting job due to the exposure his blog gave him.

Easier to maintain? Well at least you should add that this move from python to ocaml made it much harder to find someone able to maintain this project.

Wow, hate-haters gonna hate my hate. The slowest metric for the old code is about 1/3 of a second -- how much benefit does slicing that deliver in real world 0install invocations? "Easier to maintain and extend" -- well, he just cut the population of potential contributors to the project by a factor of 1000. The exciting job he got because it required knowing OCaml, and he spent a year advertising himself as a competent OCaml hacker. I find it very strange to care more about what language a project is written in than what the project actually does.

I'm going to feed the troll here, but "cut the population of potential contributors to the project by a factor 1000" is a flawed argument : The value of potential contributors is very low. You're better of trying to increase the comfort and power of the _current_ contributors.

> hate-haters gonna hate my hate

Yeah, and fanbois gonna fanboi. He found something that works better for his use case. Sure, Python has has more programmers available, but that's because it's the flavor of the week (thanks to Google, let's be honest). In 3 years it may be the language time forgot. It makes a lot more sense to pick something you enjoy programming in and makes your life easier than something that everyone else thinks is the shit. Python has its place, and that place is not everywhere.

Python itself is 23 years old and older than OCaml. Java, C, C++ are also older than OCaml and all 4 are highly used today.

I don't think language age has any bearing on quality or usefulness.

Actually, the core of the language is way older. see http://en.wikipedia.org/wiki/Caml_Light and it can be considered a dialect of ML, which is even older (from 1973).

Considering most languages (even most new ones) ignore many more decades of programming language research, 18 years is pretty good: it was cutting edge PL research at the time.

OCaml hasn't stood still either. The latest releases take advantage of some "fairly recent" research. Python on the other hand didn't even take advantage of cutting edge research at the time it was created, and as far as I can tell hasn't since.

During which time he became proficient in the language and landed what sounds like a pretty cool job.

Youth, for a programming language (and many other things), is not a positive benefit.

The thing is that strong static inference beats dynamicity.

It's just an opinion, not "the thing", whatever that would be.

OK, my opinion is that there is no need to throw away type safety to gain succinctness, especially if your language gives you REPL.

The advantage of dynamic languages is not in succinctness; the advantage is being able to defer the concrete definitions of the objects you're dealing with, or to ignore them entirely: it's what lets you pass arbitrary objects around in your middleware stack, or do things like ActiveRecord, click.py or sh.py which do crazy things with the object and module system, etc. There are things that are really hard in a statically typed language, or that you simply can't do, that dynamic languages let you do effortlessly. Of course, all of this freedom comes with cost. But gaining succinctness is hardly the major win of dynamic languages.

Well, succinctness is an advantage, and I'll claim with no data to support me that it is the reason, combined with the lack of compilation, that most developers choose dynamic languages, without necessarily realizing what they are giving up.

As someone who works for a company that uses python, but mostly uses Haskell in his spare time, my python code is often far more verbose than my Haskell (for a similar degree of complexity). I suppose if your yardstick is Java or something than you'll probably be more concise in a dynamic language, but that's really besides the point. The amount of characters you type in a language is one of the least important factors in choosing whether to use it or not. It's certainly not the reason that we use it.

I'm thinking about the golden age of PHP. Lower amount of typing, and typing weak enough that whatever junk you type ends up in something actually displayed. The no-compilation part also lowers the barrier to entry and makes deployment easier. Also features the fantastic ability to break your production code by debugging it on the server!

I agree, although there are also other benefits of dynamic typing.

With modern type systems employing structural subtyping, row polymorphism, proper variance and covariance handling and other advanced techniques static typing becomes a very powerful tool.

On the other hand, there are correct programs which we still don't know how to type. This means that there are programs we just can't write in statically typed languages, at least not yet.

Anyway, I'm not arguing against or in favour of static or dynamic typing. I'm simply stating the fact that choosing one over the other is still just a matter of preference and personal decision about which set of tradeoffs you like better.

It's exactly the same as with Vim vs. Emacs debate. I can't believe I got downvoted because of pointing this out. I believe that there is a place and a good way of discussing relative merits of static and dynamic type systems, but simply stating your preference in a thread about something unrelated is most definitely not a good way.

> On the other hand, there are correct programs which we still don't know how to type. This means that there are programs we just can't write in statically typed languages, at least not yet.

I don't think that's right. If something can't be type-checked then you just add an "assert" and do the check dynamically.

For example, in 0install I need to ask the SAT solver which version of a program I should use. I pass it a load of SAT variables, each containing a user data field with the attached program details, run the solver, and get back the selected variable. The SAT problem also contains other variables, with other kinds of data.

I know the user data field for the selected result will be of type program details, but I couldn't figure out how to let the compiler check this statically. It's not a problem; I just check the type and "assert false" if it's wrong. I still get the benefit of static checking almost everywhere else in the code.

> On the other hand, there are correct programs which we still don't know how to type. This means that there are programs we just can't write in statically typed languages, at least not yet.

I know this is true, but is it common? What would be a realistic case for this? Realistic as in, "code most people who are fans of dynamic typing find themselves writing". Contrived examples can always be found, but if someone told me they don't use static typing because there are programs that cannot be typed, I'd ask them "how often do you write that kind of programs?".

Well, the parent post is not exactly an example of objectivity too. I agree that dynamic typing can be a useful tool. I am likely to use both in any n-tier project and benefit from that.

Why especially if the language has REPL?

Common Lisp guy gonna chime in here...having a REPL that integrates with your editor (vim/slimv in my case) is an intense improvement on any other language I've ever used. With C you edit, leave, `make`, run. With PHP you edit, open browser, F5, run...same with JS.

With lisp, I edit, eval (simple command in my editor), see result. If I'm building a game or a long-running app, I can redefine functions while it's running without doing a full recompile and having to try and get back to the state the app was in when I wanted to test my change.

So you get the ability to live-code while the app is running. Once you get used to being able to do this, leaving your editor and opening to shell to test your app seems like a barbaric exercise.

So a good REPL (lisp or not, doesn't matter) can make the experience a lot more interactive and experimental that your average write -> save -> compile -> run cycle.

I'm no Javascript advocate, but I just found out that you can use Chrome like that for JS, where I can edit a function, press Ctrl+S, the Chrome REPL will say something like "functions recompiled in 80ms," and then your running app will use that function. I have to say that developing like that is pretty great.

Unfortunately the OCaml toplevel pales in comparison to CL environments like slime in terms of visibility and debuggability. Redefinition is also off the table, as OCaml is a very static language. (You can rebind the same name, which can be helpful in exploratory programming but is not at all the same thing.)

It remains an extremely useful tool though.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact