Hacker News new | comments | show | ask | jobs | submit login
HLVM -- the High-Level Virtual Machine (ffconsultancy.com)
37 points by njn on Feb 16, 2010 | hide | past | web | favorite | 33 comments

Neat technology, but, even as a rather serious O'Caml fan, even going so far as to use it in production ;), I tend to avoid the FF consultancy because of things like this:

> If you would like to keep up to date with respect to HLVM development, please subscribe to The OCaml Journal.

And when I go to read the online documentation?

> The design and implementation of this high-performance garbage collected virtual machine is described in detail in the OCaml Journal articles "Building a Virtual Machine with LLVM" from January to March 2009.

If you have to subscribe to a pay journal to learn all about it, chances are very good it isn't going anywhere at all.

Also, as something of a counterpoint to the whole "CLR missing on Linux or Mac OSX"... I've not run into many CLR compatibility issues with Mono at all. Also, they're not just doing the CLR; for example they've added first-class continuations (yes, that's right: call/cc!) to the mono VM; for language implementers that's rather attractive:


The HLVM implementation is for everyone to see at http://hlvm.forge.ocamlcore.org/ . There is a public mailing list at https://lists.forge.ocamlcore.org/pipermail/hlvm-list/ and I am sure Jon Harrop would be more than happy to discuss all technical details there.

Mono OTOH might be okayish for C#, but for instance F# so far has no open source implementation. Also Mono is lacking w.r.t. TCO and GC, so its not even close to a decent VM for functional languages.

Yes, it's 'open' -- not arguing that, nor would I dispute that Mono is far from perfect. [1]

But, like so many FF projects: is this the cart, or the horse?

In other words, this feels more like a promotional project for the consulting firm and its pay products than it does a 'real' project. I freely admit this judgement has some external bias; Jon H. has a reputation.

[1] footnote: (Of course, neither is the linked project... 128 bits for each and every pointer on a 32-bit architecture? Does that mean 256 bits per pointer on 64-bit? Yikes!)

I'm Jon Harrop. HLVM is a hobby project that I work on when I can find the time. If I could commercialize it then I would but there is no short-term way to make decent money from a VM/language that I can see.

I suggest you think carefully about when HLVM's fat references are a disadvantage compared to the alternatives and why. The real reasons are not at all what you're thinking (or what I thought for a long time).


The ability to interoperate safely and at a high-level between different languages, from managed C++ to F#, has greatly accelerated development on the Microsoft platform. The resulting libraries, like Windows Presentation Foundation, are already a generation ahead of anything available on any other platform.

Linux and Mac OS X do not currently have the luxury of a solid foundation like the CLR. Consequently, they are composed entirely from uninteroperable components written in independent languages, from unmanaged custom C++ dialects to Objective C and Python.


And that's the problem they're trying to solve. Bring some C# Microsoft love to poor dated OS X (and the like).

I abstain.

Making portable applications isn't a solved problem. There's Java. It's decent, especially with Clojure and other JVM languages, but I find the idea of a more powerful language (like OCaml) not without merits. Clojure has run into a lot of problems due to the JVM, probably the most famous one being the lack of tail call optimization.

You're right, portability of applications is an issue. Though I'd say it's deeper than toolkits, deeper than libraries or frameworks, and deeper than languages.

It starts in the interaction model. I like what Apple is doing with this (think iWork on the Mac and iWork on the iPad), and I'd hate to see that go away.

You could argue all you want that the Windows 7 desktop is not that different from the MacOS 10.5+ desktop, not that different from the Web interaction model; you'd have your points and I'd have mine. My main one is that if it feels different, then it is (a bit of a duck-typing mindset, understandably).

And if it is different, then it requires fundamentally different software application, even if the problem domain you're tackling is the same. Some low-level things might be portable, but many won't be; whatever's closer to hardware (audio, storage, search, video, rasterization) is more likely to be reasonably and defensibly portable.

Everything else I want to be specific to the interaction model of a given device, within reason.

Objective-C and OCaml are fairly high-level, but that's not what they are going after here; they are going after the likes of WPF and Cocoa and Cocoa Touch, general-purpose high level libraries of code. And that's exactly what I would not want to ever happen.

Let's not go there. Because it would be terribly silly.

Why do you need HLVM to do this? OCaml already runs on multiple platforms, so all that's needed is some library work. The same is true for Haskell, Standard ML, Lisp, and many other languages. If you want to write languages for multiple platforms HLVM sounds useful. If you want to write portable programs, why not work on improving the libraries available for an existing language?

I don't think that is the point. I am pretty sure the author was using the CLR as an example. You could write WPF applications using F#, IronPython, or IronRuby because the are all compile CLR code. If the most popular languages on Mac and Linux all compiled down to LLVM they would be a lot more inter-operable like the CLR languages. It would be much easier to leverage the Cocoa framework from Python, Lisp, or Java if all were build on the LLVM (or the HLVM which is built on the LLVM).

That makes sense. MacRuby is using the Objective-C runtime as the common denominator but I'd love to see other languages do that same.

> Clojure has run into a lot of problems due to the JVM, probably the most famous one being the lack of tail call optimization.

What in particular about the JVM prevents Clojure from being able to do tail-call optimization? My (admittedly naive) understanding of TCO is that it involves turning tail recusion into a loop, and Java can certainly handle loops OK.

> My (admittedly naive) understanding of TCO is that it involves turning tail recursion into a loop

This is close, but not the full story.

TCO turns tail recursion into a GOTO (well, a goto-with-arguments). In the case of a single function which tail-calls itself, this becomes a loop, but for multiple functions that are mutually recursive, that's not the case.

Note that the issues with the lack of TCO boil down to efficiency: there are ways of faking TCO atop a runtime that does not support it (meaning deeply tail-recursive code will run, rather than overflow the stack), but they come at with a runtime cost.

I'm naive too, but apparently there are use cases for TCO for state machines where the tail calls are between various functions, not necessarily in a simple loop.

The virtues of the CLR are (rightly) extolled for Windows, then the lack of the CLR on Linux/OS X is cited as one justification for the HLVM. Mono is alive and well on those two platforms (notwithstanding hate from the GNU/Stallman camp).

Not to start another thread, but I agree with his point on WPF being a generation ahead of similar technologies on other platforms. I hope we see work start on that at some point for Mono (running on OpenGL, for example).

How is WPF a generation ahead of what OS X has? I haven't used WPF since 3.5 came out, but it didn't seem to offer anything better than what Cocoa offers. I think it's much better than Qt, but Qt has to make sacrifices to be cross platform, so it's a different kind of framework.

I love OCaml, which this virtual machine says it is based upon, but doesn't it have issues with multi-core processors? That is, doesn't it suffer from a limitation due to its garbage collector?

This VM is based on LLVM and in particular aimed at parallel programming. As its author said: "HLVM is an experiment to see what a next-generation implementation of a language like OCaml might look like, addressing all of OCaml's major shortcomings (interoperability, parallelism and some performance issues)." [http://caml.inria.fr/pub/ml-archives/caml-list/2009/09/a94db...]

I have no idea, but I've just upvoted you back to 1. If someone knows more / better please post here and explain rather than anonymously downvote, which doesn't explain anything to anyone.

[edit: ah, ok, to expand on what 45g said, ocaml is used to build the tools, but anything that targets the HLVM will end up using LLVM for runtime code generation - so it will either compile to assembly, or use a JIT VM. either way, any limitations in ocaml will only affect the tools, not the final compiled program]

Ah, now I see. So this will get around the true multi-threading issues. Thank you both for the clarification. This is truly something to watch.

I'm curious: what about the JVM?

Kind of like Java in general, the JVM sits in the middle ground. LLVM is a good starting point for building low level languages (although nothing prevents you from writing high level languages in it, you just have write more). The HLVM is supposed to be a good starting point for building high level languages with things like garbage collection, closures, tail call elimination, etc. The JVM isn't the fastest for low level applications and isn't the easiest for high level functional languages.

The JVM is a great platform if the language in question is relatively close to Java, and you're just layering on higher-level syntax like closures. (Sure, it would be nice if Java supported them, but they're not hard to implement as anonymous inner classes, and to simulate real closing over variables you just need to wrap all captured variables in one-element arrays or other wrapper objects at the declaration side. It's annoying, but it's livable).

The further you move away from the Java typesystem or execution model, the less fun your life will be and the fewer benefits you'll get from the JVM: moving to dynamic typing will cost you a large amount of performance, for example, and you'll have to fight the Java type system a fair amount. I can only imagine what it would mean to do lazy-evaluation in a JVM language.

If it matches what you're trying to do, though, it's really a great platform: the bytecode format is pretty simple and easy to learn, the JVM runs of pretty much any platform you need, and it does a huge amount of optimization for you. If you're doing a statically-typed language and compiling down to Java-like bytecode (as opposed to handling dispatch yourself, as you have to do in a dynamically-typed language), you also basically get debuggers and profilers for free.

It's definitely a little weird, though, that they'd set up the motivation for HLVM as due to the weaknesses of the CLR's cross-platform capabilities, without mentioning the JVM at all, given that the JVM was, in spirit at least, the precursor to the CLR and still the most widely-used higher-level VM and one that works basically everywhere.

Maybe it is because the strategy more closely resembles the relationship between the CLR and DLR than the JVM and nothing. Like you said the JVM is not dynamic language friendly, and Sun/Oracle has really shown much interest in making it so.

(I am having a bit of an issue parsing the last sentence but I guess it was "hasn't")

Isn't the DaVinci/MLVM project an example of how Sun actually tried to move towards dynamic and non-java languages in the JVM? invokedynamic & method handles are already in the upcoming java 7, and that goes a long way towards supporting dynamic languages on the JVM.

Yes, I meant hasn't. Sorry for not proof reading.

The great advantage of the JVM is it has tons of libraries. This means that any competitor such as HLVM will have to fight very hard to gain traction.

That is part of the allure of being built based on LLVM. Recompile some C and C++ libraries using the gcc or clang front-end to LLVM and now HLVM has ton of libraries it can access.

I've always felt that the JVM seems to be a bit of an outsider - where the CLR seems tightly integrated. Whether this is due to the startup time, the need to start apps by invoking "java ..." rather than directly, or some other factor, I'm not too sure.

I know that I could add options for .jar files to automatically invoke the JVM, but then not all use a Main class, so that's not a guaranteed fix.

Maybe with the new generation of "cool" languages for the JVM we may start to see some change on this front. Oracle may have some interesting ideas as well, as new "owners" of the Java platform...

HLVM is targeted at high performance scientific applications. The current implementation already outperforms OCaml in several benchmarks (OCaml is more optimized for symbolic computations and sometimes 'boxes' floating point values).

But "Closures for functional programming." is still on the TBD list.

Interesting exercise, but of little use: this is solving (well, trying to) a political problem with more technology. JVM is already on the table, let's use that.

A contrarian option might be to target java code for the LLVM... don't know what that would achieve, but it may lead to interesting applications.

There is a project called VMKit which allows one to build a JVM or a CLR using the LLVM framework.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact