Hacker News new | comments | ask | show | jobs | submit login
Clasp – A Common Lisp with LLVM back end and interoperation with C++ (drmeister.wordpress.com)
278 points by drmeister on Sept 25, 2014 | hide | past | web | favorite | 74 comments

>Programmers can expose C++ classes as well as functions and class methods with a single line of code that provides the Clasp name and a pointer to the function/method.

Hellooooo Qt without Smoke bindings.

It would be really shiny to have a cross-platform GUI library in CL without grief. I've prodded that space repeatedly in the last 6 years and simply have not come away happy about any of it except LispWork's library.

Sounded good until I saw this:

A faster Clasp compiler is coming soon – the current Clasp compiler generates slow native code that is about 100x slower than highly tuned Common Lisp compilers like Steel Bank Common Lisp.

No mention of the expected speedup, but 100x is a huge performance problem.

The reason is that Clasp doesn't do Lisp language level optimizations like escape analysis yet. All bindings are stored on the heap and the stack/registers are underutilized. LLVM is a great library for implementing C and C++ but more work needs to be done to support Lisp features like closures and first-class functions. We are working on that now. The goal was first to "make it correct" and now we will "make it fast". Once we have a faster compiler (give us a couple of months) I don't see why it couldn't approach the speed of SBCL (a tall order).

Glad to hear it. I've been wondering when new languages would be developed on top of LLVM and it's starting to happen (Julia, now this). C++ is getting a lot of more advanced features, but the syntax is so clunky compared to something like Python for example. I also like the notion of easy C++ library use for things like QT.

I'm not too worried, since a) that makes it about as fast as cpython and b) there is a lot of low-hanging fruit. There is work being done on integrating the Cleavir compiler which does a number of tricks. In particular escape analysis and lambda lifting should both help a lot on certain types of code.

Any weekend compiler project can be faster than CPython.

Well CPython is an interpreter not a compiler so that's not really hard to do.

Not exactly. CPython combines a compiler and a bytecode interpreter. It compiles to bytecode, which is then run by the bytecode interpreter.

> Any weekend compiler project can be faster than CPython.

Not true and not even funny.

To write a compiler (and not a interpreter) it's indeed almost impossible to reach the slowness of CPython. Especially because it's hard to create a complex language in just a weekend.

using llvm as the backend not hard to make a toy language which runs insanely fast with only integer operations.


Considering that SBCL is among the fastest programming language implementations there is, 100x is actually not terrible for a first version.

> Considering that SBCL is among the fastest programming language implementations there is...

Source for this? I'd love to find a really fast lisp or scheme. Hadn't heard SBCL was particularly fast, and the alioth benchmarks don't show anything special there.


Edit: actually SBCL stacks up alright against languages like Go or Rust in the alioth benchmarks, so maybe that's what you had in mind.

I'd say those numbers you link to are pretty amazing for a garbage collected language. Compared to Java, SBCL is roughly on par, with some benchmarks 2-3x slower and some 2-3x faster.

edit: To clarify, I'm not saying SBCL makes Common Lisp the fastest language (I don't even think that's a meaningful statement). But to be within 2-3x of the JVM or C (and even outperforming C in some scenarios) certainly puts SBCL among the fastest language implementations. All the other ones you mention (C, C++, Go, Java..) are indeed also among the fastest. :)

Amazing for a garbage collected dynamically typed language.

> Hadn't heard SBCL was particularly fast

What did you think was particularly faster than SBCL?

> What did you think was particularly faster than SBCL?

Haskell, Scala, Java, Go and of course C, C++, Fortran all outperform SBCL in the alioth benchmarks.

Against the schemes, lisps, and "scripting" languages though SBCL stacks up favorably. I didn't notice that krig's comment was mainly comparing SBCL to this latter category ("programming language implementations").

> Haskell, Scala, Java, Go and of course C, C++, Fortran all outperform SBCL in the alioth benchmarks.

The way I read it, SBCL is in the same ballpark of what you mentioned above and magnitudes faster than the rest. Its a logarithmic scale.

Then again its a biased selection of benchmarks. Try finding a faster regular expression engine than CL-PPCRE.

I will stick to the stance pointed out by "Let Over Lambda": Common Lisp is--by language design--the fastest language around, as long as language X does not have COMPILE, it can not beat CL at a whole class of benchmarks (not found on alioth).

Also consider that the c/++ versions use intrinsics which means it's basically a compiler vs random assembly level code. Without that level of optimization they're fairly equivalent.

I imagine commercial Lisps like LispWorks would fare even better.

Depends on what you look at. LispWorks is fast, but not really faster than SBCL in Gabriel benchmarks. But what about real programs, garbage collection, etc.? LispWorks is used because its implementation is very capable, widely ported and it has stuff like a GUI toolkit which runs on Windows, GTK+ and Cocoa/Apple.

Thanks for the info.

As I never used it, besides the usual magazine reviews in the old days, as such I thought it was strong on that area as well.

It is, the 64bit version is really fast, similar to SBCL. But that's already kind of a local maximum. The GC and the rest of the runtime is better, probably CLOS is more optimized, ... LispWorks does not type check code much, but it has type inferencing when optimizing code.

You can see my benchmarks (the last from 2013) on my Mac:



Essentially, LLVM does tons of low-level optimizations but Clasp does few high-level, CL-specific ones. SBCL, on the other hand, mostly does CL-level optimizations and few low-level ones.

Yes, LLVM is awesome. Clasp uses the inlining LLVM provides to inline C++ code within Common Lisp code by generating LLVM-IR bitcode files from C++ source and linking them with Clasp generated bitcode files and then running them through LLVM module and function optimization passes. I'm really excited about combining the CL specific optimizations with the ones that LLVM provides - stay tuned.

In his defense, SBCL is really fast...

This is extremely exciting, particularly if it can be made to work with emscripten. Common Lisp in the browser, here we come!

Almost certainly can't. The closest you can get is using clicc or ECL or something similar to generate C code, and then compile that with emscripten.

This invokes llvm at runtime, and AFAIK emscripten isn't ported to emscripten.

FWIW, I've tried other lisps under emscripten:

Most lisps generate machine code, store them in RAM, and then execute that RAM. This is not possible under emscripten.

Clisp is a good candidate, since it's byte-code interpreted rather than generating machine code, but clisp makes so many assumptions about how the machine works (in particular it strongly wants a C style stack and does manual stack-pointer manipulation). I actually got fairly far into the bootstrap process under emscripten, but the minimal lisp interpreter it compiles generated bizarre errors.

I disagree. Clasp could do this because it compiles Common Lisp to LLVM-IR bitcode files (using COMPILE-FILE) as well as directly to native code using LLVM's MCJIT engine (using COMPILE). emscripten (https://github.com/kripken/emscripten) says that it compiles LLVM-IR to JavaScript. I haven't used emscripten, but I believe everything I read on the internet :-) and thus Common Lisp --[Clasp]--> LLVM-IR bitcode files --[emscripten]--> run within browsers.

I'm not an expert in Common Lisp internals, but there are some things that don't look like they could be entirely put in an executable without bringing the compiler and runtime along. I'm not just talking about things like a library that outright uses eval, but the more high-level things like macros that call compiled functions and reader macros.

Then again, you might be able to compile the compiler to LLVM IR -> emscripten -> JavaScript, and use that to compile CL code.

Agreed, but it's more of a "tree-shaking problem" than anything else. Clasp is written in C++ and Common Lisp. Clasp compiles Common Lisp to LLVM-IR and Clang compiles C++ to LLVM-IR. So theoretically you could compile everything to LLVM-IR and feed that to emscripten. Granted, it's going to be a _huge_ LLVM-IR file. Then you shake out the functions and globals that aren't needed. If the compiler isn't needed then it will shake out (or not be compiled in in the first place). The question for me is "what problem does it solve". I assume there are problems that I'm not aware of that it would solve, otherwise why would someone develop emscripten? Common Lisp is a fantastic language that is really underutilized, it's fun and so expressive. I fell in love with Common Lisp three years ago so deeply that I wrote a new Common Lisp to solve my scientific programming challenges while still being able to make use of powerful C++ libraries. I think everyone should use it everywhere - it's awesome.

If you can generate standalone bitcode that doesn't ever generate more bitcode at runtime, then yes emscripten becomes a possibility.

Sorry to go offtopic but since the you mentioned emscripten I though I'd ask.

Can you explain what emscripten is used for? I'm fully aware of what it does, just not why someone would want to use it. How do you use things like jQuery or access window or document from C and then compile it to js? Or is emscripten specifically for js that doesn't interact with the DOM?

> Common Lisp in the browser, here we come!

You have Parenscript for that: http://common-lisp.net/project/parenscript/

(I haven't used it yer though.)

It works really well. It's a subset of CL that it can compile though so you'll miss out on CLOS, restarts, etc. However the subset does include macros so it's serviceable.

It looks like there is an extension to Parenscript called PSOS[0], which provides conditions/restarts and something similar to CLOS.

[0] https://github.com/gonzojive/paren-psos

unfortunately PSOS does not use the current version of parenscript. It is based on an older and modified version.

I don't see why we couldn't do this. emscripten (https://github.com/kripken/emscripten) says it takes LLVM-IR to JavaScript. The Clasp (Common Lisp) function COMPILE-FILE compiles to LLVM-IR bitcode files. If you have some time and you would like to do this - hit me up with an email - I'd love to make this happen.

You'll probably run into all the same problems other Common Lisps run in to when they try to shake out the compiler from their runtime, namely that often you really do want to create and run new code at runtime (i.e. EVAL but much more often (compile nil '(lambda () ...)). A common example in CL is creating CLOS dispatch functions with specialized parameters to speed up generic dispatch, which I believe most CLOS implementations do at runtime (certainly SBCL/CMUCL's PCL does). You'll wind up needing to include your compiler, Emscripten itself, and all of their dependencies (LLVM libraries?) which will be quite a huge pile of JavaScript. The alternative is to include a CL interpreter but that has obvious speed problems.

ClojureScript already exists, and even if it's not Common Lisp, its still Lisp with homoiconity, macros and what not.

Ok. Anyone can explain what do I get the downvotes for?

I did not downvote you, but I find your suggestion very amusing because it's so casually out of context.

Being Lisp does not make one Lisp family language replacement for other. Language feature list is not sufficient for comparing the similarity of languages. Clojure and CL are very different languages with very different programming styles. You might have suggested Javascript just as easily.

I still don't understand comparing Javascript to Clojurescript in terms of closeness to Common Lisp. Could someone explain what makes CL and Clojure so radically different?

They are about as similar as Python and Ruby. Everything from the names of operators to the stance on mutability, to the more obscure corners (Compiler macros, reader macros, etc).

For a person who played with lot of programming languages, but not many Lisps it makes sense. Thanks.

I didn't downvote you, but I'm honestly not interested in a non-Common Lisp in the browser. I suppose some zealot(s) downvoted you.

Exciting work. CL on the LLVM has been needed.

Is this project something that you're doing to support your own research goals in computational chemistry? If so, can you share how this project fits into the bigger picture?

Finger crossed then that Azule can get a proper moving GC to work with LLVM then, until then a CL on LLVM (even Julia) are stuck with kinda bad GC.

I wonder why start from scratch for an LLVM backend, can't SBCL be used to generate LLVM code?

SBCL doesn't generate LLVM code. My primary goal was Common Lisp with C++ interoperation. It seemed easier at the time to start from the ECL Common Lisp code base and write a new C++ core and a Common Lisp interpreter that always interoperated with C++. As I wrote the compiler and expanded the system I maintained C++ interoperation all the time. There were a hundred problems that I had to discover and solve along the way to maintain a Common Lisp that interoperated with C++. You can get some level of interoperation between ECL and C++ up in a weekend but it won't work with C++ exceptions and RAII and there are dozens of other problems. In retrospect I don't think I would have gotten here starting from ECL because I never really understood the ECL code.

It wasn't from scratch, it was from ECL. And did you mean Azul?

Is the compacting garbage collector mentioned in the article not a proper moving GC? I am not familiar with it.

Yes, the Memory Pool System by Ravenbrook (https://www.ravenbrook.com/project/mps/) is a proper, moving garbage collector. It uses precise GC on the heap and conservative GC on the stack, as does the garbage collector in Steel Bank Common Lisp on x86 chips. I need it because I need my code to run on 100,000 CPU supercomputers with a controlled memory footprint to develop organic nano machines (seriously).

To be honest I was just kind of thinking "oh great another language implementation," before I read what you're actually doing and why you needed to create clasp. I appreciate the difficulty of writing precisely GC'd C/C++. It's pretty awesome that you were able to use clang to (I assume this is mainly what the analyzer does) track roots in C & C++ code.

Best of luck.

Thanks - yes the analyzer tracks roots through about 300 C++ classes. It also finds global variables and builds C++ code to interface with the MPS library. I exposed the Clang libraries to search the AST and describe the AST in Common Lisp and then wrote the static analyzer in Lisp. I shudder at the thought of doing this all in C++ and I write a lot of complicated stuff like Common Lisp implementations in C++ :-). Common Lisp is the language of trees and pattern recognition. Common Lisp is the perfect tool for this job.

I would point out that MOCL https://wukix.com/mocl pretty much provides this functionality only with full Common Lisp compatibility. It compiles the CL to C and adds methods to expose your functions to Obj-C / C code (if you want). Works just fine. Builds with Xcode, which as I understand, does everything on/via LLVM.

mocl is different from many other implementations: it is a batch compiler for whole programs to C. The generated code does contain a limited evaluator and no compiler. Some dynamic features are gone...

MOCL specifically does not have full CL compatibility.

MOCL is garbage, commercial garbage at that. No thanks.

> Clasp exposes the Clang AST library and the Clang ASTMatcher library. This allows programmers to write tools in Common Lisp that automatically analyze and refactor C++ programs. Clasp can be used to automatically clean-up and refactor large C++ codebases!

Niiice. I want to program C++, but not in C++.

Yes, I'll post more on this and provide example code in the coming weeks on the blog. It's really exciting and it's something that Clasp can do right now even though the Clasp compiler isn't a highly optimizing one yet. Google uses automated Clang refactoring to clean up the googleplex (100 megaLOC). Clasp puts those capabilities into everyones hands now.

Wow, this news almost makes me drool. I will keep my eye on that. I'm on SBCL right now, but I can't wait to see what Lisp can do with LLVM...

I hope it turns out a good implementation since there is a lot of C++ code out there and I do not know C++. But I can comprehend Lisp.

Maybe not-so-related, but "clasp" is also the name of a pretty popular Answer Set Programming (ASP) solver (link http://www.cs.uni-potsdam.de/clasp).

The name is already taken unfortunately: http://www.cs.uni-potsdam.de/clasp/

  Package: clasp
  Version: 3.1.0-1
  Description-en: conflict-driven nogood learning answer set solver
  clasp is an answer set solver for (extended) normal logic

It was also already taken by a device for fastening things. And my name was already taken by at least three people before me. Somehow we all get along.

Package names in distributions have to be unique though, so this one will probably have to be called clasp-llvm or something like that.

All names are already taken, that's why Wikipedia has the disambiguation pages.

Someone has been building Ocaml's opam on Arch I see.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact