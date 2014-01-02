For instance, Simon Peyton Jones describes [1] how they took the ideas around STM from 'unsafe' languages and put it into Haskell, developed it further, and then later the 'unsafe' languages took the Haskell innovations and put it back into their languages.
Clojure isn't perfect, and Common Lisp isn't perfect, but hey, they're better than programming in assembly. We should be stoked about someone trying to innovate; why do these posts always turn into language wars?
[1] https://youtu.be/iSmkqocn0oQ around 3:50, but the video is short so it's worth watching the whole thing.
But the big elephant to me is that Common Lisp provides a very strong set of Von Neumann abstractions in addition to its functional abstractions. Lists are an abstraction that maps to sequential reads of a contiguous block of memory. Setf has pointer and offset semantics. And the much maligned `loop` is a brilliant abstraction over iteration that lets a program be explicit about the why of a loop.
That's not to say there are not advantages to functional programming. But the popularity of Go and Python etc. suggest that abstractions that run counter to functional programming mores is not a reason to find fault in Common Lisp. Like Clojure, it's a big language that tries to meet programmers where they are in the projects that they work on. Nobody faults Clojure for allowing Java's mutable vectors because they're there in the language for practical reasons.
So I really have to sit back and shake my head when the author says he's going to be as good as Clojure, then goes off into the weeds with custom syntax, STM and actors. Really? Why actors?
This would have been a much better article if it just left Clojure out of the discussion since whenever the author talks about the language he's mostly wrong.
And as always, lies, damn lies, and benchmarks: https://benchmarksgame.alioth.debian.org/u64q/compare.php?la... If you're going to use a phrase like "faster", at least spend some time to define the word.
Probably because Common Lisp is already on a similar level of being data-centric, so it would be redundant to talk about it. Actually, the only difference between how Clojure and CL treat data is that (while both offer access to both kinds of data) Clojure strongly prefers immutable values, while CL doesn't care.
;;; adapted from
;;; https://www.ida.liu.se/ext/caisor/archive/1978/001/caisor-1978-001.pdf
(((JAN 12 2014)
((9 15) (10 00) (SEE ANDERSON))
((10 45) (11 00) (SEE LUNDSTROM))
((13 15) (16 00) (ATTEND Y COMMITTEE MEETING)))
(((JAN 13 2014)
((9 30) (10 00) (ATTEND NEW PRODUCTS PRESENTATION)))
[{:appt/start #inst "01-02-2014T9:15:00Z" :appt/end #inst "01-02-2014T9:15:00Z" :appt/description "See Anderson"}
{:appt/start #inst "01-02-2014T10:45:00Z" :appt/end #inst "01-02-2014T11:00:00Z" :appt/description "See Lundstrom"}
{:appt/start #inst "01-02-2014T13:15:00Z" :appt/end #inst "01-02-2014T16:00:00Z" :appt/description "Attend Y Committee Meeting"}
...]
1) Clojure prefers maps over cons cells. This means that I always know exactly what I'm looking at. I don't have to guess that the second times are the end-times...I know because it' named :appt/end.
2) We don't overload data types. Have a date? Use a date type. In the CL example we see symbols and numbers sometimes used for descriptions, sometimes for times, etc. Here in Clojure we have #inst ... which creates an actual DateTime object. Now I no longer wonder what I'm looking at, I know it's a date.
3) Clojure prefers data over DSLs. What this calendar example shows is some sort of domain specific language that not only has to be parsed by a program, but it also has to be parsed by a human.
4) Got a meeting description, use a actual string....what's up with the use of symbols as strings (I never understood that about CL).
5) From the perspective of a data modeler...in the CL example if you want a meeting to last over midnight or for longer than a day, it looks like you're sunk.
But thanks for the example, it's fun to see how data modeling is done in other languages.
((:start @2014-01-02T09:15:00Z :end @2014-01-02T09:15:00Z :description "See Anderson")
(:start @2014-01-02T10:45:00Z :end @2014-01-02T11:00:00Z :description "See Lundstrom")
(:start @2014-01-02T13:15:00Z :end @2014-01-02T16:00:00Z :description "Attend Y Committee Meeting")
...)
Of course, depending on the programmer, this would change a lot. For example you could use a simple list or vector of 3 items instead of a property list, or maybe an association list. But if this isn't a one-off thing, most would probably define a class or a struct to store the data in, with proper accessors to get the data out. This wouldn't be so easily PRINTable as the above, but in code it would look something like the following:
(list (make-appointment :start @2014-01-02T09:15:00Z
:end @2014-01-02T09:15:00Z
:description "See Anderson")
(make-appointment :start @2014-01-02T10:45:00Z
:end @2014-01-02T11:00:00Z
:description "See Lundstrom")
(make-appointment :start @2014-01-02T13:15:00Z
:end @2014-01-02T16:00:00Z
:description "Attend Y Committee Meeting"))
[1]: https://common-lisp.net/project/local-time/manual.html#Reade...
[2]: http://www.lispworks.com/documentation/HyperSpec/Body/25_adb...
I apologize for not being clearer. The point I was trying to make is that Lisp programs often used lists in a data centric manner and my example based on the 1978 paper was optimistically intended to illustrate that.
Perhaps there's an analogy between lists in traditional Lisp programming and text in *nix systems in so far as each is used as a standard interface when composing systems from sub-systems. Or perhaps not.
Franky, this example upset me because this is the typical "LISP has only lists and symbols" example. On the first hand, the format is not deprecated: you could write a simple parser today and extract the data back. But on the other hand, it reinforces those core myths about Lisp having no real datatype, no strings, etc.
I mean the parser is something like:
(defun appt:appointment-date (appt)
(first (first (appt)))
(defun appt:appointment-start-time (appt)
(first (first (rest appt))))
(defun ...
Anyway, what I found interesting about the code years ago and yesterday was how lightweight it was. There's nothing inherently wrong with storing appointments in Zulu time, but if I'm talking with Lundstrom it's easier if we both say '10:45'. Reflecting that business logic in the data may make sense in the context. In other contexts it might not.
As an aside, since I can customize the readtable in CL (which is not considered useful in Clojure), I added a single entry for "#i" (and read-char to ensure it ends with "nst") to read the exact same data as in Clojure; but I got an error about invalid dates; it just happens that the Clojure example does not actually contain valid RFC3339 dates (I also tried to with Clojure => invalid date format).
Things change: not the same JVM as 2014, not the same Clojure.
We may quote to one another with a chuckle the words of the Wise Statesman, lies, damned lies and statistics, still there are some easy figures which the simplest must understand but the astutest cannot wriggle out of. 1895 Leonard Henry Courtney
Almost all the work I do these days as a software engineer is data transformation. Going from a HTTP request, which is data (even the header is a hashmap), and a HTTP body, which is data, into some business logic that eventually writes to a database in a different format.
Even the most complex systems I've built containing dozens of servers and multiple databases, queues, http servers, etc. All boil down to transforming data from format A to format B perhaps with conditional logic applied.
So yes, Clojure is a functional language, but functions are just a tool to be used to get the actual work done of transforming data.
Can you elaborate more on what you mean by "lackluster performance"? What is the use case? If you're looking for top speed in terms of C/assembly performance - I'd say yes, probably the JVM will get in your way. However, I spent months building a database/key-value store in Clojure and it's quite doable to write very high performance code in Clojure as long as you put the right type hints everywhere. Tools like YourKit can help you identify the bottlenecks in your program. Again, depends on the use case, but Clojure makes it very, very idiomatic to write code that makes full use of a multicore system - something possible in the like of Java, C, et al but definitely not idiomatic at all.
It can take long seconds for Clojure REPL to start up. No other dynamic language I had contact with was that bad on that front. Even Erlang and Elixir boot up faster!
Of course, I know about various ways of setting up a daemon process in the background, but then there is a need for reloading/hot-swapping code, which Clojure also fails to do well. Both Erlang and Common Lisp (also Emacs... even Emacs!) are better in this regard, both for different reasons.
Anyway, Clojure overall performance impression - how fast it feels for users - is bad and will remain bad unless the startup problem is fixed.
> probably the JVM will get in your way
Not so much the JVM, but Clojure and how it uses the JVM (although, as you say, it's possible to get not too far from top JVM performance with Clojure). It's fairly easy to get C performance (and even beat it in concurrent code) for the same amount of effort on the JVM. Currently, the main handicap the JVM has is the lack of arrays-of-structs which may cause lots of cache-misses, and requires less-than-elegant code to overcome. This, thankfully, is being addressed by the addition of value types.
Before I started doing Clojure the JVM ecosystem was this "scary" thing but 4 years later I've learned to appreciate it a lot - stable AND sound libraries, high performance out-of-the box, and very,very high performance if you really need it.
But not as high performance as Java directly. And typically not as high performance as native languages, which the author addresses.
It's a faster wrapper around c code for us. We maintain and created:
https://github.com/bytedeco/javacpp
which wraps a lot of c++ components. This is also how we do GPUs and the like with our own memory management for our "numpy for java":
https://github.com/deeplearning4j/nd4j/tree/master/nd4j-back...
Short of it "off heap and NO GC" matters a lot.
We have 1 c codebase we use that's meant to be controlled via JNI here:
https://github.com/deeplearning4j/libnd4j
The speed gains we've seen are massive. Java can't compete with good ole simd and the like for numerical computing.
We are a big advocate of the JVM as a platform but let's be clear about its weaknesses. You need to use unsafe (see: netty,aeron, and any other low level database written in java or using java as the core language with some c++)and other tricks out of the box for real performance.
> Java can't compete with good ole simd and the like for numerical computing.
Oh, sure, for some specialized use cases of course that's true, but you could use OpenCL in Java, too. With 10+ years with C/C++ and 10+ years with Java, I'd bet on Java when it comes to performance bang-for-the-buck almost in every case (given that it's a large app), and even more than that the more concurrent, complex and unpredictable the app is (but this requires carefully looking at the design).
Value types will make Java competitive in absolute terms in more and more domains. Also, with the new JIT (Graal) you can control machine-code generation at whatever level of detail you want.
> You need to use unsafe (see: netty,aeron, and any other low level database written in java or using java as the core language with some c++)and other tricks out of the box for real performance.
They're not using unsafe for throughput but (mostly) for latency. That's a whole other matter. Also, some of those use "mechanical sympathy" as a driver of performance instead of algorithms that the JVM makes easier. I've built a concurrent DB in pure Java that relies on synchronization that would require at least double the effort if I were to write it in C (hazard pointers, etc.), and may not even have better performance.
[1]: https://github.com/h2oai
[2]: https://vimeo.com/105743312
I've had a personal conversation with Cliff himself. Java no matter what you do can't deal with hardware acceleration and gpus. We agreed on that. Numerical software is a different beast. I also kept mentioning "simd instructions" as well as things like openmp.
You are talking about systems software. Unfortunately that matters a lot for machine learning. The axis along which you can get equivalent speeds should be specified here. "values types" != "runs on faster chips"
You aren't likely to beat intel or nvidia's compilers at their own game here . Java will always be playing catch up to last gen's tech there.
Disclosure: I'm more than aware of what's going on in the space. We compete with them for customers and have a very clear understanding of their offerings. H20 has a great k/v store based on the exact mechanics you're talking about. That's about it though. Also of note: Cliff doesn't work on h20 anymore: https://twitter.com/cliff_click/status/700817408110399492
>> Oh, sure, for some specialized use cases of course that's true, but you could use OpenCL in Java, too....
OpenCL isn't exactly the industry standard for this stuff. You always end up using cuda, and you always end up dropping down to c. There's just no way to avoid that if you want the fastest out there.
Another disclosure, we work closely with nvidia and I may be biased:
https://blogs.nvidia.com/blog/2016/10/06/how-skymind-nvidia-...
I agree with you on the last part, but I keep mentioning "numerical software" for a reason. There are certain things the JVM is good at, writing a database and systems software is one of those things. There are still bits of HDFS in c++ though. I don't think you'll be able to get around having bits of your code in c which is what I emphasize here.
[1]: http://openjdk.java.net/jeps/193
[2]: http://openjdk.java.net/projects/panama/
There is a reason why Java 10 aims to improve Java's story in regard to mechanical sympathy.
I also work pretty closely with a lot of the spark/gpu folks at IBM.
While I do largely agree with you, we have our own JNI compiler called javacpp that alleviates a lot of those concerns already:
https://github.com/bytedeco/javacpp
Having our own pointer class and doing our own memory management has helped a lot.
What you're talking about is using java for the compute.
There's no reason you can't wrap that in a runtime that most people know how to use. A lot of python folks do that now..but then you have to deal with python's limitations at which point the majority of your code (way more than needs be) will end up being in c anyways vs java where you can write a significant part of your app in java and have it be fast out of the box.
I am mostly a line-of-business developer doing enterprise consultancy and that is how we use C++, just as infrastructure language when either JVM or .NET stacks need a bit of outside help.
Just wondering if sometimes having a 100% C++ solution would be a better approach than the added integration effort it requires, on the other hand, without people like you guys we wouldn't have access to nice tooling in Java for similar work, so congratulations on the work thus far and all the best for the project.
Ha.
That's what we were promised back in the mid-1990s. It's no more true today than it was then. Well written Java will always be about 2-4x slower than similar C code.
JVM applications have many compensating advantages, including the aforementioned ease of exploiting concurrency. That doesn't erase the reality of single thread performance where Java has never, in twenty years of strong expert effort, caught up.
I think of it as problem akin to how you can make traffic run better in an entire city rather than optimising the performance of an engine in a single car. It doesn't invalidate making more performant engines, but it's a different level of consideration.
When saying, "JVM is as performant as C", it's easy to run the numbers and see whether it's true or not, objectively. The caveat that GP throws in is, for the same amount of effort. Then you need to specify which human and under what circumstances, and the entire thing gets hairier.
Those days are over, unfortunately. Computing stopped getting faster in general around 2010. And that also changes the calculus around single thread performance. You can no longer count on hardware eventually to solve performance problems.
That C program might be fast, but it's not a great tool for processing petabytes of data on a stampeding herd of angry elephants. So, I think performance arguments need a little nuance about what you're doing.
(My experiments include soft real time with deadline scheduling. Clojure's fine.)
Nope. A large, concurrent app is likely to be faster in Java given similar effort (of course, given enough effort -- which may be double -- C will eventually surpass that, sometimes even significantly, depending on usage). Currently the main bottleneck, which makes the above statement very dependent on application type, is lack of value types, and that's being addressed. In small sequential apps, there will be a significant advantage to C, which diminishes with the size of the app. The reason is that as the app grows, it gets harder to write manual optimizations while keeping the code modular and maintainable, while HotSpot can do all sorts of optimizations even with nice abstractions.
I "kinda" agree with you. Java has the right stability and speed trade offs and can be wicked fast. It beats the crap out of most garbage collected languages save maybe .net and the CLR.
For real applications I agree with you that c is the way to go. Java with c and off heap memory gets you a pretty long way.
In many domains it doesn't matter you can do it in 10ms in C and I do it in 1s in Java, if the customer is willing to wait 5s.
Also I see all the time badly written Java code. For example copying an array via for loop, instead of System.arrayCopy.
Last I heard (JLS 2016 keynote by Brian Goetz) they have yet to commit/guarantee when value types will land. Java 10 is planned; if they're completed by then we're looking at probably 2020 given the delays with Java 9 release.
That's 3 years for other languages/platforms to evolve while the JVM unboxes itself.
I'm a novice Clojure programmer and haven't previously heard of this. Can you please elaborate?
* look up the type of the thing
* run the action on it
but can instead skip straight to step 2.
If you somewhere in your Clojure program (or repl) run (set! warn-on-reflection true) it will print to the console when you are doing reflection.
I think it's a good idea to put in the beginning of your core namespace - so that it is always on. Why not always wear the seatbelt when driving, you know?
There's no reason the average Clojure project ever need be using reflection for dispatch.
(defn [^WriteInterface o ^long n] (.write o n)) can be much faster because the compiler knows the type of o & n.
This is needed only when calling Java methods, because Java allows overloading. Clojure only does arity-overloading, not type overloading, and hence doesn't need the type info.
What on earth is non-idiomatic multicore c code? You have to try really hard with Macro tomfoolery before c becomes non-idiomatic :) .
That said, if there is any non-idiomatic C code then the attempts of 'experts' in other languages to mangle C to look more like their language surely fall into this category (There are lots of examples where the author starts 'We can make C look more like fortran by defining our macros to replace these keywords..).
That's the first time I've heard anything like this. Lots of great FP languages exist that are not dynamic or offer such macros, why would they be a requirement for FP?
I think many of us probably tend to associate certain styles and features with functional programming that really are orthogonal to whether the program is functional or not.
As I've had it explained to me, if you write your program using functions whose output only depends on the arguments provided to that function, and nothing else, then you have yourself a functional program. But then again, people tend to treat the concept like a scale, with some programs (or languages) being more or less functional than others.
Most of the usual operators can be imported as functions:
from operator import lt
You can choose to do this in any language, but that does not make any language conducive to FP. The other main ingredient is that a function should be a primitive data type in the language, and can be stored, re-assigned at run-time, passed to other functions, and returned from functions. Without this, you definitely cannot have a functional language.
Wikipedia[1] seems to think that functional programming is a subset of declarative programming. It would be interesting to see a table, or possibly a decision tree, of possible language design choices to see the different parameters by which a language can vary.
[1]: https://en.wikipedia.org/wiki/Functional_programming
Some languages support it better than others. Some are clearly and only functional, others offer many functional features but not the whole package.
Also, the degree that a language needs to support certain features to be called functional varies among developers. Some languages are very clearly not functional in any way, while others, like C++ since 2011, offer a lot of functional features but do not offer the same paradigms that languages like Clojure and Haskell offer. That's because the other "primitive" a language should support to be fully functional are persistent data structures that can be bashed on by numerous functions without performance cost. Such structures don't exist yet for C++ (though many have tried, and they might eventually), but are the core of Haskell and Clojure.
Firstly, JS is a very credible and easy to use target for Clojure. It has source map support and produces aggressively optimized JS via the Google Closure compiler. (Parenscript, the CL competitor, can claim neither.) I've had plenty of code where I literally rename the "clj" extension to "cljc" and it magically runs in the browser.
Secondly, I get to use ample libraries in both cases because they already have thriving ecosystems that are completely independent from all of these parentheses. For example, today I'm writing AWS code, and I get to use AWS's Java SDK painlessly. If I'm on CL, it appears I'm out of luck.
Being a hosted language, in and of itself, hence seems like a clear choice if it isn't for the (not made) performance argument. So, perhaps we should talk about that instead :)
[1]: https://news.ycombinator.com/item?id=13349747
public class Main {
public int addTwoNumbers(int a, int b) {
return a + b;
}
}
(defun void-function (param)
(let* ((class (jclass "Main"))
(intclass (jclass "int"))
(method (jmethod class "addTwoNumbers" intclass intclass))
(result (jcall method param 2 4)))
(format t "in void-function, result of calling addTwoNumbers(2, 4): ~a~%" result)))
(defn f []
(prn "calling addTwoNumbers from f" (.addTwoNumbers (Main.) 2 4)))
Also, my concrete claim was that I could use AWS easily. I can use both the SDK directly quite painlessly (see above) or use https://github.com/mcohen01/amazonica. Where is the equivalent CL library?
;; During initialization
(require 'abcl-contrib)
(require 'jss)
CL-USER(1): (in-package :jss)
JSS(2): (lambda (x y)
(format t
"~&addTwoNumbers(~A, ~A): ~A~%" x y
(#"addTwoNumbers" (new 'Main) x y)))
#<FUNCTION (LAMBDA (X Y)) {7EEE31B2}>
JSS(3): (funcall * 20 30)
addTwoNumbers(20, 30): 50
NIL
[1]: http://clojure-doc.org/articles/language/concurrency_and_par...
Surely "does" is meant to be negated?
Also, this is a perfect task for core.async / core.match in clojure. Definitely not 6 days worth of work.
