
Julia: A fresh approach to numerical computing - sveme
http://arxiv.org/abs/1411.1607
======
wbhart
I've been playing around with the Cxx library (currently only works with a
source build of bleeding edge Julia)
[https://github.com/Keno/Cxx.jl](https://github.com/Keno/Cxx.jl) It allows you
to basically embed ordinary C++ code in Julia code, to interface with C++
libraries at runtime, and to slurp in entire .h files unmodified. (It also in
theory allows a C++ REPL mode to be written, though such a thing does not
exist for Julia yet.)

It makes very clever use of a really amazing new language feature in Julia
0.4, namely staged functions. At first it looked like it was manipulating C++
via strings sent to the Clang compiler at runtime, which sounded really slow
and awful. But then it dawned on me how it really worked, and I was amazed.

I've actually convinced three or four of my colleagues (mathematicians) to
work on building a computer algebra system with Julia that will interface to
various packages such as Pari/GP, flint, Singular, possibly Gap. We meet
weekly to discuss design and to try and work on an initial implementation.

The rate at which Julia is improving and the community is increasing is
astonishing. We are at the stage where Julia is offering us language features
we haven't even dreamed up a use for! And yet the language remains elegant and
easy to learn.

~~~
one-more-minute
For reference, a staged function is something sorta similar to a macro, except
it lets you take advantage of Julia's inferred type info. Contrived (and
wholly redundant) example:

    
    
      stagedfunction foo(x)
          if x == Int
              :(2x)
          else
              :x
          end
      end
    
      y = 1
      foo(y)  == 2
      y = 1.
      foo(y) == 1.0 
    

Again, just like a macro, except that before `x` is quoted you work with the
_type_ of y instead of the symbol `y`. This means you can do _even more_ crazy
code specialisation, including on e.g. matrix dimensions or generating
appropriate FFI calls based on arbitrary input types. It has zero overhead and
you can use it just like a regular function.

Very cool, and I think people will do really interesting things with this
(especially after mixing them with macros).

~~~
infinite8s
This sounds very similar to an approach called Lightweight Modular Staging
(pioneered in Scala by the Oderski group - [http://scala-
lms.github.io/](http://scala-lms.github.io/) and also in Lua -
[http://terralang.org/](http://terralang.org/)). While this is great for
specializing numeric code based on runtime invariants like dimensionality, I
think people are finally looking at using the idea in other domains - for
example, runtime generation of DSL code on a per instantiation basis. Imagine
taking a high level SQL query, building the operator tree in a Julia DSL and
optimizing that, and then JITing the entire thing taking into account low
level storage details and the particular sets of operators and joins
([http://msr-waypoint.com/en-us/events/dcp2014/rompf.pdf](http://msr-
waypoint.com/en-us/events/dcp2014/rompf.pdf)). This is very exciting to see in
another typed language!

~~~
tlmr
You mean like blaze in python?:
[http://blaze.pydata.org/docs/v_0_6_5/index.html](http://blaze.pydata.org/docs/v_0_6_5/index.html)

~~~
infinite8s
Blaze has a bit of a broader focus than what I was talking about, since blaze
mostly offloads the actual computation to a particular backend. But a
combination of blaze and a custom lowering of the computation into machine
code using numba would be similar (although without the type safety for
guaranteeing that certain optimizations are possible)

~~~
tlmr
I believe that the numba compilation is planned for the coming year.

------
wbhart
From the article: " A pervasive idea that Julia wishes to shatter is that
technical computing languages are “prototyping” languages and one must switch
to C or Fortran for performance. The consequences to scientific computing have
been more serious than many realize."

I think this provides a very succinct summary of the Julia paradigm at a very
high level.

~~~
carterschonwald
I'm trying to push haskell for that too. But the more decent tools and
choices, the better for everyone. :-)

------
StefanKarpinski
Despite having a very similar title, this is a rather different paper from
[http://arxiv.org/abs/1209.5145](http://arxiv.org/abs/1209.5145). We had to
upload a version of this paper for a grant application, but it's still
somewhat in progress.

~~~
ViralBShah
That's right. The purpose was to have twin papers - one for the Computer
Science community focussed on language implementation, which is the earlier
paper.

This paper focusses on motivating the design of Julia for the larger
scientific community. Any comments will be incredibly valuable in improving
the quality of the paper.

~~~
trurl42
It's great to have a paper like this on the arXiv, I believe this is the right
way to reach the scientific community.

Some ideas for improvements:

The package directory could be mentioned more prominently.

It's a bit tricky to select code in the PDF, not sure if much can be done
about this?

~~~
ViralBShah
I guess we can publish the IJulia notebooks with the examples and even just
the code as plain .jl files.

------
monochr
What I would like to know before jumping in a new programming language is what
it sucks at and how badly it sucks there.

Never in my life have I ever heard anyone say anything good about how
wonderful it is language X is good at y which is what it was designed for. I
have however heard plenty of people cursing languages for not doing something
which they thought was an "obvious" thing for a language to do and X didn't.

~~~
wbhart
This very much depends on what you want to do with it and what your background
is. Here are some things that might bother you.

* Julia's approach to OOP is via multimethods, not the usual class/inheritance model. This might be annoying for people who don't want to learn how to be an effective programmer in the other paradigm.

* Julia's garbage collector is not generational/incremental and in some corner cases, GC can take 10 times longer than the actual function you are running (typically it is between 5% and 50%). This would make Julia unsuitable for HFT, real-time games, web browsers of the future and other real-time applications. (Edit: see Viral's post in the same thread. Incremental GC is in the works. This is genuinely my experience of Julia to date. No sooner do you need something, and someone competent is already working on it, if they haven't already done it!)

* Julia sucks at predicate dispatch. It doesn't have it. Granted, neither does any other language except Gap and one other I forgot. So if you are used to that feature, you would find Julia a step down. (Edit: yes of course Julia does not need/want predicate dispatch. It's just an illustration of something that could bother you if you were really, really used to something. I just happen to have colleagues who really are used to this.)

* Julia is tied to LLVM, so if you want to be on the CLR/DLR or JVM, you are out of luck.

* Julia currently doesn't have static compilation (I hear it is being actively worked on). This makes it more difficult to deploy binaries.

* Julia does not have Haskell-like separation of effects from pure functions. This might not appeal to type purists.

* Julia functions can fail at runtime where statically typed/compiled languages would pick up the errors at compile time.

* As popular as it is, Julia is still not in the top 50 programming languages by measure of usership.

* Julia's support of mutable C-style structs allocated on the stack, as opposed to pointers to heap allocated objects is still somewhat lacking. This creates some challenges in efficient C FFI in corner cases, especially in combination with GC.

* Julia doesn't have inheritance of data types (it is really a dual to a data focused language, which is sensible -- you typically have far more functions than data types in a program -- but it's still hard for traditional OOP users to get used to).

* The Julia abstract type system is somewhat linear, which makes it a little less flexible as far as contracts/interfaces are concerned, if you choose to implement things that way.

Of course Julia has so many features that make it worthwhile, that it is worth
investigating for many projects. It has multimethods, (static) dependent
typing, very easy and efficient C interface (soon C++ interface), C-like
performance is possible, garbage-collection, macros, runtime console (REPL),
great numerical features, Jit compilation, a good selection of
libraries/packages, a package manager, profiling, various development tools, a
good (highly intelligent and helpful) community.

~~~
srparish
* lack of threads: they do support multiple processes which can be useful for splitting up work for large computation, but not a substitute for threads

* module compilation is not cached, so if you use very many modules your start-up times can be slow

* error messages sometimes require some head scratching. For example, it's not uncommon to get an error that there's no available function convert(::SomeType, (Some, Args)) when there's no obvious convert() to be seen in the code in question. Occasionally the stack trace will be missing from errors, or there won't be a line number. Obviously this is improving quickly, but can be frustrating.

~~~
simonster
The first two are both WIP. Threading is
[https://github.com/JuliaLang/julia/tree/threads](https://github.com/JuliaLang/julia/tree/threads)
(although it hasn't been updated in a little while) and static compilation of
modules is
[https://github.com/JuliaLang/julia/pull/8745](https://github.com/JuliaLang/julia/pull/8745)

~~~
srparish
That's great to see. It's sometimes hard to keep up with everything that's
going on. Incidentally I really got a kick out of the recent s/Uint/UInt/
rename
([https://github.com/JuliaLang/julia/issues/8905](https://github.com/JuliaLang/julia/issues/8905)).
It took a day or so to propose and do it. To compare, java will probably never
fix it's spelling oddities (for example int and Integer). Very refreshing to
see fundamental things like that be fixed and so quickly!

~~~
StefanKarpinski
Just to play devil's advocate about the Java capitalization, I suspect the
difference between `int` and `Integer` is quite intentional: `int` is a
"primitive", immutable value type, while `Integer` is a class. Since by
convention, Java's classes are capitalized, while its primitives are spelled
the way they are in C, there's some sense here.

------
EvanMiller
For anyone in the Chicago area, Leah Hanson is giving a free Julia workshop
next weekend (11/15):

[http://www.meetup.com/JuliaChicago/events/216950712/](http://www.meetup.com/JuliaChicago/events/216950712/)

Should be a great way to get introduced to the language and pose your nagging
questions to a community expert.

~~~
willis77
Her blog is also a great resource to pick up some of the more useful parts of
the language.

~~~
StefanKarpinski
Not to mention her forthcoming O'Reilly book, "Learning Julia"!

------
edelman
Yes, this paper, as the title suggests, was written perhaps for the community
as a whole, but perhaps in particular for the numerical computing community.
Most especially, there are many ideas in this paper that are entirely
unfamiliar to that community, and this paper hopes to change that. Hope you
enjoy the reading. (We probably will make some updates. The spelling fix of
the title was already processed yesterday and will appear tomorrow or Monday.)

------
vegabook
I am a little weary of the hype around a language whose unique selling point
is speed. Where else in Julia is there significant innovation versus Python?
The latter seriously has it all when it comes to scientific computing, and if
anybody is not already running vectorized numpy code, or trivially compiling
critical for loops to c-like speed using Numba/Cython, then they're missing
out on performance which has nothing to be ashamed of versus any of the newer
kids on the block.

I have been weary of the Julia pitch which was basically "lets brew R, Python,
and Matlab (all >20 years old with massive and unrivalled ecosystems and all
serving their purposes very well, thank you) into one "new" language and put a
nice logo on it". How is this language seriously the leap forward that one
needs to abandon the current awesome toolsets, with all their battle tested
libraries, other than a nebulous "speed" argument which in many cases, judging
by the comments and my own experience in financial data matrix operations, is
not even fully accurate? Ready to be persuaded otherwise if I can be shown
that other than "speed" there are hefty reasons to move from Python and
abandon the almost endless choice of richly varied tools that I have at my
disposal already.

~~~
one-more-minute
You should read the paper, but Julia's also a really nicely designed language
in general. It's hard to communicate how great multiple dispatch + a powerful
type system is without trying it out (though, again, the paper does a good
job), but yeah, it really helps you get a lot of generality as well as
performance. When you use Cython, you immediately lose that generality.

Then there's the really powerful metaprogramming capabilities, which are
absent from the languages you've mentioned. I can't emphasise it enough: the
paper this comment thread is about does a great job of explaining why these
things are compelling.

Also, you may be interested in [1], an argument that language performance is
valuable even if you don't need it yourself.

[1]: [https://medium.com/the-julia-language/performance-matters-
mo...](https://medium.com/the-julia-language/performance-matters-more-than-
you-think-a556e6cdcd10)

~~~
vegabook
Compelling argument in your link about how performance for others allows for
better libraries for everyone. I am still concerned that Julia does not go far
enough in breaking the imperative programming mould, but I think it's
disingenuous not to give it a serious try in more than trivial exercises. I
have to say though that I have learned R, Python and Golang in the past 8
years and all of them are basically imperative (though R's vectorisation-
everywhere is impressive - wish it was all just faster); I hope Julia will
give me something dramatically more interesting. I say that because the LLVM
has ushered in a period of radically easier language development, so we are
likely to be spoiled for choice in the next 5 years. I hope Julia has done
enough to put itself way out there in in terms of innovation to make the
sizeable investment of time for myself and library developers, worthwhile.
Altogether however I cannot be anything other than impressed with the dogged
and convincing pitch that you and others are making for it, which somewhat
lowers the risk of investing time in a dead end. And even if it doesn't work
out all hunky, at least I'll know that Python will face serious competition,
and that can only be good, even for Python.

~~~
tlmr
How does the upcoming and even current numba+blaze power duo not allow library
designers to easily write fast code?

~~~
vegabook
Well that was the point of my original message. I am using numbapro and easily
getting to c-speed with R-like vectorized convenience right now. It's why I
question the "speed" argument as a non-argument when compared with Python. And
I haven't even started using the cuda approach... You allude to another point
though: Python is not standing still. Python and its environment is a mighty
high mountain for Julia to climb if it's not going to move the game forward
significantly so that its big ecosystem disadvantage is compensated. Julia
cannot just do incremental improvement - it doesn't have enough momentum to
make that a winning strategy. It needs to leapfrog to take on Python.

I should add one more point though. The post mentions Matlab 15 times and
Python only 7. It's possible this whole Julia effort will be successful with
the Matlab crowd which, up to now, has been watching with horrified
fascination from the sidelines as open source ate its lunch.

~~~
tlmr
Nice. Do you find yourself having to contort code to work with numbapro?

------
jostmey
I've been using Julia for a few months and I love it! Being able to process
arrays in a function as if it was a single variable is super awesome, and as a
result my code uses far fewer for/while loops.

------
dagw
Am I the only person not seeing the promised performance with Julia? As a
concrete example I recently ported some non-trivial legacy matlab code I use
at work to both python/numpy and Julia. The port was basically a straight port
of matlab code making only the necessary syntax changes to make the code run.
And much to my surprise Julia run twice as slow as Octave and 4 times as slow
as python. Adding type annotations helped but I never got closer than 50%
slower than Octave. In earlier test I've seen cases where Julia was literally
12 times slower than python/numexpr in real world code.

I really want to love Julia and everything I read about it makes me think it's
the language for me, but every time I try to write non-toy code the
performance is always worse than python.

~~~
srean
> The port was basically a straight port of matlab code making only the
> necessary syntax changes to make the code run.

Then it is not surprising at all that you do not get any speed benefit. I
would expect a serious slowdown.

Idiomatic MATLAB and Numpy has lots of vectorized expressions because they are
so crappy at loops. However, in vanilla Julia those vectorized expressions
cannot be JIT'ed. Now if you were to write them out as explicit loops then JIT
will take a stab at making things faster.

I do like the economy of words of vectorized expressions a whole lot, this is
the reason why I am so excited about
[https://github.com/lindahua/Devectorize.jl](https://github.com/lindahua/Devectorize.jl)

~~~
ViralBShah
On vectorized code we are on par usually, but sometimes GC performance hurts.
There is a patch for this in the works. Could you post the code snippets to
julia-users mailing list?

------
kartikkumar
The biggest challenge I'm facing now is how to convince those around me to
take the leap and step over to Julia. We have a wealth of legacy code in
MATLAB, C++ and Fortran. I've seen the various methods to call these codes
from Julia, but in the case of C++, it seems that only rudimentary support is
provided.

My goal is to port a lot of my code to set up astrodynamics simulations. I've
even created a placeholder Github repo for it [1]. Just have to find the right
strategy to involve the dev in my existing projects and get my colleagues to
chip in.

[1]
[https://github.com/kartikkumar/Astro.jl](https://github.com/kartikkumar/Astro.jl)

------
Blahah
Nice paper guys. Any plan to get something peer-reviewed? It would be useful
for motivating biologists.

Anyone interested in using Julia for bioinformatics is invited to contribute
to, or follow the progress of, the BioJulia project:
[https://github.com/BioJulia/Bio.jl](https://github.com/BioJulia/Bio.jl)

~~~
ViralBShah
Yes, we do want to submit it to a SIAM journal. It does seem that we are
already getting interesting feedback here. The BioJulia team will probably
write a paper on Julia in Biology and publish in a related journal, just like
Miles Lubin and Iain are doing with JuMP in the OR community.

~~~
Blahah
Yup, we'll definitely do one for BioJulia - but we've got a way to go before
we reach that stage :)

------
ForHackernews
Are there any mature visualization libraries for Julia yet? I'm pretty excited
about this language, but I couldn't find any visualization tools that had good
documentation.

~~~
StefanKarpinski
Gadfly is an excellent ggplot2-style plotting package:

[http://gadflyjl.org](http://gadflyjl.org)

Winston is a most traditional plotting API:

[http://winston.readthedocs.org/en/latest/examples.html](http://winston.readthedocs.org/en/latest/examples.html)

You can also plot seamlessly via Python using PyPlot:

[https://github.com/stevengj/PyPlot.jl](https://github.com/stevengj/PyPlot.jl)

They are all quite well documented.

~~~
tomrod
Still working my way through them myself. Useful examples working through docs
would be useful. Most seem to presume a lot of knowledge of the libraries.

~~~
ViralBShah
You are right - package documentation has some ways to go. Currently we don't
yet have a mechanism to do `help()` on a package function, but we expect that
to change in 0.4.

~~~
tomrod
Would love to help in some way.

------
gct
The one based indexing pretty much kills the entire thing for me.

~~~
CyberDildonics
Then you are missing out on a wonderful programming tool for a very
superficial reason. I wish indices were offsets like C and not indexes like
Lua, but I'll deal with it for all the language gives.

------
mearnsh
anyone know where I can get a juliabox.org invite code?

~~~
ViralBShah
Shoot me an email - viral@mayin.org

~~~
kartikkumar
Mind if I shoot you an email too for an invite?

~~~
ViralBShah
Yes - happy to share codes with anyone. Also, we will be simplifying the
invite system in the next few days so that JuliaBox will let you in without
any code by default.

In the rare cases of a surge, such as an HN mention, it will auto-release
account approvals as the surge subsides in a few hours. Once you do have
access, you will continue to have access, even during surges. Basically, we
are just trying to manage our compute budget better.

------
science404
is it just me or is arxiv.org down? HN effect?

~~~
science404
for anyone experiencing issues with arxiv.org, mirror:
[http://uk.arxiv.org/abs/1411.1607](http://uk.arxiv.org/abs/1411.1607)

