I work professionally as an applied mathematician and I've struggled to understand where Julia fits into the tools already available.
In terms of prototyping algorithms, MATLAB/Octave still seems to be the best choice. We have access to an enormous number of builtin routines that help diagnose what's going on when an algorithm breaks. That's not to say other language don't, but the ability to set a single dbstop command and then run some kind of diagnostic like plotting the distribution of eigenvalues with d=eig(A); plot(real(d),imag(d),'x') is amazing and saves time. There's also a very straightforward workflow to run the debugger and then drop into gdb in case the case we need to interact with external libraries.
Now, certainly, MATLAB/Octave is weak, in my opinion, for generating larger software projects that need to interact with the rest of the world. This includes things like network connections, GUIs, database access, etc. Alternatively, sometimes a new low level driver needs to be written, which needs to be very fast. All that same, that ecosystem seems to be much better in languages like C++ and Python. Though, I've been experimenting with Rust as an alternative to C++ for this use case. At this point, if I have trouble with the algorithms, I can run it in parallel with the MATLAB/Octave to diagnose how things differ.
Coming back to Julia, where is it supposed to fit in? To me, there's a better prototyping language, production language, and bare metal fast language.
I will make one last quip and that's the licensing. Frankly, the big advantage of MATLAB is that it provides license cover. Mathworks has obtained the appropriate licenses for your expensive factorizations and sparse matrix methodology. Julia has not and they remain largely under GPL:
Look at things like SUITESPARSE. Practically, what that means is that I can deliver MATLAB code to my clients and I don't have to disclose the source externally due to GPL requirements. Now, maybe they choose to run Octave. That's fine and then they can assume the responsibility for GPL code. However, for me, I maintain a MATLAB license and that gives me coverage for a whole host of other licenses in the context of MATLAB code and that makes my life vastly easier than if I were to develop and deliver code in another language.
The beauty of Julia is in its design as a language. This is what gives people hope that the tooling and ecosystem will emerge -- they all seem easier to develop in Julia than in other languages, thanks to multiple dispatch, the type system, and powerful metaprogramming abilities.
As far as fast bare metal languages go, you can write extremely general optimized matrix multiplication libraries in Julia. For matrices that fit in the L2 cache, I achieved performance similar to Intel MKL, but with far more generic code.
Writing a kernel in C means using SIMD intrinsics, where function names and types differ for every vector size, and then different numbers/sets of calls for each kernel (you'd want several sizes per architecture). Compare this to Julia, where parametric typing and multiple dispatch mean you only need one set of functions, and using an @generated function will generate whatever kernels you happen to need at compile time.
Looking into the future, awesome projects like Cassete speak to the potential of what's possible:
Cassete promises to let you do anything from getting automatic differentiation for arbitrary Julia code, even if it is strictly typed and buried in a chain of dependencies of the library you're using and written by someone who never imagined the idea of autodiff, to injecting custom compiler passes.
Julia is more promise than practice right now, but that's largely because of just how much it promises. I (and many others) think it has done a great job delivering so far.
That's why there's excitement, and why many are starting to embrace it.
> Writing a kernel in C means using SIMD intrinsics, where function names and types differ for every vector size, and then different numbers/sets of calls for each kernel (you'd want several sizes per architecture)
This is not true for any recent, competent compiler. I can write a template function in C++ code with a regular loop over arrays and it gets vectorized withSIMD instructions automatically according to architecture. Sure, it’s not JIT but it still never requires intrinsics
> This is not true for any recent, competent compiler. I can write a template function in C++ code with a regular loop over arrays and it gets vectorized withSIMD instructions automatically according to architecture. Sure, it’s not JIT but it still never requires intrinsics
The relevant comparison here is against matlab. If you want to spend two weeks writing and debugging an optimized version of matrix mult for a custom type in c/c++, that's your business, but one can do it in a day in julia and be fairly confident it will just work.
The problem isn't just whether or not it uses SIMD instructions, but avoiding redundant and unnecessary `vmov` instructions. For example, I filed a bug here:
because unless I use SIMD intrinsics, while the code is vectorized, there is a high density of unnecessary move instructions resulting in dramatically slower code.
It can make static libraries. There are some old blog posts on how it can be done. No one has packaged it up though, and it will always be limited to chosen specializations of course.
Use a container. I recommend singularity, works great with julia. Shipped a single binary, deployed on a supercompute center, ran 20 batches over 500 nodes, 16 cores per node, no problem.
Then you are constructing a problem specifically for julia to fail. Congratulations on being a jerk that doesn't pose a question that is relevant to anyone's real use case.
Counter question: Can you in 24 hours ship a binary library built in C or C++ that in guaranteed to work on ARM, x86, and also detect NVIDIA or AMD GPUs and use those for acceleration as needed?
MATLAB, Python, and R don't have the metaprogramming tools for code generation. Tools like FFTW relied heavily on code generation, and people who wrote those kinds of tools are now using Julia for generic programming. Generic programming is difficult to handle with any AOT compilation system since you get a combinatoric explosion of possibilities leading to large compile times. Generic programming works great in Julia so there's a lot of examples of utilizing types for "free features" that would be difficult to do (interactively) in any other language.
Of course, most programmers don't know what generic programming even is. However, a lot of libraries these days utilize it in order to get different precision and all of that, which is why I think Julia will at least become a language of libraries since not too many people can invest so much time in C++.
> most programmers don't know what generic programming even is.
Where do you get this from? You’re coming off far too aggressive. Metaclasses or AST rewriting decorators are parts of Python. I know they don’t fit your use case but you don’t need to set fire to multiple ecosystems because they don’t invoke a full blown compiler for every line of code.
>Generic programming is difficult to handle with any AOT compilation system since you get a combinatoric explosion of possibilities leading to large compile times.
I thought it was either solved by adding function signatures based on actual usage, or “boxing” the argument (which I guess is just dynamic at that point)
Why would you ever see the combinatoric explosion in an AOT implementation?
If you want to make it as a standard library so it can compile once and others link to it, then you have this issue. You need to compile on use in order to only add the function signatures based on use, so if you're doing this from types the user is giving you (say, a generic differential equation solver where the state can be any type with the right overloads) then you're out of luck. If you know of a big set of types the user might want to give you, then you can compile those ahead of time, but for the diffeq example you can have different types for the dependent and independent variable, so you'd want each combination. The solution here of course is just to make people statically compile against it if you want to use all of these features, but you need to compile as a shared library if you want to build an easy interface to a high level scripting language (Julia, Python, R, MATLAB, etc.). So what tends to happen is that libraries which do make some extra features possible are constrained when used from a scripting language and they compile a bunch of choices to mitigate that a bit (see SUNDIALS and its built-in NVector compilations). The true solution here of course is to compile on demand based on the new types you see from the user codes, and this is what Julia does automatically.
As for boxing, boxing works but it's slow. The developer of LightGraphs.jl tried using Swift for awhile but went back to Julia when he realized that actually using generics in Swift is really really slow because they do this boxing (your code is generic! But...). To finish the story, he tried Go a bit but generics in Go... he went to C++ but templates but they were difficult to use and the compile times were too long to be anything close to interactive, and has done extremely well over the last few years in Julia (though it is admittedly missing some tools for (partial) AOT compilation and building binaries, but these tools are in development).
If you want another example, I saw this a couple days ago:
"Also exciting - we only have 27k signatures right now, and removing all the ones taking row vectors reduces that substantially to under 10k. That seems doable pretty easily, and we can convert without performance penalty between row and column vector seamlessly (I believe)."
Those sound like big numbers to me! That's a real problem that DynamicHMC.jl doesn't have to worry about, for example.
I think Stan's autodiff is faster than ForwardDiff.jl/ReverseDiff.jl (other than Stan models regularly taking several minutes to compile), but I'm betting on Cassette-powered Capstan/Zygote getting there in the next few years.
Julia combines rapid prototyping with a well designed language and good execution speed. Matlab only has the first of those three. Julia can plot eigenvalues no problem. Almost exactly the same code works.
The MATLAB compiler is another part of the ecosystem which is worth its price when trying to deliver quickly. There’s nothing like it in the other ecosystems (maybe Mathematica?)
Entire MATLAB toolchain is a ridiculously overpriced relic, the faster it disappears the better, and it will. To a close approximation the only folks who continue to use it are those who are insulated from the market -- national labs, and universities. It annoys me because at places its a ridiculous waste of tax payer's money.
No it's not. I've seen it used within petrochem and other engineering firms as well. For example, the real time simulator Opal-RT uses Simulink and Opals are used for things like simulating power systems.
Although you may not like it, it serves a purpose and it's one tool among many.
I actually work in power systems simulations and am not aware of anyone using the power systems toolbox in Matlab outside of some academic researchers. I'm sure some folks do, but it isn't very big. This is all mainly traditional Fortran and increasingly C++ and even C. There are some open source things like MatPower (Matlab or Python), but it is mostly seen as a neat toy and not something I would trust to actually run grid studies such as an EMS vendor, PSS/E, PowerWorld, Powergem, PSCAD...etc etc etc. I've actually seen that some vendors are looking at Julia to replace some of their tools.
When you say real time, you really mean an EMS system. These have traditional powerflow applications that use something like the Newton-Raphson method, SCADA applications that bring in data from equipment in the field (~4 second data), state-estimator applications that combine traditional powerflow with weighting from the SCADA data, and a Contingency Analysis (does N-1 loss of equipment simulations). Those are the core apps and the US has several vendors ranging from Siemens, Alstom (now GE), GE, OSI...etc.
In power system planning people use PSS/E, PowerWorld, PowerGem TARA, and a variety of other loadflow tools for powerflow and N-1. They don't generally use much real time data and instead use models ranging from 1 year to 20 years in the future. There are also another dozen vendors here selling specialized products that do everything from Voltage Stability to Transient Stability to optimization of topology and fault analysis. The breadth is very wide. You might use multiple vendors with some degree of product overlap because you like vendor A's powerflow functionality and graphics, while vendor B has the best dynamics on the market.
I'll have to check out the links you've sent as I've never heard of these two products and I try to keep my ear to the ground, but it is a huge software market and I don't monitor Europe's market as well.
Edit:
Just checked out your opal-RT links and yes that is real time power systems software, just not the kinds of real time analysis I typically use and think of. This seems closer to PMUs (phasor measurement units) which instead of 4-second SCADA data use something to the tune of 60 samples a second (that's a lot of data).
I agree that Matlab and especially Mathematica is REALLY nice if they have a built in function for what you need (and they do have a ton of things as primitives like graphics, audio, a single import function that can read over 100 file types...it's nuts). If they don't have what you want, Wolfram Language is powerful but so weird.
In terms of prototyping algorithms, MATLAB/Octave still seems to be the best choice. We have access to an enormous number of builtin routines that help diagnose what's going on when an algorithm breaks. That's not to say other language don't, but the ability to set a single dbstop command and then run some kind of diagnostic like plotting the distribution of eigenvalues with d=eig(A); plot(real(d),imag(d),'x') is amazing and saves time. There's also a very straightforward workflow to run the debugger and then drop into gdb in case the case we need to interact with external libraries.
Now, certainly, MATLAB/Octave is weak, in my opinion, for generating larger software projects that need to interact with the rest of the world. This includes things like network connections, GUIs, database access, etc. Alternatively, sometimes a new low level driver needs to be written, which needs to be very fast. All that same, that ecosystem seems to be much better in languages like C++ and Python. Though, I've been experimenting with Rust as an alternative to C++ for this use case. At this point, if I have trouble with the algorithms, I can run it in parallel with the MATLAB/Octave to diagnose how things differ.
Coming back to Julia, where is it supposed to fit in? To me, there's a better prototyping language, production language, and bare metal fast language.
I will make one last quip and that's the licensing. Frankly, the big advantage of MATLAB is that it provides license cover. Mathworks has obtained the appropriate licenses for your expensive factorizations and sparse matrix methodology. Julia has not and they remain largely under GPL:
https://github.com/JuliaLang/julia/blob/master/LICENSE.md
Look at things like SUITESPARSE. Practically, what that means is that I can deliver MATLAB code to my clients and I don't have to disclose the source externally due to GPL requirements. Now, maybe they choose to run Octave. That's fine and then they can assume the responsibility for GPL code. However, for me, I maintain a MATLAB license and that gives me coverage for a whole host of other licenses in the context of MATLAB code and that makes my life vastly easier than if I were to develop and deliver code in another language.