
Julia 1.5 Highlights - tosh
https://julialang.org/blog/2020/08/julia-1.5-highlights/
======
LolWolf
Honestly, this is truly amazing. I will take a quick moment to thank the Julia
developers (and all the contributors) for such an awesome language—honestly,
if you do any kind of scientific computing (at all) I really and truly
recommend trying out Julia. The REPL and development loop are each a little
different than the usual Python/R/etc., but it's such a cool language and it's
so _fast_ that it will feel like an incredible breath of fresh air. It's
amazing to realize that writing down the first thing that comes to your head
is usually like 80% as fast as a good, performant implementation. (As someone
who has done a decent amount of work in performance engineering for embedded
platforms, I really enjoy squeezing out the last drop of performance from most
programs, but doing this for every single first-pass at a program, like in
Python, is rather annoying if this is what is needed to get a usable
implementation.)

Anyways, big props to the developers and contributors, even if this is just a
"minor" release. I'm extraordinarily excited for the future of Julia and hope
it continues to grow and be as awesome of a community as it is! :)

~~~
fermienrico
Personally, every time I try to delve into Julia, I am reminded by the massive
ecosystem that exists around Python. Speed takes a backseat, instead enormous
ecosystem makes Python indispensable and Julia a fringe language despite of
its growing popularity.

Say you have a database, and need pull data, do some heavy math computation,
then present this data in a pdf report with a QR code, pulling images and
processing them in the report.

In Python, it is without bells and whistles: psycopg2, numpy/pandas, reportlab
and may be use PIL. Wanna stick this on S3? boto3.

In Julia, it's all very fragmented. LibPQ is immature, DataFrames.jl is nice,
??? (what pdf conversion tool?), images.jl (300 stars), qr code generator?

The problem is not the speed with most tasks. The problem is availability of
tools. I am sure that will come with it, but why not just use Python at that
point? I am not a fan of Julia and that's nothing to do with its features. It
has immature ecosystem, with terrible IDE support (Atom/Juno raises my blood
pressure), debugging is painful if non existent, error messages are all over
the place, everything falls apart as the immature dependencies change and
error messages don't help _at all_.

Julia is fun in your jupyter notebook. If you try to build apps in production
environment in my team, expect push back if not straight up refusal to
initiate such a project in the first place.

Syntax was amazing around 0.4v and it went downhill from there.

I am sure someone is going to nitpick my comment and provide a way to do it in
Julia, but that's missing the point. The point is Python is miles ahead of
what it does. In production systems, robustness + maturity matters.

Also, don't forget ancillary aspects of a programming language. When we put a
python repo together, I am rewarded by an endless supply of developers that I
can hire and immediately work on it. With Julia, the supply of engineers is
limited and it is such a pain to train people to use it, learn its quirks,
spend nights and weekends fighting with it and the business doesn't give a
fuck about it.

~~~
EricForgy
I am not going to try to change your mind, but I'll just point out that one of
the highlights, in my mind, of the recent JuliaCon was the merging of the Juno
and VS Code teams. All IDE development for these teams going forward will be
focused VS Code. Have a look at the VS Code presentations available on
YouTube. It is pretty amazing to see where they are now and the future is even
brighter.

~~~
bananaquant
Do they have a usable debugger now? My last experience was 2 years ago, and
launching it even on some trivial code took over 10 seconds.

~~~
eigenspace
The Debugger has come a long way, here's it running in 1.5.0 on my machine:

    
    
        julia> @time using Debugger
          0.115907 seconds (202.08 k allocations: 13.215 MiB)
        
        julia> @time @run sin(1)
          1.438537 seconds (1.85 M allocations: 90.376 MiB, 0.80% gc time)
        0.8414709848078965
    
        julia> @time @run sin(1)
          0.003900 seconds (23.39 k allocations: 1.057 MiB)
        0.8414709848078965

~~~
bananaquant
Whoa. This looks really promising now.

------
ViralBShah
One of the things that the 1.5 release notes does not mention is the huge
amount of progress on the Julia GPU compiler. This doesn't get included in the
language release notes because the GPU compilation capabilities are provided
as a separate package in CUDA.jl.

It was great to see so many JuliaCon talks last week using these GPU
capabilities.

~~~
wadkar
Wait, does that mean GPU support includes NVIDIA only or will it also include
(hopefully in future) ATI as well?

~~~
KenoFischer
NVIDIA is the most mature, but AMD and Intel are in development

~~~
ChrisRackauckas
AMD is quite good already. The ROCm stuff works quite well.

~~~
eigenspace
Unfortunately though, they're dragging their feet on supporting their current
generation of consumer grade GPUs. It seems currently, they're not going to
support compute on consumer grade gaming GPUs and instead force people to use
enterprise grade ones if they want ROCm which is terrible.

For reference, the maintainer of AMDGPU.jl can't upgrade to a new AMD GPU or
he won't be able to use ROCm.

[https://github.com/RadeonOpenCompute/ROCm/issues/887](https://github.com/RadeonOpenCompute/ROCm/issues/887)

~~~
ezluckyfree
The AMDGPU.jl people get a lot their funding from DARPA, as the US government
apparently purchased a supercomputer with AMD GPUs.

------
coliveira
I'm glad to see that they are making improvement to latency issues. It has
always been an issue for me the fact that you need to way several seconds for
basic packages to load. I don't think you need to pre-jit so much code to make
this work.

~~~
heyitsme
People are being downvoted to oblivion for mentioning this, but for some of us
coming from different languages, it definitely takes some time to get used to
things. I'm new to julia, so am probably more ignorant about this than most...
but importing two packages such as DifferentialEquations and Plots, on my very
modern machine takes ~30seconds (this is after they've been "precompiled").
I'm curious, assuming I haven't installed anything new, or haven't changed my
installation, why does julia not cache this complied binary somewhere? It
seems like this long step has to happen on each new-kernel import, but perhaps
it could be avoided? Having an option flag that would cause julia to redo the
compilation (because the user has, say, changed something about their
installtion), seems like a simple solution.

What am I missing?

~~~
kristofferc
> but importing two packages such as DifferentialEquations and Plots, on my
> very modern machine takes ~30seconds (this is after they've been
> "precompiled").

FWIW, it is quite significantly improved in the upcoming 1.6 release.

1.5:

    
    
        julia> @time using Plots
          8.859293 seconds (15.99 M allocations: 913.783 MiB, 3.86% gc time)
        julia> @time using DifferentialEquations
         25.394033 seconds (58.78 M allocations: 3.222 GiB, 3.96% gc time)
    

latest master:

    
    
        julia> @time using Plots
          3.957724 seconds (7.67 M allocations: 537.424 MiB, 5.50% gc time)
        julia> @time using DifferentialEquations
          9.708388 seconds (23.08 M allocations: 1.537 GiB, 6.20% gc time)

------
aborsy
I don’t see why would anyone use Matlab in place of Julia in 2020 (unless one
really needs one of those specialized toolboxes).

Julia does the same things, faster and better, and literary with the same
syntax.

That being said, python with numpy has got really fast. If you stay within
numpy, and are careful with loops, or if numba works for your code, don’t
expect much improvement.

~~~
smabie
The problem with Python is that you have to bend over backwards to make sure
everything you're doing is vextorized or the performance falls off a cliff.
With Julia, you can just code naturally. It's really freeing.

------
xiphias2
Thanks for the hard work! It sounds great, especially the allocation
optimizations.

I didn't know why my code using views allocate memory, so I rewrote all my
performance critical functions to return iterators (which is not that bad, as
Rust uses lots of iterators, just Julia standard library is not optimized for
them). At least now I understand what was the issue.

Regarding thread-safety I started writing a borrow-checker macro as a hobby,
as I have been bitten by accessing mutable data structures from multiple
threads. It's great that the macro and type system is so flexible that it can
be done without modifying the language.

------
techwizrd
I'm very happy that using views no longer forces allocations. I like that I
can choose both performant _and_ safe, rather than being forced to make a
choice.

Kudos to everyone who contributed to yet another great release!

~~~
doublesCs
Do you wanna say a few more words about this? I thought that `v[1:10]` always
causes allocation but `view(v, 1:10)` never does. Is this not the case, or has
this changed?

~~~
mfsch
`view(v, 1:N)` did cause an allocation in previous versions, but the
allocation was a small, fixed size while `v[1:N]` would allocate memory
proportional to `N`. From what I understand, this was required to avoid
garbage collection of `v` in cases where the view would be the only remaining
reference to that array. In 1.5, changes to the GC allow views to be placed on
the stack now.

------
UncleOxidant
> "The return of "soft scope" in the REPL"

I'm a little nervous about code in the REPL and code in a file behaving
differently. I think they should go back to the 0.x scoping rules everywhere
consistently - which was lexical scoping, IIRC.

Which brings up a question: if this behavior is context dependent (different
in REPL and in a file) does that mean you can set a flag so that code in a
file behaves like code in the REPL?

~~~
StefanKarpinski
As noted in the rewritten scope docs at
[https://docs.julialang.org/en/v1/manual/variables-and-
scopin...](https://docs.julialang.org/en/v1/manual/variables-and-
scoping/#Local-Scope-1):

> An important property of this design is that any code that executes in a
> file without a warning will behave the same way in a fresh REPL. And on the
> flip side, if you take a REPL session and save it to file, if it behaves
> differently than it did in the REPL, then you will get a warning.

So there's no real danger of deviating behavior unless you're in the habit of
ignoring warnings.

As to why bringing back 0.x scope rules is not a good idea, this post explains
why it was changed in the first place:
[https://discourse.julialang.org/t/explain-scoping-
confusion-...](https://discourse.julialang.org/t/explain-scoping-confusion-to-
a-programming-beginner/43206/22). In short, for non-toy programs, it's a
significant source of hard-to-find bugs.

Also: changing this behavior in non-interactive contexts would violate the
semantic versioning compatibility commitment of 1.x releases. Of course, 1.5
does sometimes introduce a warning in existing code, but (a) it will keep
working the same way it did only with a warning and (b) this only happens if
you had a global implicitly shadowed by a local, in which case there's a large
chance your code was actually broken.

Give it a try. A _lot_ of time and thought was put into this change and we
believe the new design is pretty much locally optimal. It preserves the safety
of the 1.0 behavior for non-interactive programming while recovering the
convenience of the 0.x behavior in the REPL.

~~~
UncleOxidant
> 0.x scope rules... In short, for non-toy programs, it's a significant source
> of hard-to-find bugs.

It seems like a lot of languages have lexical scoping which IIRC was what 0.x
had. (Scheme, Lisp, OCaml are a few examples) Are there aspects of Julia that
make lexical scoping particularly problematic? Julia 1.x scoping rules look
similar to Tcl's.

~~~
StefanKarpinski
Julia has utterly standard lexical scoping—and always has. This issue has
nothing to do with lexical vs dynamic scoping, it has to do with whether an
assignment in a loop clobbers a global (convenient but dangerous) or creates a
loop-local variable (safe but sometimes inconvenient). Have you read the
linked manual section on scope? It is very clear and explains the design
considerations quite thoroughly.

------
muska3
They should really list all the contributors that added to the release instead
of highlighting the few people that added some of the listed features.

~~~
StefanKarpinski
That’s a good idea, although this is a highlights blog post, not intended to
list all the changes. For that you can look at the 1.5 release notes:

[https://docs.julialang.org/en/v1.5/NEWS/](https://docs.julialang.org/en/v1.5/NEWS/)

------
eigenspace
Version 1.5 is an awesome release with some really nice stuff. That said, I
can't help but feel frustrated by all the attention and effort being poured
into compiler latency. I get that some people value it very highly, but to me
it just seems silly to put so much effort towards this big group of whiners
who find it unacceptable to wait __ seconds for the compiler to run before
their first plot is produced (especially when PackageCompiler.jl exists!). To
me, the important thing about julia is runtime performance. I'd gladly pay
hefty compile times to have even minuscule runtime improvements.

To be clear, I'm not saying the Julia devs have been ignoring runtime
performance, far from it, but I just sometimes feel the need to chime in like
this because I often feel the compiler latency complainers are
disproportionately squeaky wheels, so I'd hate to see that cause the devs to
over-correct.

~~~
tastyminerals2
But the runtime performance is not that great either.

    
    
      Dot product:      NumPy 0.03528 vs 0.03 Julia    (x1/1.1)
      Element-wise sum: NumPy 0.00379 vs 0.0063 Julia  (x1.6)
      Element-wise mul: NumPy 0.00419 vs 0.00617 Julia (x1.5)
      L2 norm:          NumPy 0.02391 vs 0.097 Julia   (x4.1)
      Matrix product:   NumPy 0.00186 vs 0.01988 Julia (x10.7)
      Matrix sort:      NumPy 0.01033 vs 0.0161 Julia  (x1.6)

~~~
celrod
I tried element-wise sum, and got

    
    
      julia> using LoopVectorization, BenchmarkTools
    
      julia> A = rand(10_000, 10_000); B = rand(10_000, 10_000); C = similar(A);
    
      julia> @benchmark vmapntt!(+, $C, $A, $B)
       BenchmarkTools.Trial:
        memory estimate:  13.19 KiB
        allocs estimate:  91
        --------------
        minimum time:     29.103 ms (0.00% GC)
        median time:      29.202 ms (0.00% GC)
        mean time:        29.329 ms (0.00% GC)
        maximum time:     30.658 ms (0.00% GC)
        --------------
        samples:          171
        evals/sample:     1
    
    

While with NumPy

    
    
      >>> import timeit
      >>> u = timeit.Timer("A + B", setup='import numpy as np; A = np.random.rand(10_000, 10_000); B = np.random.rand(10_000,10_000)')
     
     >>> u.repeat(10, 1)
      [0.17918606100283796, 0.17888473700440954, 0.17893354399711825, 0.1790916720055975, 0.17922663199715316, 0.17935074500564951, 0.17939399600436445, 0.17940548500337172, 0.17933111900492804, 0.17920518800383434]
    

A 6x advantage for Julia. If I don't use multithreading, Julia slows down to
122.5ms, or about 1.5x faster.

Elementwise multiplication is the same fast.

If I get a little more creative and do exp(A) + log(B) instead, multithreaded
Julia still takes 30ms, while single threaded slows down to 169ms.

Meanwhile, NumPy slowed down to 847ms, making it 30x slower than multithreaded
Julia (on my computer) and 5x slower than single threaded.

Shall I keep making programs more complicated?

~~~
tastyminerals2
Never use big arrays in benchmarks otherwise depending on your CPU cache you
will be testing allocation as well. If you need the code I tested Julia on
here you go:
[https://github.com/tastyminerals/mir_benchmarks/blob/master/...](https://github.com/tastyminerals/mir_benchmarks/blob/master/other_benchmarks/julia_bench.jl)

~~~
celrod
Thanks for sharing the benchmarks.

I made the arrays huge to get roughly the same ball park of times you
reported. I also preallocated the memory in that benchmark to avoid
unnecessary allocations (the observed allocations are because multithreading
in Julia allocates memory). Is this something as easy to do in Python?

I made a few tweaks to your benchmark code. I get with Julia:

    
    
      Element-wise sum of two 100x100 matrices (int), (1000 loops)
        1.698 ms (2 allocations: 78.20 KiB)
      Element-wise multiplication of two 100x100 matrices (float64), (1000 loops)
        1.786 ms (2 allocations: 78.20 KiB)
      Dot (scalar) product of two 300000 arrays (float64), (1000 loops)
        4.801 ms (0 allocations: 0 bytes)
      Matrix product of 500x600 and 600x500 matrices (float64)
        229.475 μs (0 allocations: 0 bytes)
      L2 norm of 500x600 matrix (float64), (1000 loops)
        6.566 ms (0 allocations: 0 bytes)
      Sort of 500x600 matrix (float64)
        523.243 μs (595 allocations: 2.32 MiB)
    

Python:

    
    
      | Element-wise sum of two 100x100 matrices (int), (1000 loops) | 0.0042680892984208185 |
      | Element-wise multiplication of two 100x100 matrices (float64), (1000 loops) | 0.0027007726486772297 |
      | Dot (scalar) product of two 300000 arrays (float64), (1000 loops) | 0.0088505959516624 |
      | Matrix product of 500x600 and 600x500 matrices (float64) | 0.00107754109922098 |
      | L2 norm of 500x600 matrix (float64), (1000 loops) | 0.010691180499998154 |
      | Sort of 500x600 matrix (float64) | 0.008244920199649642 |
    

That is, Julia-speedup factors of:

    
    
      1. Small int matrix element-wise sum: 2.5x
      2. Small float matrix element-wise product: 1.5x
      3. Dot product (scalar): 1.84x
      4. Matrix multiplication: 4.7x
      5. L2 norm: 1.6x
      6. Sorting: 15.8x
    

The changes were: 1\. Memory allocation was slow, but Julia makes in-place
operations easy, so I made 1, 2, and 4 in-place. Although I still allocate the
result within the test function for `1.` and `2.`. 2\. MKL is much faster than
OpenBLAS, so I made `BLAS.vendor()` return `:mkl` 3\. I sorted in a threaded
loop and transposed the destination array. Transposing the destination array
was because Julia is column major, so sorting rows is a benchmark that will
naturally favor the row-major language.

But I left all the inputs the same.

My primary point with Julia here is that it's easy to optimize it and get more
performance out of it if or whenever you need it.

------
newen
Very glad Pkg is moving away from github. Hopefully, I can now use Julia in an
airgapped computer by downloading all the available packages locally like I
can do with conda.

~~~
natemcintosh
I would like to use Julia on an airgapped computer too, but have struggled to
figure out how to get packages on it in the past. Could you explain a little
more how this might get easier with 1.5?

~~~
newen
What I've been doing so far is have a parallel computer to my airgapped one,
installing all the packages that I could possibly want in the parallel
computer, copying the .julia folder to the airgapped one, and then crossing my
fingers that I won't need any other package.

In the article, they said all the packages would be in tarballs and in their
Pkg server. I haven't explored this yet but what I'm hoping to do is download
all the packages from the links in the package registry, and then point the
Pkg client to my local drive somehow.

~~~
staticfloat
The PkgServer is an open source caching server, available here:
[https://github.com/JuliaPackaging/PkgServer.jl](https://github.com/JuliaPackaging/PkgServer.jl)

The PkgServer protocol serves content-addressed chunks of data to Julia Pkg
clients, so to download a certain version of a package, the Pkg client will
request things like `GET /package/${pkg_uuid}/${content_hash}`. If you can
pre-fill a PkgServer with all of the packages that you want, then it can serve
a client just fine.

The full design is that there are a small number of "Storage Servers" that
continually explore the global registry of packages (called `General`, located
here:
[https://github.com/JuliaRegistries/General](https://github.com/JuliaRegistries/General)),
downloading and storing tarballs for every version of every package that is
available. These storage servers store everything forever, while Pkg servers
contain an LRU cache to allow them to be deployed close to whatever compute
resources will be requesting packages. For an airgapped solution, you could
generate a static snapshot of some selection of the packages you want to
serve, then serve them with nginx and point to that "static storage server"
with the opensource Pkg Server, and it would all "just work".

An example of how to generate a static storage server is here:
[https://github.com/JuliaPackaging/PkgServer.jl/blob/master/b...](https://github.com/JuliaPackaging/PkgServer.jl/blob/master/bin/gen_static.jl),
you would serve the resultant directory structure with something like nginx,
then point the Pkg Server to that server as the upstream storage server.

I will note that Julia Computing offers an enterprise solution for dealing
with secure/restrictive environments called JuliaTeam which provides this in a
convenient, managed bundle, along with many other useful features.

 __EDIT __: Ah, I forgot to mention, Stefan and I gave a talk at JuliaCon 2020
that touches on some of this, here 's a link to the timestamp of the relevant
section:
[https://youtu.be/xPhnJCAkI4k?t=350](https://youtu.be/xPhnJCAkI4k?t=350)

And for more info, here's the original planning issue (note some things have
changed as we've implemented it over the last year, but the bones are the
same):
[https://github.com/JuliaLang/Pkg.jl/issues/1377#issue-492482...](https://github.com/JuliaLang/Pkg.jl/issues/1377#issue-492482336)

~~~
newen
Thanks! I will give this a try.

------
daxfohl
How does rand get faster? I would think rand would be so well explored that
all mature languages reuse the same implementation. Is it written in C or
assembly, or does it ccall, or is it written in Julia itself?

~~~
hellofunk
> I would think rand would be so well explored

On the contrary, random number generation has no general solution and remains
an active area of research in many fields that depend on it.

~~~
pletnes
Also, different use cases put focus on different quality measures - period,
correlations between each generated number, performance... there is room for
many useful algorithms.

------
mi_lk
What are some industry players using Julia heavily? Can't find the info the
the website.

Still have the impression that it's mostly used by academics or niche
audience, would be happy to be proven wrong

~~~
moelf
[https://juliacomputing.com/industries/banking-and-
finance](https://juliacomputing.com/industries/banking-and-finance)

~~~
mi_lk
thanks, looks like I didn't look hard enough ...
[https://juliacomputing.com/industries/](https://juliacomputing.com/industries/)

------
randyzwitch
Looks like a lot of great content and updates/announcements came out of
JuliaCon last week

------
pjmlp
Congratulations, lots of nice improvements.

------
leanthonyrn
Racket and Julia on the same day!

~~~
dunefox
Racket is another language that is seriously cool.

------
kndjckt
Doe anyone know of a good ORM for Julia? This has been holding back my
adoption of Julia for a few projects

edit: for those who don't know what an ORM is
[https://stackoverflow.com/questions/1279613/what-is-an-
orm-h...](https://stackoverflow.com/questions/1279613/what-is-an-orm-how-does-
it-work-and-how-should-i-use-one)

~~~
eigenspace
Could you clarify what you mean by ORM? You're more likely to get high quality
responses if people know what you mean.

~~~
kndjckt
added! Thanks! :)

------
m0zg
> the time to generate the first plot goes from 11.7 seconds to 7.8 seconds

This should not take more than 100 milliseconds in this day and age, folks.

~~~
fermienrico
Most people who advocate Julia have never built production ready systems.

~~~
jonniedie
Most people who talk about “production ready systems” have never had to do any
serious numerical work.

~~~
eigenspace
I might have to steal this quote.

------
tastyminerals2
Ok, I think I got the picture now. You should find Julia interesting!

    
    
      If you prefer dynamic languages and interested or do scientific programming.
      If you need your code to run LLVM fast.
      If you have a lot of RAM and can live with "warm-up" lags. 
      If you are comfortable with MATLAB syntax.

------
vsskanth
Wondering if one can make static binaries in Julia yet ?

~~~
dklend122
You can precompile a binary that bakes in the compiler and distribute that.

Slimmer separate compilation is on the roadmap

~~~
FridgeSeal
Have any details around the slimmer separate compilation you mentioned?

