
Julia library for fast machine learning - todsacerdoti
https://turing.ml/dev/
======
mark_l_watson
Turing.jl is great as is Flux.jl (which I have used more).

I retired a year ago and was looking to settle down and just use one
programming language - easier on an old guy like myself, who has always been
fascinated with many programming languages. I ended up picking Common Lisp for
my retirement projects, mostly because of almost 4 decades experience with CL.

For a Lisp programmer, Julia feels like a very comfortable language. The
interactive repl is very nice. I experimented with Flux a lot, and wrote
snippets for web scraping, text processing, etc. It also worked really well
for these non-numeric use cases: a very good general programming language, not
just for numeric processing.

Julia has great Python interop so just in case you need a Python library, you
have it available.

------
s1t5
Looking at the examples, I still struggle with code that imports multiple
libraries at the top and then uses naked function names without telling me
where those functions come from.

~~~
Buttons840
In Julia you can `@which naked_function`. That might help.

With Julia's function overloading a function might come from multiple
packages.

~~~
photon-torpedo
Or rather, a function from one package can be _extended_ with methods by other
packages.

~~~
FabHK
And to add:

Which of the many method of that function is called depends on the type of all
the arguments (not only the first argument, as in single dispatch languages
like C++). In other words, the implementation actually executed (and thus the
source package) might vary depending on the argument type. If I understand
Julia correctly.

~~~
s1t5
I still don't get why multiple dispatch was chosen over having seperate
functions with typed arguments. It just seems to add complexity with limited
benefit.

~~~
StefanKarpinski
This video is my best attempt to explain it (not a rickroll, I swear):
[https://youtu.be/kc9HwsxE1OY](https://youtu.be/kc9HwsxE1OY)

~~~
s1t5
Nice talk, you're a really good speaker. Thanks.

------
UncleOxidant
There's some amazing things happening in the Julia ecosystem. See also:
[https://sciml.ai/](https://sciml.ai/)

------
abeppu
Can someone informed give some suggestions as to compare/contrast to other
tools at the intersection of probabilistic programming and deep learning? What
are relative strengths and weaknesses vs edward or pyro?

~~~
ChrisRackauckas
Turing.jl is in an interesting spot because it is essentially a DSL-free
probabilistic programming language. While it technically has a DSL of sorts
given by the `@model` macro, anything that is AD-compatible can be used in
this macro and since Julia's AD tools work on things written in the Julia
language, this means that you can just throw code from other Julia packages
into Turing and just expect AD-compatible things to work with Hamiltonian
Monte Carlo and all of that. So things like DifferentialEquations.jl
ODEs/SDEs/DAEs/DDEs/etc. work quite well with this, along with other "weird
things for a probabilistic programming language to support" like nonlinear
solving (via NLsolve.jl) or optimization (via Optim.jl, and I mean doing
Bayesian inference where a value is defined as the result of an optimization).
If you are using derivative-free inference methods, like particle sampling
methods or variants of Metropolis-Hastings, then you can throw pretty much any
existing Julia you had as a nonlinear function and do inference around it.

So while it's in some sense similar to PyMC3 or Stan, there's a huge
difference in the effective functionality that you get by supporting a
language-wide infrastructure vs the more traditional method of one-by-one
adding features and documenting them. So while PyMC3 ran a Google Summer of
Code to get some ODE support
([https://docs.pymc.io/notebooks/ODE_API_introduction.html](https://docs.pymc.io/notebooks/ODE_API_introduction.html))
and Stan has 2 built-in methods you're allowed to use ([https://mc-
stan.org/docs/2_19/stan-users-guide/ode-solver-ch...](https://mc-
stan.org/docs/2_19/stan-users-guide/ode-solver-chapter.html)), with Julia you
get all of DifferentialEquations.jl just because it exists
([https://docs.sciml.ai/latest/](https://docs.sciml.ai/latest/)). This means
that Turing.jl doesn't document or doesn't have to document most of its
features, but they exist due to composibility.

That's quite different from a "top down" approach to library support. This
explains why Turing has been able to develop so fast as well, since it's
developer community isn't just "the people who work on Turing", but it's
pretty much the whole ecosystem of Julia. Its distributions are defined by
Distributions.jl
([https://github.com/JuliaStats/Distributions.jl](https://github.com/JuliaStats/Distributions.jl)),
its parallelism is given by Julia's base parallelism work + everything around
it like CuArrays.jl and KernelAbstractions.jl
([https://github.com/JuliaGPU/KernelAbstractions.jl](https://github.com/JuliaGPU/KernelAbstractions.jl)),
derivatives come from 4 libraries, ODEs from etc. the list keeps going.

So bringing it back to deep learning, Turing currently has 4 modes for
automatic differentiation ([https://turing.ml/dev/docs/using-
turing/autodiff](https://turing.ml/dev/docs/using-turing/autodiff)), and thus
supports any library that's compatible with those. It turns out that Flux.jl
is compatible with them, so therefore Turing.jl can do Bayesian deep learning.
In that sense it's like Edward or Pyro, but supporting "anything that AD's
with Julia AD packages" (which soon will allow multi-AD overloads via
ChainRules.jl) instead of "anything on TensorFlow graphs" or "anything
compatible with PyTorch".

As for performance and robustness, I mentioned in a SciML ecosystem release
today that our benchmarks pretty clearly show Turing.jl as being more robust
than Stan while achieving about a 3x-5x speedup in ODE parameter estimation
([https://sciml.ai/2020/05/09/ModelDiscovery.html](https://sciml.ai/2020/05/09/ModelDiscovery.html)).
However, that's utilizing the fact that Turing.jl's composibility with
packages gives it top notch support (I want to work with Stan developers so we
can use our differential equation library with their samplers to better
isolate differences and hopefully improve both PPLs, but for now we have what
we have). If you isolate it down to just "Turing.jl itself", it has wins and
losses against Stan
([https://github.com/TuringLang/Turing.jl/wiki](https://github.com/TuringLang/Turing.jl/wiki)).
That said, there's some benchmarks which indicate using the ReverseDiff AD
backend will give about 2 orders of magnitude performance increases in many
situations
([https://github.com/TuringLang/Turing.jl/issues/1140](https://github.com/TuringLang/Turing.jl/issues/1140),
note that ThArrays is benchmarking PyTorch AD here) which would then probably
tip the scales in Turing's favor. As for benchmarking against Pyro or Edward,
it would probably just come down to benchmarking the AD implementations.

~~~
cuban_MMA
Hi Chris, as a heads up, Stan actually has had three built-in methods for a
while now. There is a non-stiff Adams-Moulton solver introduced in 2018. It
unfortunately was only just exposed in the Stan 2.23 documentation:
[https://mc-stan.org/docs/2_23/stan-users-guide/ode-solver-ch...](https://mc-
stan.org/docs/2_23/stan-users-guide/ode-solver-chapter.html). Certainly, the
devs have been talking about adding more solvers for a while, including SDE
and DDE solvers, and your DifferentialEquations.jl ecosystem is an excellent
model; it is an area that we know Stan has been lacking in. I think Steve
Gronder will be trying to work with you regarding benchmarking.

~~~
ChrisRackauckas
Awesome. If we can get FFI with Stan I'd like to connect
DifferentialEquations.jl to it and poke at it with various problems to see how
well it does on a few things. We can provide custom gradients if there's an
interface for it, but I couldn't figure out how to do it without modifying the
Stan source itself.

~~~
ivanyashchuk
In order to connect Stan with DifferentialEquations.jl the steps would be:

1\. Create "diffeqcpp", a C++ interface to DifferentialEquations.jl (that
would be similar to diffeqpy, diffeqr) possibly using CxxWrap.jl

2\. Make it possible to evaluate vector-Jacobian products (VJP) with
"diffeqcpp". Probably that would require ODE RHS to be coded as a string of
Julia code, to make Julia AD libraries compatible with it.

At this point, it should be possible to call Julia solvers from C++ and
evaluate the derivatives.

In Stan, there is stan::math::adj_jac_apply that makes it possible to define
custom functions with custom VJP without having to deal with Stan autodiff
types, it works for example with Eigen::Matrix<double>. [https://discourse.mc-
stan.org/t/adj-jac-apply/5163](https://discourse.mc-stan.org/t/adj-jac-
apply/5163)

3\. Make a class (let's call it JuliaODESolver) that implements two methods:

    
    
        operator() // calls Julia solver for the given input 
    
        multiply_adjoint_jacobian() // evaluates VJP for the given vector
    

4\. In .stan file add a custom function in "functions {}" block, and write a
header file that implements that custom function. That would probably be one
line

    
    
        return stan::math::adj_jac_apply<JuliaODESolver>(ode_solver_inputs);
    

More info on using external C++ code is in Section 4.5 CmdStan Manual.

5\. Modify cmdstan/main.cpp to initialize and finalize Julia context to be
able to call Julia functions. This is probably the only place where Stan
source itself needs to be modified.

I don't know what would be needed to make forward mode, and higher-order
derivatives to work.

I think it would be much better for a fair benchmarking if there was a
convenient and documented interface to Stan algorithms to use with user-
provided log-density function, similar to DynamicHMC.jl and AdvancedHMC.jl
libraries. It would be then easy to call it from Julia/Python/R/C++ or
anything else.

------
spacetracks
How does this compare to gen? [https://www.gen.dev/](https://www.gen.dev/)

~~~
keorn
Turing has more things that work out of the box, so if you do not have complex
requirements its a good first step. Gen allows for composing models using its
generative function interface, you can specify models in different ways. You
can also have fine grained control over inference, rather than a few preset
methods. Gen has also worse error messages and docs.

------
snicker7
The documentation / project page is very well-done, something unfortunately
rare in the Julia ecosystem.

~~~
systemvoltage
To add to that, even if Julia had excellent documentation literally everywhere
and on every library, I wish there were better stack trace and meaningful
error messages. Even if Julia performed 10x worse, this overlooked aspect of
Julia would make up for it. It is rather unbelievable how much time I need to
spend to figure out what's wrong with a particular piece of Julia code.

Founders of Julia - please focus on error messages. Some cool things from
Rust: [https://doc.rust-lang.org/edition-guide/rust-2018/the-
compil...](https://doc.rust-lang.org/edition-guide/rust-2018/the-
compiler/improved-error-messages.html)

~~~
ViralBShah
As walnuss says below, these things are already being worked on. If you tried
Julia a few months or couple of years ago, you'll find that stacktraces have
already got quite a lot better.

One of the things that the Julia community could greatly benefit from is more
compiler contributors. Languages like Rust naturally attract compiler folks,
and those like Go have the backing of Google.

Julia has more complex compiler due to our dynamic type system, which the
users absolutely love. But it also puts a lot of strain on the compiler team,
which is quite small. So if any folks with experience in compiler technology
on HN are looking for interesting projects to contribute to, please do look
into Julia.

We've done better over the years increasing our bus number on the compiler
codebase overall. More contributions in all areas of the toolchain would be
great to have!

~~~
pjmlp
As bystander I love the work you guys are doing, in a way it feels like Dylan
was reborn and found a place in scientific computing.

