
JuliaCon2020: Julia Is Production Ready - ViralBShah
https://bkamins.github.io/julialang/2020/08/07/production-ready.html
======
oxinabox
Invenia had run Julia for our primary production system for over a year
(nearly 2).

The size of our Julia codebase is about 400,000 lines of code, spread over
numerous internal and open source packages. You can cite the "we're hiring"
slide from my Juliacon talk for that
[https://raw.githack.com/oxinabox/ChainRulesJuliaCon2020/main...](https://raw.githack.com/oxinabox/ChainRulesJuliaCon2020/main/out/build/index.html#2)

~~~
nickjj
I'd be super interested in having you (or someone at Invenia) on as a guest on
a podcast that talks about tech stacks. We'd cover things like what
development and deployment is like, which tools you use and why you chose
them, etc..

At the moment there's no Julia episodes yet. If you want to come on, head over
to [https://runninginproduction.com/](https://runninginproduction.com/) and
click the "become a guest" button near the top right to get the ball rolling.

~~~
LolWolf
Oh, that would be awesome! I would really love to listen to a bit more detail
about the choice of Julia for a company's stack. (I am an academic, so
choosing Julia was very easy for me :) but I can imagine this choice could
likely be harder outside of basic research.)

------
ViralBShah
People often ask me for examples of non-scientific Julia codebases. My
favourite one is Franklin.jl
([https://franklinjl.org](https://franklinjl.org)), a static site generator.
This blog discusses quite a bit more, and it doesn't even talk about all the
improvements in integrating with databases and such.

~~~
StefanKarpinski
Another example is that all the infrastructure for serving Julia packages to
Julia users in the new 1.5 release is implemented in Julia. This is a content
distribution network served by highly concurrent servers over HTTPS. The
systems interacts with git servers, GitHub APIs and AWS S3 among many other
moving pieces. Code can be found here:
[https://github.com/JuliaPackaging/PkgServer.jl](https://github.com/JuliaPackaging/PkgServer.jl).

Writing servers in Julia is really pleasant thanks to the clean coroutine-
based task model. Under the hood, Julia uses libuv to get efficient high-
concurrency I/O with an event loop. But from the user's perspective, you just
write simple blocking code and use `@sync` and `@async` to spawn concurrent
tasks. If you want to use multiple threads, just use `@spawn` instead of
`@async`. This is very similar to Go's highly successful I/O and concurrency
model.

~~~
ViralBShah
That's a great example. Also, JuliaBox.com (discontinued service) ran on Julia
in production for 3 years, and has now been replaced by JuliaHub.com, which is
also largely written in Julia.

Several Julia Computing customers also run long running Julia jobs in
production, and some are discussed in our Case Studies
([https://juliacomputing.com/case-studies/](https://juliacomputing.com/case-
studies/)).

------
lordgroff
I've been thinking about jumping on the Julia ship for a while. I like the
LISP-like nature of R and miss its flexibility very much while working in
Python (a language I also enjoy, but not particularly for data science).
Julia, specifically its macro story and speed, fascinate me.

One of the things that still keeps me at bay is the JIT and the pre-
compilation in general. It seems like it's still not very easy to actually
compile a Julia library or an executable. There is PackageCompiler.jl, but my
understanding is that it necessitates pulling in sys.so, which is some 130 MB
in size, making it prohibitive for many projects.

~~~
ViralBShah
We are always working on streamlining PackageCompiler.jl and building
binaries. Bug reports always appreciated.

I'm curious to understand why the 130MB is prohibitive. Are you looking at
embedded or resource constrained environments?

~~~
luizfelberti
In a few use cases, such as big-data processing and things like that, a common
pattern is to haul a tiny data-processing kernel to several machines to do an
operation on some data they hold, rather than shuffle the data around.

This pattern can be seen in several places and frameworks like Spark &
company, and it's also pretty much the foundational idea behind processing
data with Joyent's Manta[0]. It always pretty much boils down to "serialize a
function, and send it somewhere, to run on the data locally".

Some of these frameworks do it only through native functions (Spark runs
serialized JVM procedures, and for Python functions I believe it shells out to
an actualy Python interpretes), others are completely agnostic to this and
work with binaries (Manta).

It'd be pretty hard to do, and have a lot of overhead, if all of those
"functions" that I'm shuffling around are, at the lower bound, 130MB in size.

I understand that a lot of this can be done directly at the Julia layer for
Julia things (much like Spark does with JVM things), but it kinda limits the
applications of this nonetheless.

[0] [https://github.com/joyent/manta](https://github.com/joyent/manta)

~~~
ViralBShah
Julia can already do this in its `remotecall` API (actually has done this
since v0.1), which serializes a closure and sends it to a remote node for
computing.

[https://docs.julialang.org/en/v1/manual/distributed-
computin...](https://docs.julialang.org/en/v1/manual/distributed-computing/)

However, this is not related to the system image (sys.so) that is used to
start the Julia process, which has a compiled version of base, stdlibs and
whatever else you choose to build in.

------
adamnemecek
Julia is next gen. The whole experience is so much more pleasant than anything
in the space. The interop is also crazy. How is it possible that you can call
Matlab, Python, cpp, fortran, Rust from one language?

I legit think that even if you are using say pytorch. You are better off
writing your code in julia and using the python interop.

~~~
smabie
Yeah, the quality of the language is pretty incredible. Especially compared to
Python, which has been a disaster for as long as I can remember.

I think the ecosystem could be a bit better, but that's more of a personal
opinion rather than a serious concern as, like you alluded to, you the interop
story is insanely good.

~~~
adamnemecek
The ecosystem is moving very rapidly though. What do you think is missing?

~~~
smabie
Right now, the DataFrame package doesn't support row indices like Pandas does.
Row indices are especially useful time series data, and there's another
package called TimeSeries that support a dataframe-like type called a
TimeArray that supports dates as row indices.

TimeArray is like a DataFrame, but with a lot less features. You can't really
do that much with them, and they only have limited support in an auxiliary
package for resampling.

It would be nice if a new DataFrame library could be created that was inspired
by Pandas.

In general, I think Julia suffers from the problem of too many small packages.
For example, there's 3 libraries I have to use for time series: TimeSeries.jl,
TimeSeries Resample.jl,and TimeSeriesIO.jl. None of the libraries are
particularly large and should all be merged into one package.

Part of the reason for the dirth of small packages is because, unlike Python,
Julia's strong support for function polymorphism allows for seamless extension
of existing functionality with new packages. Unfortunately, I think this has
created a cultural problem in which new packages are created that only
slightly modify existing functionality.

~~~
xiaodai
Please don't be "inspired" by pandas.

~~~
smabie
No, you're right. Pandas totally sucks. But the one good thing it got right is
support for row/multi-index.

------
andi999
I can currently see Julia only as a replacement for Python, with the biggest
advantage, that fast modules can be written in-language, and you do not
default back to C. So in a sense it could solve the two language problem.

But then for my applicationsspace there are problems:

\- compilation seems possible now, but one has to be careful of GPL (like in
the fft)

\- no way to turn off GC (so byebye real-time possibilities).

\- what about GUI design? Heard some mixed messages about it

~~~
eigenspace
> \- no way to turn off GC (so byebye real-time possibilities).

While Julia does not offer semantic guarantees that you can avoid the GC, it;s
actually quite possible and easy to write code where you're manually managing
all your memory and the GC is never invoked.

> \- what about GUI design? Heard some mixed messages about it

There's a lot of promising things happening with Makie.jl, Stipple.jl,
Dash.jl, Pluto.jl and others. This space is still a little immature in Julia,
but it's progressing fast.

~~~
andi999
I know there are workarounds, but when you do not have guarantees the phrase
'production ready'is not the first that comes to my mind. (Of course in the
problem spaces of matlab and python, and probably R, Julia is probably on par)

~~~
DNF2
Well, how high is the bar for calling a language 'production ready' in that
case? How many production ready languages are there?

~~~
andi999
What I meant: production ready depends on the feature. So for substituting
Python (and beeing able to write fast in-language extensions), the language is
most likely production ready; so in a sense it is a replacement for Python +
C; but not a replacement for C. And this is a bit sad, because it could have
been (building linked executables should be possible, since the language is
compiled).

But I like your question, lets call it 'which languages are all purpose
production ready'; probably only C, C++, Java (there is a real-time version),
LabView.

------
dnautics
Having followed it since the 0.4 era I'm excited that Julia is production-
ready (though I have professionally moved to elixir). Since I'm working in ml
infra, I'm interested in orchestrating Julia jobs (via Elixir), what I'm less
clear on is if there are good resources on how to deploy Julia, keep
dependencies sane, pin memory usage. Or does one just go straight to
containers?

~~~
staticfloat
Yep, to build on Stefan's sibling comment, we deploy a few services through
docker that follow the flow of <merge on github> -> <build on dockerhub> ->
<deploy through watchtower> that has been working for us really well.

Docker has some nice infrastructure that we like, such as the ability to set
memory limits, automatically restart processes on crash/if a healthcheck goes
sour/on reboot, easily spin up different versions of the same service (based
on which image you tag to a name), etc... Containerization has done a lot for
our ability to bring servers up onto heterogenous hardware easily and quickly.

As an example, this repository [0] is deployed on 10 machines around the
world, providing the pkg servers that users connect to in geographically-
disparate regions. The README on that repository is a small rant that I wrote
a while back that walks through some of the decisions made in that repository
and why I feel they're good ones for deployment.

To answer your specific questions:

* How to deploy Julia: we use docker containers and watchtower [1] to automatically deploy new versions of Julia.

* Keep dependencies sane: it's all in the Pkg Manifests. We never do `Pkg.update()` on our deployments, we only ever `Pkg.instantiate()`.

* Pin memory usage: We use docker to apply cgroup limits to kill processes with runaway memory usage [2]. This is only triggered by bugs (normal program execution does not expect to ever hit this) but bugs happen, so it's good to not have to restart your VM because you can't start a new ssh session without your shell triggering the OOM killer. ;)

[0]
[https://github.com/JuliaPackaging/PkgServerS3Mirror](https://github.com/JuliaPackaging/PkgServerS3Mirror)
[1]
[https://github.com/containrrr/watchtower](https://github.com/containrrr/watchtower)
[2]
[https://github.com/JuliaPackaging/PkgServerS3Mirror/blob/c6a...](https://github.com/JuliaPackaging/PkgServerS3Mirror/blob/c6ae2c8e9a7b228cebbcf48c25fd2381be97b565/docker-
compose.yml#L18)

~~~
StefanKarpinski
With a bit more detail, this would make a great blog post!

------
ablekh
While being a big fan of Julia in general and recognizing a very significant
progress in both language and ecosystem development [kudos to Julia's core
team as well as numerous contributors to the language and its growing
ecosystem!], I respectfully disagree with the post's author conclusion in its
entirety.

For scientific computing and advanced machine learning domains, Julia is
certainly a breath of fresh air and a no-brainer decision. However, as a
general-purpose language as well as technology stack, I think Julia has still
a long way to go. This is mostly due to relative immaturity of that part of
the ecosystem, spotty - and sometimes even non-existent! - documentation (of
course, except for the language itself and quite a limited number of core and
popular packages) as well as some other issues, including tooling,
development/compilation performance and limited pool of skilled developers.

So, from a startup founder perspective [who has to select the optimal, risk-
minimizing, platform stack], despite Julia's many attractive features
(including powerful meta-programming facilities - in my case, for potential
DSL development), I'm now leaning toward Python and .NET ecosystems. Both
offer very mature and large package ecosystems along with comprehensive
tooling support and incomparably larger pool of skilled developers.
Additionally, .NET offers stability of consistent development improvements,
backing of a major corporation [no acquisition risk] along with support for
modern enterprise-focused architectural practices and patterns, including DDD
and multi-tenancy.

~~~
avasthe
This.

Not trolling. I don't really know why people like to shoehorn a purpose-
oriented language into all domains. Eg: Julia for scientific computing, Rust
for low level programming. They would be great if they focused on those
"strong zones".

~~~
ddragon
I read this article not as a "Julia is ready to replace the all-purpose
languages used in business", but as "Julia is ready to deploy it's scientific
computing into production environments" (as opposed to just local/hobby tasks
and academic environments which were the early focus). No one is recommending
people to do their small e-commerce in Julia (or perhaps Rust) unless you're
in for the fun of the languages (or your e-commerce happens to be a special
case that will play with either language's strengths). But having a solid web
deployment story is important to both regardless, be it in Julia to have live
dashboards or integrating your scientific models in your microservice
architecture and to serve them directly to clients or Rust for embedded web
servers for IoT and other devices. For a language to be great in one domain,
it has to be at least good in everything around it for when you need to
connect that domain to the real world.

And Julia creators do focus on it's strong zone, with all the works on TPU,
HPC and parallel computing, and so does the community with the stuff presented
in juliacon like interactive reproducible notebooks (Pluto.jl) or data
dashboards (Stipple.jl), both using the web domain to improve the scientific
computing domain.

~~~
ablekh
I'm not disagreeing with most of your comment above, except for how the post
reads. To be accurate, the post's author set the _context_ to _general-purpose
computing_ , not the scientific one. Here are two arguments for why I think
so.

1\. Intro section (italics emphasis mine).

"... delivering _complex enterprise projects_ :

    
    
        Julia is fast, and has a very nice syntax, but its ecosystem is not mature enough for use in serious production projects.
    

For many years I would agree with it, but after JuliaCon 2020 I believe we can
confidently announce that

    
    
        Julia is production ready!"
    

2\. Section "Building microservices and applications". Even more so,
practically all sections in the post, except just one ("Managing ML
workflows"), describe various general-purpose and enterprise computing
aspects.

Therefore, I don't see how one could view the post and author's conclusion
purely from scientific computing perspective.

~~~
ddragon
Fair enough, it might be my own view on what I'd use Julia in a production
environment clouding my interpretation of the scope of the text, especially as
the author defines this as his area at the start and says how it's what Julia
shines at. Julia being an acceptable ("production-ready") language means that
whenever my problem hit the areas that Julia shines, it becomes a valid
candidate to evaluate considering all pros and cons.

And while right now I wouldn't recommend Julia for that web domain unless you
also need it's number crunching features (which is more than just scientific
computing, I work with large finance systems and there is definitely a lot of
that stuff), I do think it's more of a library/community problem than a
language problem. The language is well equipped to handle it (easy to use from
a scripting perspective, easy parallelism, fast after warm-up, allows for
clean abstractions to create sophisticated web frameworks, the mentioned easy
FFI), what it lacks is the coverage and maturity of the tools and support (so
I don't feel like I'm at a risk at any point that I need to do some
integration, such as integrating with Kafka or any other systems, without
having to write my own solution or integrating multiple languages at every
step).

That's different from systems programming, embedded, game development and even
GUI development (at least until smaller binaries with less warm-up) right now.
Even with good library it will not be competitive with the languages that
already claim those domains. Julia is a general purpose language already and
more than a matlab substitute, it does not need to be good at everything (and
it shouldn't try), but I don't think it should restrict the scope too much
either.

~~~
ablekh
I appreciate your thoughtful comment. Clearly, we are on the same page on the
topic at large. As mentioned in my original comment above, I see no problems
with Julia language per se (quite the opposite - I find it elegant, flexible
and powerful), but rather emphasized that present issues exist around relevant
ecosystem (documentation, packages, tools and support).

------
amacbride
I really enjoyed the workshops and the talks; I thought the SciML and MLJ
talks were particularly good.

Everything is available on their YouTube channel:

[https://www.youtube.com/playlist?list=PLP8iPy9hna6Tl2UHTrm4j...](https://www.youtube.com/playlist?list=PLP8iPy9hna6Tl2UHTrm4jnIYrLkIcAROR)

------
sgillen
Is the debugger in Vs code or anywhere else better now? I remember trying
Julia just a few months ago and the debugger frustrated me enough that I went
back to print statement debugging.

~~~
Eugeleo
I’m not using Julia actively, but I’ve been following the news and I there was
a lot of improvements made to the VS Code extension, and better debugging
experience as one of them. You should try it out again.

------
usgroup
If you were trying to motivate someone embedded in Python / scipy / pandas /
etc to move to Julia at this time , what would you say?

~~~
eigenspace
I think this post: [https://julialang.org/blog/2012/02/why-we-created-
julia/](https://julialang.org/blog/2012/02/why-we-created-julia/) presents the
aims of the Julia project at a high level quite well. That post was partly
aspirational at the time, and partly already achieved.

Now, 8 years later, I'd say essentially everything there has been achieved.
The most important difference between Julia now and Julia then is the
gigantic, vibrant and interacting ecosystem that's sprung up (though share are
other important technical differences too!)

_______________________________________________

Here's the pitch I'd give:

Julia's value proposition is mostly geared towards people trying to do things
that are just too difficult, awkward or slow in other languages. It's not just
about speed, dynamism and friendly syntax. It's also about solving the
expression problem and providing unprecedented composability.

So while it may not necessarily have great appeal to 'end users' (who are
people just calling numpy/scipy/pandas/whatever functions the way they were
meant to be called by the library writers), it does appeal to people who make
the sort of things end users want so the community is currently swelling with
very talented people making state of the art research libraries. The language
is designed for 'composability', so that it's very easy to combine packages
together in ways the authors didn't necessarily intend or forsee.

This approach is really starting to bear fruit. Our package ecosystem has some
real, state of the art stuff not easily available in other languages, and it's
often built in less time and with less resources. For 'end users', whether or
not Julia makes sense for them really just depends on if the things they need
are already available.

Another huge advantage is that even if you're not some super researcher
developer, Julia substantially lowers the bar for mere mortals such as myself
to do package development. The composability I mentioned above means that I
can stand on the shoulders of giants as I implement my own stuff, plus the
fact that the majority of libraries are written in pure julia means that I can
easily "peek under the hood" to figure out how they work and learn from them.
This is really special and can't be replicated in python, because 'peeking
under the hood' invariably means reading very specialized C code, rather than
Python code.

I think Julia will someday eat Python for scientific computing, but it may be
a while before that becomes obvious to Python users. In the meantime, we have
an _excellent_ package for calling and running Python code from withing Julia,
called PyCall.jl [1], so that can make the burden of switching even easier.

[1]
[https://github.com/JuliaPy/PyCall.jl](https://github.com/JuliaPy/PyCall.jl)

~~~
marmaduke
This is a good pitch and I think composability is the major selling point for
programming in the large. Time will tell though if the ecosystem doesn’t
fragment as different approches or competing packages appear, for political if
not technical reasons. The Torch vs TensorFlow choice could have been in
Julia.

> I think Julia will someday eat Python for scientific computing, but it may
> be a while before that becomes obvious to Python users.

I can’t say I understand this rather pervasive attitude. If you have well
tested code in a language which you can easily interface with, why bother
rewriting it except for reasons of performance or compositionality? Or perhaps
you mean that we’ll all be writing Julia even if some algorithms still sit in
Python?

~~~
eigenspace
> I can’t say I understand this rather pervasive attitude. If you have well
> tested code in a language which you can easily interface with, why bother
> rewriting it except for reasons of performance or compositionality? Or
> perhaps you mean that we’ll all be writing Julia even if some algorithms
> still sit in Python?

I mostly mean that I think Julia will someday have more mindshare in
scientific computing than Python.

However, I'd say that the reason you see the "rewrite it in pure julia"
attitude everywhere is that we enjoy a lot of benefits from having more and
more things in pure julia, namely that our very powerful metaprogramming and
code introspection tools can more easily 'get inside' Julia code, and that the
compiler is able to perform interproceedural optimizations (IPO). IPO are not
always important, but when you need them and don't have them, you _really_
miss them, so at least for numerical kernels and such, it's really nice to
have as much pure julia infrastructure as possible.

In the meantime though, as I daydream about a world where everything is
written in Julia, there's nothing stopping me from being pragmatic and
wrapping / bridging to and from other languages.

~~~
marmaduke
I completely agree Julia is fantastic for computational kernel construction. I
would like to never again write GPU kernels by hand (though Numba has been
quite sufficient for my needs). I just don’t see this as everything.

------
malshe
Thanks for sharing this. Are there instructor resources available to teach
Julia in a data science program? I use R mostly but I am porting one course on
ML applications to Python. The communities in both R and Python are vibrant
for instructors to create structured learning experiences for data science.
How about Julia?

~~~
amkkma
[https://juliaacademy.com/](https://juliaacademy.com/)
[https://github.com/mitmath/6S083](https://github.com/mitmath/6S083)
[https://www.youtube.com/watch?v=R84L-BQcjHw&t=905s](https://www.youtube.com/watch?v=R84L-BQcjHw&t=905s)

~~~
malshe
That's great, thanks!

------
jeanl
Do you guys know if it's possible to generate self-standing C++ code from
Julia code? (i.e., code that you can then compile into another project, and
will not require any interpreter to run or anything like that)? And if it's
possible do you know how good the generated code is (in terms of speed)?

~~~
ellisv
You can create binaries from Julia and there are APIs to call Julia code from
C/C++

[https://julialang.github.io/PackageCompiler.jl/dev/devdocs/b...](https://julialang.github.io/PackageCompiler.jl/dev/devdocs/binaries_part_2/)

[https://docs.julialang.org/en/v1/manual/embedding/](https://docs.julialang.org/en/v1/manual/embedding/)

------
vasili111
Does the Julia matrix has named columns as R or unnamed as numpy?

~~~
ddragon
The built-in array (which is multidimensional) is unnamed, but Julia was made
in a way that anyone can create as a library their own types that are as high
performance as the native ones (most of which are written in Julia) so you can
just get a named array package like [1].

[1]
[https://github.com/davidavdav/NamedArrays.jl](https://github.com/davidavdav/NamedArrays.jl)

------
minetest2048
Julia feels like a good replacement for Matlab, but then Matlab have Simulink,
is there any project to create something like Simulink but for Julia?

~~~
ChrisRackauckas
Something ModelingToolkit.jl, JuSDL.jl, NetworkDynamics.jl, Modia.jl, etc. all
are extensions of Simulink in some form. Supporting stochastic equations,
delays, automatic parallelization, acausal modeling, etc.

------
vasili111
What is the state of data frame support in Julia?

~~~
3JPLW
Exceptional — and its lead author is the author of this blog post.

[http://juliadata.github.io/DataFrames.jl/stable/](http://juliadata.github.io/DataFrames.jl/stable/)

~~~
chubot
Is that what you meant to link? It looks like the official docs, and not a
blog post.

~~~
rsfern
This post is also relevant:
[https://bkamins.github.io/julialang/2020/08/02/post_juliacon...](https://bkamins.github.io/julialang/2020/08/02/post_juliacon_1.html)

Edit: it’s a bit more into the weeds than I thought. Honestly the docs give a
better picture of DataFrames.jl

------
sadfev
Is it actually backwards compatible? I don’t think it is, they continually
introduce breaking syntax changes.

Please somebody enlighten me

~~~
amkkma
It's been backwards compatible since the 1.0 release two years ago.

~~~
StefanKarpinski
Julia follows semantic versioning [1] and is now on version 1.5, which means
there been five non-breaking releases introducing new features (and dozens of
bugfix only patch releases). Code that worked on 1.0 also works on 1.1, 1.2,
1.3, 1.4 and 1.5 and will keep working in 1.6, etc. until such a time as a
breaking 2.0 release is made. Even then breaking changes will be kept minimal
and only made to enable worthwhile language features.

[1] [https://semver.org](https://semver.org)

------
xiaodai
Does Julia webstack have middle-ware like authenticators etc etc?

------
noahamar
Highly disappointed that this isn't a convention for Julia Roberts fans.

