Everytime Julia is mentioned on HN, I see a surprising amount of people who dislike the language for whatever reason. I understand a lot of their concerns (time to first plot, debugging sucks, IDE support isn't amazing yet). However, I don't see a lot of alternatives for the niche that Julia occupies. If you want to do high performance scientific computing, there aren't a lot of options in terms of languages.
Python is the de facto but it's so slow for anything that can't be well represented as vectorized NumPy operations. There are ways around that like Numba, Jax, Cython, etc. but their use cases are pretty limited, and they don't work well with other Python packages.
There is, of course, C and C++ which are commonly used to speed up Python or as standalone packages. However, C++ is such a complicated beast that writing performant and correct code takes forever. C is much more manageable, but I find that there are not a lot of scientific packages written in pure C. This isn't even touching on the horrendous build system that is CMake.
Fortran is a pretty simple looking language and would probably be the closest to Julia in terms of speed and expressiveness, but the writing is on the wall and Fortran's days are numbered. I am not aware of many new packages being developed using it.
Other than that, there's languages like Rust and Go but those have ecosystems that are so small, they make Julia's ecosystem look like Python's. I really don't want to spend my time as a grad student writing basic numerical libraries from scratch.
> Fortran is a pretty simple looking language and would probably be the closest to Julia in terms of speed and expressiveness, but the writing is on the wall and Fortran's days are numbered. I am not aware of many new packages being developed using it.
Nothing could be further from the truth. Fortran, both the language itself and the packages in the ecosystem, is continually developed. There are a few peer-reviewed studies that quantitatively tracked Fortran usages and concluded that Fortran is not just for legacy code -- people actually continue to write new packages precisely for the reasons you mentioned (expressivity, performance)
Could you link to some of those papers? I would be interested in learning more about the frontier of Fortran development. I recently learned about the LFortran project and while I think it's an interesting project, I feared it would be too late for the language.
Fortran is a case of a language that has grown below the trend of the industry. There are probably more active Fortran projects and developer now than at any point in history. But it became a smaller piece of the much larger pie so people think it's "dead".
Same goes with Perl - probably more Perl developers than at its heyday in the 90s but a smaller chunk of the overall picture. Measure by noise levels and anything that doesn't explode seems to be dying.
If Julia gets to where it wants to be, pretty soon there will be tasks for which there is no practical alternative to Julia. At this point, people will be in some sense forced to use it, and the floodgates of hate will open (see javascript, matlab, cmake, cpp). Maybe this is the warning trickle?
You're describing the current state of Python. Julia is kind of a response to Python. Python's become the de facto for ML and dataviz. Two things it's kind of a terrible tool for
I'm sure people will become more critical of Julia if it successfully becomes the de facto for those things but there's a big difference in that Julia is very focused on those things whereas Python is trying to be as general purpose as possible
Tangent, but tbh I really don't see what the future of Python is. I don't understand why we don't just teach JS or TS as a starting language. It has about as much baggage and is about as easy to understand as Python and is the only language with a comparable ecosystem. Plus you also get to learn the language of the web for free. And now that you have stuff like deno that can run typescript without any messy dependencies. You can just make a quick ts file and write your webscraper or proof of concept and run it in the terminal. At least personally, that replaces my primary use-case for Python and I get to take advantage of TS's amazing type system which really comes in handy when dealing with external APIs.
I really don't see a place for Python
ml --> Julia, C++, R
dataviz --> R, JS/TS
beginner-friendly --> JS
web --> JS
want a backend framework but wanna choose a language that's easy to hire for --> TS, Ruby
huge, stable community --> JS/TS
quick scripts --> Deno
systems --> Rust, C++, Java, Go
I'd say it still dominates in scripting/webscraping and creating shareable work (e.g. Jupyter), but I'm just pointing out that at this point it's replaceable in those areas
I agree. Around 2008, I was drunk on "the zen of Python", considering myself a Pythonista, writing elegant solutions, making unmaintainable one liners etc. But really Lisp was so much better for that power/speed of development. Eventually (for me) go was so much more powerful, easy to maintain etc. with actual industry usage (seemingly instantly) without any loss in productivity.
Python solutions seem no better than maligned JS solutions nowadays - but at least there's TS and efforts to improve it where they can. Indeed, most Python code doesn't seem to be in proper production, but just one of exploratory scripts and pieces strewn about in notebooks. It seems like it could easily lose market share quite quickly.
The moat for JavaScript is web-frontend, otherwise it would not be used (note the current rewriting of most of its tooling into non-JS languages). I would not call it beginner-friendly, stable, or good for quick scripts (does node come with sqlite?). I've yet to see a stable equivalent to django/rails. Javascript's data visualisation comes from it being in the browser, which has severe limitations (e.g. the need for WASM), and it's usually much easier if you don't need to two languages to do analysis (as the JavaScript numerical and data format ecosystem is a mess).
Python's advantage is apart from web-frontend (I have no expectation PyScript will catch on), it is everywhere, even when you don't expect it to be (e.g. Bank Python, VFX, Postgres, build systems, scripting interfaces in apps). Are there better options for specific usecases? Yes, but it's when these usecases intersect that Python wins out.
> I've yet to see a stable equivalent to django/rails
... express? Has better benchmarks than both those frameworks, is probably more commonly used today (more employable), and has the benefit of TypeScript which fixes the biggest issues with both Python and Ruby (lack of a good type system)
> Python's advantage is [...] it is everywhere
Yes, that's my point. That's it's only advantage. But JS is better setup to become the "just use it because everyone knows it" language imo.
Having used express, and being shocked at how bad it is, no, express is nowhere near equivalent django/rails (flask, the go to microframework for Python, is more complete than express).
TypeScript has two major issues (which it shares with Python's type annotations): it has to interact with untyped code (so what you think it fine is not, especially when key libraries are not written in TypeScript), and it tends to increase barriers to modifying the code (unlike more functional languages, whose type system tends to be both complete and have less footguns).
As to JavaScript being more commonly used and known, that's dependent on where you're working (especially when it can be sandboxed off so that the majority of users don't need to deal with it, see plotly dash and R-shiny). I suspect having seen multiple examples of systems where JavaScript was used where it was entirely the wrong tool (and where for various reasons Python was the correct one, partially due to the difference in the library ecosystems), we'll see a slow retreat over time of JavaScript to its core areas (being front-end and server-side rendering of the front-end), as more people find that having two languages is a better choice, and rewrite/replace existing JavaScript services.
The key advantage of Python is that it's good enough at many things and that there is a giant infrastructure of libraries and a huge community (and thus tutorials, howtos, stackoverflow answers etc.).
You mention Julia/C++/R for ML. My counterargument would be that things like fast.ai exist, use Python and make the whole topic approachable. I also find it a bit hard to belive many people that don't come from a stats/other academic field background would readily pick R over Python for anything ML related. I personally never thought JS was very beginner friendly because setup was traditionally clunky. That has thankfully changed a lot.
You mention many things that are better in theory but in practice I need to get work done now. I'd probably rather write quick scripts in Python (or bash) than Deno for example.
It's probably also easier to hire for Python than Ruby at least I'm not convinced Ruby has a giant edge. Huge, stable community is also exactly what Python has.
Python also has mindshare in many niches for example in security many things are quickly scripted in Python (notoriously Python 2 in some cases). Go and Ruby are also used for a lot of tooling there.
At the end of the day, if someone from a non-programming background asks me what language to learn, I still recommend Python (because usually they either want to pivot into ML somehow or want to get some quick scripting done to automate something or want to scrape some data). If I were to teach my child, I'd probably pick what I consider the most fun and completely ignore employability (Elixir) and if I were to build a CS program from sratch I'd have to think long and hard. I might end on Rust or Go but JS would be in the running (depends if there's a big focus on embedded/osdev etc.).
At the end of the day, "pick whatever language gets the job done" is still the best recommendation in my opinion :)
> The key advantage of Python is that it's good enough at many things and that there is a giant infrastructure of libraries and a huge community
Right that's exactly my point. But once that's gone there's not really anything left for it. JS is also good enough at everything and probably has a larger community and more stability given it's the language of the web. Def not there in terms of ML yet, but it can get there
> I'd probably rather write quick scripts in Python
Because you're used to it.
> It's probably also easier to hire for Python than Ruby
If we're talking backend, an even easier thing to higher for than both of them is JS/TS
> Python also has mindshare in many niches for example in security many things are quickly scripted in Python
Yup. Again, this is the momentum of its community, but not anything inherent to it
> "pick whatever language gets the job done"
I don't disagree that that language isn't usually Python today. I'm just saying all of its success is due to the community and ecosystem so I don't see any guarantee that this will still be the case many years from now
> I also find it a bit hard to belive many people that don't come from a stats/other academic field background would readily pick R over Python for anything ML related
I agree with this, but actually suspect that it's a good thing. People without some experience of stats probably shouldn't be using ML, and if R use helps to identify these people, I'm all for it.
That being said, basically every company has standardised on Python, some for good reasons and others because of the PR Python has gotten through DL stuff.
JS suffers way too much from a lack of a good comprehensive standard library and builtins. It doesn't even have list comprehensions!
I get thinking Python and JS are "kind of the same", but there are so many details that Python gets right and JS kinda falls apart on. Python's pervasive dunder methods means that you can look at objects in a REPL and get a good representation (try looking at any non-trivial Javascript object and it's a major pain). You can define equality. Operations _don't_ stringify by default (so dictionaries work like you think they should). And Python has blocking methods, which means that you can write code in a straightforward way.
There are a lot of things playing in JS's favor but especially as packaging stories get better it's hard to discount Python's ease of use relative to Javascript IMO. And all of this is kind of ignoring the fact that most people are _actually_ talking about Typescript when they talk about JS nowadays, because writing "raw" JS is so error prone that it would be very difficult to actually get serious work done in it without an extra lint phase (I say this as a huge TS proponent)
JS actually did used to have list comprehension but it was deprecated because nobody used it. There's no need imo. .map, .filter, .reduce, and other array methods achieve the same thing with a nicer syntax imo
Also JS has undoubtedly one of the nicest anonymous function syntax with arrow functions. Something I constantly find myself missing when I'm in a python file.
> And Python has blocking methods
What does this mean? Isn't JS blocking by default? Both languages have async support that you don't have to use...
> it's hard to discount Python's ease of use relative to Javascript
Hmm I guess we just have different experiences with this. I can open up a quick text file anywhere on my computer, save it as a .ts file, import whatever library I want, use TS if I want it, etc and then just run it in the CLI with `deno run test_run.ts`
Feels pretty easy to me. And then you have _anything_ related to UI. Maybe it's just me but I never quite felt comfortable with GUI libraries in Python. With JS I can just make an HTML file and open it up in any browser and I have access to the most cross-platform UI API in the world: the WWW
> "raw" JS is so error prone that it would be very difficult to actually get serious work done in it
If you're trying to get "serious" work done you probably shouldn't be using either of these languages. Or if you are, you'd definitely have some linter set up for either one. Also, Python's typehints are nice, but it can't compare to TS. And if you don't wanna go all in on TS, you could always stick with JSDoc comments which get you 90% of the way there for small projects and have really great support in any modern IDE.
> What does this mean? Isn't JS blocking by default? Both languages have async support that you don't have to use...
To be clear on this, because JS APIs prioritize non-blocking mechanism to the expense of everything else (related to it the original sin of how JS interacts with a browser IMO), everything has to become async, even though in practice a lot of this stuff is happening sequentially anyways.
See this example Playwright test [0]. awaits everywhere. Because of how await works as well (Rust got this one right with postfix await), suddenly fluent APIs become awkward. You no longer have foo.bar().baz() but (await foo.bar()).baz(). (await getResources()).map(f) is not beautiful in my eyes.
And the core thing is that almost no code actually relies on concurrency in practice! So much of this is a consequence of the problem of JS not having good yielding semantics, and its interaction with the DOM in general.
Is this the biggest issue in the world? I guess not, but I very much dislike the aesthetics of this fore smaller chunks of code. I would really like a model with implicit awaiting in ~all the places that it exists currently.
I do agree that JS's anonymous function syntax is nice, and I miss it dearly in Python. Though here as well, I think Python comes up on top with all of its argument/keyword argument strategies. I get the options dictionary pattern, but dislike it.
And UI stuff is of course ... well it's easier in Javascript (though it requires its own infrastructure). There's no ways around that.
Loads of serious work is done in both language, I just think that JS is missing a lot of table stakes features given that there's so much attention to it.
Except that you can use python in most of the use case you mention
ml --> Python
dataviz --> Python
beginner-friendly --> Python
web --> JS
want a backend framework but wanna choose a language that's easy to hire for --> Python
huge, stable community --> Python
quick scripts --> Python
high level system --> Python
low level systems --> Rust, C/C++
Yeah sorry if I wasn't clear. That was my point. These are all the areas where Python is currently used, but there are better options for all of them now
More stable than python that's for sure. You could just go with react or node (depending on your use case) , pick a build tool, and be set for half a decade. That's absolutely not the case with Python, and I'm not just talking about the breaking changes that happen in some releases. You also don't get a mess of venv, conda environments, incompatible packages, having to use docker to deploy your app, etc. I use Python a lot, and I like it, but Javascript is much more stable as a platform for quite a lot of what python is generally used for.
Hopefully, but I doubt it'll do so any time soon. Even if it does, JS will still have momentum because of the lack of tooling and setup needed to use it. It's still gonna be in every "intro to web dev" tutorial ever for many more years to come
> I don't understand why we don't just teach JS or TS as a starting language.
IMO the main problem is pedagogical. JS is a vast sea of conveniences that contradict conventional programming practices. This is also the reason I hesitate to recommend R as a starting language, for all that it's the de facto language of statistical analysis. Both languages are hard to teach in a coherent way, because both have lots of gotchas and exceptions that aren't bad, per se, but in the aggregate impede understanding of fundamentals.
Just yesterday I was working with someone on fundamentals in R, distinguishing between literals, variables, and functions. He got the idea, and then we got to model specification (`lm(y ~ x)`) and I had to shrug and say, those "identifiers" are different. I shudder to think of trying to teach types with a beginning student in JS.
> I'm sure people will become more critical of Julia if it successfully becomes the de facto for those things but there's a big difference in that Julia is very focused on those things whereas Python is trying to be as general purpose as possible
I see that as more of a point in Python's favour than Julia's. IME languages that are designed as general-purpose languages tend to be better than languages that are designed for niche purposes, even within those niches. The things that make a language good require consistent, general thinking.
> IME languages that are designed as general-purpose languages tend to be better than languages that are designed for niche purposes, even within those niches. The things that make a language good require consistent, general thinking.
Julia is a general purpose language, while being the best tool for scientific computing.
Right but Julia is specifically built against the criticisms made about Python in those spaces. All I'm saying is within that space there's gonna be less to critique
> For many years, it has been popular to predict that FORTRAN soon will be a dead language, to be replaced by a "better" language. Despite the undeniable fact that there are more eloquent, rich and powerful languages available, FORTRAN continues to be used widely and, indeed, is still the most popular programming language for scientific and engineering applications.
> Everytime Julia is mentioned on HN, I see a surprising amount of people who dislike the language for whatever reason.
I am mostly indifferent to the language (love the Jupyter equivalent tool Pluto, dislike Pkg, love the simplicity of matrix operations, dislike being pushed towards the REPL...) - I am increasingly irked by the observation that in topics like this the Julia community comes out in force to downvote anyone who expresses a negative opinion.
I still prefer Matlab/Octave for numerical exploration, after initially being excited for Julia but later becoming disillusioned with it.
I’ve always found Matlab the language to be small and simple, with just a few oddities. The library is where most of the warts accumulated, but I’ve used it long enough to learn them. Performance is much better than python - very fast matrix operations and Java-like speeds for everything else, with comparable startup times.
Octave is almost the perfect tool for numerical computing. Only problem is that loops are slow. Some millionaire out here should pay a "stallion" to write a good JIT octave interpreter and the world would be such a better place!
Re Fortran - it should be a really sensible choice for writing floating point themed python extensions. It has a usable standardised interface to C (as of... F2003? I think) so generating a stub that glues a fortran module onto python's C API would be straightforward. In ye olden days I wrote the numeric kernels and fortran and distributed them across machines in c++, wouldn't be averse to doing the same thing today.
Main argument against that is probably that it's possible to get the same sort of codegen out of C++ and fortran native programmers are somewhat scarce, though I'd bet on it being easier to teach to domain people than julia or c++.
For me it's the fact that it's probably for over 10 years that I've been reading on how amazing Julia is and how it's going to become much bigger than Python in Data Science yet I've never seen any evidence of that. It gets tiring after some time and yet the Julia community keeps making these claims.
Nim is more of a Golang-Python hybrid; people like me who learned Python first find the Go syntax offensive and the focus on error handling to be very pessimistic and boring to write, but creating something with Python which you can just compile and distribute is a pain in the ass. Nim solves both problems nicely. Sadly it is too much of a minnow in terms of available packages, e.g. for things like mature cloud SDKs.
It's catering to Fortran developers. Matlab was also catering to Fortran developers (you can read up the history of Matlab). And I wouldn't discount HPC guys as non-developers.
I thought so too but it bothers a lot less now that the default style of iteration is "for x in y" instead of looping over indices. In fact I barely think about it.
And, yes, for math-y stuff that you are going to be reading off a book, it helps keep things more compatible.
I have confidence in Fortran continuing to exist in a useful state for my lifetime and beyond.
If Julia had evolved into a community-managed language in the same way that Python has, I would have a positive outlook on it - but I am nervous about the efforts to commercialise it (which mostly just seems to be a case of skimming value off of financial institutions). It is totally understandable why the creators went down this route, but I just wish they had been more idealistic rather than getting on the "everything is a VC-backed company" bandwagon. While I hope Julia continues to grow, I don't trust its longevity enough to put serious effort into using it for myself.
Julia is community managed. JuliaHub (previously known as JuliaComputing) does do a bunch of work on Julia, but doesn't "own" the language in the way Google does for Go. There are a number of Julia developers with exactly the same amount of control over Julia as anyone at JuliaHub has at a number of colleges and companies.
Sorry, not buying it. Entirely possible it is just a perception problem and not a real one, but the situation is just not comparable to a language like Fortran.
Google owning Go is a big downside, but I think people know that going in - also, Google makes no attempt to monetize the language.
The situation is very different from Fortran, but mostly in good ways. The Fortran standards committee's meetings aren't open to the public, and they publish a new version of the standard every 5 years that takes another 5 years for the compilers to catch up to (if they ever support the new version). In the Julia world, you just make a pull request on github, and if it's something that needs a discussion, it gets discussed in a meeting that happens every other week and is open to and takes input from anyone.
As someone who has used and contributed to Julia, I find Yuri Vishnevsky's arguments about correctness totally fatal [0]. And Yuri's examples are not obscure, they are totally realistic scenarios where Julia silently returns wrong results in ways that may be hard to detect. I do mostly scientific computing, and the idea that I might have to retract a paper because of a bug in the programming language is intolerable. It doesn't matter how beautiful and fun to write and develop the language is if it can't be trusted to return correct results.
Software has bugs. That's the way it is. You may think that Julia (but I suppose this is mostly about the ecosystem of packages around Julia) has too many bugs. Then you can use something else. Like Python. If you move from Julia to Python, you may want to use Numpy? Pretty cool project. It currently has 1,9k issues on Github and if you filter by bugs, it has 599 such labeled issues. How many of those are issues like in the post? I don't know. The same applies to Scipy. For example, the gaussian hypergeometric function returns wrong results for some input values https://github.com/scipy/scipy/issues/3479. This issue was filed in 2014. You can find similar old issues in Julia packages. That's how these things go. Luckily, many of the issues listed in the blog post are fixed.
If you think that picking any language and any library combination with a semi-high requirement for the number of features you want to be already implemented will be able to fulfill the "this has to be completely correct or I won't use it for my research"-requirement you will have a hard time.
The last part of the post seems to be about OffSetArrays.jl. Many people who have implemented libraries and who care about composability and generic input also agree that the Base AbstractArray interface is not perfect or complete and sometimes the issue is that the interface that does exist is not followed well enough for composability to work. A more complete, agreed upon, and generally adhered to interface for an "AbstractArray" would be nice and has been+is being worked on and discussed by many people in the community.
Scipy (I believe Numpy parent project) just straight up has wrong formulas in some of its methods because they are legacy and changing them would change people's codes.
Hamming distance was one, if I recall correctly, that wasn't correct, as well as a few others in its parent module.
I still use the package, of course, because its great. But given the disconnect I saw I'm still careful when documenting as to what method is used (and use a comment to clarify). Most of the time it isn't a huge deal.
> is if it can't be trusted to return correct results
If you need correct results, the trick is to not trust anything. Any results that come out need to be validated. This can be done by running the equations backwards to see if the initial conditions and/or boundary conditions hold. For example, when I wrote matrix inversion routines long ago, I'd check that multiplying the original by its inverse produced the identity matrix.
This undated blog post (published in about May this year) is not the last word on Julia. There have been fixes made to issues brought up.
Julia's authors get credit for trying to rewrite many core compute/LA libraries from scratch in a composable manner rather than using ancient fortran code nobody dares touch. That said, the article was a good wake up call to the community that tests and correctness is vital for trust.
Edit: First paragraph removed as the person I referred to disagrees with me and asked me to make an edit.
> Julia's authors get credit for trying to rewrite many core compute/LA libraries from scratch in a composable manner rather than using ancient fortran code nobody dares touch.
You can go even further than that. OpenBLAS, the BLAS linear algebra library that things like Julia and NumPy use by default now, was an MIT Julia Lab project. The developer was a research software engineer in CSAIL under Alan Edelman! And then the openlibm open source math library? Also made by the Julia developers. Julia developers like Steven Johnson and Doug Bates are also the ones behind some of the ancient classic codes like FFTW and nlopt. So in many cases they aren't just rewrites but rewrites by authors of the some (but not all of course) of the ancient codes!
Yes, specifically to change focus from academic work to being "full time" on OpenBLAS a bit and clean it up as it started getting a lot more traction. Shortly after completing that he passed on the maintainership of OpenBLAS its growth, but that was how it sat until came along late in 2017 to pick it up from there.
This ends up being why a lot of people in the Julia Lab learned the internals and started creating pure Julia BLAS implementations, which continues to this day with things like LoopVectorization.jl and Octavian.jl (though Chris Elrod was not around during that time, so he comes with his own influences)
It was cobbled together from msun and other sources to make a complete, portable, open source, liberally licensed libm, which—shockingly enough—did not exist in 2010 when we put it all together.
> This undated blog post (published in about May this year) is not the last word on Julia. There have been fixes made to issues brought up.
And yet this is an anticipated reaction:
> Whenever a post critiquing Julia makes the rounds, people from the community are often quick to respond that, while there have historically been some legitimate issues, things have improved substantially and most of the issues are now fixed.
> For example:
> [...]
> These responses often look reasonable in their narrow contexts, but the net effect is that people’s legitimate experiences feel diminished or downplayed, and the deeper issues go unacknowledged and unaddressed.
> My experience with the language and community over the past ten years strongly suggests that, at least in terms of basic correctness, Julia is not currently reliable or on the path to becoming reliable. For the majority of use cases the Julia team wants to service, the risks are simply not worth the rewards.
If I recall, the same argument was used to justify Octave's Fortran coded libraries for stats. Primarily, it was an argument about historical consistency in the assumptions people agreed upon at NASA.
Julia's library compatibility is still an issue, but hardly related to the language itself. Unlike many languages, Julia has regression testing for its graphical plotting output. This means it can and usually does check that a given output matches a previously known function(s) output across each release iteration. This means that unlike many languages that fix/ignore various external library versions, Julia actually checks that data gives the exact same results every time.
The caveat is that people must report issues to the project, or add regression tests to ensure correctness persists throughout the ecosystem. =)
I just read the blog post. The first issue was acknowledge on the same day and fixed on a commit two days later. All issues were noticed fast and corrected in the following months.
Now, there are of course much more mature options if that makes you comfortable, which is totally respectable, but saying there are bugs in the programming language borders spreading FUD.
> I do mostly scientific computing, and the idea that I might have to retract a paper because of a bug in the programming language is intolerable
How do you feel about bugs in the number system because let me tell you IEE-754 doesn't give correct results.
Hate to say this but checking for correctness is your job. There is no programming system that will be 100% correct. Yes, it sucks when a language might break your mental model of how it works (and I'm not speaking for Julia, I don't use it anymore, it's possible the correctness bugs are egregious) but you should be writing extensive tests if you're truly worried about having to retract a paper. If you, say, use python, you better triple check that no dict is getting mutated under the hood when you pass it into a function, etc.
> The Julia correctness issues are things like basic math functions, called in simple normal ways, returning wrong numbers.
If you have an example of this, I'd be interested in tracking it down. I didn't see this in Yuri's blog post, nor am I aware of any egregious examples of this right now.
> Hate to say this but checking for correctness is your job.
Yes. But also No.
The "No" case being if you run your data through a well established program or library already known in academia to have been scrutinized to give accurate results to the 20th decimal place of precision.
The distinction is pointless when you care about writing code that gives the right answer. What are you going to do - write Julia code without using the standard library?
Not particularly. Unless the bug is in some complicated compiler inference code, the Julia stdlib is just like any other Julia package. If you can fix it in a library, you can fix it in Base Julia.
Let's at least be clear what happened in the examples of the post. Julia has correctness checking, and it's turned on by default. The only way to turn this kind of stuff off in normal usage is locally, i.e. putting `@inbounds` and such on a block of code. That blog post is about some cases where in some libraries (not even standard libraries really, some statistics libraries), there were people who put `@inbounds` in code to explicitly turn off safeguards but did not do the due diligence to ensure that the code that was unsafed was actually correct.
Note a few things here. One, this is some user deciding to turn off the safeguards, not necessarily Julia being unsafe but instead Julia being flexible and some users abusing that without checking it well. Two, you can run Julia in a way where this feature is disabled: just running `julia --check-bounds=yes` and all `@inbounds` markings are ignored. So in theory this entire blog post could've just been "please run Julia via `julia --check-bounds=yes` and you get better error messages". Note this isn't something that's obscure: literally every time you run `]test` or use package CI it puts things into this mode, so it's used daily by almost all developers. Third, you say you do scientific computing but this post was about statistics libraries.
Look, it's not perfect. What I'm hoping for is that Julia's compiler soon improves its tracking of certain effects like size and uses that to emit code without bounds checking when it proves the code stays inbounds. If that's the case, then `@inbounds` wouldn't be necessary, and this would go away completely. But for now, we have a system when when you need to, you can opt out, and users can explicitly turn off all opt outs with just a single command line argument. I think that's at least a better solution than say MATLAB where indexing out of bounds resizes the array without an error, or indexing out of bounds in Python or R can just wrap around and give you a value without erroring. At least Julia chooses safety by default, and we need to bonk a few people on the head to stop turning off the defaults before checking correctness (in the name of "maor performance", which could soon be replaced by the compiler simply doing this as an optimization and thus further reducing the need for people to do it).
And another change that can be done to Julia here is that all of the features that can be unsafe, like `@inbounds` and `@fastmath`, can be moved to an Unsafe.jl standard library, so that packages have to explicitly import Unsafe.jl and you can check the Project.toml to see if a package uses it. That paired with command line overrides to disable any of these features would give very clear warnings and workarounds for users. What you'll find with this is that there is a very small set of packages using it. And unlike some other languages, Julia makes these unsafe triggers local, so hopefully we can further localize and cage any usage of this to be a very small surface area. To me, that seems like something that is at least solvable, in comparison to other language designs where such unsafe behavior is global to the whole language (I'm looking at you Python, https://moyix.blogspot.com/2022/09/someones-been-messing-wit... bites so hard every time I try something...)
This reminded me of how excited I was by Godot game engine, then after only three months or so I lost all faith in the developers and moved on.
If you are building a tool you expect other people to use, "correctness" needs to be priority number one. After that you can think about speed and aesthetics.
Practically speaking, what's the difference between your program giving incorrect results because you wrote a bug in your code, and it doing to because of a bug in a third-party library, or in the language itself?
I mean, the end result is the same. In all cases it's debuggable - the language itself is also just a large piece of software. And if these things are practically the same, and 90% of your bugs come from your own code, and 90% of the remaining ones from third-party libraries, how much do the last 1% bugs really matter?
I used to work at an HPC research institute. While I agree with your sentiment most statistical software has bugs that effect results.
The panda's bug tracker stressed me out on a daily basis. I've also found bugs that effected results due to missing instructions between processor architectures, erasure coding in wire protocols for storage systems, and even a kernel bug.
But by far the most common was by the author of the model themselves.
Indeed. Though for Julia, the issue is that there is a way to locally disable correctness checking (`@inbounds`) that some developers have gone too far with. But there is also a global way of turning it off (`julia --check-bounds=yes). This is at least a better position than say something like the Python ecosystem where there are global correctness overrides that can be disabled in a way that effects every single library (https://moyix.blogspot.com/2022/09/someones-been-messing-wit...). Julia needs to improve this more, as I posted in detail two features that would go a long way to improve this, but singling out Julia seems to really miss the context of where these features are derived from and how it relates to other languages.
Great writeup. Minor comment about the portion of the post mentioning Conda being glacially slow: Mamba [1] is a much better drop-in replacement written in C++. Not only is it significantly faster, but error messages are much more sane and helpful.
That being said, I do agree that Pkg.jl is much more sleek and modern than Conda/Mamba.
Conda should have set the default for channel_priority to strict a while ago, it significantly reduces the search space [0], and it reduces the chances of the solver finding a poorly-specified-yet-nominally-compatible package build on some less rigorous channel. With strict channel_priority and conda-forge as my priority channel, it hasn't taken more than maybe a minute for to solve any dependencies (assuming the target packages are on the priority channel) for any of my own envs, although when I try to recreate envs that installed packages from many channels, I have experienced some long waits.
Agree with all of these points. Multiple Dispatch takes a bit of getting used to, but once you get it it just makes a lot of sense and allows for an amazing amount of code-reuse. And, yes, Julia's REPL is sooo much nicer to use than Python's - I always kind of cringe when I've gotta go back and use the Python REPL as it's so primitive by comparison. You can even tab-complete filenames in the current directory in the Julia REPL. Say you want to open a file called "example.log":
Yea. The comparison with the Python REPL makes me distrust this post a little.
Any Python user who’d be interested in using Julia is or should be familiar with ipython. (If you’re not, it’s a better REPL originally made to make iterative scientific like programming easier in Python, made by the same person/people behind jupyter, which was originally called the ipython notebook.)
Yes Python prides itself on its standard library and it’s be nice to have a better standard REPL. If this were the point being made I’d appreciate the post more.
But, Python is a large and diverse ecosystem, so having different tools coming out of different domains all sharing the same language is part of the value, so it’s not the most hard hitting critique of Python.
That the author made such a point of this, in my eyes, make them seem a little too keen to dunk in Julia’s major “big-brother” competitor rather than being as fair and accurate as they can. Though, I haven’t read their previous Julia-critical post, so I may very well be overreacting. Still, use ipython.
> Any Python user who’d be interested in using Julia is or should be familiar with ipython.
I would have assumed so too, but the number of (reasonably competent) Python programmers who are shocked to know there are alternate REPL interfaces would suggest otherwise. I was lucky enough to get into Python when these thing were still new and shiny, so I came across ipython/bpython pretty early on. But nowadays the very fact that ipython is well-established means that people don't talk about it as much, assuming everyone knows about it, so it's very much possible that someone comes across news about Julia (which is still "new and shiny" enough to be in the news) and doesn't know about ipython.
Defaults matter, and are the only thing used by a large segment of users, even in today's information-rich and one-tap-install world.
I remember "upgrading" from ipython as a terminal REPL to ipython notebooks when it was a shiny new thing.
Since though, I've more or less joined the anti-notebook camp and use the ipython REPL almost exclusively. My workflow, interestingly, is a callback to how MatLab, R-Studio, Spyder (and probably other IDEs, especially those catered to data or scientific work) traditionally facilitate work, which is to edit source along with a REPL and some interactions between source code and the REPL for some sort of quasi REPL driven development. With ipython as a REPL, and the ability to just dump code from source into the REPL, I find notebooks way too bloated for most needs while providing the niceties of an actual text editor or IDE with all the normal linting etc that jupyter are struggling to provide last I checked.
As you imply, it's a bit of a shame. Ipython (REPL) and the kernel interface are probably the best parts of the jupyter ecosystem.
Thanks for sharing. Being a developer of a lot of libraries, it's easy to spend every day just focused on what's bad right now. Indeed, that's what you do because that's what needs to get fixed! But it's nice to take a step back every once in awhile and see what has been accomplished. I think there's a long ways to go still, but there is a lot of good in there. Thanks for taking the time to highlight it.
I wanted to like Julia, but have been unable to. If I want fast numerical/scientific code, I have Rust. If I want to prototype numerical algorithms, plot things, prototype, REPL etc, I have Python. I was hoping Julia could replace the latter, but it's too slow-to-compile; need fast iteration.
Julia's numerical syntax is unbeaten. I wish other languages would adopt it.
If compile time is a major bottleneck and you want to use Julia, I highly recommend using -O1. Julia uses -O2 by default which is faster, but has a quite notable compile time impact (often 2x or so).
Latency / "ttfx" for all sorts of things has also dropped dramatically over the past ~year, and will get another big boost with the native code caching PR
It is already doing what V8 does. It's only compiling when needed. The problem is when you start to put in too many things in the startup script, which is going to compile the things it says it "needs", even though you may never use when running.
Julia actually already has an interpreter that it uses for some stuff (like code in the global scope). adding more of a teired jit approach is on the Todo list, but it's a very large project.
> Rust has the most beautiful runtime/toolchain but a doubtful syntax.
When you want to use it for the type of number crunching programs that Julia is made for.
Rust is made for making programs that currently C and C++ are used for: every tick counts, close to the metal kind of code (kernels, drivers, hi-perf, soft-realtime, etc). Julia is not even considered in that space.
What if you want both though? IMO projects where Python wraps a compiled language is likely to get you the worst of both worlds.
The latency of Julia does really suck though. I would wait it out if I were you and check in in two-three years and are if it's better (which it very much should be!)
Do you use the REPL-driven workflow in Julia? With Revise.jl you only really have to pay the startup cost once; subsequent changes you make are usually fast enough to feel instant.
Curious bystander here. Is there a cult follower on HN who can speak to the benefits of multiple dispatch, practically and compared to other languages? I know the sales pitch but I’m curious in particular about which practical coding problems and abstractions become obsolete with MD.
It looks really promising, like “the way things were always supposed to be”, but I haven’t yet developed a full and intuitive understanding of what it means.
For me it solved many reoccurring problems. One issue that popping up for me was the need for a clunky visitor pattern. Say you do collision detection between multiple shapes of different type, or you want to transform a serialized representation of a GUI into GUI elements.
In my mind it also removes a lot of cases where I need major restructuring and and refactoring in typical OOP languages. Getting type hierarchies and code organization wrong is so much easier with OOP and much harder to de-tangle.
Not always easy to explain with examples. Once you used MD for a while and try SD again you really start noticing how clunky SD makes code writing.
The problems it solves depend a bit on where you're coming from. If you're coming from a CBOO (class based object oriented e.g. Java/C#), the biggest notable thing is that a lot of the Gang of 4 design patterns go away and are replaced with "call/pass a function".
If you're coming from a functional language, the main difference is that Julia will compile specialized versions of functions for your types, and makes it a lot easier to write runtime efficient and generic code.
For a simple example of this in action, consider the following
julia> a = 1//3 # a is the rational number 1/3
1//3
julia> b = 2+3im # b is a gausian integer (complex number with integer coefficients)
2 + 3im
julia> a+b # even though these types don't know about each other, appropriate methods get called to get a complex rational result.
7//3 + 3//1*im
This might seem really basic, but you'll be hard pressed to make a system like this in another language without hard coding a list of dispatches manually.
Wrote a fairly complex optimization-based translational trajectory generator for a robot arm. Needed orientation as well. Implemented a simple quaternion type, implemented multiplication and addition for it, and fed it to the translational code. Worked off the bat.
If I wanted to use eg Euler angles or modified Rodriguez vectors or some other representation of rotation, the identical code pattern would work.
That's one of the cool things about Julia. If it can prove the types of the arguments to a function, or at least reduce them to a small set. It can do the dispatch at compile time.
Well, Julia has better than semi decent support for generics, since every function is by definition a generic.
That's an advantage versus other dynamic languages without multiple dispatch, but statically typed languages do not have this issue in the first place. The only other dynamic language used for numerical stuff is Python, but its issues go well beyond lack of MD.
MD can be useful for static languages of course, but it comes up significantly less often and static MD is simply overloading/ad-hoc polymorphism.
To me julia is the perfect mathematical toy language with great fundamentals and prospects of becoming a serious language once the compile time issues are fixed.
It does so many things very right and always from first principles, not ad hoc (ad hacks).
Yes, but I should probably watch it again. Unfortunately it seems most people come from a mathematical angle, so most examples and references are not intuitive to grok for me. I am mostly curious about the implications for general purpose software development.
I like it a lot even though almost all of my functions have no dispatch (only one implementation) or single dispatch (implementation differs based on the type of just one of the arguments).
Firstly I like the aesthetic and implications of f(x, y, z) over x.f(y, z). I like that f is its own thing and not a property of x.
Secondly, it's really nice to be able to reach however far down the callstack you need to go to patch some function "owned" by someone else to act correctly or more efficiently with whatever your type is. You can get correct code most of the time with single dispatch and well-thought-out interfaces, but getting interfaces right in advance is really tricky and requires coordination with other users of the interface whereas multiple dispatch can often work well without much coordination and can also handle cases where you want the implementation to vary dramatically based on the type (the OneHotVector in Karpinski's "Unreasonable effectiveness of multiple dispatch" talk is an example of this).
With multiple dispatch you can extend the types of your data and functions that process it independently of each other. For example, some people can extend floats to intervals (to keep track of precision) and some other people can write any code that operates on abstract floats. They don't have to communicate and agree on things and conventions, the composite code just works.
For new programming language XXX I look for a book like the "K&R for XXX". I see some older ones from Packt, but nothing like the official Rust Book, for example. There is "Think Julia" (OReilly) but its older.
0. Watching this [0] Fireship video to get the “big picture” on the language.
1. Reading the “Noteworthy Differences from other Languages” [1] section of the manual, if your language is on the list.
2. Then, if you're hurried, read Learn Julia in Y Minutes [2], or otherwise read the official manual [3] in order, skipping sections you're not interested in.
In case you're interested in scientific computing specifically, you can start from Chris Rackauckas' course on Parallel Computing and Scientific Machine Learning [4]. And if you don't have a straightforward application to test Julia on while learning, I would recommend doing the exercises available in Exercism [5], which are well-made and provide access to awesome mentors.
As for books, Think Julia is pretty good imho, even if it's a little bit outdated.
I'm hoping the new SciML docs can become a good enough source for beginners looking to do scientific computing (https://docs.sciml.ai/Overview/stable/). It's not there yet, we literally started redirecting links to the new docs on Monday so that's how new it is, it's already moving in the direction of having a lot of materials for new users (in scientific computing specifically, this is not and will not be a general Julia resource) before ever hitting deeper features.
It is not exactly a K&R style book but I did write it to appeal more to regular programmers than academics. The Julia world is very dominated by researchers which is reflected in how it is often presented.
It isn’t meant to show directly applicable code examples but focus more on things which could be fun and interesting to implement: such as simulate rockets.
3. Look at docs/source code to get what I'm missing
4. Repeat
Julia docs are good for step 3. I don't think there are many super technical books like the one you're looking for, since Julia doesn't really have an industry that would produce them, and since the core is still being developed and all that.
I like to start off hands-on, and I found this appendix [1] a quick way to dive into the language. It's not comprehensive, but a great pragmatic gateway into the language for the experienced and impatient programmer.
I'd like to point out two things I _don't_ like about Julia that seem to be idiosyncratic but nevertheless matter to me:
1. Rather than use something like reStructuredText or Texinfo they used Markdown, and then of course had to define a bunch of custom additional stuff to make it work for documentation, plus a custom program to publish the documentation to a limited number of formats. Unlike the two documentation languages I mentioned, the Juila "customized Markdown" isn't publishable to Info, so it can't be used in Emacs: rather, it's basically just a bunch of web pages.
2. Julia is completely dependent on LLVM. Even Rust (which was initially like that) now has a GCC-based implementation.
It's funny, Julia used to use reStructuredText and I could _never_ get the syntax right when doing doc changes. Markdown is significantly more accessible in the common case of just writing prose and code examples.
Interesting. I've heard that complaint before and I'm not sure how to address it since I have only used reStructuredText in what are likely smaller projects.
I think it's strange for a language to absolutely rely on the architecture of one compiler. Ideally it should have at least two that are both under FLO (free/libre/open) licenses.
IMO chasing multiple compilers is a bad idea for almost any young language. It more than doubles the amount of time it takes to improve anything and while there are benefits, they are fairly minor. You essentially are asking a language to not fix anything in the compiler for a year for the advantage of making it harder to fix things in the future.
Sorry I wasn't clear enough before. I should have said "I wish Julia worked on a compiler licensed under the GPL." The only major one I know of is GCC. This is not just a "minor" benefit. To quote from the post I linked to:
"The nonfree compilers that are now based on LLVM prove that I was right -- that the danger was real. If I had "opened" up GCC code for use in nonfree combinations, that would not have prevented a defeat; rather, it would have caused that defeat to occur very soon.
...The existence of LLVM is a terrible setback for our community precisely because it is not copylefted and can be used as the basis for nonfree compilers -- so that all contribution to LLVM directly helps proprietary software as
much as it helps us."
Regarding 1., Markdown is much more accessible to the average person, especially the average programmer, than the formats you mention. Markdown is basically becoming a common-knowledge thing where if a person is at least somewhat tech-savvy, you can assume they know at least some basic Markdown.
My point was that the Markdown format itself is not complete enough for the task at hand i.e. making documentation. So all the people involved have to 1) learn all the customized extensions specific to this one project (and remember to differentiate them from the other custom versions for other projects, e.g. Rust's one), and 2) learn a custom deployment system.
I'd rather just learn one thing and use it for many different projects.
The error messages you mentioned in here have been completely overhauled. In fact, most things in SciML are now caught and throw very high level error messages. We also revamped the whole documentation and added docstrings everywhere. See https://sciml.ai/news/2022/10/08/error_messages/ . We're also in the middle of rolling out a new documentation (https://docs.sciml.ai/Overview/stable/) that has a lot more of a split between tutorials and references. It's not complete, but the core push of this should be completed in about 2 weeks. As for loading times, we've transformed those as documented in https://sciml.ai/news/2022/09/21/compile_time/ (taking a core case from 30 seconds to 0.1 seconds), and Julia v1.9 is releasing a feature where package precompilation can store LLVM-compiled binaries.
So I think most of the blog post has already been addressed?
The one thing we haven't done is improved type printing. I am with you on that one, and actually opened a Base Julia issue about it way before your blog post: https://github.com/JuliaLang/julia/issues/36517 . It requires a Base Julia fix though, so that's a bit out of my hands. Also, I think it would be good for Base Julia to do a bit of the error message interpreting that SciML has done, specifically for broadcast (https://github.com/JuliaLang/julia/issues/45086). So there are some more improvements to be done, but I don't think the blog post is up-to-date given the overhauls that were done in the summer of 2022 (thanks to your feedback).
It would be nice to hear updated thoughts when you have a chance to try all of these improvements (especially when v1.9 comes out with the cached binaries)!
> It would be nice to hear updated thoughts when you have a chance to try all of these improvements (especially when v1.9 comes out with the cached binaries)!
I finished my Master in physics. Now, I'm working in IT, and even tough physics will always have a special place in my heart, I probably will not return to physics.
If I will find myself doing Numerics, I might give Julia a second try, but I don't think I will find myself doing numerics anytime soon.
Maybe if I buy a SDR and have some fun with it, I might use Julia.
For me, Julia lags in deep learning compared to Python and lacks many valuable packages like PyTorch and PyG. Another reason that stops me from adopting Julia is that it lacks good learning resources. I can grasp Python within an afternoon. But the same thing is pretty hard for Julia.
>I don't know much about the intricasies of multithreading, so I couldn't tell you if Julia's approach to multithreading is groundbreaking or well designed. What I can tell you, is that in Julia, it is so easy to use multiple threads effectively, that even though I don't feel that confident writing complex multithreaded code, most my performance sensitive scripts are multithreaded where it's obvious to do so.
Threading may be easy but the approach they are taking is not very performant for various hardware-related reasons.
when choosing a foreign language to learn most normal people would not examine which one has the most elegant grammar, the most performant representation of information or any such abstract design elements.
people ask instead: whom will I be able to communicate with. what kind of cultural repository (wealth) do I get access to. how will I benefit to justify spending the limited opportunities I have to reprogram my brain.
the programmer-to-programmer aspect of computer languages is not appreciated enough, especially not in the modern, highly networked and open source way of developing software. yes, the primary purpose is to marshal ones and zeros in silicon chips and it does matter if the language helps you do that in a sane way but this is just one of the many factors people consider.
I think, unfortunately for Julia and other aspiring challengers, right now the primary social network dynamics among developers is "winner takes all". e.g., its not a deep secret that the meteoric rize of python is because it enabled countless people to aspire to be part of the perceived AI/ML/FAANG gold rush. Over time programmer eyeballs motivate investment in ecosystem evolution and a benign spiral is set in motion.
I don't have a crystal ball as to when/if/how the current pyfever might cool down. You can't predict those things, they are an emergent property of our very complex networks. Maybe some killer feature or julia or rust or go or dart or kotlin will trigger some regime change, but unless it does, any discussions of alternate greatness have a distinct detached feeling to them.
> people ask instead: whom will I be able to communicate with. what kind of cultural repository (wealth) do I get access to. how will I benefit to justify spending the limited opportunities I have to reprogram my brain.
Choosing the lowest common denominator language to be able to communicate with the “most number of people” is a specific bias (nothing wrong with it). People pick languages with many different motivations — something might sound/read elegant, or there’s a specific group of people they want to engage with (Eg: anime enthusiasts, kernel developers, statisticians, etc.)
> I think, unfortunately for Julia and other aspiring challengers, right now the primary social network dynamics among developers is "winner takes all".
Only for certain use cases. Between Python and Julia, there’s exactly one language I want to program in, and will strongly prefer every time I can get away with it.
I don’t want to diss on Python — its ecosystem/workflow is a substantial improvement over the previous era, but it has so many warts (for certain fundamental needs, Eg generic programming) that you can never fix the problem “additively” by extending the ecosystem instead of radically redesigning the language ground up.
> People pick languages with many different motivations
thats definitely true, but my observation is that collective behaviors in the adoption of a communication channel have a network dynamic that can dominate rational factors. In previous eras people would learn German or French to get access to scientific/technical literature. At some point English simply became the de-facto language for sharing such knowledge irrespective of whether it might have been the most adapted for this. This does not mean that other languages go extinct, it just creates an allocation that does not seem to correspond to "objective" features.
> you can never fix the problem “additively” by extending the ecosystem instead of radically redesigning the language ground up.
thats an interesting view but there are important counterexamples :-). you can see, e.g., the rapid developments in the last few years in the C++ space. essentially what can happen in response to pressure (from other ecosystems) is the development of dialects and multi-paradigm languages (abandoning full backward compatibility and/or simplicity). similarly one would think that if python continues benefiting from lavish attention at some point there will be a python 4 etc. you might argue that this not really "fixing the problem" but in the context of delivering useful products here and now this becomes a bit academic
> have a network dynamic that can dominate rational factors.
Yes, but only if they are all equally expressive, or if you only care about the lowest common denominator. In a situation with an expressiveness gradient, the distinction will always survive. I think this insight is articulated well in PG's essay on the blub paradox.
Eg: English is a pathetic solution if you want precise/expressive communication for the purpose of engineering -- so no amount of network effect will enable English to trump equations even if it were some new idea (language) like calculus competing with a thousand year old language.
> similarly one would think that if python continues benefiting from lavish attention at some point there will be a python 4 etc. you might argue that this not really "fixing the problem" but in the context of delivering useful products here and now this becomes a bit academic
Probably the most fundamental tenet (self-professed) of Python is its "simplicity". Whatever that might mean, it seems highly incompatible with growing the language in the manner of C++, and even more so if the changes were backwards-incompatible (as the 2->3 transition demonstrated). The dynamics of language evolution might still bias towards that outcome as the needs of large corporate codebases dominate the beginner's need for simplicity, but at that point Python would have lost most of its soul.
But, regardless of all that, (as a concrete example of the kind of limitation I'm talking about) it seems impossible to support generic programming in Python without giving up on OOP / single dispatch. <Lipstick, pig, etc.>
> in the context of delivering useful products here and now this becomes a bit academic
I disagree, vehemently :-) When there is an impedance mismatch between the problem domain and the abstractions fundamentally embedded into a language/system, you are setting yourself up for a world of pain -- needing an exponential amount of engineering effort to overcome the barrier, and the solution will still forever be doomed to mediocre quality (at best).
Calling Python (or pretty much anything) "general purpose" in an absolute sense is a bit of a misnomer -- it just means one is blind to the limitations/biases embedded in the technology, or is implicitly focusing on a limited set of problems.
> I think this insight is articulated well in PG's essay on the blub paradox.
From the essay: ".. You were also safe if they said they wanted C++ or Java developers. If they wanted Perl or Python programmers, that would be a bit frightening-- that's starting to sound like a company where the technical side, at least, is run by real hackers".
Its funny that twenty years ago python could be at least "a bit frightening" versus its current lowest common denominator status :-). His argument is probably still valid in the same circumstances (a group of people aiming to solving a tough problem ab-initio and in isolation). But the intervening two decades have delivered a lot of learnings about the evolution and adoption of technology stacks in a connected world context. I am by far not qualified to summarize all the wonderful and bizarre things that have happened. I will just mention something that I believe is characteristic of that angle: Think about "json" and its emergence as the now ubiquitous data exchange format. It is arguably a regression in relation to e.g., xml or rdf yet its simplicity has enabled entirely new layers of complexity and the "API" era.
Having said that, its distinctly possible, certain even, that specific language characteristics will be selected for and shine in the near future. One elephant in the room is the end-of-the-road for easy CPU optimizations and the tough problem of speeding up computation across multi-core, GPU etc etc. Another elephant goes back to the json effect and the gross inefficiencies of semantic-lite approaches.
Hah, that's funny. Actually, I had to stop abandon a Julia package of mine because I just couldn't figure out how to debug some threading issues.
What I meant was that, when multithreading is hard in Julia, it's not Julia that makes it hard. Julia gives you exactly all the tools you need, easily available. It's that the problem is hard, i.e. you fuck it up even with good tools available.
In my work though, problems tend to be simple to parallelize.
Especially for scientists, it often is. Embarrassingly parallel is the norm. Multithreading is only actually hard when you have significant interaction between the threads.
I loved the early iterations of Julia. It was everything I wanted… fast, simple, and readable. Somehow the language has become less simple and readable to the point I don’t recommend it anymore
> A better catchphrase for Julia might be "The best expressiveness / performance tradeoff you have ever seen".
Man, Jakob just has a way of writing that gets right to the point. Nice post.
My only complaint about the post is that in Python you have many different REPL options. For example, both ipython and bpython are miles ahead of the Julia REPL. I agree though that the Julia default REPL is better than the Python default REPL. I particularly like the shell mode in Julia. You can also make your own modes, like a sql mode or any custom DSL mode.
I have been very critical about Julia in the past, so I'll say some positive things before I complain more :)
- The package management in Julia is god-like. Specifically, there's a whole subset of packages that are just binaries and libraries cross compiled on VMs for Windows / Mac / Linux + x86 / x64 / ARM, and it just works. There's no more trying to compile code on a users computer when trying to install a package, it's beautiful. It is truly a phenomenal piece of engineering, and I cannot praise it enough. Hats off to the core team responsible for this.
- Multithreading is such a joy to use. Compared to any other dynamic GC language, Julia and Go are pretty much up there in terms of usability + features. So much better than Python or R. But while I prefer Julia over Go, I do prefer Rust over Julia.
Now for some of my grievances. I'm a nobody in the Julia community so take my word for what it is worth.
- Optimizing Julia code is a joy. However, learning how to optimize Julia code not straightforward. If you are coming from Python and/or don't have experience thinking about memory, cache lines, references, mutability etc, you have your work cut out for you in terms of learning how to optimize Julia code. In addition to that, there are Julia specific things you need to learn to know how to optimize your code. Do you know what the function barrier paradigm is? No? Too bad, now your codebase is 10x slower and refactoring could take weeks. And that's just the theory of everything you need to learn. There's SO many subtle ways your Julia code can be slow because the compiler wasn't able to "see through your code". And the tooling here is getting better but still has a LOONNNGGG way to go. Statically being able to check and assert for problematic inferences will improve this for me a la JET.jl but right now JET.jl is too slow to run on a large codebase.
- Thinking with Multiple Dispatch (MD) is much like Thinking with Portals. Once you get it, you have this ah-ha moment. But the type system is overloaded in my opinion. You HAVE to use the type system for MD but people also use it for interfaces (AbstractArray for example). I think adding inheritance in the abstract type system was a mistake, and a trait or interface like approach would be way better for this. Maybe something like concrete types for dispatch and abstract types for interfaces? I don't think this will EVER change though, not in Julia 2.0 or 3.0 or later, because it is SO ingrained into the Julia community. I'm not explaining this well here but I've complained about it before in previous comments on HN and am too lazy to go find it and copy paste :)
- There's a number of minor syntax / usability gripes I have that I don't think will ever be fixed as well. I generally think a programming language should incentivize you to "do the right thing", often my making it easier to type. In Julia this framework of thinking exists but isn't applied consistently. It is easier to create a immutable struct than a mutable struct
struct Immutable
x::Int
y::Int
end
# vs
mutable struct Mutable
x::Int
y::Int
end
However, if you want to use it to store user data, if you choose immutable structs, your interface for users is EXTREMELY annoying. For example, if they want to update `x` from `1` to be `2`.
im = Immutable(1, 3)
im = Immutable(2, im.y)
With mutable structs:
m = Mutable(1, 3)
m.x = 2
There's third party packages that make this easier but this should ABSOLUTELY be in Base.
There's similar complaints I have about type names in Julia. I'm incentivized to write `::String` instead of `::AbstractString`, `::Int` instead of `::Integer`. In Julia using `AbstractString` is almost always preferred.
The naming is also quite annoying. Why does `AbstractString` have `Abstract` in the name but `Integer` not, when both of them are abstract types?
I've said this before and I'll say it again. I think Julia core devs should have a usability expert come in and review their whole codebase and workflows and make suggestions. I have no idea how Rust has nailed this so well. In Rust, so many things are just consistent. You can guess what the names or behavior of what you want so often, it's awesome.
TLDR:
If you are thinking of using Julia for a large production codebase, wait 5 more years. I've learnt that the hard way. For personal projects it is amazing though.
> if you choose immutable structs, your interface for users is EXTREMELY annoying. For example, if they want to update `x` from `1` to be `2`
That's the whole point of immutability that you can't "just update". I fail to see how obscure magical updates on immutable (?) structs like advertised in [1] or [2] e.g.
using Accessors
@set obj.a.b.c = d
are beneficial. Note that there is _zero_ explanation on the front page of what the above snippet does, how to use it and what actually happens under the hood. One example of why I personally don't have much trust in JuliaLand.
Scala has `x.copy(field=newval)` for its (immutable) case classes. Note how clear it is that one is making a copy. Lenses are also outside of the stdlib (e.g. [3]).
As I understand, the idea of immutability and usage of packages like Setfield.jl to change struct fields is a safety feature for concurrency. When putting it in as an argument to a function, you can be sure that it would not be changed during execution introducing races without needing to acquire a lock. Also, state machines become easier to reason about when immutables are used.
> I generally think a programming language should incentivize you to "do the right thing"
> It is easier to create a immutable struct than a mutable struct
This is exactly an implementation of that principle - since very often one of the goals when using Julia is performance, it makes sense to incentivize the more performant option. [1]
> I'm incentivized to write `::String` instead of `::AbstractString`, `::Int` instead of `::Integer`. In Julia using `AbstractString` is almost always preferred.
I'm on the fence about this one. I used to think this way, but now I think just slapping Abstract types everywhere in code leads only to fake-genericity. It's the sort of trap that leads to the kind of problems Yuri complained about in his (in)famous blog post.
I'd rather someone type ::String and be artificially limited in what they accept (at least for now, until someone complains), rather than be incentivized to say they accept an abstract type and then end up not actually supporting all of the abstract type interface's flexibility.
The best part of Julia for me is just that it has numeric primitives built in, so the different ML etc. libraries don't have to implement their own (incompatible) versions. So Julia code can be 1/10th the length in Python, because you literally just plug libraries together, instead of having to pipe distinct data structures together.
It also means the libraries are short and easy to understand, instead of having tens of thousands of lines implementing said primitives etc.
I recently wanted to run some Python benchmarks, and was surprised to find that some major Python packages (i.e. Numba) didn't work on the latest Python release.
In Julia, all open source packages that pass tests on the latest release are confirmed to also pass on a release candidate before that candidate can become the next release.
A major package being incompatible with a new Julia version just doesn't happen.
Indeed. There's a nice example from yesterday where someone asked about whether anyone is planning to update ParallelDataTransfer.jl because it's so old (last update August 2018) that it doesn't have any of the modern package development features that became standard post v1.0. But I was like, that doesn't mean it's not working... Julia is stable. And indeed if you run it, it still passes tests on Julia v1.8, more than 4 years later. That's a testimony of just how stable post-v1.0 has been. And mind you, this is a library of tooling for distributed computing, so it's touching that is considered a deeper feature. Source: https://discourse.julialang.org/t/discontinuation-of-paralle...
I used to really like Julia but over time have heavily moved to the view that my ideal general-purpose programming language would look a lot like Swift, perhaps with some bits of Julia, e.g. the REPL.
Julia doesn't have a good support for OOP. I don't think FP is the optimal way to solve all possible programming tasks. Usually a mix of both approaches leads to the most simple, maintainable and readable code.
And I'm really not sold on multiple dispatch. In Swift, I would simply implement the join function by adding an extension on `Array<String>`. And the + function would just be, for example, `static func + (left: Vector2D, right: Vector2D) -> Vector2D`. I've never come across a situation where I wished Swift would support multiple dispatch. On the other hand, the fact that Julia doesn't use the object.method() has 2 very clear drawbacks: lack of completion in an IDE and poor readability (example: `object.doSomething(a, b, c).doSomethingElse(a, b, c)` - imagine how would this look in Julia).
> implement the join function by adding an extension on `Array<String>`. And the + function would just be, for example, `static func + (left: Vector2D, right: Vector2D) -> Vector2D`.
1. We are talking about a function that takes two values and spits out a third. In what sense are we talking about calling a method (or passing a message) to an object (the first string) to mutate its state? How do we enable a compiler to reason about the side-effects (or lack) of this code?
2. In mathematics "+" is almost always used for a commutative operation on values, so defining it the way you have breaks an (implicit) interface that is not explicitly codified (in part because current languages do not give us a way to codify this). The ability to reason about such properties is crucial if you want the runtime to be able to automatically reorder/parallellize/etc without breaking correctness.
3. While these details are minor annoyances in the case of string concatenation (fairly benign problem), they can become really painful when you are trying to leverage the power of generic programming to compose sophisticated algorithms. A classic poster example for this is when you want to add units or uncertainties/error-bars to your numbers and have all your numerical & plotting code generalize seamlessly -- Julia (for example) performs beautifully, while I don't see how any single dispatch (OOP) language can handle this challenge. More examples in this spirit: complex numbers, autodiff with dual numbers, etc.
1. We're not talking about a method that mutates state because this method doesn't mutate state. We can add a `mutating` keyword to the method (this is in fact implemented for Swift structs, but not classes).
2. I don't get this. What's the difference from Julia? Whether the operation is commutative is defined by the implementation of +, like in Julia, AFAIK.
3. Multiple dispatch seems useful in these cases, although it's a fairly niche area. I don't know how doable would this be just with generic programming / protocols without MD, didn't think too much about it. I vaguely remember that I didn't like that extending code with MD in a way relies on implicit, undefined interfaces.
I don't understand what the Julia language offers over for example OCaml, which has a solid type system yet also has a REPL as well as a scientific computing library:
Python is the de facto but it's so slow for anything that can't be well represented as vectorized NumPy operations. There are ways around that like Numba, Jax, Cython, etc. but their use cases are pretty limited, and they don't work well with other Python packages.
There is, of course, C and C++ which are commonly used to speed up Python or as standalone packages. However, C++ is such a complicated beast that writing performant and correct code takes forever. C is much more manageable, but I find that there are not a lot of scientific packages written in pure C. This isn't even touching on the horrendous build system that is CMake.
Fortran is a pretty simple looking language and would probably be the closest to Julia in terms of speed and expressiveness, but the writing is on the wall and Fortran's days are numbered. I am not aware of many new packages being developed using it.
Other than that, there's languages like Rust and Go but those have ecosystems that are so small, they make Julia's ecosystem look like Python's. I really don't want to spend my time as a grad student writing basic numerical libraries from scratch.