While all the points are valid what scares me most about Julia is what is covered under the section "The core language is unstable".
Recently I tried the latest beta version and came across an issue - I just could not add packages. (`] add JSON` or
). No matter what package I tried to add, it failed with a stacktrace pointing to libuv.
I tried downgrading to lower versions of Julia but no luck.
Turns out it is a regression introduced by a commit dated Sep 23 2020. In libuv.
As I tried to fix it, I realized Julia uses a version of libuv that is significantly diverged from libuv main branch ("123 commits ahead, 144 commits behind libuv:v1.x"
as I write this comment) and the bug was in the code. Using the code from the corresponding function from libuv upstream fixed the problem.
To summarize
1. There's a bug introduced close to a year ago that breaks basic functionality on a mainstream operating system.
2. Julia has its own version of, of all things, libuv, that is significantly diverged from the original.
3. The bug is in one of the changes introduced.
While each of the above is defensible on its own. Taken together, they do scare me away from considering Julia for production use. I am hoping am wrong somewhere.
I think it is a lovely language.
Do you have a link to an open issue tracking this problem?
Several Julia core devs are also libuv maintainers or contributors. The main reason for the divergence is that Julia’s libuv fork has a significantly more flexible event loop model that allows using libuv from multiple threads efficiently. The main libuv project has been reluctant to accept that change since it’s a quite advanced capability that Node doesn’t need.
> The main reason for the divergence is that Julia’s libuv fork has a significantly more flexible event loop model that allows using libuv from multiple threads efficiently.
Thanks for the bug report (assuming that’s you that filed it). The change that broke this was made first in Julia’s libuv fork but is being upstreamed into libuv main. It simulates UNIX chmod functionality on Windows, which is (apparently) tricky to get right across all versions and all corner cases. Supporting that correctly has been an outstanding TODO in libuv for a long time — what you’re seeing is Julia driving the development of libuv and therefore hitting the bugs first.
This issue was reported seven days ago and is now slated to be fixed in the next release of all affected versions.
I don't buy the argument that not fastidiously following libuv upstream is an example of Julia not being stable. It's the opposite: that's an indicator of some measure of stability.
The situation could easily be opposite: suppose libuv does stupid things and breaks, and so if Julia tracks the daily build of libuv, it breaks.
"unstable" is not a word which means "staying on top of the development of every dependency". That could literally be a clause in a working definition of "unstable"!
If you lag behind in updating dependencies then there will be situations where you don't get some bugfix for a while
Maintaining your own fork is a good idea, because sometimes fixes are security issues from advisories. You don't necessarily want to jump to the latest and greatest libxy.so, picking up 75 other changes to fix one security item: those items are risky, because they can contain undiscovered bugs. You can carry the security patch for now and then drop it when you update.
Indeed. And in this case, the #3 contributor (and #1 in recent times, really) to Julia is also the current primary maintainer of libuv so it’s very unlikely that there would be any critical fixes that would be missed.
This particular example turns out to be a case where a feature was developed in Julia first and there’s some corner case on this poster’s Windows 10 setup that triggers a bug in the new feature.
I mean, this just looks like your run-of-the-mill bug, or? If you filter on the bug label on GitHub you will find many more. It would be interesting to try figure out what is special about your setup since no one has reported this bug since it got introduced (and it doesn't show up on any of the CI).
I was curious about that too. Many other people use Windows 10 and this is the first time this issue has come up, so there must be something else unusual going on here. If no one could install packages on Windows 10, we'd have heard about it.
I mean, that's bad and scary, but you did mention that you were using a beta version of julia. Finding this sort of thing is exactly why the beta versions exist.
If the beta was stable and ready for production, it wouldn't be called beta...
Oh, the problem is present in 1.6.2(current stable release) as well as v1.5.4(released March '21 available under old releases). I could not find any other 1.5+ version to try.
Oh I see I did misunderstand. Yes, that is unfortunate. I'd guess that since this made it into a stable release, it's likely a pretty tricky bug to encounter, since the failure is so spectacular.
I will say though that bugs appearing in stable releases of any software is a bad, but practically unavoidable thing. It sucks that this bites people, but it's not what I'd really call instability.
OP, thank you for this post. I use Julia daily, and it's likewise my favorite language. So I went into this post prepared to be...defensive. But several times in the post I found myself thinking "no! Wait, well, maybe, hmm, urgh, yeah that's not great."
I'd like to read an epilogue or follow-up "so what" post. If Julia is your favorite language but many of these problems seem like some combination of a) unsolvable given the design priorities of the language; b) unsolvable until a hypothetical future 2.0; c) solvable but only with huge resource increases, how do you manage these problems day to day? What do you see as the moderate term future?
That sounds like a good idea. The problem is that I just don't know enough about compilers, Julia's internals, and how the core dev team works to really forecast the future and know what's possible. How hard would it be to implement caching of native code? A fast interpreter that runs side by side with the compiler? No idea.
Maybe it would still be useful to make a post explaining how I cope with the limitations of Julia. I'll think about it.
Great article, thanks for writing this, I’ve had a lot of frustration with the Julia community being unwilling to see faults in the language.
I really want to like Julia and I think I’m theory it has a lot to offer but due to much of what you listed I find it hard to develop in.
One thing I would add is that package management is also odd to me. It seems to have a lot of overhead and behave in unexpected ways.
In general I think Julia could learn a lot from Go. When I develop in Go most everything is simple. The type system is easy and just works, packaging is simple and usable, it compiles quickly. I fend writing the language seamless and productive.
Unfortunately Go also isn’t good for data science due to the memory model so Julia has a lot of room to step in there.
>I’ve had a lot of frustration with the Julia community being unwilling to see faults in the language.
What? It's basically a meme at this point for major contributors to write a list of what they think is bad. The language's creator did a nice Youtube video "What's bad about Julia?" at JuliaCon 2019 (https://www.youtube.com/watch?v=TPuJsgyu87U), one of my most popular blog posts of all time is "7 Julia Gotchas and How To Handle Them" (https://www.stochasticlifestyle.com/7-julia-gotchas-handle/), etc. What the community pushes back about is things that are factually incorrect or just silly and repetitive, like the n+1th discussion about 1-based indexing. It's stuff below like, "I want to do an interactive workflow but not use the REPL, so it's not working well"... well the REPL is literally the interactive piece that caches the compilation so just use it if that's what you're trying to do? Yes, nobody has time for that nonsense, but everyone chimes in for real substantive discussions like changing the Base iterator API, programming patterns that would reduce recompilation, etc.
I don't think that's fair. Your blog post is not about what you think is bad - at least, it doesn't read like it. It reads like a list of stumbling blocks and how to get around them. This is not the same as listing the major problems with the language.
Jeff's talk is indeed great, but... it's not mostly about what he thinks is bad, either. That part is quickly swept aside with "just so you know we know". The talk is really about some very specific problems in the type system that are so obscure that I don't even consider them worth mentioning in a blogpost about Julia's weaknesses.
>It reads like a list of stumbling blocks and how to get around them. This is not the same as listing the major problems with the language.
I mean, if by stumbling blocks you mean all of the things I thought was bad and all of the work arounds you have to know to get work done? I think the difference is that normally when I gripe I like to then elucidate why the bad things exist. To me that's more productive because digging into "why is it like that" starts to point towards "how do you fix it". But yes, giving the full detailed backstory behind what leads to the gripe naturally softens the blow because the reader then understands its purpose, but that's just reality. Very few things are truly and purely awful when you really see the full reason for them. Usually it's just an engineering trade-off.
Any language has stumbling blocks. That's _really_ different from fundamental issues with the language. There is no working around most of the issues mentioned here.
I think a million and one posts on 1-based indexing and "why no OOP" or some perceived weakness of the modules system and other silly things like that have led to a base defensive attitude.
But this post is criticism from someone actively using the language, well thought out, and things that you can not work around. It includes real currently unresolvable pain points.
And it doesn't sound like the Julia lang team has a plan for how to address them. Or it's not communicating that plan. After all, the talk you link by Jeff Bezanson name checks many of the issues raised here two years ago, to then continue to not talk about them ever again.
This feels like classic feedback from the Julia community. It really stifles any hope I have of the language recovering. I know there are reasons behind things, but they also seem blind to fixing the obvious stuff because of them
And yet, would you not agree that the most important pain points do improve over time? Most of the things I mention in the post: Latency, stability, ecosystem maturity, static analysis and the IDE experience is notably better now than just 1 year ago.
I agree progress on the deeper problems is slow, and I also think that it is slow because 90%+ of the issues people have with Julia are "user experience" things (see any of the help forums as well as all of the previous HN discussions). Given a limited amount of resources, it's difficult to justify a deep dive into revamping the type system (AIUI multiple dispatch with complex types + traits is mostly unexplored territory) over improving latency and tooling. That's not to say it shouldn't be done, but in the absence of some significant (financial and development) support behind it I don't think the type system work will be prioritized.
In any large system, there are many problems that could have technical solutions but they're missed for one reason or another. Design mistakes occur from time to time and we can choose how we respond when they do. I want experts from other communities to join us, and they need to see that Julia will take their feedback seriously. With all due respect to your major contributions, I think we'll be better off if people in leadership roles take a more critical view of Julia, especially in public.
The thing is no one is saying that starting a process every time isn't a valid way to work — it works great in most scripting languages. What people are saying is that since it's slow and annoying in Julia currently, you might want to consider a REPL-based workflow with Revise. If the response to that is "But I don't wanna use the REPL!!" then what can one do? Ok, don't use the REPL. But as everyone has agreed, it's going to be a bit annoying. If that's a dealbreaker for you then you may not want to use Julia just yet — which is totally ok!
That interaction unfortunately seems to often get characterized as "The Julia devs don't care about startup time! They insist on REPL-driven development and refuse to work on it." Never mind that improving startup time and package loading has been the number one priority of the compiler team since Julia 1.0 and that it's gotten about 10x faster. There's also ongoing work to make it even faster using system images.
As someone who has used Julia for production use, I basically agree with everything you've said.
IMO:
1) Julia rushed to v1.0, but its understandable - it is sort of a catch-22 issue. More developers won't join unless it is stable (v1.0). But it won't get stable unless a lot of developers use it and feedback is heard.
2) I personally found that Julia community is slightly hostile to feedback and negativity. May be it is just me but it has way too much hype-driven-positivity that leads to delusion.
3) A lot happened between v0.2 to v1.0 which IMO should have been carefully and slowly done.
4) Developer experience should have been one of the major focuses. Stack traces should be beautiful and absolutely transparent. Debuggers are clunky - I used the Atom (Juno) debugger and wanted to toss the laptop off the window - just shows how frustrating it was. Compile times and smoothness of dev experience pays dividends and it was largely ignored.
5) I really like official libraries, not SomeBasicFunctionality.jl dependency. I am spoiled by Python.
6) Marketing around speed is misplaced IMO. It lures people like me into fanboys of Julia. There is so much more to a programming language than speed.
I don't want to be overly negative. The language and the original paper is beautiful - and this is all hard stuff. Kudos for the progress made so far.
> 2) I personally found that Julia community is slightly hostile to feedback and negativity. May be it is just me but it has way too much hype-driven-positivity that leads to delusion.
I want to push back on this a bit, which I acknowledge is very ironic. In the past few years people consistently post on Discourse asking for fundamental changes to the language to make it more resemble python, C++, or whatever their preferred language is.
People often say Go is great because there is "only one way of doing things", yet people are very resistant to being told "the way" to do something in Julia. This has happened enough that it's prompted a pinned PSA on discourse: https://discourse.julialang.org/t/psa-julia-is-not-at-that-s...
It gets tiring! And i'm not sure how the community should handle these requests, but I don't think it's fair to blame all of the negativity on the Julia community when these somewhat misinformed, or even bad-faith posts are so frequent.
Completely agree with 4 and 5. I use Julia regularly and while it is great for scripts and proof of concepts of mathematical stuff it might be too complicated to implement in Python or in C++ once one starts to grow an application it is tremendously frustrating.
I have clear and satisfying workflows in C (vim + make + gdb + valgrind) and Python (vim/vs code, pdb) but I cannot say that debugging and tracing errors is a comfortable experience in Julia. Essential development tools are relegated to third-party libraries (e.g. Revise.jl and Debugger.jl) and are often clunky or responsiveness is limited (e.g. Revise.jl won't work when you change a struct)
I believe the problem is that, because of Julia is marketed towards scientific computing and HPC, it draws mostly from an academic user base which is not that interested in software development.
> I believe the problem is that, because of Julia is marketed towards scientific computing and HPC, it draws mostly from an academic user base which is not that interested in software development.
Although you have a point regarding Revise and Debugger, the last paragraph makes no sense at all. As a counterexample, Haskell drew mostly from an academic user base and many Haskell idioms have ended up in other real world *production* languages like Python (e.g. map, filter, reduce).
> As a counterexample, Haskell drew mostly from an academic user base
May I infer from this statement that you agree with me that Julia draws mostly from an academic user base? In that case, I do not see how your counterexample invalidates what I said. Perhaps some constructs from Julia end up in "real world production languages" but what I said is that the development experience in Julia is subpar and it shows when one is trying to develop larger codebases where software engineering practices become more important.
> I personally found that Julia community is slightly hostile to feedback and negativity. May be it is just me but it has way too much hype-driven-positivity that leads to delusion.
Part of the problem, at least from what I observed here and in other dev communities, was that Julia had a large number of bad faith critics (probably larger than the number of active users, at the time) from about 0.2 to 0.5 or so. I'm talking about everyone who compared the runtime of hello world in the REPL, with start up time, to that of a compiled executable. There were piles and piles of fake benchmarks that did little more than compare BLAS bindings as though the differences were intrinsic to the language.
Sure, the Julia community could have reacted better to such (and a few people did: BenchmarkTools.jl was an early win in this area), but few people do react that way in practice.
Yeah, that might be true. I've read very poor criticisms of Julia other places on the Internet. Most famously perhaps the "giving up on Julia" blog. In fact, the lack of good criticism was what prompted me to write this post.
> 5) I really like official libraries, not SomeBasicFunctionality.jl dependency. I am spoiled by Python.
Python's standard library is full of bad code that nobody has time to improve [1]. Over time, Julia packages will surpass the standard library too but we'll still be hauling it around and spending core-dev time on it for for backward-compatibility reasons. We should have a much much smaller standard library.
I think the only difference is whether something is officially supported or not. When it is officially supported, it tends to not break and has to run through a bunch of tests that ensure that std lib doesn't break with every programming language release. It tightly couples them. With off the shelf libraries, there is always a delay.
I also have an ideological take - I think of programming languages as a tool box and like a car mechanic, you want to have dependable tools that are robust, don't break and has good support. Basic algorithms and datastructures should always be included in the programming language - this is subjective but I firmly believe in it.
Others have commented on stacktraces, so I wanted to mention that the debugging experience in VS Code (the replacement for Atom-based Juno) is much improved as of the past couple of months. Worth a revisit if you're using it already.
Regarding stdlib support, I feel like it's a toss-up. For numeric and data processing code, Julia has a much richer stdlib whereas I need at least Numpy in Python (and even then, working with anything not array-shaped is a pain). For more systems-y stuff though, Python has a much fuller stdlib (e.g. reading/writing a bunch of archive formats). The end result for me is that scripting with Julia feels high friction, and doing anything which requires any kind of throughput with Python feels like pulling teeth.
I like that they are colored now, but really what needs to be added is type parameter collapasing. In most cases, you want to see `::Dual{...}`, i.e. "it's a dual number", not `::Dual{typeof(ODESolution{sfjeoisjfsfsjslikj},sfsef,sefs}` (these can literally get to 3000 characters long). As an example of this, see the stacktraces in something like https://github.com/SciML/DiffEqOperators.jl/issues/419 . The thing is that it gives back more type information than the strictest dispatch: no function is dispatching off of that first 3000 character type parameter, so you know that printing that chunk of information is actually not informative to any method decisions. Automated type abbreviations could take that heuristic and chop out a lot of the cruft.
> I’ve had a lot of frustration with the Julia community being unwilling to see faults in the language.
A related thing that really turned me off in the early days was one of the main developers constantly criticizing other languages. That wouldn't be so bad by itself, but much of it was uninformed, and it told me at least one of the core devs was not going to learn from other languages. There was always an attitude that "I know what to do, unlike everyone else".
The reason I never moved heavily into Julia was that it was too easy to write slow code (back then, things may have changed) relative to compiled languages like Fortran. Maybe I'd have stuck with it longer if I found the community to be pleasant, but an unstable and often slow language plus an unpleasant core developer was more than I was willing to put up with.
Opposite experience here, it was crazy to me how fast I could prototype my number crunching on price series, how trivial it was to distribute it across multiple machines in my house and finally how easy it was to flip to gpu computation. On every step I was surprised how fast it was.
Community is great, met only kind people, exceptionally knowledgable bunch.
A couple of very vocal individuals have an attitude, which reflects badly. But the community is getting bigger, so it's becoming less of an issue as most interactions don't include those people.
It's still easy to write slow code. But writing fast code is often easier than it would be in C++ or whatever.
There is a certain set of languages that the Julia team is experts in. But the world of languages is very diverse and big parts of it are lacking representation, so we miss out on their insights. Eg I think most core devs are experts in C, C++, Matlab, Python, but not as many experts in Haskell, OCaml, Rust, Racket, Common Lisp, Clojure.
I'd like to find ways of attracting those other experts into our community, and making sure we use their insights whenever we can.
Always good to attract experts from other languages. I was mainly addressing the claim that Julia core devs like to crap on other languages, not trying to claim that we're experts in every language — obviously we're not. But saying that we're uninterested in other languages or unwilling to learn from them is simply untrue — we're a bunch of huge programming language nerds and we love programming languages, even some of the notoriously flawed ones like C++, Matlab and Perl.
It’s easy to write slow code in any language. I wrote some slow Fortran before I learned how not to. Generally you need to know a good amount about not only the language, but about the machine you’re targeting, to write really efficient code.
I don't think Julia is trying to be anything like Go and it shouldn't be.
Probably Julia's best feature is the powerful metaprogramming support that allows you to, e.g., create array packages that automatically run code on GPU. You can't have such features with a simple language.
That doesn’t mean it can’t learn from it. Yeah they are solving somewhat different problems but in sheer usability of a language Julia could take a lot from Go.
> powerful metaprogramming support ... can't have such features with a simple language
Various Scheme implementations are among the simplest languages I've ever used yet possesses some of the most powerful metaprogramming I know of. Perhaps I misunderstood you somehow?
All languages have problems. Don't take my post to mean "don't use Julia" - the very first words of the post is that it's my favorite language. Also do note that the webpage is built with Julia. Not a coincidence.
But even good systems need critique, otherwise there can be no improvement or growth.
Yeah, I guess only seeing one side of the argument can be a bit misleading...
It would probably take another 1,000 word blogpost to go through why Julia is amazing, despite all these problems. I'm sure I will make that post someday.
While I do like Rust, the basic design of Julia and Rust are just very different. They're intended to solve different problems, and the problems I have are the problems Julia is designed to solve. The love comes from just how well it solves those problems compared to every other programming language out there.
Compared to Rust specifically, Rust is a complete nonstarter for interactive coding, since there isn't a REPL. Even if there was, it's way too complicated to use interactively - a garbage collector would be a must, and a lot of these compile time checks that I praise would be a major pain in the ass when coding casually. It's also way too slow to write: Even after having written Rust for a few months, I estimate I can solve a problem using Julia 4 times faster than using Rust, using maybe 2/3 the number of lines, or less.
Things can have faults and be worth investing in, it doesn't need to be an all or nothing false dilemma. Nothing would get fixed if we all just self selected in this way.
That was explicitly not the OPs point. His point was that there are warts on every tool and he wanted to clearly bring them up in a way that is productive and helpful.
He says julia is his favourite language in the very first sentence.
jefft meant GP, not OP. When people complain about julia, they often get "maybe julia isn't for your use case" instead of "good point, we should fix that".
Julia is incredibly fascinating for me as a language to observe, although I haven't got the time/patience to try it out yet. Its language design seems incredibly progressive and yet pragmatic: they've repopularized dynamic dispatch (which makes perfect sense in a mathematics-oriented language), and has a unique mix of dynamic and static typing that seems perfect in paper.
In practice, Julia seems to suffer from the initial architectural decisions that the creators has made when the language was young - the biggest which is its heavy dependence on LLVM. This is a problem since LLVM is mainly used for static compilation of code for languages like C++/Rust/Zig and not with dynamic execution - its included JIT is famous for being unreasonably slow. Even the dev seems to know this but they don't seem to have that much choice - since too much of their codebase is dependent on it. (For example: https://discourse.julialang.org/t/jeff-bezanson-remarks-on-l... - LLVM codegen keeps getting slower, which in turn cancels out all the little optimizations they can do for the language) It's a language that seems so perfect on paper but suffers from its real-world implementation - and therefore captivates me in a special way. Some major architectural overhaul might be a much better move (such as a fallback bytecode interpreter independent from LLVM) - but it's definitely going to be a major effort.
Using LLVM is not exactly an indelible architectural decision — switching to a bespoke JIT would be entirely possible, it’s just not worth it yet. Moreover, if you want to generate world-class high speed code, you really need something like LLVM. So yes, it’s a bit of a drag but it’s also a massive enabler of critical performance. That’s the real reason the project continues to use it.
There are a number of approaches to reducing latency further, which are being pursued. The most promising current avenue is better tooling around system images (compiled executables with additional functionality beyond the core language pre-loaded): generating them faster, generating them incrementally (load one, add a few things and produce another), and generating them automatically for the set of packages in the current active project (not feasible until the first two are done). That will give an experience comparable to compiled languages.
How much communication is there between the Julia compiler team and the LLVM team? I don't get the impression LLVM has been very sensitive to Julia's needs so far.
Quite a bit: several of the Julia compiler team are LLVM maintainers and contributors. LLVM has started to track compile time, so that's gotten a bit better with recent releases. Making noise about compile time was mainly to get the LLVM project to start paying attention to compile time, which they now have, so mission accomplished — as long as they stop completely allowing latency to degenerate, it's ok. LLVM is never going to be fast at compiling code (and if we used something else that was as good at producing fast code, it would almost certainly also be slow), so avoiding putting LLVM in the fast path is always going to be necessary to make more latency improvements.
You may be interested in the current initiative around compiler plugins [1]. One of the explicit design goals is offload to alternative backends like GPUs or Wasm, but you could see how that would extend to something like Cranelift as well.
That said, LLVM codegen performance is very much a tail wagging the dog thing, and I wish upstream put a significantly larger focus on it [2]
In my opinion the most important positive thing Julia has is the Compsci community it attracted.
There has never been so much enthusiasm in oftware for doing physical simulations and correct and stable arithmetics. It made a sizeable dent in the Fortran/C++/Python/Matlab environment physicists were in.
That said even after 10 years Julia has still so many rough edges that I wouldn't employ it for anything outside the aforementioned use case.
Just getting it to Plot something in a headless environment was a pain. Essential utilities like debuggers (!) are external dependencies. Matrix/Vector/Array confusion permeates the air. Ditto for a syntax annoyingly in the middle between Matlab, Python and Fortran.
Given it is already not that young I predict it will never really attract the masses beyond its niche, but it can still kill the remaining Fortran codebases.
It's a little hard to say "after 10 years" when Julia has been at 1.0 for 3 years only. And remember, that's not with Rust's niche of low-level developers, but with a niche of people who are generally not programmers, and therefore much more conservative about their languages!
It's surprisingly hard to get good data about how quickly Julia grows, but it certainly doesn't look like it's plateauing yet. It's just too early to know how far it'll go and where it will end up in 10 years.
If anything, Julia becoming popular with academics makes me extremely skeptical about its missing guardrails. Obviously anecdotal, but I have many years of experience working with academics and statisticians and I know to be wary when they suddenly like a new technology. Usually it's because it makes it easier to swallow errors somehow, or because their friend wrote it and sent them an email about it.
Ha! That's hilarious. Yeah, I've seen some real horrorshows of programming in my field.
I think it's important to realize, though, that academics do have a special use-case when programming. That is, academics don't pick e.g. Python because they are bad programmers and don't know what they are doing, but because Python is a good fit for their needs. I see Julia as an excellent - near-perfect, in fact - academic programming language. That Julia excels at this use case does not mean it's bad at everything else.
Just giving you a hard time... I do think the performance improvements alone make Julia a serious contender. I've seen a lot of valuable engineering time spent rewriting algorithms in C++ for scientists who know Python. To speed up their iteration loop without bringing in performance experts would be great.
I like Julia because of the trivial 10x performance improvements I get over writing similar code in Python/numpy.
What I don't like is how Julia attempts to force you into a REPL workflow. Everything is optimized for REPL work. I just want to write a script in a text editor and run it from the command line. Please make this easy for me.
> In programming I tend to think the whole computer and its OS and so on as the programming tool. My impression is that for Julia folks Julia is something totally separate from anything else in the computer. But I don't understand why. Is it that its felt that changing syntax on the fly is too cognitively demanding or something? This is something that one gets used to very fast.
This comment really does not resonate with me. It sounds like a criticism of Smalltalk, not Julia.
Not sure what it means to change "syntax on the fly", but that does sound like the sort of design that is maximally flexible, at a considerable cost to readability. Macros already allow you to embed DSLs (as long as they parse), and almost everyone agree that they should be used sparingly.
> And also tends to ease the IMHO weird hung-ups on syntax. In fact, I think Julia has made some (IMHO misguided) syntactic decisions because they just want to do something differently from Python just for the sake of being different. I find this totally senseless.
I am not aware of any decision that is done just for the sake of being different. If there are any, then yes that's senseless.
> Not sure what it means to change "syntax on the fly"
GP meant that Julia enthusiasts seem to dislike the cognitive overhead of switching between languages (e.g. Bash scripts, SQL, C code, Python scripts, and as GP wants, Julia scripts as well) while doing multi-language development using the whole Unix computer as a programming tool.
It's really hard to do, because having a command-line centered development experience with Julia is a complete nonstarter until compile time latency is near zero. And that's just really, really hard, and unlikely to happen, at least any time soon.
That comment you linked is amazing - it steps back a little bit and gets our head of out the dirt to see the horizon. As a bunch of programming ostriches try to muck about this or that - the author instructively dictates that general purpose programming languages are part of the toolbox to get the job done...on an operating system of choice.
We need to listen to people that have gone through the experience of putting complex systems together and their learnings are immeasurably valuable. This is the eternal balance of theoriticians and practioners.
It also further cements something that I always felt but can be offputting to some people - Julia is an academic language used by academicians. From a software engineering perspective (not computer science, mind you), there is much to be desired.
I really like Julia, and also get the sense that the community is a little young and unsettled in some ways. The biggest way I've seen this manifest is in terms of a total lack of decent documentation for reputable, well known projects. For example, there's this trend in the Julia community of putting little clever puns in the github package description. Okay I'm not trying to rain on anyone's parade and I appreciate humor but when I'm looking for packages that little github description area is critical. Maybe save the puns for the readme doc? Beyond that, it's frustrating to find core libraries with absolutely zero API documentation or with totally outdated/incomplete/misleading docs.
I'm not all that mad about it or anything so I won't point fingers. I just find it to be out of touch and mostly just a buzzkill. In the past when considering investing my time and energy in building a project in Julia, I've found it frustrating to have to do all this guesswork to get a sense for where critical packages are at and what their limitations are.
So from an optimistic angle, I think this probably reflects the fact that there are quite a few researchers writing Julia packages, which is really cool. There are some awesome cutting edge techniques implemented as Julia packages. I don't think researchers should be off the hook for maintaining crappy packages, but I do think it takes some higher level thinking to cultivate good tools and collective habits around this stuff. For whatever reason, this is evidently challenging for the Julia community.
I hope that some of the culture and tooling around documentation and package maintenance gets more mature over the next few years. If anyone knows about efforts to improve documentation in the Julia ecosystem, I'd love to hear about them.
Not to call anyone out but examples? Every time I look at packages I’m blown away by how much effort has gone into their docs. But people keep saying this kind of thing so there must be some packages they’re hitting that are under-documented.
I find that "online" docs are good, but "in-repl" docs are usually poor to inexistent (in packages, that is).
Base Julia has improved on this a lot, though there's still room for improvement (especially in terms of useful examples or linking to related functions).
This also came up during JuliaCon there was a suggestion to add module doc strings which would be a good start and consider distributing docs via the packaging ecosystem so they can be made available in VS Code
Edit: tldr, Julia could use something with the ease of use and functionality of docs.rs, but built especially with Julia's typesystem in mind.
So after thinking about this, I think there are two things going on here. 1) it's actually the lack of browsable autogenerated API docs I'm frustrated by... I'm spoiled by the most excellent Rust docs.rs API docs, which give a great, quick, readable, comprehensive overview of what's in a package even if the maintainer hasn't made any docstrings and (2) rust has had this emphasis on usable documentation as a seamless part of the development experience for a while, but Julia is definitely catching up.
So I went back and checked and was happy to find that the JuliaGPU packages that I previously couldn't find docs for definitely have some docs now! In particular, GPUArrays.jl. There were also some astronomy packages I looked at had been rewritten with docs left hanging for like a year.
That said, in the autogenerated API docs for GPUArrays.jl, if there's a function with no docstrings, on JuliaHub it just shows a big yellow warning to the developer. I'd prefer if it showed some useful information about the types the function is defined over and its return types. I'd also love if there was some quick way to see a list of included types and functions, along with their type signature and even a way to view the code.
Really I think I'm just spoiled by the Rust community's amazing auto-generated API docs on docs.rs, which seamlessly integrate with examples and readme style docs. If there's a rust package I wanna use, docs.rs will give me a nice consistent, browsable overview of the code and I can usually figure out what's in there just from that, even if the package maintainer hasn't actually written any example docs or docstrings, just using info from the typesystem. It's so nice to be able to go to one place and see what's in a package, the traits, structs and function signatures, all alongside docs generated from docstrings and handwritten docs. Did I mention that this information is always in the same place on docs.rs? These aren't just "filler" docs, they're super usable.
Most Julia package docs are more freeform and I have to click around to find the API docs, and honestly I'm not sure if every package even has these. Whereas on docs.rs they're right there immediately with no cognitive overhead. Freeform docs are awesome, and I'm always excited when a package has lots of well thought out documentation, but it's no substitute for up to date, informative API docs that give you a solid window into the fundamentals and interfaces.
Julia has such a cool typesystem, I could imagine there are some interesting opportunities to use that to make the autogenerated Julia docs much more usable and informative.
As far as the comprehensive approach is concerned, I honestly don't think Julia is that far off, it's really a shift in emphasis and a streamlining that I'm wishing for. I can see that a lot of progress has been made.
Maybe it'd be worthwhile to consider hiring some of those ex-Mozilla people to help the Julia community get to the next level on this?
The compile time latency is the biggest the biggest show stopper for me.
Yes, you can cope with it by staying in the REPL and they are actively working on it but it is very unfortunate that the "slower" competitors like Python and R feel so much more responsive while developing.
Other than that, Julia is a absolutely beautiful language. Sure, it is not offering the same static safety as Rust or Haskell but I would not hold that against it. There is always some trade offs to be made regarding the dynamic vs static spectrum and the dynamic aspects serve Julia well for the typical use cases.
The compile time latency is really bad for some libraries. I was recently testing out computing class groups using https://github.com/oscar-system/Oscar.jl and was surprised to hit *several minute* latency due to LLVM compilation. For me, the most striking things are that (1) there is no indication in the REPL that calling a function is slow due to LLVM compilation, rather than the function actually running, and (2) it is difficult to avoid paying the compilation penalty every single time you restart Julia, since caching the JIT in general is a technically very, very difficult problem. I wonder if there is an easy fix for (1) that I don't know about: some way of showing a spinner or something saying "I'm compiling code right now".
I love Julia and choose to work in it almost exclusively, but I agree with the points in the article. I've run into a lot of issues just writing numerical linear algebra type algorithms.
Even core, and not quite core but maintained by core dev, libraries like Distributed.jl and IterativeSolvers.jl can feel pretty rough. For example IterativeSolvers has had strange type issues and not allowed multiple right hand sides for linear solves, for years, afaik due to some aspects of the type system and some indecision in the linalg interface. DistributedArrays still is very poorly documented and looks like it hasn't been touched in 3 years.
I've run into problems when I need more explicit memory management, for example none of the BLAS/LAPACK routines have interfaces for the work arrays, so you either get reallocation or have to rewrite the ccall wrapper yourself. It can also be hard to tell where the memory allocation is happening.
My most recent problem had been with Distributed and DistributedArrays, where everything is fine if you just want a basic parallel mapreduce, but has been a huge pain past that. It's not even clear to me if Distributed/DistributedArrays has been more or less abandoned in favor of MPI.jl, which for me removes most of the benefit of writing in julia, since you then have to run it through MPI. There is an MPI sort of interface for DistributedArrays but that part is not well documented and looks like more of an afterthought.
My use case isn't even that complex, I just want to persistantly store some matrices across the nodes, run some linear algebra routines on them every iteration and send an update across the nodes, then collect at the end. If anyone has any idea how to do this correctly in Distributed or DistributedArrays or can point me to some examples that would be amazing because it has been taking me forever to piece it together.
Not going to stop using Julia but there are many basic things even just in a scientific computing workflow that still feel like they were rushed and they can really take the wind out of your sails.
Agreed, but it's funny that criticism of Julia broadly falls into two categories:
1) Julia doesn't have X. X is critical for modern programming languages, and without X, we should not even entertain the idea that Julia may be usable
2) Julia's feature X is too unstable. It's like they tried to implement too many things in Julia 1.0, and developer time stretched thin. They should have just not implemented all this stuff!
I mean yes, we all would like a programming language that materializes with 1,000,000 developer hours already poured into it, great editor support out of the box, and which is somehow born with 10 years of usage. It's similar to wanting an employee who enters the work force with 10 years of industry experience. Nice, but it's not very realistic.
Re point 1 (JIT latency): This has been done to death, but truly fixing it would make Julia part of my toolkit. For example, right now I'm experimenting with numerical diffeq solvers for chemistry. It involves experimenting with parameters, and plotting. AFAIK, this isn't possible with Julia unless I switch to a REPL or notebook workflow.
This brings up a key question: What is the technical limitation behind compiling dependencies every time you run the program? Is non-REPL workflow an afterthought? I'd rather use Rust for something like this! `cargo run` works faster than Julia's JIT since it doesn't need to recompile dependencies like a plotting or numerical lib each time I change a parameter.
Out there idea: What about a new language that takes advantage of Julia's strengths, like clean mathematical syntax, 1st-class array support, and being a fast scripting/numerical language, without some of its problematic design decisions? Julia's been the language I've been wanting to love for years.
I think the REPL workflow is a huge advantage for experimentation and explorative programming. I assume you are supposed to and are able to have a file buffer open and attach your REPL to that, avoiding restarts. Is that assumption correct? In this case I cannot imagine how a compile-run workflow has any advantage for the type of work you are describing.
The problem with the REPL workflow is that it forces the user to track a lot of state. You can find yourself constantly trying to remember if you remembered to re-evaluate an important line. I've found that -- for anything but the simplest problems -- it's better to write nicely parametrized scripts and run from the command line.
Pluto (reactive execution) and Revise (automatic re-evaluation) are the standard solution for this problem in the Julia ecosystem. Not that your workflow should not be supported too (people are working on it), but alternative solutions preferred by much of the current Julia community do exist.
No - I'm not familiar with a file buffer. That sounds like a neat concept I'll look into for Python too. In general, I've been moving away from REPL workflows due to hidden state and lack of reproducibility.
The hard part is making it produce the correct results even if the user does something like change the definition of +. It's not un-solvable, but it needs a couple hundred hours of work by people who are very busy.
It's a fundamental problem of having a dynamic, compiled language. It's not possible to actually solve. Think of it similar to how C++ and Rust needs to compile its source code before running it. No-one talks about this as a problem to solve.
The problem with simply compiling dependencies to static binaries is that all Julia code is allowed to redefine other Julia code. So package X can load package Y, then define a method that causes package Y to change behaviour, and thus needs to be recompiled.
This is not unintentional, by the way. Having packages able to use each other's code is critical for having the ecosystem be "composable", that is, being able to make two different packages work effectively together. For example, I might take a type from package X and throw it into a function from package Y, and it works.
I don't think this is an inherent problem of having a compiled dynamic (in other words, JITted) language.
Things like Javascript (V8) and Lua (LuaJIT) manages to have fast startup times while having exceptional performance in hot paths - this is because they have a fast bytecode interpreter that executes the script first while the actual compilation is taking place. Unfortunately Julia in its current state, doesn't have a fallback interpreter to execute its bytecode/IR (which is similar to LLVM IR). And LLVM IR isn't really suited for fast execution in interpreters - it's moreso designed as an intermediate language for a heavyweight compiler than a bytecode for a dynamic language.
Maybe some heroic figure would come out of the fog and suddenly brings a whole new bytecode representation and a fast interpreter for Julia, but that would be quite a project..
Julia does have a minimal compilation path with an interpreter. You can even configure this on a per-module basis, which I believe some of the plotting packages do to reduce latency. There is even a JIT-style dynamic compiler which works similarly to the VMs you listed: https://github.com/tisztamo/Catwalk.jl/.
IMO, the bigger issue is one of predictability and control. Some users may not care about latency at all, whereas others have it as a primary concern. JS and related runtimes don't give you much control over when optimization and are thus black boxes, whereas Julia has known semantics around it. I think fine-grained tools to externally control optimization behaviour for certain modules (in addition to the current global CLI options and per-package opt-ins) would go a long way towards addressing this.
You're right, done some more research and there seems to be an interpreter in the compiler: https://github.com/JuliaDebug/JuliaInterpreter.jl. It's only enabled by explicitly adding annotations in your code, and is mainly used for the internal debugger, but it's still there.
Still, it still seems to try executing the internal SSA IR in its raw form (which is more geared towards compiling rather than dynamic execution in a VM). I was talking more towards a conventional bytecode interpreter (which you can optimize the hell out of it like LuaJIT did). A bytecode format that is carefully designed for fast execution (in either a stack-based or register-based VM) would be much better for interpreters, but I'm not sure if Julia's language semantics / object model can allow it. Maybe some intelligent people out there can make the whole thing work, is what I was trying to say.
The naming is unfortunate, but JuliaInterpreter is not the built-in one but a separate package for use in external tooling. The built-in one can be run globally via the --compile=min CLI flag. Likewise, you can also pass -O0 to -O3 to configure the level of optimization (which, predictably, affects latency).
As for IR representation, I'm not aware of any limitations of the IR (remember, there are multiple levels) over a LuaJIT-style bytecode for interpretation performance. After all, the Futamura projections tell us that a compiler is really an interpreter that's undergone some partial application. Of course, that's a theoretical correspondence that has little bearing on real-world performance, but I don't think you can confidently say that Julia's lowered or unlowered IR forms are fundamentally bad for fast interpretation.
You can set optimization per module with `Base.Experimental.@optlevel`, though I'm not finding any documentation for it. Could swear it was in a release note.
help?> Base.Experimental.@optlevel
Experimental.@optlevel n::Int
Set the optimization level (equivalent to the -O command line argument) for code in
the current module. Submodules inherit the setting of their parent module.
Supported values are 0, 1, 2, and 3.
The effective optimization level is the minimum of that specified on the command line
and in per-module settings.
Inside Julia compiler/runtime there is an interpreter, because Julia uses a heuristic to determine whether to compile or interpret a function. There is also interpreter code in Julia debugger. I don't know how full featured they are, but one does not have to start from scratch.
On the other hand, implementing a tracing JIT for Julia is going to be such a big task, I am not sure how much help existing interpreters are going to be. At the very least there needs to be a new GC, which necessitates changes everywhere except the parser. LLVM integration may also prove awkward for a tracing JIT.
> It's a fundamental problem of having a dynamic, compiled language. It's not possible to actually solve.
There are plenty of Common Lisp implementations that allow you to redefine functions or add methods to multi-dispatch functions without requiring you to recompile every use-site. There are other trade-offs here, but it’s not “impossible” to precompile and optimize code in a dynamically-typed language.
It’s not really a problem with dynamism, it’s more about the fact that Julia’s compilation model is similar to C++ with templates but at runtime (and with first class parametric types instead of textual substitution, but the compilation and specialization story is similar). Just as separate compilation isn’t possible for C++ template code, it’s a challenge for Julia as well.
Maybe you just don't read the right places because people definitely talks about this as a problem to solve. For example, Julia already stores e.g. inferred code in the precompile files. There isn't a huge leap to also stored actual compiled code from that. Just because it is possible to invalidate code and cause recompilation doesn't mean that it isn't worth storing ahead of time compiled code in the 99.9% cases where invalidation doesn't happen.
But this is not solving the problem, it's just reducing it. My analogy with compilation of C++ is that noone SHOULD expect it to be "solved", because it's a natural consequence of how the compilation works. They may improve e.g. incremental compilation to reduce compile times, but it's just not a problem that can be "solved". Similarly with Julia and latency.
I disagree that it cannot be obviated in Julia. Can be solved with one or more of static tooling (ie proving there's no overwriting), tiered compilation with a faster interpreter and sealed modules. All things jeff et al have been discussing
The things that stick out to me as unwieldy in Julia:
* Not embracing functional style idioms. There's no TCO so writing recursive algorithms from old FP papers is hard. Not an issue per se, just gripe.
* Not having explicit interfaces. I'm sure there's some advanced Julia-fu that I could do to introspect my way to knowing what methods I need, but the very concept of them not being explicit irks me. Again, not an issue per se, but it annoys me.
* There's a complete and total disregard to any notion of a smaller Julia or a Julia more suited for general purpose. Julia seems ripe to benefit from a Ruby&mruby sorta situation, but alas. We need this gigantic blob of dependencies and toolchains and compilers and this and that, to operate this programming language. I don't get it. For the last time, it's not an issue per se, and the language has achieved incredible engagement in many demanding niches, so what do I know anyhow?
yeah, if anything that's what I dislike about julia, I wish it didn't claim to be a GP programming language, and instead worked more on figuring out how to glue it to languages that aren't C.
For context, I have basically stopped using julia (not for any taste-related reason except I started doing jobs that were more suited for elixir and haven't had to do numerical computation since), and elixir doesn't claim to be a GP-PL and it's perfectly fine and doing quite well.
Is there anything you think that prevents Julia from being general purpose? The main one I run into is that the garbage collector needs work, but fundamentally I don't think there's much that keeps Julia from being good as a general purpose language.
Python is a GP pl (not a very good one, IMO, but it is one). For a long time for some distros Linux CLI and even GUI tools were in python. Can you imagine such an ecosystem in Julia? I can't, nor would I want Julia to make the sacrifices it would need to make to fulfill such a role. There are so many things that python is "mediocre, but good enough" at, that Julia is quite frankly terrible at, because of the (good for what it does) choices that Julia made.
I wouldn't write a quick CLI using Julia (versus a long-running or test runner CLI - I have done this professionally [0]). I wouldn't write a web server that is expected to take a load (a personal website is probably fine, though probably still quite painful). I wouldn't write Anything embedded.
[0] Wrote a containerized storage block device performance measurement tool. Julia was useful to generate and track statistical distributions of random reads and writes with very clear and concise code.
I also think Julia's concurrency story is not that great. But I'm biased against every PL since I have spent a ton of time in erlang/elixir-land which currently has the only (high level) concurrency story that makes any sense IMO.
I liked when Julia's concurrency was done per-process (still lives on in Distributed.jl, iirc). I get that this has performance implications, especially around spawning and expensive communications, even with MPI, but I kind of wanted some sort of non-locality of data to be acknowledged, and maybe I would have preferred the new threading concurrency to work with the Distributed.jl abstraction. I think it would have been even more awesome if Julia GPU treated computation against gpu arrays in the same fashion as a distributed job... But as of the last time I used julia (it's been a while) that wasn't the case.
I do wonder if you'd see similar reactionary claims about Elixir being a general-purpose language if the community experienced constant shallow and misinformed claims to the contrary (which Julia has in abundance, for some reason). Imagine if every 10th post about Elixir on HN was "I will never use it because I work in embedded and it has zero support there" when Nerves is a google search away.
From another perspective, if we consider "general purpose" to mean "focuses on most of the things I care about", then you could argue (and I'd agree) that Elixir appears more general purpose for a larger number of people. Specifically, most devs work on networked services, and networked services are bread and butter for BEAM languages.
Lastly, there is also a strong tendency to dismiss anything with a whiff of academic attachment as unfit for "real" industry work. This effects not just Julia, but also Scala, OCaml, Haskell, TLA+, and more. Though I do empathize with the perspective, It is often taken to the point of caricature-you can show someone a dozen companies using some tech and it'll be as if you said nothing at all!
There's no TCO in many languages, even proudly functional ones like Clojure, so that doesn't seem to be as make-or-break as you and I would have expected. (Clojure does have recur though, but it doesn't help for mutual recursion.)
Often the lazy pattern for evaluating iterators has bad "mechanical sympathy" and the strategy of "evaluate step 1, write it into a list" and so forth often beats the "step 1 calls step 2 calls step 3..." pattern by an order of magnitude in performance.
You just win so much by keeping the inner loop in the I-cache, having the branch predictor completely focused on one inner loop at a time, etc. Writing/reading the array burns memory bandwidth but the CPU usually likes the access pattern.
True the eager approach can fail terribly if the array is huge and it's particularly obnoxious if you are running 10^6 calculations and just one of them makes a big array, but the lazy approach is one of those lower teachings that gets passed off as a higher teaching.
I think what you really want is a jit compiler that decides what to evaluate on a case by case basis. Julia has nearly all the inputs it needs to do this well… it just doesn’t
It is a lot safer than `from pkgfoo import *` because you get a warning if there is a collision.
$ julia --banner=no
julia> module A
export foo
foo(x) = x
end
Main.A
julia> module B
export foo
foo(x) = x + 1
end
Main.B
julia> using .A
julia> foo(1)
1
julia> using .B
WARNING: using B.foo in module Main conflicts with an existing identifier.
julia> foo(1)
1
julia> B.foo(1)
2
Yes, I should have added that. It intersect with the type system in a complicated matter, since two functions can co-exist in the same namespace as long as their signatures are incompatible.
Which of course means that, if I'm correct that the community is moving away from type signature as much as possible (they will still be needed to some extent to control dispatch), the problem of namespace collisions will increase over time.
I think you're completely right with this observation. In practice much of Julia is duck typed. It uses types to pass information from the outermost caller to the compiler of the inner functions, not to express invariants.
Human annotations get in the way of that, and thus get in the way of composability.
You want to annotate Interfaces in Julia, not types. But we don't have interfaces.
Aren't namespace collisions the feature that makes multiple dispatch so composable? I think this is what allows to replace the built in Arrays by GPUArrays and most of the code should still work because all the major operators and functions have GPU implementations
They aren't collisions so much as overloads. e.g. GPUArrays implements the same base AbstractArray methods. I'm not a big fan of using either (in any language) because it obscures where imports come from, and AFAICT overloads are created even with a plain import.
> since two functions can co-exist in the same namespace as long as their signatures are incompatible.
This doesn't really make any sense. You can only have one function in each namespace, this function can have multiple methods extending that function. As long as you are not doing type piracy you will not implement the same method in e.g. different packages.
The problem with `using` is mostly about that it is unclear from what namespace a symbol comes from since it isn't visible lexically.
I agree with much of what the article gripes about. To me it's not prohibitive, and obviously brilliant stuff is being built in Julia as it is right now. But I just feel that things could be that much better.
That said, I do feel some of the defensiveness around these topics from years past is fading somewhat. Still, while some of the things are fixable in Julia 1.X, others likely will require a 2.0 release...
The REPL-based workflow and the compile time latency just need to go.
Stop defending it. Its a problem. It harms adoption. Fix it and move on and you'll stop hearing endless complaints about it. Sometimes the endless complaints have a point and you need to listen to it.
Coming from a decade-long history with ruby I do think that dynamic, duck-typed languages are inherently flawed, but c'est la vie and Julia is miles better than Matlab. I'm not looking at Julia to be exactly like go or rust, I want automatic differentiation. Don't particularly care about the memory bloat either.
Newbie documentation does need to be a whole lot better. I suspect that would probably get fixed if the compilation and REPL-centric issues got fixed and it wasn't so immediately hostile to your time on day 1.
Package management also needs to get better. Rubygems gets the shape mostly right if you don't look too close at the details, cargo seems to be excellent. Languages created in the past 10 years really should have strong package management from the start and not treat it as an afterthought and not try to reinvent the wheel without understand the successful systems that have come before them.
But really its about the compilation and REPL-centric issues. That should have been a requirement for 1.0, and I can't believe that a language that is nearly 10 years old could have that bad of an initial workflow for users.
I don't think comments like these are helpful, or get us anywhere. The latency is not simply a bug to fix - it is a consequence of the compilation model, which is also what makes Julia so great.
It's like wanting to use Rust, but demanding that the Rust devs fix the annoying problem that you have to compile your code before running it. After all, Python doesn't have this problem, so why don't the Rust devs just stop being lazy and fix it like Python did?
Dynamic scripting languages like python and ruby all have JIT compilers, that isn't particularly unique to Julia.
In comparison the startup costs of Julia are quite painful though compared to those.
> The latency is not simply a bug to fix - it is a consequence of the compilation model
This statement would require some actual formal proof.
And I don't need to recompile glibc or openssl every time I make a binary on Linux, and I don't feel like I should have to recompile Plots.jl and all its dependencies in Julia every time I want to make a change in my script that uses it.
Yes, Python and Ruby both have JIT compilers, but they do not operate in the same way that Julia's JIT operates.
Numba is a tracing JIT -- Julia does not do that. It precompiles a static CFG (as much of the CFG as inference can concretize) before runtime.
For the parts of the CFG where inference cannot explore call edges, special calls are inserted which allows the runtime to return back to inference when the types are known.
Julia does not have a fallback "tracing interpreter" at all, it's all compilation. When compilation occurs and how it occurs for any specific user program depends greatly on how abstract interpretation learns about the CFG.
As to your latter comments, they are all false as well. Julia does not recompile Plots.jl every time you make a change to your script -- Plots.jl precompiles once, and only recompiles if a method definition invalidates something which has already been precompiled. The specific mechanism/relationship which Julia uses to detect an invalidation is called a call backedge -- you can think of it as a relationship between callers and callees, but designed to handle multiple dispatch and the specialization that that entails.
The first time precompilation is slow -- because Julia is literally running type inference and then caching all parts of the CFG which could be inferred. But unless you doing things which would (in general) not be performant (like invalidating a ton of cached method instances) -- the full precompilation stage should never occur again.
Only quibble i have is that Numba is actually an Just AOT method at a time JIT, last I checked in 2018...just much more rudimentary and limited than Julia.
Most compiled languages allow for incremental build. I can understand that a clean compile needs 15 minutes or whatever. But why does it take long when you compile a small change? (or maybe this has been fixed?)
It doesn't take long to compile a small change. But the way compilation works is different than other languages. Julia doesn't cache compiled code (well, it caches some of it), because that's fiendishly hard to do in a dynamic language that allows dynamically redefining what functions mean. So whenever you start a new Julia process, it has to compile tons of things from scratch.
You can use the Revise.jl package, however. It keeps track of the files you've imported and updates the code whenever a file is changed. This causes only minor latency, similar to an incremental build.
Julia isn't that unique, most dynamic languages compile code on startup into bytecode and execute it on a JIT VM, and they all allow dynamically redefining the world at runtime as well.
Revise.jl inside the REPL is annoying as hell to use when you start messing with structs.
It isn't a big ask to just have `julia whatever.jl` not take minutes to compile the entire world. Stop defending it like every Apple fanboy on mac forums trying to convince me that the touchbar is great.
And Revise.jl kind of proves the fact that incremental compilation is entirely possible. If it wasn't possible at all due to $maximum_dynamic_insanity that wouldn't even work.
> Revise.jl inside the REPL is annoying as hell to use when you start messing with structs.
> And Revise.jl kind of proves the fact that incremental compilation is entirely possible. If it wasn't possible at all due to $maximum_dynamic_insanity that wouldn't even work.
I agree it's disengenuous to say that Revise is the answer to all problems when struct redefinition is still not possible. Yes you can toss them in a module, but then why allow defining them in global scope at all? Thus far I haven't seen any theoretical limitations either, since Common Lisp variants seem to be able to do something similar.
That said, I would not look at most other dynamic languages for this. Having a mandatory VM is an express non-goal in Julia, and if you want one then you'll have to face off against all the other people asking for better static compilation. Again, compiled lisps show us that a VM isn't even necessary to get this level of dynamism or low latency incremental compilation.
The claim around having to recompile the entire world has already been addressed in a sibling thread, so I'll not rehash it here.
I've never seen anyone say that latency is a good thing or that it's not a priority. It's a hard problem and it's being worked on and it's getting significantly better — particularly over the past three releases. It's disingenuous to suggest the latency gripes aren't being heard or respected.
Now what I see frequently is that folks *do* regularly suggest workflows that work for them. Sometimes too zealously, yes. And indeed REPL-based workflows aren't for everyone.
On documentation I whole-heartedly agree. The manual hasn't seen a major re-structuring since its inception AFAIK — and it's long overdue.
All languages make compromises and Julia has decided to not compromise in execution speed. I say this not because I don't think compilation time sucks but because 10 years is not that much time for solving a complex problem. How many years in Python without proper parallelism (I don't think the answer is the multiprocessing lib)?
I think Julia package system is good but since I come from Python maybe it doesn't mean that much.
You do know that it is actually a top priority, right? And you have seen the measurable (order of magnitude) progress done since v1 was released, right?
This is not even getting into the interesting reasons for why this problem exists and what amazing features are enabled after this tradeoff was taken.
I'd buy that argument if it was still around 2017.
I get to own my own reaction to the language and after I did "julia mytest.jl" on a script that plotted a differential equation I thought to myself "oh well its just a new language", then I looked up that it was nearly 10 years old and I thought it was a bad joke.
"WTF does `activate .` even do?" If you then read it, it says "If you are ever stuck, you can ask Pkg for help:".
(@v1.6) pkg> ?activate
activate
activate [--shared|--temp] [path]
Activate the environment at the given path, or the home project environment
if no path is specified. The active environment is the environment that is
modified by executing package commands. [...]
Also the docs tell you in a highlighted blue bubble that if you don't want to use the "REPL-centric workflow" there is also the API:
> This guide relies on the Pkg REPL to execute Pkg commands. For non-interactive use, we recommend the Pkg API. The Pkg API is fully documented in the API Reference section of the Pkg documentation.
Most other language have cli gools like `gem` or `cargo` and if you're in a directory with a Project.toml why do you even need to remember to do that all the time?
I love julia and wouldn't mention this if I hadn't written a big chunk of data-analysis code with it...
It's a small gripe, but one thing that really bothers me is that they gave up on libreadline for the REPL (for what seem like fairly superficial build management issues, looking at the original email threads) in favor of a custom solution, which
* doesn't support vim-like or emacs-like keybindings, and
* won't read my .inputrc file and my customizations
It's just maddening using julia's REPL without this. It's easy to dismiss user complaints like this, but I can go back and forth between bash and into the python REPL and all of my line discipline actually makes sense, but then the julia REPL is broken and clumsy.
Also, pdb.set_trace() is a lot easier and more productive than what Debugging.jl has to offer.
Have you tried Infiltrator.jl? It's great for breaking out into an interactive REPL from some inner scope, allowing you to examine the local variables and other program state interactively. This covers the part of the functionality of pdb.set_trace() which I care about.
You might have some sort of dangling config issue somewhere? I use the emacs keybindings in the Julia REPL all the time. They work fine out of the box (at least here).
Based on https://docs.julialang.org/en/v1/manual/embedding/, this seemed fairly simple, and as though I could avoid repeatedly paying the compile time latency cost by just calling my embedded functions at least once before the performance-critical loop (to ensure that they're compiled). Is that not the case? There's definitely still the high memory overhead to consider, but (admittedly without having tried it yet), embedding Julia doesn't seem too terrible to do. Worse than e.g. compiling a shared object library or something, but not at all unreasonably hard.
A rule of thumb is that 90% of unhappy customers don't complain. So it's likely that a lot of people are experiencing these same problems, but not voicing them. "Thank you" seems to be missing from a lot of comments.
A good response might entail: what is the core team is currently prioritizing, why, and how will this feedback help guide those efforts?
Also, take heed of the number of occurrences of the following statement, which I support:
> "community is hostile to feedback"
Except an MIT licensed language and compiler implementation isn't a business.
It's open source software. And despite the fact that other users have referenced the recent valuation of Julia Computing, that doesn't change the fact that Julia is open source software.
Of all the complaints I've seen in all the Julia posts in HN -- how many complainers want to pick up the compiler and help fix it? Reminds me of a Linus Torvalds quote.
Upset at compile-time latency, REPL-based workflows, lack of static compilation, etc? -- please come join us and make it better. In fact, I guarantee that working in the guts of a system like this will endow you with an incredible set of skills -- and it is possible with Julia to do serious work with the compiler, and learn things that will make your head spin (abstractly).
Yes, we want the best language and compiler possible -- yes, there are issues right now. Yes, there are plans to fix these issues. But ... there are not enough talented, smart people working on this stuff. And there is enough low hanging fruit that a bit of effort can have a huge impact on the community.
Does this 'MIT licensed language and compiler implementation' face competition? Does it receive and act upon customer feedback?
Try replacing the word 'customer' with either 'user' or 'developer.'
A programming language is a product that is packaged, promoted, and distributed by means of open source to a target market of users for a price of $0. All the same rules apply whether we like it or not.
Short list of current priorities (in no particular order)
Making it easier to interface with the compiler (this is step 1 on better ability to static compile/debug)
Better garbage collection
More performance
Reducing the memory footprint of small strings
Lowering the overhead of multithreading
It's worth noting that Julia doesn't really have a core team. A lot of the development is done by a pretty large community of developers without any central management.
The short answer is that changing the type system to allow multiple inheritance (traits) involves solving some really hard and open-ended problems (eg method specificity). Solving this is probably breaking, and requires a ton of work on a problem that might not even be solvable. If I had to guess it will be solved by the language that replaces Julia in 20 years.
Thanks for the info. What worries me is that you can replace 'Julia' with 'Scala' in that sentence and a data scientist wouldn't know the difference. It's already fast. If Julia wants to win in data science then they need to poach users away from other languages.
As much as many of us would like it to be, the kind of data science work you see Scala used for is a pretty small part of what Julia is used for.
I think a big part of that is because DS rarely involves writing fast numeric kernels or hot inner loops, i.e. user code that needs to do numeric stuff quickly. This is in large part because very large organizations have poured untold millions into libraries that already handle this (e.g Spark).
In domains where this has not happened or that have more bespoke requirements (e.g. modelling and simulation), something like Julia is far more compelling. That's not to say it's not viable, but unless more practitioners start feeling stuck in a rut [1] I don't see the mindshare changing dramatically.
> In e.g. Python, you are not going to run into types you want to subclass, but can't.
In python, you can only really subclass things that were designed to be subclassed. Subclassing arrays, dataframes, or anything from xarray or dask is explicitly recommended against by those packages. In practice I only subclass the base scikit-learn transformers, Abstract Base Classes, or classes I've written.
This article is painfully true. Julia is full of gems, but also features so many warts everywhere that I'm not sure whether it will be able to overcome the latter – I hope it will.
I would add to the author's list the subtle differences between REPL and scripts, the weird obsession of Julia for embedding slightly different and out of date versions of its dependencies (BLAS, uv, LLVM, ...) that makes it a nightmare to package, the painfully slow documentation website,
the sometimes surprising function names, and wholeheartedly agree on the failure that is functional programming handling, both in syntax and implementation – which is surprising for a language stemming from MIT.
Given that R and Python are so mature and active, this really leaves not much space for Julia. Yes, Julia is faster than R and Python (just the language, not packages), but R and Python can easily integrate with C/C++ and many R and Python packages are in fact internally C/C++. This really leaves not much advantage for Julia against R and Python.
If Julia had appeared earlier, it may have had better position in the data analytics programing ecosystem, and gain more attention and opportunity to grow. Now, the whole data industry is booming, but people would rather contribute to the matured R and Python for real work, instead of waiting for Julia to grow.
I don't think that's so convincing, actually. When I learned programming, Python was _just_ replacing Perl as the "common language" in bioinformatics. Until then, there were lots of people who claimed Perl would essentially never be dethroned because it had become the standard.
The issue is that while _some_ users only experience the surface level of packages like PyTorch, Numpy, SciPy, Pandas, and so on, people who actually have to develop these packages must work with the C++ and C underneath. To the developers, the problem of Python's performance is front and center, and a real issue they face every day.
So, the developer folks will make the switch first, as they realize it's much nicer to develop Julia packages in Julia than it is to develop Python packages in C. Over time, this will mean the development power switches to Julia - and as soon as these developers have build tools as good or better than existing tools, the majority of users will slowly migrate, too.
> So, the developer folks will make the switch first, as they realize it's much nicer to develop Julia packages in Julia than it is to develop Python packages in C.
Not so fast. You should also consider that writing the core in C/C++ allows some of those packages to offer bindings for several languages other than python (for example, both tensorflow and torch have bindings for Go, .NET, Rust, Haskell, R, Java, JavaScript).
This significantly expands the use-cases that these frameworks can cover without requiring a complete rewrite for every platform. For example, it is relatively easy to have data scientists train models with Python and deploy such models to mobile/edge devices and web apps.
That's true, but for most packages, that's not really an issue. I.e. there might be a tensorflow library, but there is no libnumpy or libpandas, as far as I know.
As I also mention in the blog post, if you set out to create fundamental library software like FFTW, you should indeed use a static language, and Julia is no good.
What is your point? (I honestly do not understand.)
Blackbear's comment was that writing libraries in C allows those libraries to be deployed broadly in many compute environments.
Jakob's reply (as I understood it) was that outside of the big Deep Learning libraries, this has not really happened. There is no C implementation of Pandas that allows for redeployment in other non-python compute contexts.
My point was that, with Arrow, this type of cross platform compatibility is coming to python dataframe libraries. You can prototype Dask code that runs on your laptop, then deploy it to a production Spark cluster, knowing the same Arrow engine is underpinning both. Or at least that's the vision. Obviously Arrow is still relatively young. But the point is, it's far from certain that the long-term global optimum for the ecosystem isn't sticking with "all libraries are written in C".
In response to "rise of," I too was excited about Arrow until I played with it and realized it didn't even provide a shape attribute. Anyways, people shouldn't be dependent on a low level lang like C to write fast code.
Fair. I agree Arrow is still more of a vision than anything else.
> it didn't even provide a shape attribute
I suspect this has to do with the project's focus. I think they aspire to be a back-end to DataFrame libraries, which are generally 2d. I think they (correctly) are ceding the "n-dimensional tensor computation" space to the current incumbents.
Arrow is getting support for N-d arrays, so if anything they're expanding in that area (which is exciting). I don't think they're interested in creating a universal libarrow though, the point of the data format and C data interface is to have languages define their own implementations.
For sure, I didn't mean to imply they weren't looking at compute too! https://github.com/apache/arrow-datafusion is another example of the shared compute vision. What I was trying to point out is that (at least for Arrow core) they seem to eschew FFI and generating shared libraries in favour of from scratch implementations in other compiled languages and direct bindings in interpreted ones.
This is doable in Julia too. DifferentialEquations.jl has python and R bindings. I expect more to follow as Julia starts getting more best in class packages.
I mostly live in the Rust world, not the Python world, but I worked on two Python packages wrapping over my own Rust libraries and it's been pretty great.
> This really leaves not much advantage for Julia against R and Python.
It's a bit of a chicken and egg problem. Really the argument is the problems with dual-language model will never go away, so Python & R face an ongoing cost disadvantage in the long run. However, they are also much more mature.
So assuming previous assertion is correct, you need to get enough people interested in Julia to grow to the tipping point.
I'm not really sure it will work, but can see the argument. After all technically superior platforms for this existed when python got started (heck, early python was next to useless for data work beyond file parsing) but that didn't dissuade it's use. The non-technical arguments made it a good-enough language and once that had momentum it's compelling.
On the third hand, what julia really needs to attract right now is mostly library/package writers - that's a tiny fraction of the users of python, but basically steers "what's next". However, for library writers, how many people might use it can be a deciding factor on investing the time...
> for library writers, how many people might use it can be a deciding factor on investing the time
Yes. Especially given that a high percentage of library development these days is being done by industry, not just academia. (At least in machine learning.) They're not writing packages for fun or for academic citations, they want users.
Julia does appear to have a higher than usual proportion of package authors to total users, FWIW. I think it's both a blessing and a curse, because feedback around gaps in the ecosystem is sometimes brushed away with "write your own package". That is, because a large part of the community writes packages to scratch their own itch (I've yet to see one written for citations), they expect others to do the same (which may not be fair).
I think Julia has a pretty good path forward since it is often faster than C++ (especially once you consider language interop overhead), and it is much more pleasant to write than c++. Since Julia and most of the packages are written in Julia (for the most part), Julia makes it really easy for users to become developers. Python and R have much bigger barriers to entry since users will usually hit a binary blob, which makes it much harder to interact with library implimentations.
I dunno. I have written FFIs in many languages and by far julia was the easiest to call a c abi in.
I would say if anything what holds people back in julia is a drive to RIIJ instead of calling the c abi (despite that julia's linalg basically does exactly that with BLAS).
> Python is plenty fast if you use NumPy / CuPy / SciPi.
I think this misses the point. You are basically saying that python is fast enough in cases that someone has written already written a fast package in not-python (plus bindings), whereas the value proposition for julia is basically: writing such packages in the future will be lots better.
Obviously the average user doesn't care, but if enough whiz-bang stuff starts to come out in Julia but not python, they'll move.
I think it will be interesting if enough clever new whiz-bang stuff comes out particularly based on Julia's metaprogramming powers.
However, when talking about the "two language problem", the main issue traditionally was prototyping computational code in Python / Matlab / whatever and then having to drop into Fortran / C / Cuda for optimizing it. That's pretty much a solved problem in Python now - thanks to the well developed ecosystem, you don't really have to leave Python to write fast numerical code.
The first problems I came to Julia to solve were ODE simulations in which it's just as critical that the stuff I write and rewrite over and over again (the actual differential equations of my model and associated simulated controller callbacks) runs fast as it is that the pre-written stuff of the library (the solvers or integrators) runs fast. To the best of my knowledge I can't do that in python.
This example (DifferentialEquations.jl in particular and SciML.ai in general) is one of the places where Julia leads the pack.
When I first used Julia it felt rather strange in a lot of ways (I had used python for about 20 years), but after 2 years when I go back to python (and of course there are places where it makes more sense to use python) it feels primitive as there are a lot of very nice features of Julia that I suddenly miss.
> That's pretty much a solved problem in Python now - thanks to the well developed ecosystem,
I don't understand this argument. The same C and Fortran libraries that make numeric Python fast can be called just as easily from any other programming language, just like in Python.
> you don't really have to leave Python to write fast numerical code.
Well, you need to, for example, if you want to write a new numeric algorithm to call from python. As in, not use a numerical solver, but write a numerical solver from first principles. For that, Python is mostly useless and you'd need to use C or Fortran (or Julia).
It should go without saying that Julia also has had a C FFI for years. And a Python FFI. And an R FFI.
The C++ one was dodgy last I looked (which was several years ago; I'm sure it's better now). In practice I typically wrote explicit C wrappers, restoring genericity on either side of the wrapper. Faintly tedious, but nothing a couple macros couldn't solve.
Yes, Julia has them, but if you use them, it basically renders one of the promised advantages of Julia ("it does away with the two-language problem") moot, as you now have two languages anyway. If you have to languages anyway, why not use one of the more established languages?
----
I personally don't subscribe to the idea that the "two-language problem" is a problem that needs fixing. In the end your stack will have multiple languages anyway, unless you rewrite everything is Julia, which is utopian, so having one more isn't really a problem (and it might also be one that you are already using).
> Yes, Julia is faster than R and Python (just the language, not packages), but R and Python can easily integrate with C/C++ and many R and Python packages are in fact internally C/C++.
This is what I was responding to. The grandparent argued that Julia had less library support than R and Python because, so they implied, those languages could interop with C/C++ in a way Julia can't.
The two language problem refers to code in user space, anyway. It's not like Rust, where there are so many wheel-reinventing projects in progress. If you peruse the main Julia package registry, and you'll see plenty of library-wrapping modules.
This is probably a "me problem," but I have had a hard time moving beyond the tutorials to actually getting work done. For example, I am working on a project right now where I would typically use a dict of dicts in python and then use map functions so that I can process the keys in parallel. When coming to python from perl, ruby, and c++ I found that I already had a pretty good intuition about how this would work.
I read the documentation, but could not make as much sense of it as I should. I searched online for example and found a few hits on the question, but not nearly as many as I would in other languages, and if there were answers then they were often marked "wrong," or I could not get them to work with my code.
After a lot of searching and trial and error I finally got that working, then started looking into how to run this on multiple nodes and processors and got completely lost.
I am sure the documentation is there and that I could do a better job figuring it out. I am also sure Julia would be a great match for the type of work that I do, and I am going to keep trying. But from what I have seen so far it feels like the pain of trying to migrate from Python to Julia is pretty difficult to justify.
This is definitely not just a you problem. There are resources out there like [1] and [2], but much less in the way of curation and a dearth of end-to-end tutorials/walkthroughs.
I'm not sure how best to improve the situation, but the current state of things leaves much to be desired. If you're willing, don't hesitate to post about your experience and any feedback you have on the community forums.
It is difficult to chain a value through a series of function calls. Meaning you can't do something like `a.b().c().d().e()`. You either have to store intermediate results in variables, call it like `e(d(c(b(a))))` which IMO is less readable, or do something like this:
a |> x -> b(x) |> x-> c(x) |> x -> d(x) | x -> e(x)
there is some hope of this getting improving in the future, but in the present, it bugs me.
One thing that I miss coming from python to Julia, is typing-as-documentation. In python I often write type annotations that are much more restrictive than they need to be. They work as checkable documentation of how a function is expected to be used (and under what circumstances do I guarantee it will work), without restricting how it can be used.
I think this is a real conflict. In my opinion (not just mine), the only reason to write type constraints on a method definition in Julia is to control dispatch. Adding types to method arguments for the purposes of documentation is counterproductive to generic programming.
I used to think this was true (as a developer of a lot of generic Julia code and small data analysis applications).
But now as a developer of larger amounts of "application style" code, I'm not so sure. In an application, you've got control of the whole stack of libraries and a fairly precise knowledge of which types will flow through the more "business logic" parts of the system. Moreover, you'd really like static type checking to discover bugs early and this is starting to be possible with the likes of the amazing JET.jl. However, in the presence of a lot of duck typing and dynamic dispatch I expect static type inference to fail in important cases.
Static type checking brings so much value for larger scale application work that I'm expecting precise type constraints to become popular for this kind of non-generic code as the tooling matures.
You're not wrong. I guess what I'd like is the ability to apply (up to) two type annotations, with the second one a subtype of the first, and use the first for dispatch and the second for documentation/testing/static analysis...
... which actually seems like it might be doable with some macros? Those are beyond my ken right now, but the goal would be
@doubly_typed f(x::Number|Integer) = x
f(6.0) # Same as g(x::Number) = x; g(6.0)
@strictify f(6.0) # Same as g(x::Integer) = x; g(6.0)
Then you would use @strictify when running tests to ensure that the stricter types in your codebase are all compatible. But you'd still need to figure out what to do about return types and the help command...
As an R user, my issue with Julia: what niche is it filling? R is established in statistics and data analysis, Python is established in general-purpose and data analysis and ML, so why should someone learn Julia?
The "main advantage" of Julia being faster than R seems irrelevant to me, R is plenty fast enough for my one-off statistical analyses. Further, I can't justify spending my attention on learning yet another language for analyzing data, given I already learned SAS, SPSS, R, STATA, Excel, and Python.
That's probably because you didn't try to write anything significant in R besides running an analysis.
R is an interesting language which adapts "ok" to statistics and related fields, however it's very limited by today standards, completely REPL oriented and very, _very_ slow. In fact, the first thing one learns in R is to never loop on any set larger than a few hundred elements if you want your script to execute at decent speed.
All the R speed is actually backed by fortran and C/C++ extensions. This goes from the R core to all the scientific and biosciences packages.
R as a language has an extremely primitive interpreter and GC runtime. It's often much slower than python or perl. Although given the primitive GC and runtime, it's quite straightforward to write extensions for it. But *nobody* really wants to change language (whichever it is) to write extensions just to get some speed back. Especially when writing these extensions bars you from the entirety of the R ecosystem itself.
Mmm, very good points. I used R for quite a few high-level statistical analyses, I found it handy for munging data and .rmds are nice for organizing a systematic set of chunks to make the code logical. Sometimes I ran into speed issues, which is when I would reach for data.table, which is quite a huge speed boost for some of R's slowest issues.
I had to learn the basics of STATA for a modeling project, because the other researcher knew STATA (these network effects of collaborators knowing different things). Turns out fixed-effects multivariate logistic regression modeling is WAY faster in STATA than R, to the point where our infra just couldn't complete the R code in any reasonable timeframe so STATA was just the better pick.
My main gripe is especially in healthcare, so many workers are trained in something in their 20's, then use that for decades. There are thousands out there still using SAS every day, for instance, because it's what they and their collaborators know. I suppose Julia and Python will grow in healthcare over time, SAS should in theory go down in usage...who knows!
People who is tired of wrapping C++ or Fortran libraries. It's not exactly a drop in replacement but it's almost there. If speed is not relevant to you then it may not be a good choice unless you are into language learning for the sake of it
Nice article! I agree with many of the points. If I know the Julia community well enough, criticism like this leads to constructive discussion and eventual improvement.
One section I don’t get is that about the iteration protocol. It seems like the author is unhappy about the state being passed around, and the “implicit” trait methods one should implement.
The former point appears to be a consequence of the lack-of-object-orientation (which is by design) while keeping object attributes vs state separate wherever possible (which is arguably a good thing in general); I don’t see how else this can be done in a language that only has structs and functions, honestly. Also, the fact that iterators can be themselves stateful is something that can happen in Python too (one is just “expected” not to hold state information inside Iterable types, but there’s no guarantee of that by just looking at the type). This seems hard to avoid. Maybe a trait could be used in Julia to provide this information about an iterator type?
The latter is the same problem as “implicit traits” highlighted in other sections of the article, so not really anything specific to the iteration protocol.
Indeed, the latter problem is the same as implicit traits.
For the former, you can absolutely do it without having object-orientation or anything like that. Simply lower the iterator protocol to
```
itr = iterator(x)
while (i = next(itr)) !== nothing
# stuff
end
```
Then `itr` can be mutable, even when `x` isn't. This is what most modern languages does, and it solves all the problems I mentioned, since the state is stored in the mutable iterator, while at the same time, `x` can be unaffected.
Well, to be honest I still don’t understand the difference: they seem to me like two different styles of structuring the same kind of logic. In fact, the linked list example in the referenced article from Mike Innes I don’t find very convincing, stated as it is: I would like to see that example concretely, to understand how the current Julia protocol would fall short there (“leak”).
Same for the `zip` example in the OP here: I’m not sure the Python `itr = iterate(xs)` kind of protocol makes it any simpler. But I’m on holiday and cannot try it out right now—I owe you one here :-)
julia> open(read, "/home/jakob/Documents") # yes, a directory
UInt8[]
I'm not sure if this is true in other languages but it is also true in Java when you attempt to read a directory as if it were a file. I ended up adding code to test if it was a directory to avoid the issue. Does this happen at a lower level?
It's built into the semantics of the OS. Unix directories are ordinary files, treated specially. They can be opened and read (and written to) like any file. If you have a more primitive pager than less, you can sometimes see this by running it on a directory; you will see binary garbage with file names embedded in it, like whole pieces of corn in a turd.
That said, if your HLL doesn't provide a better abstraction when reading a directory with the standard file I/O primitives, it is indeed a serious bug in the language. Don't do that, language designers and standard library authors!
It originally threw an error in Julia, but it turned out that checking if a file is a directory is a surprisingly expensive system call, so it was removed.
It seems the issue is stuck, since most people agree the current behaviour is bad, but forcing every file read to do a heavy system call is not a great solution
@jakobnissen
you have a typo in the first section:
"Among Julians, latiency is"
- Sorry, I would have nudged elsewhere but could not find an alternative venue to contact you.
One major gripe I have is the infamous "TTFP"(Time to First Plot) problem is more or less solved, these improvements haven't manifested in any other areas.
Here is time to first read csv problem. I am on newest Intel Macbook Pro
cat read_csv.jl
using CSV
using DataFrames
df = CSV.read("hello.csv", DataFrame)
println(df)
cat hello.csv
col1,col2,col3
1,2,3
4,5,6
time julia read_csv.jl
julia read_csv.jl 10.48s user 0.37s system 102% cpu 10.609 total
Referring to start-up latency, the author asserts:
> But the problem is fundamentally unsolvable, because it's built into Julia on a basic design level.
Citation need. Today, the Julia compiler runs in JIT mode. But there is nothing preventing it from being ahead-of-time compiled (indeed, some work is going behind the scenes to make this happen). There is also an interpreter for Julia. I don't see latency or large memory footprints as being "fundamentally unsolvable".
Yeah, though the solution for it basically seems to be "throw out all the previous code that was highly coupled with LLVM and actually write our own fast JIT compiler and interpreter from the ground up". It's possible, but probably takes an incredible amount of work and would maybe involve cloning a copy of Mike Pall (the creator of LuaJIT) to help on the project.
My response's more about the dynamic execution side of things (latency & memory bloat of Julia). Static compilation is a whole different topic, but I'm confident that it can be solved with existing LLVM infrastructure (because what's LLVM good at? static compilation!) The hardest part may be how to modify Julia's semantics to fit that paradigm (for instance, the need to introduce "header files" in the language, and all the gnarly details coming with it)
Wow this is helpful, I’ve had a slowly growing interest in julia due to all the positive threads in HN, but that latency forcing a repl dev or notebook dev is a nonstarter. That’s a big limitation.
And from what I’ve seen of dev done in python notebooks for data analytics, now I’m concerned about the code quality of Julia projects. Does Julia have good linters for notebooks if people are pushed to use them due to the latency?
I don't think there is anyone developing packages on pluto/jupyter, so I wouldn't worry about that. The most common method for that should be using an editor like VSCode (which will have some linting capabilities) with an open REPL and Revise [1]. What it does is every time you save any of your files (with some known restrictions), it will automatically and incrementally update the state of your application in the REPL, allowing you to probe your code whenever you want (with tons of introspection methods, up to interactively inspecting the native code being generated), and since you never leave the session you don't face the compiling latency more than once. I end up preferring this workflow for experimenting and data science stuff since it retains the structure and tooling of an editor with the ability to interact with my application (and really miss it when I'm writing Python applications), but of course each one has a preferred workflow and it'd be nice if Julia supported more of them as well.
>The experience was not that my program became more safe in the sense that I could ship it without sweat on my brow. No, it was that it just worked, and I could completely skip the entire debugging process that is core to the development experience of Julia, because I had gotten all the errors at compile time.
>And this was for small scripts. I can only imagine the productivity boots that static analysis gives you for larger projects when you can safely refactor, because you know immediately if you do something wrong.
Getting the program right the first time can't _possibly_ apply to programs larger than a small script. Type errors are only 2% of all bugs in large programs with companies behind them. When a big Rust program fails, then you have to contend with the fact that you can't even set a debugger checkpoint where an error occurs:
Obviously people wouldn't be asking for that if the type and borrow checkers caught all the bugs at compile time. Not even close!
And not only can't you debug errors, you also can't change the program at all. All you can do is terminate the program, generate a new one from modified source code, and run the new program. I hope that bug you were working on is simple enough that you can reproduce it with a unit test!
LOL Rust problems!
Now if only Rust had resumable exception handling, could be incrementally compiled, allowed function and type redefinitions at runtime, and wasn't designed as if we were still in the punched card era but with files instead of cards. That would give you far more of a productivity boost than a type checker.
why is this an issue? It's generally quite rare to leverage laziness (like with infinite lists and generators or something). And it causes errors to emerge away from the function callsite.
I haven't thought about this issue much.. are there other issues with eagerness?
In a functional language `map` and `filter` are nice to chain together. For instance in Haskell:
[1, 2, 3] & map (* 3) & filter (> 4) & sum
This takes the array, maps it, filters and sums it in one go. If the array was say 1,000,000 elements and you used multiple `map`, `filter`, and other similar functions you'd soon run out memory (or use much more than needed) if they were eager and allocated a million-element array at each step as opposed to running through the array one time.
The alternative is to use imperative style (for loops and such), but the functional style is often more declarative and easier to read and write (think SQL).
I guess even after all this time using these functions I'm still not entirely comfortable with the lazyness. I think a more explicit method in Clojure is to use `eduction` (if I'm not misunderstanding!)
https://clojuredocs.org/clojure.core/eduction
This allows you to be more explicit and say you want to chain these transducers together.
The lazy variant makes it more difficult to reason about performance (and this being a "math" language - that seems a priority) b/c you need to read the tea leaves to see where the sequence gets realized. It really requires a gear shift in reasoning about your code b/c you need to think "okay, I'm writing this huge reduction, but I don't care about its size b/c it won't all get realized b/c I did this other thing later on". My gut feeling is it's a bit hard to keep track of everything :) But I'd be curious to hear an alternative..
I remember writing some math code and finding it was really challenging to get the lazy operations to chain together well. It was over a year ago, so the details are a bit hazy.. but I needed it to run down an array of numbers and do some operations and this lazy method ended up being incredibly slow. Unfortunately I don't remember the ultimate cause, but I'm guessing it thrashed the cache. It could also just be a limitation in Clojure internals when it comes to compositing lazy seq's - but again, I'm not quite sure.
The issues with the type system and weak static analysis are the biggest gripes for me. The other features of Julia still make using it worthwhile, but it's rather frustrating to code compared to any language with good static type inference and a powerful type annotation system.
That's one of those things that are never going to change. Julia's very specific type system is one of the major features and so much of the language and the ecosystem is build around it.
Can someone from the core team chime in on the general thinking around the type system/trait stuff? Most of the rest of the complaints are actively being addressed or have been thoroughly discussed, which leaves me unconcerned.
Haven't heard much about the type system stuff in years, or it's just been "duck type and thtt works ok". Is that the consensus? Honestly, that's a pretty valid perspective also. Julia works amazingly well and there's no other contender for the same combination of speed and composability. But, I feel like things could be easier/smoother.
Is there a general design plan, or maybe there's a desire/intent to do something but it's going to require iteration down the line?
Not a core team member -- but this comment confuses me.
What exactly do you mean by "duck type and thtt" (?)
Also, people are constantly asking for traits. But I don't think I've seen a compelling example where: this is something that you can only do with traits, but you can't do with multiple dispatch.
Perhaps what you're arguing for is the ergonomics of multiple abstract inheritance (?) So a concrete struct can inherit from multiple abstract types (?) So maybe it's not a comment on utility, but more ergonomics.
Can you elaborate on where the hole is? Jeff Bezanson's thesis work on Julia was undoubtedly PLT, as were some of the other folks at the Julia Lab now (e.g. Shuhei Kadowaki). Keno Fischer is very much into all things PLT, as can be evidenced by his work on Diffractor (see this ACM SIGPLAN presentation [1]). Ditto for James Fairbanks and all the people working on Catlab.jl's [2] ecosystem.
Where I do agree with you is that PLT awareness has not permeated the entire community as much as I'd like to see. Certainly one feels a need sometimes to push back against increased "Matlab-ification" of the language. However, it is also nice to see a direct line between rather theoretical concepts like abstract interpretation and very practical use cases like GPU computation.
I agree there are a number of interesting packages in this space (Metatheory is another).
What I'd like to see more of is things like arrows, higher-kinded data types, etc, that work better with support from the language's type system, and things like parametrized modules and contracts, that just need someone with the right background to show Julia how to do it well.
I'd like to see those as well, with the caveat that I've yet to see a language implement some of them performantly (e.g. arguments against higher-kinded types in Rust).
One thing I found about Julia which turned me off was the library used for distributions used 64bit floats while that used for neural Nets used less precision, so it's hard to connect the dots there. Also there wasn't a lot of great tooling for RL
If you specify a distribution using, say, Float32s, you'll get a distribution in Float32s. I'm guessing the machine learning example used Float32s so they would better take advantage of a GPU. It's almost always true in Julia that you can combine your choice of numeric type with other neat packages without the author of either one being aware of the other. Want to simulate ODEs in Float32 for speed on GPU, in Float64 for GPU, or if you have an exotic need for high accuracy use something bigger but slower, you can do it.
> Here is a fun challenge for anyone who thinks "it can't be that bad": Try to implement a TwoWayDict, an AbstractDict where if d[a] = b, then d[b] = a
I've done the exercise, let me walk you through it.
First, there is no inheritance of concrete types, only composition, so I wrote a wrapper:
struct TwoWayDict{K, V, D <: AbstractDict{K,V}} <: AbstractDict{K,V}
d::D
end
import Base: setindex!
function setindex!(d, key, value)
d.d[value] = key
d.d[key] = value
end
julia> d["a"] = "b"
"b"
julia> d
Error showing value of type TwoWayDict{String, String, Dict{String, String}}:
ERROR: MethodError: no method matching length(::TwoWayDict{String, String, Dict{String, String}})
Hm, so length is missing, let's add that
julia> import Base: length
julia> length(x::TwoWayDict) = length(x.d)
julia> d
TwoWayDict{String, String, Dict{String, String}} with 2 entries:
"b" => "a"
"a" => "b"
It should be documented better, but without leaving the repl it's not that difficult to implement the basics, just some function forwarding really.
Now if you ask: did you implement the full interface? I don't know [1], but I do like that julia doesn't force me to implement things I don't need in my code. By not being strict about interfaces, julia solves the expression problem [2], and in some cases this is very useful. E.g. if I want to create a number type and I only want to implement +, -, *, /, and potentially conversion and promotion, then I can do that, but I don't have to worry about exponentiation, trig functions, or whatever people think is encapsulated by the notion of number.
[1] Well, I forgot to write a constructor to enforce the d[a] = b and d[b] = a invariant on instantiation, and there's no support for deletion by key.
This is one of those comments that’s so instructive I’m going to save it, because I’ll probably want to refer to it in the future. Thanks for taking the trouble.
Ok... I read that. I have to say it's very not intuitive. I like Julia overall, but this is an area where I scratch my head and say "Hmmm... why would they do that?" (performance, I'd guess, but it's not very user friendly). Is there a set of guidelines for translating Python/numpy code to Julia because indexing in Python/numpy works a lot differently (and seems more intuitive).
I will also say, a number of times I've thought something like "why would they do that?", and some months later (I'm kinda slow) I would realize that it now makes sense.
Sorry, slightly off topic, but as someone not really versed in Julia, but who sees it frequently on HN (but, honestly, nowhere else, but that could just be the type of industry I'm in), what would you say the closest "competitor" is for Julia?
Or rather, if I am currently using programming language X to do Y, what would be the values for X and Y that should make me really want to consider Julia instead.
Using python for any kind of high performance computing — array/based operations (especially operations in sequence over one array, which will occur in a single pass), distributing work over several processors, moving data to the GPU, even just wanting your for-loops to be fast — all of these are easier in Julia.
Matlab, Octave, R, Perl, Python, in order. Especially consider it if you're a scientist, engineer or mathematician - while it's general-purpose, it's really good at these domains.
The only other language I'm familiar with multiple dispatch is common lisp, and it also has a problem with unenforceable and undiscoverable interfaces.
I'd be very interested to see a good solution for interfaces with multiple dispatch. Perhaps the popularity of Julia will lead to a good solution.
I have to admit never really dug into Julia.
When it was new I read about it as I was looking for a replacement of python/numpy.
But then I was really repulsed by the fact that array indices don't start at 0.
I plan to train an NLP bot from Hacker News 1-based indexing discussions to automatically generate the mandatory thread on it so that nobody ever has to participate in it again. It's so unoriginal that it's a better fit for ML automation than a human brain.
Also these comments just tell us about the background of the commenter, and nothing else. Anyone who has had to translate numerical or scientific algorithms into both C and pre-array syntax Fortran finds Fortran’s 1-based indexing to be a more natural fit, I would suspect.
In practice, indexing is the least of anyone's problems. I switch between 0- and 1-indexed languages a lot and it's never an issue, I don't even have to think about it.
I disagree. When immersed in a language it doesn't tend to be an issue, but switching between them absolutely is. Over my career I've had to deal with a ton of off-by-one bugs in code that was prototyped in Matlab then ported to C++. My hope with Julia was that the language was fast enough that you wouldn't have this divide between prototype and production, but given the overhead in using it anywhere but a REPL, it doesn't look like things are going to work out that way.
On it's own, 1-based indexing certainly isn't enough to keep me from using a language though.
This really isn't a big deal. I switch between C/C++/R all the time and mixing zero/one indexing isn't some huge mental burden or source of intractable bugs--you get used to it after a few days.
It's probably just a huge bias, but I feel that, as a programmer matures and experiences more and broadens his or her horizons regarding programming languages, he or she takes a road from dynamically typed languages to statically typed languages, often going through lisp as an intermediate step.
That's what a programmer mid-way to maturity would do.
When the programmer finally reaches enlightenment they see that static types have their own burden, and Lisp or Smalltalk or Forth or Erlang, can be the final step, not some gateway drug to static types. Heck, C and C++ ain't bad themselves.
I don't think this is universally true, no. Historically there have been large migrations of developers from traditional statically typed languages towards Ruby/Python/JS/Groovy/Clojure. In fact, Clojure was designed by and is used by well experienced developers fed up with the added complexity of type systems.
By "traditional statically typed languages" I assume you mean things like C or Java, whereas I'm talking more about Haskell-like type systems. Perhaps that's really an optional first step in the journey: from [Java to] Ruby/Python/JS/Groovy/Lua [to Lisp] to a language with a real type system. The further along you go, the more mathematification your programming undergoes.
I think this is a bias. There has been a move from Java/C++/C# static to Python/Ruby/PHP/JS dynamic but now people are adding types back to Python, Ruby, PHP and JS. There is also a move from Java static to SML-like static (so Rust for example). Static and dynamic are not the only way to categorize languages and hide some importants things.
But doesn't the fact that people are adding types to dynamic languages, and the fact that TypeScript is a very popular way to write Javascript, and, as you say, that people are moving from terrible static type systems (Java) to better ones (Rust) prove my point?
I think some people could (and will) argue that you haven't tested real dynamism until you've used Smalltalk or Common Lisp and that there could be a shift from Typescript/Rust to these. That's not my opinion, but it seems fair. There are often cycles like that, with each side getting better at each cycle.
Turns out it is a regression introduced by a commit dated Sep 23 2020. In libuv.
As I tried to fix it, I realized Julia uses a version of libuv that is significantly diverged from libuv main branch ("123 commits ahead, 144 commits behind libuv:v1.x" as I write this comment) and the bug was in the code. Using the code from the corresponding function from libuv upstream fixed the problem.
To summarize
1. There's a bug introduced close to a year ago that breaks basic functionality on a mainstream operating system. 2. Julia has its own version of, of all things, libuv, that is significantly diverged from the original. 3. The bug is in one of the changes introduced.
While each of the above is defensible on its own. Taken together, they do scare me away from considering Julia for production use. I am hoping am wrong somewhere. I think it is a lovely language.