Why I still recommend Julia

t_mann · on June 26, 2022

It seems like the author isn't really defending against the bulk of the criticisms, which revolve around correctness and reliability, and instead highlights 'cool' language features. If you can't trust gradients or probability distributions / sampling to be correct, that's a pretty damning verdict for a language aimed at numerical calculations. The fact that it has some traction in academia (where the link from incorrect computation results to real-world consequences is extremely long-winded and hard to attribute) is worrying from that perspective. Imagine Boeing were to use Julia for its computations (not unlikely given that it has a great repuation for differential equation solvers).

gugagore · on June 26, 2022

> If you can't trust gradients or probability distributions / sampling to be correct, that's a pretty damning verdict for a language aimed at numerical calculations.

If you draw the distinction between Julia the language and Julia the ecosystem, then that's not a damning verdict for the language, even if we accept the premise that you can't trust the computations you mentioned.

The way that Julia composes is unprecedented. Typical examples include computing with measurement uncertainty, or big floats through functions that were not explicitly designed to support those types. This is great, but the correctness of this composition cannot be assumed.

One good proxy for how this composition occurs is in many type combinations. As Here is a concrete, made-up example. There is a function you've been using without any problem that takes an argument of type `DenseArray{Float64, 2}`. (Meaning, the function gets called with an argument of that type. The function definition is not constrained to work on that type.)

If you've never focused on the correctness (e.g. reviewing the code, unit tests) of this function to see what it does on `SparseArray{BigFloat, 2}`, I would not take it for granted that it works.

This is perhaps too big a burden on a user. Some might say they rather not have the compositionality unless it's guaranteed to be correct. But, if you're actually asking me to imagine using Julia in a situation where I really want to be sure about the correctness of a Julia program, I would at the very attempt to characterize all the expected types to go into any function. If at runtime a function is called with a new type, I would log that, and possibly error out.

Note this is related to, but not equivalent to what's necessary to do "static" compilation in Julia.

martinhath · on June 26, 2022

> This is perhaps too big a burden on a user. Some might say they rather not have the compositionality unless it's guaranteed to be correct.

I think this is it. Or rather, I think to some people, myself included, compositionality should imply correctness. If you /can/ compose things, but the result is wrong, can you really say that you've got composition?

How useful it is really to be able to plug in your own types in other peoples libraries if you have to trace through the execution of that library, figure out which libraries they transitively use, to ensure that all of your instantiations are sound? How do you even test this properly?

It's a really hard problem, and from what I can tell, Julia gives you no tools to deal with this.

[0] is probably relevant here, although I'm not sure I share the positive outlook.

[0]: http://www.jerf.org/iri/post/2954

hueyluey · on June 27, 2022

We need to think about alternatives. It's easy to find issues and weaknesses in what Julia does, but we should consider what we would do with a different language to make a fair comparison.

If you don't want composition, then there's no issue and Julia can be as weak/strong as other languages.

If you do want composition, then I see two ways (but I'm sure there are others): you do the "typical" thing with glue code or you use the more "automatic" way that Julia provides. Which one is better? If this is too subjective: Which one is more correct?

Yes, Julia can propagate errors in unexpected ways, but how would you implement this in another language? You'd probably have to spend X hours writing glue code and Z hours writing tests to make sure your glue code is working. This also raises issues with maintainability when one of the two packages you're connecting /composing changes.

Julia offers a reduction on the X hours for writing glue code (sometimes with X = 0) and maybe a similar time Z writing test code. The maintainability, I'd argue, becomes easier.

The cost is the unknown unkowns that might creep up when doing this composition. My (extremely) subjective sense is that this doesn't happen that often to me (I don't usually pass approximation intervals to sparse matrix to auto differentiation to neural networks), which means the benefits outweight the cost in this regard. YMMV

Edited a few things for clarity

t_mann · on June 27, 2022

'Pro: saves X amount of coding hours. Con: may silently return wrong results.' is a terrible philosophy. It's like saying 'well, the surgeon messed up, but at least the surgery was cheap'.

gugagore · on June 27, 2022

There's nothing more silent about it than any other bug that arises in a similarly dynamic language. People use Python over C++ in large part because it saves X amount of coding hours, and comes with different kinds of bugs that can "silently return wrong results".

hueyluey · on June 29, 2022

It seems you've picked the weakest version of what I said (which means communication becomes less likely)

Two points

1) the glue code you HAVE to write to make the composition in other languages and the mantianability issues you get also may introduce wrong results 2) MAY doesn't mean it will and we trade-off speed / convenience and risk in many other areas. Surgeons (or the health system in general) do trade off % of success for cost / speed, just because they don't have the resources to do everything / spend 100 hours on everyone

tgflynn · on June 26, 2022

I don't have experience with using Julia in really large code bases but my intuition has always been that the combination of these design characteristics in one language:

- easy unrestricted composability

- lack of well-defined interfaces

- lack of an effective way to use the strong typing system to validate correctness

is not a very good idea.

gugagore · on June 26, 2022

I think Julia (mostly meaning pervasive multiple dispatch, which is unprecedented at that scale Julia has it) unlocks a new way to organize programs.

Some formalized notion of "interface" is certainly important, but it seems no clear formulation has emerged.

I think it's fairly consistent with the dynamic nature of Julia and fairly inconsistent with the static nature of Julia. I don't know what a good solution to the interface problem is.

I think one can get pretty far with writing tests to check interfaces. If you are a library that expects user-defined types, you can expose an interface to these tests so that a user can check if they've implemented everything.

This is a very generic approach, and aside from the key limitation of only giving results at runtime, is much more powerful.

tgflynn · on June 26, 2022

Testing is certainly general but it's high-effort and it's easy to miss corner cases. I think type checking + some testing is going to beat testing alone in most scenarios.

ModernMech · on June 26, 2022

> If you draw the distinction between Julia the language and Julia the ecosystem

Language and ecosystem are so intertwined that I almost never do this. Maybe when there are multiple well developed ecosystems, but when there’s only one, then it and the language are inseparable to me (for any language, not just Julia).

gugagore · on June 26, 2022

I wonder if you feel that way as equally about C and Lisp as you do about Julia.

t_mann · on June 26, 2022

I remember Julia being pitched as a competitor especially for Matlab, but also R,... much more so do than for C or Lisp. So it seems like a fair comparison, and those are definitely to some extent seen as identical with their package ecosystem (in Matlab, most packages actually come from the same vendor, and base R utility is a tiny fraction of the utility of its packages).

t_mann · on June 26, 2022

Thanks for the explanation. Yeah, that is an interesting feature. Sounds like you'll have to do a lot of reading documentations and testing other packages yourself then. A typing system that allows defining accepted, rejected and 'other' data types for functions (and a compiler flag for either warning about the latter or not) would be interesting.

But the original critique also mentioned several correctness issues in the core language itself, so the issue does seem to go deeper.

gugagore · on June 26, 2022

There are several correctness issues lurking about in virtually all software. That part of the critique is uninteresting to me. If I want to pull up a blog post about several correctness issues in any programming language, I bet I could.

Also depends what you count as correct, see undefined behavior.

t_mann · on June 26, 2022

> Also depends what you count as correct

I feel that should be the job of the authors to think about what correctness means for their application, and under what circumstances they can give a correct answer. Ideally, also warn the user when the inputs don't fall within that domain or give them options to handle such cases.

Let's talk about an example: say you have a neat algorithm for fast multiplication of sparse matrices, approximately correct up to a defined Frobenius error. Say we have some reasonable treatment of dimension mismatch and NaN, inf,... with option flags already. Now if I wanted to know whether your algorithm has the same numerical guarantee for a dense matrix, I'd have to read your paper, run some tests, ask around, and I might never be quite sure.

It seems much more sane to put the onus on the authors to be explicit about the behaviour in that case. Your and others' comments seem to put the onus mostly on the user for thinking about the behaviour of others' code. That does not seem like a good practice, especially in the context of complex math and nested composition.

oivey · on June 26, 2022

You could do something like this:

  plus_one(x::Number) = x + 1
  function plus_one(x)
    @warn "untested type usage of plus_one"
    x + 1
  end

That's cumbersome, but a macro could make it less annoying.

gugagore · on June 26, 2022

That is not a full solution to the problem of detecting untested types. What if "plus_one" works for Int, Float32, etc. but not with MyNumber, all of which would dispatch to "plus_one(::Number)".

ViralBShah · on June 26, 2022

A detailed discussion is on Julia discourse: https://discourse.julialang.org/t/discussion-on-why-i-no-lon...

Bugs are everywhere - including Python and R libraries. You would not design a safety critical system in Python using libraries and codepaths that are not heavily tested. Even then, you still have to carefully design testsuites that give you confidence that what you are computing is what is expected.

See these talks on how a collision avoidance system was designed with Julia. https://www.youtube.com/watch?v=rj-WhTL_VXE https://www.youtube.com/watch?v=19zm1Fn0S9M

dekhn · on June 26, 2022

I'm sure Boeing uses Julia somewhere for its computations... and they have extensive testing to support that. I recall when the PR folks at SAS, a statistical tool, said that nobody would trust a Boeing plane that had been developed using R (an open source alternative to SAS), and Boeing replied that they used R to design airplanes.

notagoodidea · on June 26, 2022

Besides that the SAS vs R is something still ongoing. I heard the aame point made in favour of SAS in Public Health, Clinical trials analysis and set up and different public administration. They trust the algorithms provided by a closed source certified vendors and not an open source alternative.

At the same time, correct implementation of statistics is hard and testing it to be correct is harder.

dekhn · on June 26, 2022

the story you heard about health is exactly backward. I had also heard that FDA approvals depended on SAS, but I spoke more recently with somebody who does this and it turns out both the submittors and the FDA use R, not SAS. There have definitely been communities of stats people who "trust" only certain codes or routines, but generally, the trend has been towards open implementations of the code, as it allows more experts to inspect the workings when they detect problems.

HenrikB · on June 27, 2022

There are working groups under the R Consortium umbrella that work on FDA-related tasks with a focus on R [0]. Folks from the pharma industry and FDA are involved.

One of the working groups is dedicated toward submissions: R Submissions Working Group [1]. They recently announced: "The R Consortium is happy to announce that on Nov 22nd, 2021, the R Submissions Working Group successfully submitted an R-based test submission package through the FDA eCTD gateway! The submission package has been received by the FDA staff who were able to reproduce the numerical results." [2]

[0] https://www.r-consortium.org/all-projects/isc-working-groups

[1] https://rconsortium.github.io/submissions-wg/

[2] https://www.r-consortium.org/blog/2021/12/08/successful-r-ba...

notagoodidea · on June 27, 2022

Preach but I should have added than my experience was not in the US but in Europe.

t_mann · on June 26, 2022

Well, it's also not like Boeing didn't have some major engineering f*ups in recent years, but we're going on a tangent here.

ChrisRackauckas · on June 26, 2022

Can you point to a concrete example of one that someone would run into when using the differential equation solvers with the default and recommended Enzyme AD for vector-Jacobian products? I'd be happy to look into it, but there do not currently seem to be any correctness issues in the Enzyme issue tracker that are current (3 issues are open but they all seem to be fixed, other than https://github.com/EnzymeAD/Enzyme.jl/issues/278 which is actually an activity analysis bug in LLVM). So please be more specific. The issue with Enzyme right now seems to moreso be about finding functional forms that compile, and it throws compile-time errors in the event that it cannot fully analyze the program and if it has too much dynamic behavior (example: https://github.com/EnzymeAD/Enzyme.jl/issues/368).

Additional note, we recently did a overhaul of SciMLSensitivity (https://sensitivity.sciml.ai/dev/, as part of the new https://docs.sciml.ai/dev/) and setup a system which amounts to 15 hours of direct unit tests doing a combinatoric check of arguments with 4 hours of downstream testing (https://github.com/SciML/SciMLSensitivity.jl/actions/runs/25...). What that identified is that any remaining issues that can arise are due to the implicit parameters mechanism in Zygote (Zygote.params). To counteract this upstream issue, we (a) try to default to never default to Zygote VJPs whenever we can avoid it (hence defaulting to Enzyme and ReverseDiff first as previously mentioned), and (b) put in a mechanism for early error throwing if Zygote hits any not implemented derivative case with an explicit error message (https://github.com/SciML/SciMLSensitivity.jl/blob/v7.0.1/src...). We have alerted the devs of the machine learning libraries, and from this there has been a lot of movement. In particular, a globals-free machine learning library, Lux.jl, was created with fully explicit parameters https://lux.csail.mit.edu/dev/, and thus by design it cannot have this issue. In addition, the Flux.jl library itself is looking to do a redesign that eliminates implicit parameters (https://github.com/FluxML/Flux.jl/issues/1986). Which design will be the one in the end, that's uncertain right now, but it's clear that no matter what the future designs of the deep learning libraries will fully cut out that part of Zygote.jl. And additionally, the other AD libraries (Enzyme and Diffractor for example) do not have this "feature", so it's an issue that can only arise from a specific (not recommended) way of using Zygote (which now throws explicit error messages early and often if used anywhere near SciML because I don't tolerate it). tl;dr: don't use Zygote.params, the ML AD devs know this its usage is being removed ASAP.

So from this, SciML should be rather safe and if not, please share some details and I'd be happy to dig in.

leke · on June 26, 2022

Just wondering...If gradients or probability distributions are an issue for Julia, can't they just be fixed some day in the future?

gugagore · on June 26, 2022

Yes, those fixes don't inherently require the Julia language to change at all.

anigbrowl · on June 26, 2022

I really like Julia, and I think this article does a great job of defending its design decisions. But I'm a bit disappointed that it doesn't address Yuri's main criticism, which is that there's some worrying correctness bugs in Julia that aren't addressed as well as they should be.

I'm no language expert, so I can't tell if Yuri's criticisms are just a function of the smaller number of people working on Julia vs a very popular language like Python, or whether the failings he discusses stand out because so many things Just Work in Julia, or whether it indicates some much deeper problem, so I've gone back to look at the original thread - https://news.ycombinator.com/item?id=31396861

I'm interested to hear what others think of Julia's longevity prospects. Its popularity in climate modeling and other scientific contexts suggest it will be around for the long haul, and that Julia projects needn't end up as abandonware for lack of footing. Is that accurate?

ViralBShah · on June 26, 2022

Yuri's criticism was not that Julia has correctness bugs as a language, but that certain libraries when composed with common operations had bugs (many of which are now addressed).

I will recommend following the discussion on the Julia discourse here that is focussed on productively and constructively addressing the issues (while also discussing them in the right context): https://discourse.julialang.org/t/discussion-on-why-i-no-lon...

Just like any other open source project, Julia has packages that get well adopted, well maintained, and packages that get abandoned, picked up later, alternatives sprout and so on. The widely used packages usually do not have this problem. Overall the trend is that the user base and contributor base are growing.

jakobnissen · on June 26, 2022

I think you must be misremembering. There were several issues linked in his post that were bugs in the core language itself.

I can find you several more, if you want. In the last 2 years I've myself filed something like 6 or 7 correctness bugs in Julia itself (not libraries), and hit at least 2 dozen, whereas I've never found a correctness bug in Python despite using it daily for 5 years.

Right now, you can go to the CodeCov of Julia and find entire functions that are simply not tested. Many of those, and they are in plain sight. And it would take less than an hour to find a dozen correctness bugs that are filed, known about, agreed to be a bug, tractable, yet still not put on the milestone for the next Julia release, which means the next Julia release will knowingly include these bugs.

I just don't know how people can see these facts and still claim Julia cares a lot about correctness. It's just not true.

If you want something actionable, here are three suggestions:

1) Do not release Julia 1.9 until codecov is at 100% (minus OS-specific branches etc.)

2) Solicit a list of tractable correctness bugs from the community and put all the ones that are agreed to be bugs and that are solvable on the 1.9 milestone.

3) Thoroughly document the interface of every exported abstract type, the print/show/display system, and other prominent interfaces, do not release 1.9 before this is done.

Edit: I apologize for implying you were not being genuine. That was uncalled for.

ViralBShah · on June 26, 2022

I carefully mentioned that the issues (with the exception of a type intersection bug which is in the language, but was characterized as a control flow bug) are not core language issues. Julia ships with a very large standard library, and people often lump all issues in base Julia as "language issues".

I know you have yourself filed dozens of issues, many of which have been fixed. I feel it is unfair to characterize years of work by a community of people as: "Julia does not care about correctness". There's an open triage meeting that happens every other week, where all new issues are discussed and triaged. There is a fairly detailed and well-defined release process. I don't believe people are holding back on filing bugs, because they are waiting for us to solicit.

jakobnissen · on June 26, 2022

I would argue #39460, #39385 and #39183 mentioned in the post are all correctness bugs in Julia itself.

I feel there is this weird disconnect between what is being said by e.g. Yuri, me or Dan Luu, and what is being heard by some of the core devs, and I don't understand where this disconnect is happening precisely. That is very frustrating.

I think delving into the issues with communication will only turn sour with no benefit, so let's not go there. Instead, let me be much more concrete. When I say that Julia does not put correctness as a high priority, what I mean is:

* Julia is not well tested, as can plainly be seen from the code coverage. Having all functions covered by the test suite is the absolute minimal standard of testing - I would argue that is not sufficient to consider something well tested considering Julia's generic methods. But even covering all methods with tests is still not done for Base Julia.

* When I file a bug that is eminently fixable e.g. #43235 or #43245 (or several others), it is not being fixed after months, it is not being milestoned, and new releases are being pushed out that contains the bug.

Do these two points not illustrate that more could be done to reduce the bugginess of Julia? I legitimately don't understand that one can hold the view that these two issues are not a reflection of correctness not being prioritized in Julia.

There is a broader point here about how Julia's language design and lack of safety features or tooling makes it very difficult to write correct code and enforce the correctness of it. But I feel if we can't even agree that Julia ought to have all its exported methods covered by tests, and all its bug reports fixed, then I can't see how we can have a discussion about the more complex and nuanced topics like how to enforce interfaces or contracts.

ViralBShah · on June 26, 2022

My comments have been mainly to address the nature of the conversation here and to provide some balance. Specifically, most readers of HN are not deeply steeped in the nature of issues, and the overly broad language in some of the comments can easily give the wrong impression.

It is easy to pick a subset of bugs and weave a particular narrative. You have filed several issues, many of which are still open, and many have been addressed. Thousands of bugs are fixed for every release, including many you have filed - and Julia is better as a result.

Should Julia be better tested, yes it should be. Should we have better tooling, of course we should. Can the triage process be improved, yes. Should code coverage get to 100% - it has steadily increased over time. None of Julia's dependent libraries have 100% code coverage. Many of those projects are even larger than Julia itself. Can every possible bug identified be fixed - we would like to - but eventually there is limited developer time and everything has to be prioritized.

I will once again mention that the triage process is not a secret process. I welcome you to join some of the triage calls to give higher visibility to issues that you feel should be fixed (but are unable to provide PRs yourself for).

ViralBShah · on June 26, 2022

Correction: It is not that every new issue is triaged on the triage call. Issues are triaged by a group of people with triage access - and the triage call focusses on issues that have the `triage` label.

adgjlsfhk1 · on June 26, 2022

also, if you want to be able to mark issues for triage, ask. we are fairly liberal in who we give tagging permissions to.

nsajko · on June 26, 2022

All other languages and language implementations have bugs, too.

tremon · on June 26, 2022

...and they correctly treat them as release-critical.

nsajko · on June 26, 2022

No. Take a look at the GCC or Clang bug trackers, for example.

wtetzner · on June 26, 2022

It’s not just bugs that are the issue, it’s correctness bugs.

blindseer · on June 26, 2022

If I may Viral, I suspect one takeaway from Yuri's criticism (at least it was a takeaway for me) is that with multiple dispatch correctness bugs like the ones listed are hard to find (impossible even). How would you respond to that criticism?

In my opinion, better tooling to assist with such cases would help tremendously. Adding support for interfaces to `Base` would be a great start. What are your thoughts about this?

Also, it's been a while since we've seen a roadmap on what the core team is working on? What are the next big features we can expect from the language and what is the approximate timeline for that? Having answers to these questions would be extremely helpful.

ViralBShah · on June 26, 2022

The issue is not that the bugs are with correctness of multiple dispatch, but that multiple dispatch allows you to combine generic programming with abstract data types. Thus, one can have a generic implementation in base Julia, and someone can pass a new user data type - a combination that can easily not work. Some of the frustration also arises from types such as OffsetArrays that are included in the base distribution, but not as well supported and tested as the regular Arrays type. Thus, the discussion here tends to focus on defining interfaces, and of course on better testing of uncommonly used data types.

In general, we've not had a formal roadmap - but we present a "State of Julia" talk at JuliaCon every year. But very broadly, the list (of the top of my head) includes: improving a lot of the underlying compiler infrastructure overall, improving support differentiable programming, improving garbage collection, support for GPUs from multiple vendors (too many of those now), supporting apple silicon, type system support for tools like JET.jl.

NEWS.md is generally updated during the course of a release cycle, which eventually becomes release notes, and then post release, we put together a highlights blog post. https://github.com/JuliaLang/julia/blob/master/NEWS.md

amkkma · on June 26, 2022

Thanks.

regarding "improving a lot of the underlying compiler infrastructure overall"

Is the compiler plugin project still active/ planned?

ViralBShah · on June 26, 2022

It is still planned, but I'll defer to Keno and others to chime in on the details.

sgt101 · on June 26, 2022

>If I may Viral, I suspect one takeaway from Yuri's criticism (at least it was a takeaway for me) is that with multiple dispatch correctness bugs like the ones listed are hard to find (impossible even).

Maybe one approach would be for the community to create some certification tests. Although this wouldn't help with corner cases it would allow packages to run some testing that (if the tests are good) might throw up problems with constitutionality? Also if there were bugs from composition the certificate could be suspended until they were fixed.

Sukera · on June 26, 2022

I may be missing some context, but what correctness bugs exactly are you referring to that are caused by multiple dispatch?

orthoxerox · on June 26, 2022

Dependently typed arrays/lists/vectors with explicit lower/upper bounds would've prevented the errorneous assumption that you can iterate over any axis from 1 to length inclusive.

Sukera · on June 26, 2022

True in principle, but that is neither related to multiple dispatch nor a common feature in other languages. Dependent typing is very much active research in PL academia. The @inbounds kerfuffle was all about incorrectly using a tool that's explicitly documented to take off the existing bounds checking safety, similar to `unsafe` blocks in Rust. So even with dependent types, if you explicitly opt out of those checks, they wouldn't have helped.

abhimanyuaryan · on June 26, 2022

@blindseer I recommend start writing for your code. Tests are important you should not skip them. Julia vs no Julia

blindseer · on June 26, 2022

Did you mean to reply to me? Or perhaps another comment of mine? If not, I don't follow.

Edit:

It looks like abhimanyuaryan is blindly spamming suggestions to write tests in this thread (???). This is a case where I agree in principle but their comments are completely beside the point and not relevant to the context of the comment or the post.

abhimanyuaryan · on June 26, 2022

sorry for confusion what just pointing to one of recommendations from the blog post that I thought you mind wanna re-consider to overcome "multiple dispatch correctness bugs" i.e. "I'd say that there is a huge combination of packages that can be used together providing the language with an enormous amount of possibilities. It is up to the programmer to verify the interaction before use and, preferably, add tests to one of the packages." Yuri's criticism of composability came from packages as Viral already mentioned

anonymoushn · on June 26, 2022

> Yuri's criticism was not that Julia has correctness bugs as a language

Are you sure? Here are some issues from the post:

"Wrong if-else control flow" seems like a language issue? bug is still open [0]

"Wrong results since some copyto! methods don’t check for aliasing" seems like a bug in a core library. The bug, which is filed against Julia, not some third-party library, is still open [1]

"The product function can produce incorrect results for 8-bit, 16-bit, and 32-bit integers" was a bug in a core library, which was fixed [2]

"Base functions sum!, prod!, any!, and all! may silently return incorrect results" seems like a bug in a core library and is still open [3]

"Off-by-one error in dayofquarter() in leap years" seems like a bug in a core library which was fixed [4]

"Pipeline with stdout=IOStream writes out of order" seems like a bug in a core library and is still open [5]

I've been deliberately conservative here and only posted the issues from Yuri's post that are in the JuliaLang/julia repository. The other issues are filed against JuliaStats/Distributions.jl, JuliaStats/StatsBase.jl, JuliaCollections/OrderedCollections.jl, and JuliaPhysics/Measurements.jl. Since I have not used Julia very much, I don't know whether these are commonly used libraries or obscure libraries nobody uses, but they seem pretty close to the core use-cases of the language. Maybe someone who uses the language a lot more can shed some light on this issue.

Some commenters seem exhausted by what they perceive as a continual stream of lies about these topics, which has left them less inclined to post about them.

[0]: https://github.com/JuliaLang/julia/issues/41096

[1]: https://github.com/JuliaLang/julia/issues/39460

[2]: https://github.com/JuliaLang/julia/issues/39183

[3]: https://github.com/JuliaLang/julia/issues/39385

[4]: https://github.com/JuliaLang/julia/pull/36543

[5]: https://github.com/JuliaLang/julia/issues/36069

ninjin · on June 26, 2022

At its core, yes, Yuri wanted to highlight the fact that the “power” of the language created more or less “a fractal” of type/function composition cases that made it very difficult to guarantee the correctness of a given call site. This is inherent in the language, but causes potential issues across the ecosystem and he felt that the community did not take it as seriously as he would have hoped. At least that is the takeaway I got from both talking to him and reading what he wrote.

To me, this is a long and complex discussion to be had by those that understand general programming language design and the case for Julia itself and frankly statements like “Some commenters seem exhausted by what they perceive as a continual stream of lies about these topics, which has left them less inclined to post about them.” are bloody cheap, unfalsifiable, and adds little to nothing to the discussion.

anonymoushn · on June 26, 2022

> At its core, yes, Yuri wanted to highlight the fact that the “power” of the language created more or less “a fractal” of type/function composition cases that made it very difficult to guarantee the correctness of a given call site. This is inherent in the language, but causes potential issues across the ecosystem and he felt that the community did not take it as seriously as he would have hoped. At least that is the takeaway I got from both talking to him and reading what he wrote.

For things like "The majority of sampling methods are unsafe and incorrect in the presence of offset axes" I agree that this is just some unfortunate combination of library code and concerns that the library authors had not considered, but the numerous correctness issues in the language and stdlib seem like they would often make it hard to figure out what exactly the problem is.

> bloody cheap, unfalsifiable, and adds little to nothing to the discussion.

Sorry about that. I'm not sure how to highlight the recurring nature of these less-than-factual posts and their effect on some members of the community while respecting the wishes of those people.

ninjin · on June 27, 2022

> For things like "The majority of sampling methods are unsafe and incorrect in the presence of offset axes" I agree that this is just some unfortunate combination of library code and concerns that the library authors had not considered, but the numerous correctness issues in the language and stdlib seem like they would often make it hard to figure out what exactly the problem is.

I am not sure what we are arguing here. You asked what Yuri originally wanted to highlight and I answered based on my own insights. I have re-read what he wrote and I still stand by my conclusion and I see no point in arguing back and forth as I have stated here and in a comment when Yuri’s post was originally submitted [1] that all the issues are derived from how Julia as a language handles types and dispatch. Everything beyond this is arguing minute semantics and frankly very uninteresting compared to a discussion as to whether it can be resolved and how.

[1]: https://news.ycombinator.com/item?id=31398040

> Sorry about that. I'm not sure how to highlight the recurring nature of these less-than-factual posts and their effect on some members of the community while respecting the wishes of those people.

Not sure if I can consider this an apology when you in your very next breath make exactly the same kind of vague statement that I called out in the first place. Since you apparently know these people, how about collecting their opinions and presenting them anonymised? Likewise, if there are lies spread across so many posts, why not simply collect and refute them? This all seems obvious to me, but maybe I am missing something? In fact, I find posts such as yours in very stark contrast to what Yuri wrote and accomplished with his post, so perhaps you can find inspiration there?

anonymoushn · on June 27, 2022

If the project has cultural issues that would cause severe correctness issues regardless of the language's semantics, it may not be that useful to talk about technical solutions that would let other projects without these cultural issues make a language with similar semantics and fewer correctness issues.

> Since you apparently know these people, how about collecting their opinions and presenting them anonymised

This sounds like the sort of claim that would get called out as unfalsifiable online.

> Likewise, if there are lies spread across so many posts, why not simply collect and refute them? This all seems obvious to me, but maybe I am missing something?

Is the question really "Why don't you just do a lot of work for me, for free, since I will not use a search engine?"

ninjin · on June 27, 2022

> If the project has cultural issues that would cause severe correctness issues regardless of the language's semantics, it may not be that useful to talk about technical solutions that would let other projects without these cultural issues make a language with similar semantics and fewer correctness issues.

How can solving – or failing to solve – a technical issue that could then lead to insights shared between communities ever be a bad thing? Maybe I am just too thick to see it? But I am ending my efforts at this point as I am not getting any closer to understanding what you are trying to convey.

As for your “comebacks”, we clearly have very different standards and expectations when it comes to discourse. Where I come from, those that bring claims are expected to also bring adequate evidence – or at the very least attempt to do so and not spread claims just to later throw their hands into the air when called out for it.

Sukera · on June 26, 2022

[0] is already fixed on master but can't be backported to LTS due to the fix introducing other problems specific to LTS. [2] is closed. [3] should be fixed, but doesn't seem trivial to do. [4] is a regular ol' bug (again, not a language but a library bug). Same goes for [5].

From the linked stuff, only [0] is a true correctness problem of the language and not a bug in a library.

> Some commenters seem exhausted by what they perceive as a continual stream of lies about these topics, which has left them less inclined to post about them.

How is differentiating bugs in the language (parser, compiler, type system, ...) from bugs in libraries implemented in the language "lying about these topics"? It's not like julia is claiming to prevent logic bugs, or am I missing something.

anonymoushn · on June 26, 2022

"Yuri's criticism was not that Julia has correctness bugs as a language" would be an appropriate thing to say if none of the bugs in the post were "Wrong if-else control flow."

Edit: "certain libraries when composed with common operations had bugs" also does not seem like a completely on-the-level way to describe bugs in the stdlib that do not require any particular composition to encounter.

ViralBShah · on June 26, 2022

If you look at the linked issue, you will see that it is a type intersection issue (a real language issue) mischaracterized as a control-flow bug. Many readers here seem to see that and believe that if statements in Julia are broken - no that is not the case.

anonymoushn · on June 26, 2022

As an end user it may be hard for me to accept that there are no control-flow bugs involved when this code

    println(flag)
    if flag
        println("flow for true.")
    else
        println("flow for false.")
    end

prints

  false
  flow for true.

oivey · on June 26, 2022

I pasted that code into a REPL (including setting flag=false, which seems to be implied) and got:

  false
  flow for false.

How did you get your output? Your post implies that if statements are broken in Julia. Untrue.

jakobnissen · on June 26, 2022

It is from the top comment in https://github.com/JuliaLang/julia/issues/41096

anigbrowl · on June 26, 2022

That's valuable context, thanks for taking time to answer. Looking forward to JuliaCon 2022!

oivey · on June 26, 2022

I think the language has amassed a nearly unassailable lead in differential equations, scientific machine learning, and (to a lesser degree) optimization. That lead hinges on advantages of the language, so I think others are unlikely to catch up. The story for working with GPUs is also really good compared to e.g. Python. I think those things alone are enough to keep the language around for a while.

andi999 · on June 26, 2022

What is your evidence? Especially machine learning is done a lot in industry and I think nobody has even heard of Julia outside academic circles.

semi-extrinsic · on June 26, 2022

I guess when GP says "scientific machine learning" he means "research into machine learning algorithms", which is typically done in academic circles, no?

Then when someone's discovered something new and useful, the focus turns towards making fast and readily available implementations in commonly used frameworks. Like Torch, which you can use both from PyTorch and from Julia.

wardedVibe · on June 26, 2022

No, they mean traditional scientific computing (differential equations and the like) merged with machine learning techniques. It's extremely useful in e.g climate modeling. It let's you encode domain knowledge with additional fudge terms approximated by neural networks.

sciml.ai/

oivey · on June 26, 2022

Yeah, this is what I meant. It’s virtually impossible for Python-based libraries to catch up to the huge number of differential equation solvers that are available in Julia, and that work on a GPU, and that can be automatically differentiated through, and that all work through a common interface, and that all work around a well defined ecosystem.

More broadly, I’d say Julia has more potential than Python when it comes to automatic differentiation. JAX is really cool, but Julia’s automatic differentiation tools are targeting the whole language without modification and restriction (outside those imposed by calculus). Potentially some really fancy stuff could get embedded in neural nets beyond differential equations and with relatively little effort.

orbifold · on June 26, 2022

I find that claim highly dubious, solving differential equations rarely yields to completely generic methods and there are plenty of solver suites in other languages, among them C++ and wrappers for Python. To use scientific machine learning you don't necessarily implement all methods under the sun, just a method appropriate for the domain at hand.

oivey · on June 26, 2022

Implementing automatic differentiation through C++ solvers or Python wrappers is going to be exceptionally difficult. If you want a generic toolbox like torch for scientific machine learning you would want every method under the sun.

That’s the moat Julia has here. A full rewrite of the differential equation ecosystem wasn’t necessary in order to support automatic differentiation.

krastanov · on June 26, 2022

I do not know of another differential equations framework nearly as polished, performant, and feautere rich as the one in Julia, in any language. I actually know of no other framework that has both modern solvers and autodiff working together. The only other autodiff solvers I know of are ancient RK4-like solvers implemented in things like Tensorflow. It is hard to overstate how monumentally enabling such tools are for old-school science and they currently exist only in Julia.

orbifold · on June 26, 2022

Sundials (https://computing.llnl.gov/projects/sundials) is arguably more feature complete, has many optimised implementations scaling to super computers and supports both adjoint and forward sensitivity analysis. Btw. several "research" papers at machine learning conferences on "Neural Differential Equations" presented "discoveries" covered in the documentation of Sundials https://sundials.readthedocs.io/en/latest/cvodes/Mathematics....

ChrisRackauckas · on June 26, 2022

Let's talk details then. On the small stiff ODE and non-stiff end SUNDIALS is bested quite easily (for example https://benchmarks.sciml.ai/html/StiffODE/Hires.html), but that's pretty well known so not much to say there. For the larger ODE end, it's missing Runge-Kutta-Chebyschev methods, exponential integrators, etc. which do amazing on a lot of equations, and in direct head to head FBDF has been outperforming CVODE BDF for over a year now on most equations (as demonstrated in the online benchmarks, and now there are some applications papers starting to show that). Where SUNDIALS does do well is for distributed NVector is really good, along with it's heterogeneous compute. Its new multirate methods are good too. But then it's lacking methods for SDEs, DDEs, RODEs, higher precision, probabistic numerics, etc. so it's hard to call it feature filled. It's also lacking a complete symbolic front end like ModelingToolkit for handling higher index DAEs, so while IDA currently outperforms DFBDF head-to-head, the optimal thing to do is still to use a Julia solver just post MTK optimized (for example see https://benchmarks.sciml.ai/html/DAE/ChemicalAkzoNobel.html and for the large equation side see some of the stuff on HVAC models). Sundials has deep customizability in the linear and nonlinear solvers but that's also seen on the Julia side.

And while SUNDIALS does have an adjoint implementation, it does not construct the vjp through reverse mode so direct (naive) usage is really lacking in performance. Yes I know that there is a way to define the vjp function in the docs, but it's buried enough in there that all of the examples I've looked at of SUNDIALS in applications in the wild (like in AMIGO2) don't make use of it so there's a real performance loss seen by end users for many cases. DiffEq of course automates this.

Don't get me wrong, SUNDIALS is really good, still bests the Julia solvers on the distributed case, and there is a small performance advantage to CVODE in the one doc example on the 2D Brusselator after adding iLU preconditioning (https://diffeq.sciml.ai/dev/tutorials/advanced_ode_example/ and the low tolerance case of the Filament benchmark (https://benchmarks.sciml.ai/html/MOLPDE/Filament.html, though the medium tolerance case is bested by ROCK4). For years DifferentialEquations.jl used wrappers to SUNDIALS as a crutch for the cases where SUNDIALS still outperformed it, so nowadays it both has a fairly complete wrapper of SUNDIALS but also has benchmarks of different algorithms outperforming it.

We've been benchmarking against it weekly for years now, and within the last two years we've really seen a reversal so now it's a rare sight to see SUNDIALS perform the best. And there's a lot of momentum too: new parallel solvers, new GPU solvers, etc. are coming out from both crews so there's a fun arm's race going on right now, so we will see how it evolves (I definitely want to wrap those multirate methods ASAP and play with the theory a bit)

> Btw. several "research" papers at machine learning conferences on "Neural Differential Equations" presented "discoveries" covered in the documentation of Sundials https://sundials.readthedocs.io/en/latest/cvodes/Mathematics....

Of course the Julia folks know this because we were the ones mentioning this at conferences, since we would routinely choose to benchmark against SUNDIALS implementations instead of random unstable ML ones.

krastanov · on June 27, 2022

Independently on our views on the comparisons of Sundials and DifferentialEquations.jl, I want to wholeheartedly agree with you that modern research in Neural Differential Equations and ML in general is rediscovering a ton of stuff that is obvious of everyone that has done serious numerical work.

adgjlsfhk1 · on June 26, 2022

while there are c++ solvers, they are missing the flexibility needed to solve many types of problems efficiently. they almost all use finite differencing, which is inaccurate and slow, and they rarely let users customize the linear solve and nonlinear solve algorithms. What the Julia diffeq suite is working towards is complete control over solver options, while picking good defaults when not customized. They let you pick forward and backward mode AD (or a combination of the two), problem specific preconditioning methods, custom root finding algorithms (including gpu based linear solvers), arbitrary precision methods for verification (or if you're just paranoid), interval arithmetic, linear uncertainty propegation, and more.

orbifold · on June 26, 2022

As far as I'm aware there is considerable prior art, since solving this problem has been relevant since at least the 70s. Granted the barrier to entry was far higher then, then now, but we are talking about scientific / engineering applications and there the prior investment in specialised tools has always been justified. A contemporary example is http://www.dolfin-adjoint.org/en/latest/, but there are others. Sundials for example also both supports completely custom linear and non-linear solvers in conjunction with adjoint sensitivity analysis.

adgjlsfhk1 · on June 26, 2022

there is lots of prior art but it's still not ideal. the one you posted, for example only has reverse mode AD which sometimes is good, but forward mode is sometimes asymptotically faster (and has lower overhead, and some numerical stability issues)

ChrisRackauckas · on June 26, 2022

Sundials does have forward sensitivities, though the implementation does not SIMD the calculations of the primal with the dual parts like direct AD of the solver does. This makes it a bit worse on forward mode than direct differentiation of a solver (with an optimizing compiler) would do, which was something that took us a few years to really pin down (we had to implement our own to really benchmark it, and got the same phenomena demonstrated in https://arxiv.org/abs/1812.01892). The nice thing about the direct forward mode is that it is easier to opt the extra equations out of the linear solver, something that AD systems won't do by default (but it can be done)

rmbyrro · on June 26, 2022

> One such operation can be to sell. The sell function can do things like marking the book object as sold

Maybe this isn’t a good example of OOP?

Does it suggest the author perhaps doesn't fully understand principles behind loosely coupled OO code?

If is_sold is an attribute of book, the class shouldn't have a sell method, but a mark_as_sold.

The action of selling is not the responsibility of a Book. Baking it into Book will create tight coupling between areas of the code (product and commerce) that should be loosely coupled. The "commerce" side may change for reasons unrelated to the "product" side. Keep'em separated.

And if we think for a minute, is_sold should not be a Book attribute in the first place. It seems to me this piece of information should be handled somewhere else related to inventory, not by the Book itself.

There should be a ProductEntry class, for instance. It could have a product_type and product_sku attributes, for example, pointing to a Book.

Even here, is_sold shouldn't be an attribute, but a method. It would have an availability_status attribute.

func is_sold(self) { return self.status == ProductStatus.SOLD }

Ultimately, the inventory code should not care that it's a Book or whatever.

The logic of whether it's available or not doesn't belong to the product itself. What if someone bought, but returned it? Maybe it will be available for shipping tomorrow and the store wants to start reselling it already? Is it the Book responsibility to track that? No way.

vnorilo · on June 26, 2022

Seems to me that you are describing something that should be a relational database, and where object orientation doesn't seem to bring anything of value.

thecompilr · on June 26, 2022

That OP just pointed how the provided example would actually work in an OOP project, not their fault to have to work with an example that is indeed an inventory management problem.

croes · on June 26, 2022

Isn't the point that in OOP you need to think about the design of your objects but in Julia this isn't necessary?

Your critique seems to prove the author's claim.

ssivark · on June 26, 2022

To phrase succinctly the (direct) counter to the previous blog post: Julia makes possible the kind of combinatorial composition (and therefore, radical modularity & reuse) that is simply not possible in most languages. This will lead to some friction as the community figures out the right design patterns in these uncharted waters, but on the flip side one already gets many superpowers, provided one is careful about testing the composition works as expected, rather than just blindly assuming it does.

SnowHill9902 · on June 26, 2022

Can you give some examples?

adgjlsfhk1 · on June 26, 2022

the simplest I know of is Complex{Rational{Int}}. works in Julia, breaks in python.

ChrisRackauckas · on June 28, 2022

And that works and optimizes in the differential equation solvers.

the__alchemist · on June 26, 2022

> Finally, it is also likely that CPUs will become faster over time and that tooling around reducing TTFX will improve.

I don't agree with this as a mitigation. This line of reasoning is why despite steady hardware improvements over the past few decades, responsiveness of (PC) programs and websites have stagnated or regressed.

To the main TTFX issue - I won't consider Julia until this is taken seriously.

celrod · on June 26, 2022

Many of us in the Julia community (myself included) take it very seriously and spend a substantial amount of our time working to mitigate it.

ChrisRackauckas · on June 26, 2022

The load times on some core packages were reduced by an order of magnitude this month. For example, RecursiveArrayTools went from 6228.5 ms to 292.7 ms. This was due to the new `@time_imports` in the Julia v1.8-beta helping to isolate load time issues. See https://github.com/SciML/RecursiveArrayTools.jl/pull/217 . And that's just one of many packages which have seen this change just this month. This of course doesn't mean load times have been solved everywhere, but we now have the tooling to identify the root causes and it's actively being worked on from multiple directions.

skohan · on June 26, 2022

I totally agree, leaning on Moore's Law to solve performance issue doesn't work. When you have performance issues on the language itself, you're essentially paying a performance tax on every piece of code that executes, in a compounding way. So if Julia became a widespread programming language for general applications, you have those small delays leaking into all sorts of operations, and before you know it your whole system feels slow and bogged down.

I have been testing some performance oriented software recently, and it's amazing how much more productive I feel when every keypress is immediately reflected on screen with no delay. We can adapt to poor performance, but it adds a constant cognitive load to deal with it.

In my mind Julia is suited for limited applications, like doing work in a jupyter notebook, but is not suited for general applications unless the TTFX issue can be fixed.

jakobnissen · on June 26, 2022

In my opinion, there is no such thing as a "general application". What about generating static websites? Is that a "general application"? Is TTFX a big issue there? Currently, the Julia package Franklin.jl takes around 15 seconds to create the website the first time (changes in the source code update the page in less than one second). I'd argue no.

What about Julia to produce generative art? Is TTFX an issue there? Etc. If "general applications" means _all possible applications_, then I'm not sure any language is suitable for general applications.

Julia has weaknesses which limits what it's good at. Right now I think it has more weaknesses than other languages. Some of these limits will hopefully be removed or reduced in the future. But fundamentally, I don't see Julia as some kind of domain specific or niche language. Just like any language, it just has its own set of tradeoffs.

skohan · on June 26, 2022

Yeah I would agree the use cases you mentioned are probably fine for Julia as well. I could be more precise and say that Julia is not relevant for performance-relevant applications.

The problem with ignoring performance is, many applications can become a dependency of a performance relevant application even if that was not the original intent.

cjalmeida · on June 26, 2022

We recently used Julia to solve (correctly) a super complex scheduling problem for a major airline. Our budget was 5min, Python would get not even close to a solution in that time. The tool is being actively used in production.

Julia is not not only fine, but my go to language for performance problems.

skohan · on June 27, 2022

Ok but comparing with Python is like a worst-case scenario, and long running tasks would amortize Julia’s performance issues.

There are a great many use-cases/language comparisons you could find where Julia would be the worse performing language.

DNF2 · on June 27, 2022

You seem to confuse performance and latency. High performance is Julia's main selling point. Latency is its Achilles heel.

It's being very actively worked on however.

skohan · on June 29, 2022

Latency is a type of performance. For instance if you have even a few milliseconds latency it's relevant in terms of UI performance.

DNF2 · on June 29, 2022

Then you should specify. You made a blanket statement about performance, while using the least commonly considered type. If you just say 'performance', most people will assume you mean throughput.

MaxBarraclough · on June 26, 2022

It seems very similar to what's going on with Java these days.

If your plan is to do bytecode verification and class-loading only once requests are coming in, and to handle the first however many requests with interpretive execution (prior to JIT), you shouldn't be surprised to see poor performance for those first requests.

Now, ahead-of-time compilation in the JVM is becoming more mainstream, and brings the expected benefits to, for instance, start-up time. [0]

[0] https://spring.io/blog/2021/12/09/new-aot-engine-brings-spri...

phonebucket · on June 26, 2022

The quote provided explicitly suggests that tooling around reducing TTFX should also improve. It’s not suggesting improved CPU power as the sole mitigation.

bee_rider · on June 26, 2022

That seems more like a little aside on the compilation times front, rather than the main thrust of the author's argument.

anonymoushn · on June 26, 2022

> Based on the amount and the ferocity of the comments, it is natural to conclude that Julia as a whole must produce incorrect results and therefore cannot be a productive environment. However, the scope of the blog post and the discussions are narrow. In general, I still recommend Julia because it is fundamentally productive and, with care, correct.

If TFA really means "Most of the time if statements take the correct branch, and it's unusual for them to take the incorrect branch, and you should use tests to detect whether the if statements in your program are working" then I would appreciate it if TFA would come out and say that. I am sort of used to programming in environments in which if statements work 100% of the time.

Sukera · on June 26, 2022

Aside from me disagreeing with this article here in that I don't think "all of julia is incorrect" is a valid take from Yuri's original article, I'm not sure how you arrive at "if branches don't do what I think they do" from that sentence. Are you referring to one (already fixed) bug[0]?

[0]: https://github.com/JuliaLang/julia/issues/41096

anonymoushn · on June 26, 2022

I am also thinking about an issue where catch randomly does not catch exceptions. I have not found the right issue in the issue tracker, so I don't know whether this is fixed.

ModernMech · on June 26, 2022

Can you provide some more detail on this issue? That sounds… not good.

smarx007 · on June 26, 2022

I don't see how it addresses the original complaint. Vishnevsky basically stated that if you are trying to run a scientific experiment on a supercomputer, maybe it's a risky idea to use a new programming language with a new stdlib and a bunch of OSS libraries vs using an old language like C with very stable set of existing code because new things tend to have unknown bugs? Vishnevsky has a point, but unless you are running some critical computations on supercomputers, maybe it doesn't apply to you?

To be clear, in supercomputing environments people still use old versions of CentOS just to make sure that library version updates do not change their computation results. I don't think many people here would say "I am sticking to Ubuntu 16.04 because I am afraid that the updates to some library like gmplib will slightly change my computation results in a way that is hard for me to detect".

Also, just staying with the old doesn't mean it's correct. You can also introduce bugs to your libs. I think NASA thought this through long time ago and solved it by making sure critical parts of the code are implemented twice using different stacks with different programmers.

If you are NASA, CERN, LLNL, or a bank, maybe it's a good idea to implement your computations once in Python and once in Julia (by at least two different programmers) and compare the outputs. And I don't think in this situation Julia is any different from other languages (other than you may put too much trust into it and skip this dual implementation). Case in point: https://github.com/scipy/scipy/issues?q=is%3Aissue+is%3Aclos...

ModernMech · on June 26, 2022

> If you are NASA, CERN, LLNL, or a bank, maybe it's a good idea to implement your computations once in Python and once in Julia

Doesn’t this negate one of Julia’s main selling points? That it has “solved the two-language problem”. Ironic for them to solve that in the performance domain only to then need a second language to prove correctness.

ChrisRackauckas · on June 26, 2022

Note that with the Julia differential equation solvers, you can, without rewriting your code, test it with SciPy's solvers, MATLAB's solvers, C solvers (CVODE), Fortran solvers (Hairer's, lsoda, etc.), and the pure Julia methods (of which there are many, and many different libraries). https://diffeq.sciml.ai/stable/solvers/ode_solve/ Even a few methods for quantum computers. This also includes methods with uncertainty quantification (Taylor methods with interval setups, probabilistic numerics). So no, you can run these kinds of checks without rewriting your model. (Of course, some of the solvers to check against will be really slow, but this is about correctness and the ability to easily check correctness)

davisoneee · on June 26, 2022

I interpreted it more that, in domains where correctness is vital (rather than "good enough"), you want more than one implementation no matter what languages you use.

Maybe that's not what parent was going for, but I think it's like the reproducible/replicable difference in research... can you use the author's code and data, getting the same result... can you use the author's algorithm/pseudocode and data, and get the same result... can you use the author's algorithm/code and different data, and get an _equivalent_ result?

nsajko · on June 26, 2022

> new things tend to have unknown bugs

Guess what also has unknown bugs? Old C code.

cdaven · on June 29, 2022

The more code is used, the more bugs are found. So new code is expected to contain more unknown bugs than old, well-used code. But this is a function of usage, not of time itself.

practal · on June 26, 2022

I just came across Yuri's criticism for the first time, but it makes sense to me. I am not a user of Julia, but have followed it with interest since they published their first paper about it. With hindsight, it is clear that they would run into correctness issues due to their powerful composability features. The solution is obvious and hard at the same time: there must be a way to PROVE correctness. Of course, to incorporate a prover into Julia will be pretty hard, it is probably much easier to incorporate (some of) the ideas of Julia into a new shiny prover.

mgaunard · on June 26, 2022

any language with generic programming has the same issues.

This is in no way specific to Julia.

practal · on June 26, 2022

I think it is specific, because a) Julia is a dynamic language, and b) it uses dynamic multiple dispatch.

I think these features are great, but on their own they lead to exactly the situation as described.

mgaunard · on June 26, 2022

so you get type errors at runtime rather than compile-time.

That's a known problem with dynamic typing.

practal · on June 27, 2022

As far as I understand the situation the problem is that you do NOT always get type errors during runtime. Instead, you just get a wrong result, because the combination is legally allowed (that means, the types are accepted), but has not been foreseen and is handled wrongly.

For example with Python, there is also dynamic typing, but not dynamic multiple dispatch. There do not seem to be correctness problems of the same sort as in Julia in Python. So dynamic multiple dispatch, or maybe just Julia's version of it, seems to be the culprit. But an in-depth analysis of this is needed before making a final verdict.

But as I said before, it seems inevitable: dynamic multiple dispatch leads to an explosion of possible combinations, how do you make sure that all of these combinations work as expected? And what is "expected" in the first place?

ChrisRackauckas · on June 27, 2022

No, you do get type errors during runtime. The most common one is a MethodNotFound error, which corresponds to a dispatch not being found. This is the one that people then complain about for long stacktraces and as being hard to read (and that's a valid criticism). The reason for it is because if you do xy with a type combination that does not have a corresponding dispatch, i.e. (x::T1,y::T2) not defined anywhere, then it looks through the method table of the function, does not find one, and throws this MethodNotFound error. You will only get no error if a method is found. Now what can happen is that you can have a method to an abstract type, *(x::T1,y::AbstractArray), but `y` does not "actually" act like an AbstractArray in some way. If the way that it's "not an AbstractArray" is that it's missing some method overloads of the AbstractArray interface (https://docs.julialang.org/en/v1/manual/interfaces/#man-inte...), you will get a MethodNotFound error thrown on that interface function. Thus you will only not get an error if someone has declared `typeof(y) <: AbstractArray` and implemented the AbstractArray interface.

However, what Yuri pointed out is that there are some packages (specifically in the statistics area) which implemented functions like `f(A::AbstractArray)` but used `for i in 1:length(A)` to iterate through x's values. Notice that the AbstractArray interface has interface functions for "non-traditional indices", including `axes(A)` which is a function to call to get "the a tuple of AbstractUnitRange{<:Integer} of valid indices". Thus these codes are incorrect, because by the definition of the interface you should be doing `for i in axes(A)` if you want to support an AbstractArray because there is no guarantee that its indices go from `1:length(A)`. Note that this was added to the `AbstractArray` interface in the v1.0 change, which is notably after the codes he referenced were written, and thus it's more that they were not updated to handle this expanded interface when the v1.0 transition occurred.

This is important to understand because the criticisms and proposed "solutions" don't actually match the case... at all. This is not a case of Julia just letting anything through: someone had to purposefully define these functions for them to exist. And interfaces are not a solution here because there is an interface here, its rules were just not followed. I don't know of an interface system which would actually throw an error if someone does a loop `for i in 1:length(A)` in a code where `A` is then indexed by the element. That analysis is rather difficult at the compiler level because it's non-local: `length(A)` is valid since querying for the length is part of the AbstractArray interface (for good reasons), so then `1:length(A)` is valid since that's just range construction on integers, so the for loop construction itself is valid, and it's only invalid because of some other knowledge about how `A[i]` should work (this look structure could be correct if it's not used to `A[i]` but rather do something like `sum(i)` without indexing). If you want this to throw an error, the only real thing you could do is remove indexing from the AbstractArray interface and solely rely on iteration, which I'm not opposed to (given the relationship to GPUs of course), but etc. you can see the question to solving this is "what is the right interface?" not "are there even interfaces?" (of which the answer is, yes but the errors are thrown at runtime MethodNotFound instead of compile time MethodNotImplemented for undefined things, the latter would be cool for better debugging and stacktraces but isn't a solution).

This is why the real discussions are not about interfaces as a solution, they don't solve this issue, and even further languages with interfaces also have this issue. It's about tools for helping code style. You probably should just never do `for i in 1:length(A)`, probably you should always do `for i in eachindex(A)` or `for i in axes(A)` because those iteration styles work for `Array` but also work for any `AbstractArray` and thus it's just a safer way to code. That is why there are specific mentions to not do this in style guides (for example, https://github.com/SciML/SciMLStyle#generic-code-is-preferre...), and things like JuliaFormatter automatically flag it as a style break (which would cause CI failures in organizations like SciML which enforce SciML Style formatting as a CI run with Github Actions https://github.com/SciML/ModelingToolkit.jl/blob/v8.14.1/.gi...). There's a call to add linting support for this as well, flagging it any time someone writes this code. If everyone is told to not assume 1-based indexing, formatting CI fails if it is assumed, and the linter underlines every piece of code that does it as red, (along with many other measures, which includes extensive downstream testing, fuzzing against other array types, etc.) then we're at least pretty well guarded against it. And many Julia organizations, like SciML, have these practices in place to guard against it. Yuri's specific discussion is more that JuliaStats does not.

practal · on June 27, 2022

Thank you for this explanation!

In a way that is what I said: The problem is not that the types did not fit, the problem is that the code did just not behave as expected according to the interface specification. And combining many different implementations which each other through multiple dispatch increases the chances that the misbehaviour of one of them impacts overall correctness.

But my emphasis on "dynamic" seems to be wrong. As you describe, this would happen in a static language as well.

Nevertheless, I don't believe the solution is linting. That's just a bandaid. The solution is to prove your code correct. That way you make sure that the code implementing an interface behaves as demanded by the interface. There are areas of computing where that does not make much sense. Numerical computing isn't one of them.

ChrisRackauckas · on June 27, 2022

I agree that one thing that should also be done is to change the AbstractArray interface. I believe that "being an array" and having indexing are two distinct quantities, so SimpleTraits.jl-style dispatching on "allows_linear_indexing" would be a way to slim down what's assumed when writing a function to more specific pieces (similar to Haskell). But as far as I am aware, even Haskell won't prove that a code is incorrect if it uses a hardcoded 1:n indexing in a function that says dispatches on "allows_linear_indexing" (and thus this kind of issue would get by even Haskell and not even throw a runtime error in cases where an array assumes -1:n indexing). So I'm curious, what's your idea for an interface that can prove correctness here?

practal · on June 27, 2022

You will notice that you use the array the wrong way when you try to prove the correctness of the client of the array. Somewhere in the specification of the client it will be required to, let's say, sum up all of the elements of the array. If the array index range is [-1, 4[, but the client sums over [1, 4[, then this is wrong.

The client will have its own spec, and when it uses the array, you need to prove that it does so in a way that its own spec is fulfilled.

You need to know the entire semantics of the interaction. You need to know what the array represents for the client, and that may depend on the client of the client.

But with multiple dispatch you are constantly tempted to pretend that correctness just depends on the type, because that's what you dispatch on. So that's the problem.

In general, I like types for organising things. I don't like them for correctness.

PS: Edited the above a few times to make my point more clearly.

ChrisRackauckas · on June 28, 2022

> If the array index range is [-1, 4[, but the client sums over [1, 4[, then this is wrong.

So you cannot loop over a subset of the indices? That seems restrictive.

practal · on June 28, 2022

You can do whatever you want. You just need to prove that it is correct. I guess I wasn't as clear as I hoped I would be.

I don't think the Array indexing issue can be dealt with by (just) improving its interface specification. I think the proper fix for this is beyond what can reasonably be done in any language that doesn't come with a notion of correctness and proof.

amkkma · on June 27, 2022

Dex proves indexing correctness without a full dependent type system, including loops.

See: https://github.com/google-research/dex-lang/pull/969 and https://github.com/google-research/dex-lang/blob/5cbbdc50ce0... for examples

ChrisRackauckas · on June 28, 2022

Yes, but that doesn't handle this case. I discussed a case of indexing in the bounds which is still incorrect.

bzxcvbn · on June 26, 2022

How can the author make the argument that OOP makes things too tightly coupled, then claim that it's up to the developer to check if every single coupling in their code is correct? The coupling is still tight, it's just not enforced by the compiler. There's a reason for tight coupling in OOP: it's better to let the compiler check for correctness rather than every single developer out there.

phonebucket · on June 26, 2022

> to check if every single coupling in their code is correct?

The author is not addressing the couplings in their code itself, but the code couplings between their code and all libraries in the Julia ecosystem, which is an altogether more audacious goal.

melissalobos · on June 26, 2022

I do think that Julia in general isn't the best for interacting with C(based off of some experimenting 2 years ago). I have tried using it with a decent number of odd C libraries used in Scientific Computing(with actual capital S and C. Which always involved binary blob libraries.). I think if you control more of the stack it is a better language. But when using it with vendor provided binaries I ran into many many issues.

> "with actual capital S and C." This part was referencing the industry and not me being egotistic.

celrod · on June 26, 2022

I find `ccall` and especially `@ccall` easy to use, but thankfully haven't spent much time wrapping C libraries so I'd consider myself far from an expert in the matter.

Creating mutable structs and `GC.@preserve`ing them is an effective means of getting stack allocated memory, so long as the structs do not escape.

E.g., I occasionally follow this approach:

  mem = Ref{NTuple{32,Float64}}()
  GC.@preserve mem begin
    p = Base.unsafe_convert(Ptr{Float64}, mem)
    # do things with p, e.g. pass it to a C library
  end

abhimanyuaryan · on June 26, 2022

but the whole point of using Julia is that you don't need C(relatively same speed). Why would you interface it a Scientific Computing library written in C. Also C doesn't really sound like a good language for Scientific Computing. People do SC in Python and which is why Julia i.e. to solve two language problems and mind boggling speeds like C with syntax like Python

uecker · on June 26, 2022

C is an excellent language for scientific computing. A lot of low-level and high-level libraries for scientific computing are written in C. Toolchain support is excellent. It is maybe less good a putting some scripts together quickly.

abhimanyuaryan · on June 26, 2022

what are some of these libraries that are written in C but don't have python bindings if I may ask. Genuine question the once I have seen have python binding and that's why they are super popular and widely used.

amkkma · on June 27, 2022

Various people have expressed skepticism of needing language level traits, but it's in vogue to be writing interface packages with no top level abstract types (see Tables.jl) because we don't have some way to extensibly express being part of multiple type trees.

It's challenging enough that Abstract Types have no interface specs (it's claimed that this is a feature, allows interfaces to develop over time), but it's even tougher when things get even looser with this new trend towards have untyped functions...even various interface package maintainers themselves have said the current state of affairs is difficult.

And if we do want to opt into abstract typing, you're going to be inheriting lots of functionality that could be better broken into smaller typeclass like components (see AbstractArray).

I think Julia is screaming for some capability to break up Abstract types and being able to inherit from multiple kinds of things, along with specified interfaces with these things.

The problem is that this is very difficult, given the complex subtyping system and multiple dispatch. Adding more complexity exacerbates already present method and typing decidability issues. People are working on this on the periphery, but I hope it can become a priority of the core team and that it's solvable.

rightbyte · on June 26, 2022

My major problem with Julia is that writing performant code is a bit like writing Haskell.

You need deep understanding of the compiler to know what constructs will mess up performance. Especially how typeinfo is propagated.

nsajko · on June 26, 2022

> You need deep understanding of the compiler to know what constructs will mess up performance. Especially how typeinfo is propagated.

Could you expand on this? It's not clear whether you're just talking about issues that would trip up beginner who hasn't read the relevant sections in the manual yet.

moelf · on June 26, 2022

at least in Julia you have `@code_warntype`, JET.jl, `@benchmark`, and the language being dynamic also helps a lot.

adgjlsfhk1 · on June 26, 2022

that is a major issue, and I'm not sure if there's a good solution. the one mitigating factor is that the high level languages Julia competes with (Matlab, R and python) have a similar ease of writing much slower code.

leephillips · on June 26, 2022

In practice it’s not so major. Following a few simple guidelines from the performance section of the official manual is enough for most people most of the time. And the most important of these guidelines are practices that an experienced programmer would probably already be following, such as not using global variables.

skybrian · on June 26, 2022

Maybe Julia should have an easier way to know if you're using a combination that's been tried before and is supported?

abhimanyuaryan · on June 26, 2022

this is actually interesting but sounds very complex. How about writing tests?

throwaway290 · on June 26, 2022

> How about writing more code?

FTFY.

Tests are code. This code needs to be maintained, since it's coupled to your architecture. This code has bugs of its own. Etc.

Writing tests is a fine choice if you have the resources, and also isn't a valid answer to a criticism of the language.

tharne · on June 26, 2022

I've noticed that any time you need to use the word "still" to argue for something, you've usually already lost the debate or the larger point.

I can't imagine anyone writing a post along the lines of, "Why I still recommend SQL for querying databases", even though SQL was conceived back in the 70's. "Still", in this context more often than not suggests that a grasping-at-straws is taking place.

Julia has a lot of neat features, but at the end of the day, it's a research language rather than a getting things done language.

DNF2 · on June 27, 2022

This seems like some sort of magical thinking. Some random person uses a particular word in a personal blog post (there's nothing official about this), and somehow this influences the technical merits of the subject of the blog post.

Can you explain the causal connection here?

einpoklum · on June 26, 2022

> The idea is that when you are, for example, writing business logic with books, then you want to neatly put the operations on the book near the definition of the book.

That's not the idea. You put the code for the "internals" of a book near to its definition. In other words: The code that needs to see the private/protected members of a class. It is not good OOP - if I may be so bold as to use that phrase - to stick as much functionality as you can involving some class, into that class.

blindseer · on June 26, 2022

The example using Measurements.jl only works as a good example for composibility if you use `Number` as the type for the argument. If you happen to use Int then it doesn't work:

  using Measurements
  plus_one(x::Int) = x + 1

  plus_one(measurement(1))
  ERROR: MethodError: no method matching plus_one(::Measurement{Float64})
  Closest candidates are:
    plus_one(::Int64) at REPL[2]:1
  Stacktrace:
   [1] top-level scope
     @ REPL[3]:1

This is perhaps my biggest annoyance with Julia. Concrete types cannot be subtyped, and hence cannot be extended by other packages. And explaining to researchers and scientists that they need to use abstract types to make their code more composable is exercise in frustration. I think computer scientists with experience programming in C++ may have good intuition for when to use concrete or abstract types in the function signatures, but research scientists in ML and optimization (in my experience) just don't do a good job of that. And just end up having awful Julia code to work with, that isn't really extendable. In my opinion, readability suffers greatly when you have to understand the type hierarchy in order to find out if your code will MethodError or not. For example, if the author use `plus_one(x::Rational) = x + 1` instead of `plus_one(x::::Number) = x + 1`, would the code that uses Measurements.jl have worked? Who knows just by looking at the code. It turns out it doesn't work and there's a MethodError. The built in type hierarchy is great, but third party packages are hit and miss.

Honestly a simple solution to this would be to allow concrete types to be supertypes of other types. My understanding is that there's no compiler related reason for this, and this is just for "good practices" but for the life of me I don't understand why this limitation was made. I've seen Julia code that creates a supertype abstract type for every concrete type they create, and it is just awful to deal with. Fortunately, it is somewhat easy to write a macro to make this happen, but still very annoying that one has to do this at all.

Combine that with the lack of a file system package modules (like in Python or Rust), the lack of any kind of doctests, the built in testing being so lackluster, you end up with really poorly organized code.

TSCoding has a 1 min video on why he doesn't program in Haskell anymore [1] and I can't help but feel his reasons directly apply to Julia too.

For example, in order to run that example for this comment, I had to run `add Measurements` and it took 7 minutes to download the package. Why is the "registry" for packages in Julia a giant git repository? It takes SO long to update every time. This is an example of a poor software engineering decision in an otherwise elegant language.

[1] https://www.youtube.com/watch?v=SPwnfSmyAGI

adgjlsfhk1 · on June 26, 2022

There actually is a really good compiler reason to not allow subtyping concrete types, and it's one of the main reasons Julia code is faster than languages like Java. If you allow concrete types to have subtypes, when you see `Vector{Int}` (List<Integer> in Java), the compiler can't tell what the memory layout of the element type will be. As a result, pretty much all objects end up boxed which destroys locality and removes all possibility for vectorization. In Julia, the compiler knows the memory layout of any insatiable type at compile time, so it can much more aggressively remove pointers (or even just decide that an object will exist in registers rather than allocating memory for it).

IshKebab · on June 26, 2022

That's not a result of `int` not being subtyped. It's a result of forcing reference semantics on everything (which is something most languages unfortunately do).

C++ doesn't need this kind of restriction because it makes the difference between references and values explicit, so it's actually possible to explicitly ask for a vector of values, rather than only being able to ask for a vector of references and artificially restricting the language so that it can be optimised to a vector of values.

(Yeah I know you can't derive from int in C++ but that's for other reasons.)

adgjlsfhk1 · on June 26, 2022

Julia also has reference semantics for everything. That said, it turns out that if you make it idiomatic to create immutable structs, the difference between reference semantics and copy semantics disappears, and lets the compiler pick the faster one (which fixes a common performance bug in C++ where users end up copying a ton of data around by accidentally copying a bunch of data)

IshKebab · on June 27, 2022

Yeah I always thought that "make everything immutable!" was partly - maybe even mostly - a workaround for the fact that it's so hard to copy data in most languages.

I agree C++ makes it too easy to accidentally copy data, but at least you can copy data!

adgjlsfhk1 · on June 27, 2022

but the only reason you want to copy data is that it makes manual memory management easier. otherwise paying a reference is almost always cheaper.

kaba0 · on June 26, 2022

I don’t see how is it inherent to subtyping, but language semantics. For example Java will likely add 2 new types of class that will loose identity - at that point Integer (and other, struct-like data) will be optimizable by inlining into the surrounding data structure.

Sukera · on June 26, 2022

The point is not limited to Int though. Say you have this struct:

    struct MyStruct
        a::Int64
        b::Float64
    end

Since the size of each field is known (8 bytes), the size of the whole struct, and thus the size of each instance of that struct, is known - 16 bytes. This is crucial information for inlining, loop unrolling, copy elision, deciding which register(s) to place the object in, getting rid of indirections through pointers.. If you'd allow to subtype MyStruct, the compiler wouldn't be able to know ahead of time what the size of a MyStruct object is and would either have to box every access (introducing pointers which have to be chased on access) or defer A LOT of work to runtime. With boxing, you almost immediately lose lots of opportunities to SIMD or otherwise optimize your code, because you have to keep a whole lot of extra type information around at runtime that's just not necessary when all your objects in a e.g. Vector{MyStruct} have the same size anyway.

kaba0 · on June 26, 2022

Oh I see what you mean. Though that’s why Java has the `final` keyword to forbid subtyping for a given type.

adgjlsfhk1 · on June 26, 2022

right. The Julia design is to say that `final` was a good idea and that everything should have it (Especially since OOP people now say to prefer composition to inheritance anyway).

languageserver · on June 26, 2022

you misunderstand both Java generics and Java primitive types.

1. the JVM does not support generics as a language construct, and are simply a compiler feature of Java

2. the JVM and Java differentiates primitive and reference types, Integer being a reference (boxing) type of the primitive type int. And, generics do not support primitive types anyway.

The JVM implementations are incredible pieces of machinery, and even if they did not pack or stride their primitive arrays -- which they absolutely do -- the JVM & compilers will handle an incredible amount of runtime memory layout and packaging.

blindseer · on June 26, 2022

Thanks for that explanation!

oivey · on June 26, 2022

If you drop the type annotation in your example then it works fine. There's not really a good reason to provide a type annotation to that function.

I just installed and used Measurements.jl in <10 seconds. Newish versions of Julia (1.6? the LTS?) don't distribute the package registry as a git repo.

blindseer · on June 26, 2022

My approach has been by default to never use types in functions, and if I use types, use the most abstract type possible, only in order to trigger dispatch.

1) It is just hard to explain this to people that work under me. 2) types in functions are no longer useful to understand the code or improve readability, since the purpose of types is only for dispatch.

I guess my dream request would be an alternative syntax for typing for concrete and abstract types. Abstract types would be used for dispatch and composibility. Concrete types should be used because the function is being specialized for that concrete type.

There's lots of nuance here though, and I've really struggled to communicate this to colleagues and peers.

oivey · on June 26, 2022

Fair enough. I'm not sure what fueled this idea that marking variables with types will make the code faster. Cython? I think part of the issue, too, is that in Python type annotations are really just fancy comments. If you mark a function as taking ints in Python and stick a double in, at most your editor or mypy will complain. If you do it in Julia, it's a compilation error. Tangent: JET.jl might catch these sorts of issues for you and I think has linter support in VS Code now or imminently.

I'm not sure being able to subtype concrete types will help with the specific issue you're showing here. Really the right thing to do is make the function take Numbers. If you allowed Int to be subtyped, things would still be semantically wrong, and you would have introduced a parallel inheritance tree that will break a lot of the interoperability multiple dispatch enables.

leephillips · on June 26, 2022

Exactly. There’s no reason to write the function that way unless you’re trying to break things, or to write a function that can’t be used with other functions.

jakobnissen · on June 26, 2022

In Julia 1.7 and onward, the registry download is a single tarball, and the packages are read directly from it. In the future, it's likely only the diff is going to be downloaded. These things are being improved, but it takes time - it also takes longer time than other languages since Julia have relatively less devpower than e.g. Rust.

tome · on June 26, 2022

[Off topic]

> TSCoding has a 1 min video on why he doesn't program in Haskell anymore

This is very interesting, and as someone who is spending a lot of time trying to bring Haskell to a wider audience it is missing a key piece of information that would help me: what is the link between Haskell being a beautiful and elegant language and it not being engineered properly? I can see (at least) two distinct possibilities:

1. Beautiful and elegant languages can simply never been engineered properly (or it's too hard to be worth trying). There is an irreconcilable tension between beauty and utility.

2. Haskell people are too interested in making something beautiful that they don't bother trying to make it useful as well, but they (or others) could.

[For the record, I don't agree with the presenter's point of view: I find Haskell both the most beautiful and the most useful language I know.]

nsajko · on June 26, 2022

> explaining to researchers and scientists [...] is exercise in frustration

A person unwilling to learn how to use their tools won't be able to use them with optimal efficiency. Does this have anything to do with Julia, specifically?

systems · on June 26, 2022

I like Julia a lot I think it needs two things (for me)

1. better database support, also need to support BI Cubes likes MS SSAS

2. fix scoping rules

I created a small feature request on github, where I suggest a fix for their scoping (inspired a lot by perl), and I would like to take this opportunity to promote my suggestion, I think its a good one, that will make Julia nicer

https://github.com/JuliaLang/julia/issues/37187

tpoacher · on June 26, 2022

Btw, you can totally have classes in Julia. And methods will be compiled like all other functions. It's just not considered particularly idiomatic.

nxpnsv · on June 26, 2022

I wonder who needs the introduction to objects to parse the rest of the post… it seems challenging

dontbenebby · on June 26, 2022

[flagged]

melissalobos · on June 26, 2022

> niche programming language

I mean `niche` is a little weird here since Julia is very specifically a general purpose programming language. It just happens to be really really good at math.

[1] https://julialang.org/#tab-general

sk1pper · on June 26, 2022

> Julia is very specifically a general purpose programming language

I’ve been trying to figure this out recently - because I love Julia’s features. Readable like Python, but with more ability to optimize performance, and also lispy with macros and generic functions. I’m personally interested in it as a general purpose language.

But when I search around about it, most folks to seem to relegate it to the data science realm only. Everyone seems to be saying: well it’s certainly general purpose capable, but its designers are focused on data science, and that will continue to be the primary goal. As such, don’t expect to see it widely adopted as outside of data science anytime soon.

I don’t want that to be the case, but it seems harder to build broader excitement about the language if it’s going to continue to be perceived as niche.

krastanov · on June 26, 2022

Bringing up data science as its niche is somewhat funny, because the most mature libraries in the language (where it is light years ahead of other ecosystems) are in the general sciences (e.g. differential equations, math optimization, etc). It is true that the Julia ecosystem as a whole is mature only in a few niches, but focusing specifically on narrow data science claims would make me doubt the knowledge of the person making that claim.

baazaa · on June 26, 2022

R has multiple dispatch right with s4, I don't recall running into huge library compatibility issues with R? Is the only difference between R and Julia maturity where R has ironed out the sorts of bugs Yuri was running into?

krastanov · on June 26, 2022

I think the difference is that in Julia multiple dispatch is the main paradigm to structure code (together with very aggressive devirtualization/specialization/compilation). That enables quite amazing things. Other languages have multiple dispatch as well, but it is not foundational to the ecosystem in them. They lack the "magic" but they also have lower propensity for the interface mismatch bugs described through these two threads.

elchief · on June 26, 2022

Julia was always gonna be the RC Cola of data science languages. Python and R were too far ahead and had too much mind share

krastanov · on June 26, 2022

Why are you restricting the discussion just to data science? In "general" science there are way more devs than just in data science / statistics, and Julia absolutely shines there. Don't get me wrong, the language is general purpose, the ecosystem is a bit niche for now, but still, it seems wild to restrict comments to such a small field as data science.

bzxcvbn · on June 26, 2022

> Julia absolutely shines there

Do you have evidence for that?

I'm a mathematician at a research university, and maybe two of my colleagues are using Julia. Despite their proselytism, everyone else is using Python, C, or math-specific software such as GAP, Matlab, or Mathematica.

krastanov · on June 26, 2022

Depends on what exactly you would like evidence for. Your dissenting comment is that Julia is not popular yet. With that I can easily agree, but that is also not directly related to whether it is an amazing tool, which was my claim.

In terms of examples of hard sciences where it shines: It is the only tool in existence that has at the same time high-quality differential equation solvers and autodiff on them. Compare DifferentialEquations.jl to any other package in any other language. The rich capabilities of the aforementioned package depend on the multiple dispatch + aggressive devirtualization used in Julia. Python/Jax/Tensorflow/Pytorch while wonderful on their own, are nowhere near these capabilities. Matlab/Mathematica do not have these capabilities. The famous C/Fortran/C++ libraries are also far less capable in comparison.

bzxcvbn · on June 26, 2022

Once again, I'm looking for evidence, not your say-so...

krastanov · on June 26, 2022

Quoting from my reply so the evidence is easier for you to notice:

> Compare DifferentialEquations.jl to any other package in any other language

If you do not know how to do such a comparison for yourself, this thread has details https://news.ycombinator.com/item?id=31883793

shadowofneptune · on June 26, 2022

Am I the only one who actually likes the taste of RC the most? That aside, Julia is relatively young for a programming language, 10 years old. It's still possible for it to find a niche, even if it is not the one it aimed for. It took until the late 00s for Python to enter the data science niche in the first place.

anothernewdude · on June 26, 2022

Sure, RC is fine, but if you 1-base your arrays then you chose to be the pariah. I have no patience for technology that just chooses to be special for the sake of it.

stellalo · on June 26, 2022

Let’s be serious: I yet have to see a convincing argument for 0-based being better than 1-based, or the other way around. And designing a language based on “what everybody else does” is definitely not the right approach.

anonymoushn · on June 26, 2022

The argument for 0-based indexing with exclusive upper bounds is that hi - lo == len, that 0-length intervals can be expressed without hi < lo, that if you have found the first index at which a predicate is true and the first index at which it is no longer true, those indices themselves are usable as upper and lower bounds, so you may write my_slice[first_true_idx..first_subsequent_false_idx] to get the sub-slice in which the predicate is true continually, that using mod to restrict indices to the valid range is idx % n rather than (idx - 1) % n + 1.

The argument for 1-based indexing with inclusive upper bounds is that binary heaps are easier to write this way. Edit: And that people who are not primarily programmers may find 0-based indexing weird.

rightbyte · on June 26, 2022

> Edit: And that people who are not primarily programmers may find 0-based indexing weird

Alas, if just our forefathers had called it "offset" instead of "index".

patrec · on June 26, 2022

> The argument for 1-based indexing with inclusive upper bounds is that binary heaps are easier to write this way.

Nope. The argument for 1-based indexing with inclusive upper bounds is that, to express ranges, exclusive indexes require you to have a bogus just-one-beyond-the-largest-valid-index element and a way to get the successor for every index. This is a giant PITA for anything but integers (MAX_INT, and x+1), which is why every single programming language I can think of also has end-inclusive ranges in some contexts. If you don't believe me try writing regexps with end-exclusive ranges.

On the other hand, there is no natural way to express empty ranges with 1-based indexing.

anonymoushn · on June 27, 2022

I'd love to learn more about writing regexps with various kinds of ranges. I don't see the issue, but it's very likely that I'm missing something.

patrec · on June 27, 2022

Here's what a regexp for a C-style identifier looks like (ranges are inclusive):

    [a-zA-Z_][a-zA-Z0-9_]*

Here's what the same regexp would look like if regexp ranges were end-exclusive:

    [a-{A-[][a-{A-[0-:_]*

Do you see the issue now? Of course if regexps really worked that way, you'd be better off doing this instead:

    [a-zzA-ZZ][a-zzA-ZZ0-99_]

But that specific trick only works because regexp ranges occur in a set union context.

anonymoushn · on June 27, 2022

I see, the `a-z` range has an inclusive upper bound. That makes sense and it really should be like that.

bzxcvbn · on June 26, 2022

If there are no convincing arguments one way or the other, then "what everybody else does" becomes the convincing argument. Why change established conventions for no good reason? There's no reason for curly braces to indicate control blocks and square braces to indicate indexing, but if a language swapped the two, what would you say?

ModernMech · on June 26, 2022

Indeed, and if you add up all the users of 1-based languages (Fortran, Matlab, R, Excel, etc.) and 0-based languages (C, C++, Java, Python, etc.), I think you’ll find that the 1-based languages have vastly more programmers.

Going by popularity, 1-based indexing is the established convention.

bzxcvbn · on June 26, 2022

> I think you’ll find that the 1-based languages have vastly more programmers.

Any shred of evidence for that? Listing four languages for each won't cut it.

ModernMech · on June 26, 2022

Excel alone has more users than all other languages combined, so it's not even close.

anothernewdude · on June 27, 2022

And how many of those users index anything? Heck given the languages you'd write macros for excel, they really screwed up by picking 1-based.

ModernMech · on June 27, 2022

Indexing is the primary operation in Excel and one of the first things you learn how to do in the language (selecting a range of cells). It’s how you refer to anything, by indexing into the global cell space.

Given Excel’s massive success and nontechnical user base, which again is larger than all languages combined, it’s hard for me to see 1-based indexing was a mistake. I have experience teaching novices how to program, and 0-based indexing is always a sticking point of confusion. So from my perspective, 1-based indexing is the right choice for excel given the user base and programming style.

goatlover · on June 26, 2022

Yes, but not so much in scientific computing, where scientists and mathematicians do a lot of the coding.

kgwgk · on June 26, 2022

Because 1-based languages like Fortran, Matlab, R or Mathematica are less used in scientfic computing than elsewhere?

goatlover · on June 27, 2022

More used in scientific computing, which was my point, because only programmers think of indexing based on offset instead of the first positive integer. Nobody outside of programming in C/Unix inspired languages starts counting at 0.

kgwgk · on June 27, 2022

I may have misinterpreted you comment.

When someone wrote “Going by popularity, 1-based indexing is the established convention” and you replied “Yes, but not so much in scientific computing” I understood that as “1-based indexing is less used in scientific computing compared to general computing”.

goatlover · on June 27, 2022

Oh, either I replied to the wrong person or misread the parent.

goatlover · on June 26, 2022

Fortran, R and Matlab have 1 based arrays. That's pretty common with scientific computing languages. I'd hardly call those pariahs, and Fortran's older than C.

ModernMech · on June 26, 2022

More programming in the world is done in a 1-based language, Excel, which has more users than all other languages combined.

0-based languages are in the minority as far as installed base and usage goes. Being 1-based is a decision dependent on target audience and application.

Fortran, one of the earliest languages, is 1-based, so there’s plenty of historical precedent.

Matlab, one of the most commercially successful languages in history is 1-based, so it’s not exactly a barrier to adoption.

anothernewdude · on June 27, 2022

If you're counting the number of people who use excel spreadsheets as "programming in excel" then that's great. I no longer need to take you seriously.