Hacker News new | past | comments | ask | show | jobs | submit login
Why Julia (ucidatascienceinitiative.github.io)
303 points by Tomte 4 months ago | hide | past | web | favorite | 249 comments

I recently began learning Julia and initially everything was amazing, except for 1 based indexing but with everything else I could overlook that. Then I attempted building something medium sized and it all fell apart. I feel like it needs some serious work on tooling, the module system, packages, etc.

Has anyone built something medium-large sized in Julia? Maybe I'm missing something. When I was trying to use modules to organize my code the response I got was that modules are more trouble than they are worth so just don't use them. So why do they even exist? That really put me off from Julia despite really liking it otherwise.

I would want to build things with it not just play in a REPL and notebooks.

Yeah, I built several libraries that replace IEEE floating points with alternatives. You should really be using modules to organize your code. I don't know who told you that they are more trouble than they are worth.

I would also suggest aggressively unit testing all the parts. Numerical developers are not often in that habit, which is a shame.



> Numerical developers are not often in that habit, which is a shame.

For certain! I don't know how anyone can do any serious numerical development without a whole lot of test cases, including esoteric PhD-level numerical analysis stuff.

I remember working with the IMSL libraries back in the 1970's and being in awe of the huge set of numerical tests they had in the test suite which gave them, from my point of view, an unbeatable lead in development. These were tests that were often about highly esoteric aspects of numerical stability in floating-point algorithms and were written by extremely well educated mathematicians with tons of numerical experience.

For those who aren't greybeards IMSL (the International Mathematical Subroutine [now Statistical] Library) was one of the first uses of re-usable software in application programming. It started as decks of punch cards that you would put in a request for and the computing facility would punch a copy of the appropriate routines for you to include in your program deck.

> When I was trying to use modules to organize my code the response I got was that modules are more trouble than they are worth so just don't use them.

Really? I normally break up my code across modules, but also across libraries. I'd say a few of my projects are at least medium sized. I discuss an example here, where much of the code is split across many separate modules: https://bayeswatch.org/2019/01/29/optimizing-a-gibbs-sampler... I achieve roughly 1700x better performance in an example than a JAGS model. A C++ version does a bit better at 2000x when compiled with Clang.

Having all these dependencies checked out for development is not best practice, so I wouldn't recommend strictly following my example.

Agreed, that's my experience as well. Julia is amazing for scripts and smaller projects, but I wish there was something like Swift (which I'm increasingly convinced is closest to the ultimate general-purpose language) with all the nice things that Julia has.

Specifically, these things make Julia less suitable for larger projects:

- Lack of support for OOP. And no, purely functional programming is not the best way to code all projects.

- Dynamic typing.

- Doesn't compile to native code (there are ways to do it, but too complicated and there are caveats).

- Module system doesn't seem to be design well.

- You can't use a function before it's defined.

Sorry, why do you need OOP? I haven't coded OO in about 10 years now. (Mostly Julia, elixir, and functional JavaScript). It's great. Would never go back.

I even had to write python once, so I built a class where none of the functions were passed self (so it was basically a functional module)

In python you’d normally just use module level functions for that. Using a class without having instances and all static functions doesn’t really buy you anything.

Iirc, modules have a dependency on bareword paths that can get awkward. I'm not a professional python programmer, so the fastest way for me to deliver something that I could be 100% guaranteed to work was that way. Maybe I'm wrong, but that was my logic.

The only problem I found with this approach is to manage state and dependencies: say, if the functions need to work with database, it is not convinient to send it as a parameter. A class instance initialised with a database looks more feasible in my opinion.

That’s a pattern I use a lot. Passing services and connections around everywhere is a pain, but I like to stick to a more functional style. I generally use classes as a place to hold context (E.g. dB connection) and then use a functional style within the class. Works as a fairly happy medium for me.

Your alternative is dependency injection. What you seem to be describing is class factories (I.e. the use of meta classes) which /is/ OOP.

That said, I think OOP is fine. And so is FP. They’re just tools, and I would prefer to model nouns in the former, and model verbs with the latter.

I would be more interested in why needing dynamic typing is such a deal breaker

Because sometimes it's better for readability and conceptual simplicity to structure the code as objects connected to each other. Most of large-scale successful projects, often written by better developers than me or you, use OOP.

And 80% of the internet switches (Cisco and juniper) are controlled by erlang. So?

I think you're looking at this wrong. Part of the reason why so many large scale products use OO, is because they use Java, and java is awesome if you're working at such a large organization that your productivity is most easily measured by pointy haired managers who count how many LOC you've written ;)

As for readability and conceptual simplicity, may I introduce you to something called the Factory pattern?

Erlang is arguably the most object-oriented language in common usage. Actor concurrency is closer to the original OO model than everyone else's "struct with methods". I'm more inclined to agree with you overall, but Erlang might not be your best example. :)

Edit: never mind, you're obviously aware...

I might be wrong, but my understanding is things happened backwards from that. OOP tried to solve the problem of how to get tons of devs working on huge projects at the same time; java came along to try to be an awesome OOP environment, so Java became the standard.

> OOP tried to solve the problem of how to get tons of devs working on huge projects at the same time

I am skeptical of that history. Simula and smalltalk were academic endeavors long before lots of devs were working on projects, Java became popular because it was a portable C++ which had no pointers and garbage collection.

Smalltalk was being adopted across the industry when Java came around, the big difference was that JDK was free beer and one of the big Smalltakers (IBM) rebooted their environment into Eclipse, while jumping into Java bandwagon.

Was it Visual Age for SmallTalk which became Visual Age for Java which later became Eclipse? Vaguely remember reading something like that earlier.

Yes. With Gamma, one of the pattern book authors, as architect on Eclipse team.

Interesting, thanks.

That is seriously rewriting history. Smalltalk was not big in any shape or form in business in the mid '90s.

Like Java ended up being afterwards no.

It surely had lots of consulting opportunities, though.

What is or is not OOP is actually fuzzy. When it boils down to it erlang's actor model is basically an implementation of smalltalk objects (but don't tell joe Armstrong that). Elixir's module/struct/protocol looks awfully like OO in the post bjarne soustrop era.

Conversely, python's objects feel anti-object since you're passing self all the time (aka dot notation is a weaksauce syntactic sugar)... You could easily model this behavior if you really wanted in Julia, for example, with a custom operator (not that you should)

    function Base.^(o <: Object, fn)
       (a...) -> fn(o, a...)

Joe already knows. He and Alan Kay had a talk about it. It was great.

Erlang is more purely object-oriented than Java. These are the words from its creator itself


I don't think that's a big part of the reason to be honest. Also, modern Java has a reasonable support for FP.

I'm talking about projects like Mapbox, Blender, Firefox, Android.

> Most of large-scale successful projects, often written by better developers than me or you, use OOP.

Ad hominem if I ever saw one.

- Doesn't compile to native code (there are ways to do it, but too complicated and there are caveats).

What do you mean by that? It absolutely is compiled to e.g. x86_64 instructions by default, going through LLVM. Do you mean no standalone binaries? If so, that's actively being worked on by e.g. PackageCompiler.


> - You can't use a function before it's defined.

I'm not sure what you mean by this. You obviously can't call a function before it is defined, but you can use a function in another function without any problems:

    julia> foo(x) = bar(x)
    foo (generic function with 1 method)
    julia> bar(x) = x+2
    bar (generic function with 1 method)
    julia> foo(3)

> - Lack of support for OOP. And no, purely functional programming is not the best way to code all projects.

Julia doesn't rely on functional programming and there are also structs and operator overloading. What feature do you think is missing specifically?

> Dynamic typing.

Julia is not dynamically typed, though if you write a type unstable function, you will get back an Any type.

- Module system doesn't seem to be design well.

What specifically is problematic?

Julia is dynamically typed: https://stackoverflow.com/a/28096079.

> Julia doesn't rely on functional programming and there are also structs and operator overloading. What feature do you think is missing specifically?

Interfaces, access modifiers, calling methods via object.method() instead of method(object), you can't define methods inside classes (structs) with an implicit "this" parameter.

> Module system doesn't seem to be design well.

Working with a multiple file project feels weird... I have to first "include()" the source code and then use "import" or "using". The module system in Java, Javascript or Swift seem to be more straightforward and sensible.

> - Lack of support for OOP. And no, purely functional programming is not the best way to code all projects.

...and neither is OOP? For doing data things I greatly prefer functional programming for ease of understanding how the data flows and gets changed.

> - Doesn't compile to native code (there are ways to do it, but too complicated and there are caveats).

Neither does Python, neither does R. AOT compilation is being worked on, I expect it to get pretty great. C and FORTRAN do, sure, but most people are not writing C and FORTRAN in production.

> - Module system doesn't seem to be design well.

It’s maybe not perfect, but it’s honestly miles better than Python.

> - You can't use a function before it's defined.

This feels like an incredibly unfair and unreasonable complaint given not many languages that aren’t AOT compiled actually support this and Python and R certainly do not.

> ...and neither is OOP? For doing data things I greatly prefer functional programming for ease of understanding how the data flows and gets changed.

OOP support doesn't imply not having support for functional programming. Swift, Scala, and Rust are good examples of languages thst support both.

The parent seems to like julia, but laments that it is missing several features which they have come to appreciate and depend on from Swift. Your response points out that Python and R don't have those features either, which is true, but sometmwhat besides the point of the parent comment.

> Lack of support for OOP.

Please. No. Languages that try to do everything are crap. If you want to do something OOP, why don't you grab a language built for it?

I do. Many languages have excellent support for both OOP and FP. I think Swift is currently the best-designed general-purpose language and wish that it would improve in areas where Julia is better, e.g. standard library and available packages or REPL.

I'd much rather whave a couple of language that specialize in their own paradigm rather than a single one which tries to do everything, to be honest.

What's convenient about Swift as a language which "tries to do everything" is that you can mix and match. Sometimes it's convenient to use different programming paradigms for different components.

It also makes the code a pain to read and maintain.

Difference of opinion. Swift is the most self-documenting language I have worked with between the named function parameters and the power of the type system.

Swift does have a REPL.

Yeah, but Julia's REPL is better in several ways.

Which ways?

Faster startup, better printing of values, integrated package manager, integrated help system, filesystem autocompletion, etc.

Swift actually has faster startup, a better package manager by far, better custom printing support. It also has playground and jupyther notebook integration. Swift is a solid piece of engineering, Julia is a hacky toy.

I'm talking about the REPL. I've tested all points I mentioned before submitting the comment.

honest question, not a serious SDE here. In what ways is julia not supportive of OOP?

It doesn't have interfaces for example. It doesn't have access modifiers. Also, not sure if this belongs to OOP but working with optionals is cumbersome.

Python also doesn’t have interfaces or access modifiers. Would you consider Python to have bad support for OOP by those criteria?

Yes, Python definitely has bad support for OOP.

Smalltalk doesn't have them, either.

Indeed. Alan Kay might take some issue with this notion of OOP.

Only if define OOP as what java does.

Smalltalk might be a better example.

Yes, although Python feels more suitable for OOP for some reason, not sure why... perhaps it's the object.method() notation. It's often more natural and readable and allows better autocompletion in an IDE. Also, I prefer that methods are defined inside of a class, it's immediately clear what is a top level functions and what is a method belonging to a class or struct.

Well Julia sort-of does have interfaces, it’s just that those interfaces are protocols and only conventions (ie there’s no way to say “my type will implement Foo interface”). There is an interface for iteration. If you want to implement it then you just have to implement the iterate function for your type and zero and 1 parameters. Similarly the way to say that your type has a length isn’t to implement the HasLength interface but to just implement the length function.

Again sorry for being dense but:

- aren’t access modifiers the way to define scope? That is possible in julia.

- getters and setters, wouldn’t those be defined in a struct in julia?

- inheritance: pretty sure this doesn’t exist in julia.

I seriously barely understand, i’m reading from the doc here:


You can change the indexing method if you want but that can introduce its own issues


I actually like 1-based for numerical work. A lot of great languages (Smalltalk, APL, Lua...etc) use it too.

It makes sense with matrices.

This is something I don't fundamentally understand. As someone who works a lot with MATLAB I'm very used to and like 1-based indexing. But when I use C or Python, 0-based indexing is not something I complain about or hold against the language. It's just the way things are.

Maybe if you don't think of it in terms of a different index basis and instead you think of it as indexing vs. offsets then it becomes easier to switch between the two?

If that's someone's biggest complaint against the language, then I'd say the language must be pretty awesome ;-)

It's basically bikeshedding. "But I look at the shed all day..." "But I'm used to looking at reddish colors..." "Red is more correct than green because <yadda yadda>..." Let's all move on to more important things :-)

It's not bikeshedding because: 1) it is pervasive and fairly importsnt 2) it's kind of a deep indication that something is amiss. It's like a language written in 2019 that has default dynamic not lexical binding, not in terms of importance but in terms of it's an argument that was had and decided 20 years ago. If this is wrong then you can probably assume that lots of other things which are important, decisions which can't be "proven" correct but have both convincing arguments and weighty empirical evidence are wrong (I'm not having that argument here but google why 0-based makes sense by Djikstra).

I might have to learn Julia at some point but 1-based indexing was a fairly heavy indicator to me to hold off.

You’re repeating yourself without defending your point. Beating us over the head with how obvious something is to you doesn’t make your argument any more compelling.

> it is pervasive and fairly important


> it's kind of a deep indication that something is amiss.


> it's an argument that was had and decided 20 years ago.

Evidently not, if we’re having this debate.

> If this is wrong then you can probably assume that lots of other things which are important...are wrong


> google why 0-based makes sense by Djikstra

Is Djiktra’s opinion so infallible it supercedes the design choices of several successful languages?

I'm not convinced. I've written a great deal of software in C++ and MATLAB. I don't see what the big deal is either way. You say this thing has been "decided 20 years ago" (by whom?) but that's clearly not the case, as myriad languages that use 1-based indexing are in existence, continue to come into existence, and are used to great effect. You cite Djikstra's argument (well, you didn't actually cite it [1]) but his reasoning isn't exactly air tight; he appeals to subjective ideas of "ugliness" and "unnaturalness" and "preferences" and cites a single anecdote for Mesa as proof of his correctness, then dismisses FORTRAN, ALGOL, and PASCAL out of hand as a "pity". Again, I'm not convinced.

[1] https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/E...

Much like turing completeness, you can build systems with any choice that "work", so an argument always has to be from aesthetics, which can include simplicity and "uglyness", but these aren't totally individual (you can get 99/100 people to agree on which of two schemes is "simpler"). Simplicity and beauty aren't something that even mathematicians dismiss as they can be trying to tell you something.

There's another argument, I can't remember if I originated it or read it but it's the one I find most convincing.

Consider that you have finite units of equal length 1 stretching away from you in a line. How would you specify to someone (using real-number measurements) to retrieve the first N=6, or describe the vector they occupy? You would say, take everything between [0.0, 6.0). If you use zero-based indexing, the numbering scheme specifying the interval is identical across both systems: [0.0, 6.0) => (python)[0: 6]== 0<=x<6. In real numbers the interval could be open or closed for this argument to work, because for there to be any elegant translation the integer upper bound has to be open. That's the only way you can keep the elegant transformation with 0 and 6. Caveat that thie works because you're numbering the real-number origin as 0 but try telling physicists that they're not to do that! O=(1,1,1) is an even more indefensible position.

So there's another mathematical argument that I either came up with or read elsewhere that feels deeply convincing to me. If there's 2 there's probably more, and here's the list of languages on wikipedia, you tell me which you'd rather work in. https://en.wikipedia.org/wiki/Comparison_of_programming_lang...

Sorry for the irate tone but I really think that on reflection and fair consideration this is an obvious one.

> an argument always has to be from aesthetics, which can include simplicity and "uglyness", but these aren't totally individual (you can get 99/100 people to agree on which of two schemes is "simpler").

Putting aside the implicit claim that something as subjective as aesthetics will be agreeable by 99% of the population...is 0-based indexing simpler? Why?

Your next example is...a little all over the place mathematically, if I’m being honest. I’m having trouble following it. But let me try...

So you’re saying I have a 1-dimensional, ordered set S, whose elements are real numbers. It can’t be an interval, because S needs to be discrete for this to be possible computationally or even theoretically (S is uncountable if its continuous, and thus cannot be indexed). It needs to have an order to have a sense of position in the first place, so map S to the natural numbers in whatever way you choose.

Now choose an element s in S. You want me to pick the first n elements from s that are “next”, for whatever your order relation is. We’ll call that relation <. In that case I would respond by saying, “Let s be an element of S, and choose an n-tuple N of elements

    (x_1, x_2, x_3, x_4, x_5, x_6) in *S*
such that s < x_1 < x_2 < x_3 < x_4 < x_5 < x_6, and there exists no element k of S such that k in S, s < k < x_6, k not in N.

Haven’t I accomplished your exercise just fine without using 0 indexing? If I were programming this, I could index the n-tuple starting with x_0 or x_1. I don’t understand the issue - what are you trying to convey here? Personally, I’d program this by defining S to be my vector or list or whatever, k to be the index of s in S, then defining my n-tuple with:

    N = []
    for (i = 1, i <= n, i++):
        N.append(S[k + i])
I feel as though you only believe your example is compelling because you’re overlapping the origin with the initial index. But when you abstract your problem to the general case (as I did), then it doesn’t really seem to matter much. Your example is actually kind of a narrow edge case.

Hey, thanks for trying to reply substantively. You haven't understood what I was saying, which is fair enough because I didn't present it very clearly probably. The line where you start talking about something else is this: "So you’re saying I have a 1-dimensional, ordered set S, whose elements are real numbers". No, I'm not saying that. Firstly to define the question as Djikstra does: we're arging about, if I have a list L, and I write into my computer L[a, b], which elements of L are returned? My argument is kind of a physical metaphor, take the real number line as a physical axis (like you're doing physics), and take a load of toy trucks that each have length of 1 (I guess you choose your coordinate system to make the length 1) and put them down end to end. If I was a physicist, and I asked you to pick up 3 trucks starting at 0, and I was using real numbers and a vector I'd say it like this, pick up the trucks in the range [0, 3) on my axis.

Now if you choose 0 indexing + (closed, open) for your slicing, then a (python) programmer would say to another programmer, take the slice [0:3] to get the 3 trucks. So the slice numbers correspond perfectly to the vector description of the space that the items occupy. That's why in this slicing system taking L[a, b] then b-a = the number of items you get. Because it's matched perfectly with a real-number vector description of the space the items occupy on a real number line. All the other advantages like (I can describe an empty interval: [0:0]), I can describe taking the last element and it feels right [-1: 0] follow from this physical-space homomorphism. That last one: in a circular (modular?) space, walk left 1 and then come back right 1 and pick up the element you pass over. In 1-based: [0: 1] = the last element? I don't know if any languages do this but either you can't and that's sad or you can and it is incredibly counterintuitive.

It sounds like you haven't considered any of the multiple reasons or situations where 1-based indexing works out better.

If you're going to give a counterexample, please give it.

I thought about my post after I read it and came to this- given Djikstra's (only system which can describe an empty interval and interval with first element without "unnaturals" ) and (?)'s argument that it creates a nice homomorphism between real-number intervals and integer slices then it's inarguable to me that 0-based is more mathematically elegant.

Where I guess there's space to disagree is this, I believe (to almost a point of faith I guess) that mathematical/logical elegance is important, and moving away from it is normally a mistake which leads to more pain in the long run.

Why? You've already come to the conclusion that anybody that disagrees with you is an idiot, and you think your argument is fully reasoned and covers all bases. Why should I believe that you would change your mind given new information?

I guess, because often you're not arguing in order to convince your opponent, you're arguing to sway the crowd (and sometimes, decide yourself).

And I don't think they're idiots, I just think that they're wrong. Not many people have read Djikstra and (?) and have deep faith in mathematical beauty, or have even thought about this much, doesn't make them idiots.

The most prominent programming languages designed specifically for mathematical purposes (Mathematica, MatLab, Julia, R, Fortran), all have 1-based indexing. That should be a sign to you that 1-based indexing has a legitimate mathematical rationale. Djikstra wasn't speaking on behalf of the mathematics community. The idea that if you have deep faith in mathematical beauty that you'll come to the same conclusion is absurd.

I'm pretty sure that they just copied Fortan, which was written most of a century ago.

That's the second time you've ridiculously misrepresented my statements: "anyone disagreeing with you is an idiot" "deep faith will lead you to the conclusion" (and not deep faith + thinking about the actual problem, several convincing arguments and some ability)

One based indexing is used in tons of domains within mathematics.

Fortran chose 1-based indexing for a very obvious reason...it was the best translation from the mathematics literature that they were trying to implement. Because matrix notation uses 1-based indexing! MatLab, a language designed specifically as a high level language for matrix mathematics, chose it for the same reason. R, a language for statistics, chose 1-based indexing because it is a statistical language, and counting is one of the most fundamental operations in statistics, and 1-based indexing is the form used for counting.

Mathematicians obviously have no problem switching back and forth between 0-based and 1-based indexing for different domains, so it boggles my mind that computer scientists have turned it into such a huge holy war, and even more mind-boggling that 0-based zealots claim to have mathematics on their side.

Maths on your side != mathematical convention on your side.

Since @zimablue seems serious about this, and for those who care as much about the array indexing question and why Julia's "arbitrary" indexing might be useful, here are links to a couple of my previous comments [1,2] on HN:

[1]: https://news.ycombinator.com/item?id=15473169

[2]: https://news.ycombinator.com/item?id=15472933


The idea is that the interface to the data structure should ideally closely match the semantic meaning of the data (timestamps, frequencies, etc.) rather than memory addresses/pointers. That lets you program at a higher level of abstraction. More generally, both zero-based and one-based indexing have their natural uses, depending on what you're referring to. Eg: consider floors in a building. If you wish to index the horizontal surfaces separating the spaces, it's more natural to start with zero for the "ground level". If you wish to number the spaces between the surfaces, then it's more natural to start with one for the "first floor". Which one is more convenient depends on the semantics of the problem domain and what ideas you wish to communicate. Being overly attached to one perspective is unhelpful.

To quote myself from those links:

> The way I think of it is that different indexing schemes suit different problems. I want to think carefully about the problem domain and use the most convenient convention. For example, when my array stores a time series, I would like the index to correspond to timestamps (and still be performant, so long as my timestamps can be efficiently mapped to memory locations, which is true for affine transformations, for example). When another array stores the Fourier transform of that time series, I would like to access elements by the frequency they correspond to. That stops me from making annoying indexing errors (eg: off-by-1), because the data structure's interface maps nicely to the problem domain. I find that much easier than the cognitive cost that comes with trying to shoehorn a single convention on every situation. But it's difficult to appreciate that when thinking of language constructs divorced from specific problem domains, as one tends to do when typically studying data structures and/or algorithms.

> [Regarding offset array indexing...] Think of this feature as blurring the distinction between data (accessing an array) and computation (calling a function). The fact is that arrays as they are used (contiguous memory location collection) often carry more information than being just a dumb list of arbitrary data, and it's very convenient to expose that in their interface.

Now, for example, an array can more closely resemble a cached function computation, because the interface to both carry the same semantic meaning.

THanks for a serious reply. The cached function computation argument is an interesting one, which resembles the answer by evanb here https://mathematica.stackexchange.com/questions/86189/why-do...

I hadn't seen it before, basically if I understand it right treating all vectors as lisp s-exps which are themselves 0-indexed (0th element is 'list' , because you're doing a lot of symbolic manipulation. It's the best argument I've seen for 1-indexing, I'd say two criticisms - 1: I don't know how homoiconic Julia is, if it's not very then the actual use of this seems very low relative to slicing. 2: It seems to be mixing up indexing a vector itself, and indexing the expression that the vector is (like macro-quoted vs unquoted). I can see how it would smooth things in some way but it feels like a fudge that might cause more pain somehow later.

No, IIUC, my explanation is nothing like what's mentioned in that comment. I'm trying to argue that it behooves us to use indices which conveniently model the problem domain, not some holy system where the head always starts with zero. I gave several examples to the effect, where zero based, one based, or different offset indexing makes sense. If you're really curious to continue the discussion, feel free to reach out via email (refer my profile). This thread is become unwieldy.

> Eg: consider floors in a building. If you wish to index the horizontal surfaces separating the spaces, it's more natural to start with zero for the "ground level".

It depends. In the USA, 0-indexed floor numbers are the norm. Elsewhere (eg, Australia), 1-indexed floor numbers are the norm.

Indeed it's starting to sound rather more like a church than a bikeshed!

I am curious what people write all day for this to be such a big issue. It's not like you need to re-invent N-dimensional array indexing every morning -- we have abstractions. If translating code then it could be from either convention... or from a mathematics book. (Where you may note that mathematicians are unconvinced by the CS arguments... admittedly not a group of people known for taste in all matters, but taste in notation they have thought about quite a bit.)

I don't hold it against any language, but it certainly breaks with common convention. The number of simple off by one errors I had when I started writing Lua was a bit irritating.

> but it certainly breaks with common convention

That depends on which circles you run in. If you're a programmer, yeah, it breaks with common convention. If you're a scientist or engineer, not so much. When writing a programming language for scientists and engineers you have a choice: do I stick with programmers' conventions or do I stick with engineers' conventions? I've had the pleasure of teaching both MATLAB and C++ to freshmen engineers. For them, they get off-by-one errors in C++, not in MATLAB.

For "2D indexing" into a 1D array it's actually a little awkward. With an n×m matrix,

- zero-indexed: i×m+j

- one-indexed: (i-1)×m+j

OTOH one-based is slightly better for trees stored in 1D arrays:

- zero-indexed: parent=(child-1)/2; children=2×parent+(1, 2).

- one-indexed: parent=child/2; children=2×parent+(0, 1).

My favourite fact about this stuff: in VB (or was it VBA?) when you asked for an array of size n, you actually got an array of size n+1. So people could do 0-based or 1-based indexing and be none the wiser...

All fair enough but not particularly relevant here (in the context of "why Swift"). For doing numeric work in a language actually defined for that purpose, you will have proper 2D (and hopefully n-D) arrays, and hopefully slicing operations on them. So you'll never do this.

In this context doing 2D indexing in 1D arrays is a code smell. It does come up in the case of writing libraries for general purpose language with poor support for numerics and linear algebra, but then you should be abstracting this away from your callers.

Well, the nice thing about doing 2d indexing in 1d arrays is the 1d array sits in one contiguous block of memory, which can be important for performance, and also makes it a lot easier to pass the array to C libraries (which I think you alluded to). Whereas in a lot of languages, int[3][3] might be stored as 3 pointers to 3 arrays. I imagine Julia optimizes this?

Yes, in julia A = rand(2,2) is a matrix, and reshape(A,:) is a vector view of the same continuous block of memory. You may also index A[i] directly, as if it were a vector; eachindex(A) is an iterator which uses whichever style of indexing is going to be quickest (and in cache-friendly order).

Many of the low level libraries for numerical computation are coded in Fortran, which is 1-based. So it is a wash in this aspect.

That is one of the things I meant by “proper”, meaning an efficient layout and alignment.

Forgive my ignorance, but are you saying you use A[(i-1)×m+j] to get the element in the i-th row and j-th column? Why not use A[i, j]?

GP wrote (emphasis mine) 'For "2D indexing" into a 1D array'. I presume it's some graphics-related optimization or some pointer arithmetics-related work, but I'd also be curious about some concrete use-cases too.

Julia n-D arrays are stored contiguously in memory. So, you can access the memory by reshaping the array into whatever dimension you want and then access it using the A[i,j,k] syntax.

What does reshape actually do? Create a 1-D array from a 2-D one or vice versa, or just make appear to be so?

It alters the descriptor of the data layout, not how the data is actually stored.

Got it, thanks.

APL let you choose. It was only trouble and then Iverson made J 0-based.

Yeah you can change when needed of course (forgot about that).

Iverson's J did a lot of things (chief of them was removing the awesome symbols) to make it more appealing. A lot of folks understand the cause, but also realize it takes away one of the best reasons to use APL. So moving to 0-base wasn't necessarily because he thought it was better.

out of curiosity, could you give more details about modules problems ?

I do not recall the exact issue anymore unfortunately, I abandoned the project because of it, but the general sentiment on the Julia discourse seemed to be just avoid them. This blog post seems to sum up my issues with modules and namespaces pretty well though:


I think that these issues are generally acknowledged although I don't know if they will be addressed. Seems like major pain points for library development should have been addressed before 1.0.

A lot of this blog post is addressed by just using `import` instead of `using`... so it was actually just an issue of not reading the manual. It's like using `from package import *` and then complaining Python doesn't namespace properly.

> It's like using `from package import <star>` and then complaining Python doesn't namespace properly.

It really is not:

* `from package import <star>` is more work than `import package`, `using` is shorter than `import`

* the official documentation Python documentation starts with qualified imports, then introduces local bindings, and finally unqualified imports, opening a few doc pages imports are either fully qualified or explicitly bound; the first occurrence of using a non-prelude package in Julia's tutorial is `using`[0] and the modules documentation explains `using` first

The official Julia documentation very specifically steers the reader towards unqualified imports as the default & proper way to do things, Python's does the opposite.

[0] https://docs.julialang.org/en/v1/manual/functions/#Optional-...

edit: oh for fuck's sake I hate hn's shitty brain-dead pseudo-markup.

> `using` is shorter than `import`

By one letter!

And in the right direction: for trying things out interactively, `using ThePackage` and then having everything available is great.

Then once you know what you want to do, in more careful code you can switch to `import` and qualify more things, and your future self with thank you.

But serving both of these needs seems like a valid design goal. I'm not sure the manual does a great job of explaining this right now, import vs. using also changes the rules around adding methods to functions, and it is perhaps more confusing than it has to be.

> By one letter!

Yes, by one letter. Out of 6. "using" is 16% shorter than "import". And it uses better keyboard alternation (the first 4 letters of `import` are on the right side of a qwerty keyboard, the last 2 on the other, for using the only "i" and "n" are consecutive letters on the same half of the keyboard).

So "using" is 1. introduced first 2. the primarily documented import mechanism 3. significantly shorter and 4. significantly more comfortable to type.

If Julia's community doesn't want people to use it everywhere, they're doing a very, very good job of fucking with and victimising their users by way over-incentivising the use of `using` over `import`.

> And in the right direction: for trying things out interactively, `using ThePackage` and then having everything available is great.

And for writing maintainable code it's terrible, despite being by far the easiest and most convenient solution.

ChrisRackauckas complains that people use `using` over `import`, literally everything in the language and documentation pushes them towards it.

Putting the convenience of interactive sessions way, way over that of proper programs does not seem like "the right direction" to me, especially not when people then jump on their high horses and chide users for doing what the language unambiguously pushes them towards.

> But serving both of these needs seems like a valid design goal.

Python does that just fine: it provides convenience for interactive use without pushing users towards the least maintainable and desirable option.

But you have to type `python3` which is 7 letters to 5, if we're counting :)

Seriously though, it's hard to write one manual which gives everyone the perfect on-ramp for their needs. I actually think it's fair to aim it low, make it easy for matlab refugees to get started, interactively. People who know how namespacing works in several other languages (and intend to write big projects) are the ones well-equipped to know how to dig deeper. Nobody has been victimised!

> But you have to type `python3` which is 7 letters to 5, if we're counting :)

Of course, that was actually a pretty big consideration during the "vcs wars", both the length and the alternation of VCS commands.

> I actually think it's fair to aim it low, make it easy for matlab refugees to get started, interactively.

Then don't whine that people use that, and police your community.

> Nobody has been victimised!

Well how's that for dishonesty. ChrisRackauckas pretty much goes "this article is invalid because you're doing exactly what the languages suggests and incentivises" half a dozen comments above.

My point was that counting letters in one short command seems myopic here, it's not something you type in isolation. Julia package names tend to be much longer and more explicit than python/R ones. But julia code (at least mine) tends to be more compact, partly because I can write things like A ⊗ B instead of np.kron(A,B).

As far as I'm aware there is no budget for a police force. There is, however, a double-your-money-back guarantee on all advice from blog posts!

I do think the manual could be clearer about using/import. Perhaps this should be emphasised under workflow [1], and packaging differences from Python included in the list [2]. The experts have forgotten what the pain-points were; contributions from those who have not are thus welcomed.

[1] https://docs.julialang.org/en/v1/manual/workflow-tips/ [2] https://docs.julialang.org/en/v1/manual/noteworthy-differenc...

>Yes, by one letter. Out of 6. "using" is 16% shorter than "import". And it uses better keyboard alternation (the first 4 letters of `import` are on the right side of a qwerty keyboard, the last 2 on the other, for using the only "i" and "n" are consecutive letters on the same half of the keyboard).

Counterpoint: I am going to attempt an autocomplete after two characters by pressing tab. Typing 'us' leaves my left hand, the tab hand, as the last used hand before the tab. This prevents me from buffering the tab by moving my hand into position while typing the first two characters. 'im' is typed very easily with my right hand and while I'm typing that, my left hand is free to move into position over the tab key.

Right, so this is a documentation thing, and I agree the docs should probably be changed a bit. But it's not a language thing.

Although I read the module documentation I did indeed miss that, I was learning mostly from the Julia tutorial notebooks and am definitely a Julia noob. Julia's `using`, `import`, and `include` does seem like it could have a better design. In Rust or Python for example anyone could tell the difference without looking at the docs. I assume `using` was kept for backwards compatibility?

I'll need to take another look at the code I was working on. I was planning on doing so after a break anyway.

fair enough, the article is from 2015 though; these weren't open research problems, I'd hope they've been fixed since.

out of interest what would you consider medium sized?

The main issue I encountered as a Julia user is that multiple dispatch doesn't scale very well.

When you start building out a project, it's easy to keep track and debug if multiple dispatch starts failing (i.e. <any> type starts spreading everywhere and Julia slows to Python like speeds).

In medium-to-large projects, it becomes extremely cumbersome to manage this. It's doable, but adds a layer of complexity management to projects that simply doesn't exist in strictly typed or pure scripting languages.

Of course, you can just decide to explicitly type everything - but the issue here again is the lack of enforcement.

In a nutshell: Julia is great when you're a grad student working mostly by yourself on small scale projects! But not so great in prod.

And there's really no problem with that; that's who the language was designed for!

> In a nutshell: Julia is great when you're a grad student working mostly by yourself on small scale projects! But not so great in prod.

Some people would disagree with that


Sorry - I didn't mean to sound so negative! I'm very well aware of all the large Julia use cases and they're often great applications of the language.

I would also argue that the large open source Julia packages are also great examples of Julia "in prod".

Just highlighting what I think is a significant con in a language with many pros!

This was published on their own website, it is therefore biased toward a good perception of them. It doesn't seem to be a fair and independent review to help build an opinion.

And these were not published on their website:




I had to go so far as to read the project Github page to find them.

In all those papers, there at least one member of the team in the author list, or a member of a partner organization (Intel, Lawrence Berkeley National Laboratory, etc). I wouldn't call them fair independent reviews of the language.

Are you really complaining that the people writing about the tool are the ones developing it?

Whose opinion do you want, the Pope? Would a divine sanction be enough for you?

Yeah, my biggest want for Julia 2.0 would be a built-in static type analyzer.

That is neither a breaking change, (and thus could come in 1.3) Nor something that needs to be baked in. (thus could be in a package)

I think multiple dispatch will scale perfectly fine for large projects when it has better IDE support.

There's no reason Juno or any other IDE couldn't display the output of a static analyser inline, or allow you to command-click a function call to go to the site of the exact function being called, or show a list of alternatives, and so on.

Give Julia and its IDEs a few years to improve and you might find it much better suited to large projects. I wouldn't consider Java any good for large projects either, if it didn't have the excellent IDE support it now enjoys.

The optional typing seems like the perfect solution to this... skip explicit typing for small scale projects, but make sure you add it for production...

Both cassette based performance linting and static type checking will solve this problem. There are already the beginnings of the tooling being formed.

> it's faster than other scripting languages

That certainly depends on your use case. For instance, the launch time is ridiculously slow, so that you cannot realistically run a small matrix computation in julia from within a shell loop. It is better to use octave for that, where the startup time is almost negligible (just a bit slower than starting a subshell).

I agree. I meta-programmed (enormous) expressions from analytical expressions exported from Mathematica to Julia, because I found Julia to be ~3000 times faster than Mathematica when it comes to calculating eigenvalues. Using BigFloat for higher precision, my matrix function in Julia took ~20 minutes to compile on the first run and ~20 GB of RAM. Smooth once compiled, but I was the only one of my collaborators that had the capacity to run it.

Note, that I still do like using Julia. I'm a physicist and need to do a lot of computations in a hassle-free way, then jupyter+Julia (+ SymPy) is(/are) the best available tool(s).

The above may only be an issue of BigFloat, to be fair, since Float64 compiled in an instant (never measured, and the time never bothered me).

So Julia has solved a lot of problems for me, and I see great potential for it in the future.

I didn't find it slow, but:

    $ time julia -e 'print(1)'
    real    0m0.438s
    user    0m0.300s
    sys     0m0.118s
    $ time python -c 'print(1)'

    real    0m0.040s
    user    0m0.036s
    sys     0m0.003s
it is slower..

That said, instanciating julia every step of a bash loop.. I think it requires a jvm mindset, warmup once and iterate inside rather than outside.

> I think it requires a jvm mindset, warmup once and iterate inside rather than outside.

I do not have this mindset then. I prefer tools who are mindset oblivious. They are really useful!

For example, imagine I have a collection of a few hundred images with their projection matrices (in text files). I want to crop them and apply a simple imagemagick operation (which is not available from inside julia). The elementary solution is to run a shell loop to apply the crop, and call julia to perform a simple adaptation of each projection matrix. This is impossible today: most of the running time of such loop is spent on julia initialization. Half a second to do nothing is simply unacceptable in a serious scripting language.

>I do not have this mindset then. I prefer tools who are mindset oblivious. They are really useful!

Yet you picked a specialized tool "imagemagick" (made for working with images), and want to drive it by another specialized tool (the shell, made for file and process management).

Since you want "mindset oblivious" tools why not write everything from scratch, in C?

> Since you want "mindset oblivious" tools why not write everything from scratch, in C?

Because imagemagick, julia and other specialized programs already provide the advanced algorithms that I need. Unfortunately, julia is not a good team player.

The Julia tooling is not currently designed for repeatedly creating new REPL/interpreter instances to quickly solve one small problem.

What it excels at is rapidly solving the same problem many times within one REPL/interpreter (since after the first time the code will be compiled), or for longer computations where the overhead of the initial compilation does not matter.

It would be great if it could also get fast at the use case you are interested in, but this isn’t currently what it excels at.

>Unfortunately, julia is not a good team player.

Or, you know, Julia is not specialized for your use case.

I guessed that it wasn't your habits but if you step outside that box a bit you'll realize that lifting up or down things inside loops is the most natural thing to do. And for you example (which may not be your real use or workflow) you could have two loops, independent or coupled through a queue so that the julia process is only started once.

Of course! Once I realize that after a few seconds only a dozen images have been processed I cut the loop, remember that the julia repl is dog slow, and then, rewrite the task in a different way. But I would prefer not to have to do that. Moreover, for more complicated examples, there may be data dependencies that make the loop commutation non trivial.

The slow startup time may be a minor inconvenience, I agree. But nonetheless it seems to be a case of sloppy engineering, as other scripting languages do not have this egregious problem. It sets a bad tone, and makes me wonder if there are other hidden monsters inside the interpreter, that may be solved by "you are holding it wrong" like this one.

Yeah I get it, it's inconvenient. But I taking the jvm thing again, I think julia's value is outside of this way of doing things.

Out of curiosity I timed ocaml

    $ time echo 'let x = "1" in print_endline(x)' |  ocaml -stdin - 
    real    0m0.045s
    user    0m0.039s
    sys     0m0.007s
I thought julia init time was due to type checking phase but ocaml seems to have no issue, even with non zero code:

     $ time echo 'let l = [1;2;3;4] in let m = List.map (fun x -> x*x) l in print_int(List.length(m))' |  ocaml -stdin - 
     real    0m0.050s
     user    0m0.040s
     sys     0m0.011s
Sad, I liked a lot of Julia features..

ps: this hints at the init time being constant recompilation of its core https://www.reddit.com/r/Julia/comments/4c09m1/julia_045_sta... and they were thinking (2 years ago) of caching..

What were you planning to use Julia for where its init time matters?

Nothing, i'm just curious [1] about Julia. I never used it.

[1]It's just because i find the science python stack a bit absurd.

Right now, using is as a “quick” command line utility for solving small problems is probably not a good Julia work flow (unless the overhead you mention doesn’t matter to your application). Where it shines is for longer-running or repeated computations where the initial compilation cost of starting a new REPL doesn’t matter.

Preaching the convinced

You have to use it inside a REPL, like R or Matlab. Julia using LLVM and compiling on-the-fly is not sloppy design, it's a design tradeoff (and a great one, if you really need speed).

Julia is not a scripting language, it's a language for mathematical analysis.

Also going back to your example the solution would be to use ImageMagick.jl from Julia, and extend the library (4 extra lines with ccall) if it's not available, and send a pull request to help the community.

> Julia is not a scripting language, it's a language for mathematical analysis.

Ok, this clarifies the matter a lot. So julia is not intended to be a general-purpose programming language. Are you involved in julia development?

(Besides, I strongly dislike the design tradeoff of not being a good unix citizen.)

Julia is a general-purpose programming language. The language itself is actually really good as a scripting language, including integrating well with shell, C, Fortran, R and Python, for example:


The only trade-off here is the restricted resources the Julia devs can afford. Focusing on the scientific computation niche, the Julia compiler was built to compile code just ahead of time, since this way it can create code as fast as compiled languages while still being dynamic like any interpreted language (such as Matlab and Python). But there is nothing in the language that prevents it from being fully interpreted (and therefore having minimal compile time overhead, but generating poorly optimized code for long running processes), or alternatively pre-compiling and caching to have both (which is being worked on: https://github.com/JuliaLang/PackageCompiler.jl )

Not Scripting != Not general purpose

I'm not a Julia developer, but the great thing with Julia is that the libraries are exceptionally easy to read and extend, because the core developers make a strong effort to write clean code. Just look at ImageMagick.jl and you'll see, but I have the same experience with other libraries as well.

Also the language itself supports extending code (multiple dispatch really helps it to stay clean).

Why can you not call you imagemagick operation from julia? Can't you just `run(``)` it?

This is a part of a quite large system that does a lot of things. I do not want to wrap everything inside julia, that would force me to rewrite everything in julia. I just need to invert a few 4x4 linear systems from inside a shell script. I agree that this is not a very representative use case, and maybe it is irrelevant for most people. Yet, if I try to evangelize julia in my lab this problem blocks it, and it is better to stick to octave.

To demonstrate how slow the startup time is, here’s the respective times for executing ‘print(1)’ from Bash and from REPL on my machine:


    julia> @time print(1)
    1  0.000031 seconds (7 allocations: 272 bytes)

    $ time ./julia -e 'print(1)'
    real    0m0.184s
    user    0m0.156s
    sys     0m0.168s
So yes, that’s a slow launch time. If you use Julia in a shell script and it’s starting up Julia on each calculation it will be brutally slow. Five orders of magnitude slower, in this case.

honestly, parent had a use case in mind but it's ultra orthogonal with julia's goals. They're aiming at large numerical problems.. as I said above, if you need julia to make a quick one shot computation in a bash script loop, by all means use something else. But my bet is that anybody reading about julia is planning to do the heavy work inside of it, in which init time will probably be insignificant.

It's a bit annoying it's true, a well featured language with dynamic/repl appeal that is not in the ms launch time. But alas..

Totally agreed, I'm a fan of Julia. I just wanted to comment with the delta between startup time and actual execution for passerby.

First of all that is ridiculously slow. Secondly, unlike Matlab you don't have everything you need available after starting the REPL. For instance if you want to plot something you might run `using Gadfly`. How long does that take?

16 seconds. Sixteen seconds. For real. This is not usable.

Hold on how long is MATLAB boot ?

maybe he was thiking about Octave, that is an interpreter for the same language as Matlab and it starts much faster.

Yeah but heaven help you if you need nested for loops in octave.

Much less than 16 seconds. And that gives you a whole IDE.

Agreed Octave works excellently as a linear algebra/numerical subshell for Shell scripting, better than Python IMO.

I like Julia a lot, and thought I understood some of it, but this article puzzles me:

> This output is saying that a floating point multiplication operation is performed and the answer is returned.

But "this output" is:

  %2 = mul i64 %1, %0
  ret i64 %2
which looks very much like an int64 operation and return.


> Here we get an error. In order to guarantee to the compiler that ^ will give an Int64 back, it has to throw an error. If you do this in MATLAB, Python, or R, it will not throw an error.

while the quoted input and output is

  In [6]: 2^-5
  Out[6]: 0.03125
(ie, no error, but the correct (floating point) result.)

This was written back in Julia v0.5, and the compiler got smarter so I need to update my examples :). Here, Julia specializes now on the fact that `-5` is a literal, and then inlines the literal and corrects the output type using that value. If you define it as a variable and stop constant propogation, it'll error. Stuff like this are making it harder to write tutorials to show what's actually going on, because literals and constants are all getting optimized on now!

Another confounding fact is that Julia optimizes on small unions, so the generated code isn't that bad anymore. Now it just creates a branch. It used to have to do all of inference and dynamic dispatching on the fly, which is what it has to do in fully uninferrable code of course, but a union of two things just does a type check and splits. So now... that example is not as bad as it used to be...

Ah, well, I guess that's good news then!

As a rule of thumb, it is probably a good idea to ignore Julia articles that are more than a year or so old. Anything that is discussing pre-1.0 may be completely irrelevant at this point. The language has evolved a lot, even the past few years, and it is only with the 1.0 release last fall that the devs promised to start maintaining backwards compatibility.

So there are lots of older articles discussing drawbacks to Julia, or ways to do things in Julia, that simply aren’t relevant anymore. Unfortunately they still often are the first links that come up when searching google for Julia questions.

  >  In [6]: 2^-5
  >  Out[6]: 0.03125
  > (ie, no error, but the correct (floating point) result.)
On the other hand:

  julia> 16^17

That's just integer overflow.

I know. But I don't think that's the correct (floating point) result most people would expect, so it's worth signaling.

  julia> 2^62

  julia> 2^63

  julia> 2^64

To be fair, though, this behaviour (namely, staying close to machine arithmetic) has been discussed and decided upon, and advertised. It's what enables the speed.

C/C++ makes the same choices, by and large.

I don't think C/C++ have integer exponentiation so even C/C++ programmers may be surprised here. I don't know if the behaviour of the power operator when both numbers are integers (and depending on whether the exponent is negative or positive) is mentioned in the documentation.

This is something that was just heavily discussed on the Julia Discourse, with a big back and forth between individuals used to the computer integer arithmetic rules vs people wanting Python like functionality.

Based on the confusion there, it definitely seems that it could be made clearer in the docs that if one wants floating point results one needs to use 10.0^17 instead of 10^17.

I did a 5000 line dissertation project in Octave after rejecting Julia. Reason : I had derived the math in linear algebra including Kronecker products; the math mapped to Octave pretty directly, but Julia requred me to translate all the Kronecker products to loops —yuck! kron(A, B) would become 12 lines of weird indices and for loops. On the listserv I was told that Julia was great because it didn't require vectorization for performance, but I only wanted vectorization for graceful expression.

Plus I got annoyed with extra weird syntax, but I can't remember the specifics.

Basically, Julia required more lines and characters and wasn't as close to the math.

Aside:I think Matlab / Octave is a lot like SQL and Tcl: lots of haters, unfashionable, but usually the most elegant solution .

Surprises me. Julia can usually stay much closer to mathematical notation than, say, Python or C++.

You can even use nice Unicode notation such as A ⊗ B ⊗ C.

So, `kron` is actually provided by the standard library [1], are you saying that this kronecker product didn't do the job?

[1] search in this file: https://github.com/JuliaLang/julia/blob/master/stdlib/Linear...

It was 5 years ago, maybe it's better now.

I wasn't comparing Julia to python or c+, but to Matlab octave.

Thanks for that clarification. 5 years is a lot of evolution for a language that was roughly 2 years old when you used it! Might warrant a revaluation, or at least always state the timeline along with your opinions :-)

As notation for array based algorithms, Octave/Matlab is vastly better than anything else I've found. Some guy did PRML in Matlab, others have done it in Python. The Matlab version is like reading a book; clear, concise, correct.


IMO for most people, for array based algos, Matlab hits the "notation as thought" Iverson saying in a way that APL didn't quite make it.

For such things, Julia is often very close to Matlab indeed, or at least it can be used that way.

You could translate most of those files line-for-line, with quite a few lines identical or trivially changed (bracket shape, or max -> maximum). This is often a useful thing to do, get a transliterated version running, and then re-write bits of it more idiomatically, while checking that the output is identical.

I'd be happy to look at the output of such an exercise. If I were transcribing it, it would either be to J or C. Julia doesn't solve any problems for me.

Julia has had a kron function in Base since at least 0.5.

I haven't dipped into Julia's macro side, but I wonder how much work it would be to just create macros to create syntactic sugar that maps infix Kronecker products to the Kronecker function.

There are so many Julia packages that do similar stuff that I imagine it can't be all that hard for people who have become fluent with the macro system.

Why metaprogram? Just define an operator using the built in kron function. Example:

const ⊗ = kron

A = rand(5,5)

B = rand(3,3)

A ⊗ B

Tada! I'm not sure how MATLAB/Octave's kron(A,B) looks more like math than A⊗B, but everyone can have their own opinion.

Oh. For the benefit of others reading this comment thread, this is why that works: https://docs.julialang.org/en/v1/manual/variables/#Allowed-V...

This is the second time I've learned about time-saving functionality related to infix that I didn't know about. (The previous was the symbol for integer division, which had been left out of the documentation until version 0.7). In this second case, interpretation of Unicode symbols as infix operators should be better surfaced in Julia documentation and tutorials - it's a really useful feature.

`kron` is in the standard library, at least in the first stable version of the language https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/index....

I want to like Julia but after a decade of Python, every time I try it, it’s a death by a thousand cuts (and outdated Google results). I just can’t afford to have productivity drop to near zero for the learning curve plus reimplement everything.

Also I ha e found multiple dispatch to be harder than regular OO methods to locate (for IDEs but grep also).

When I search Julia questions on Google I always restrict the search to the past year or six months. You are right that there is an enormous amount of irrelevant info out there from earlier versions of the language.

Have you tried methods(some_function) or the @which macro?

Yes - doing @edit(arg1,arg2,....) opens the file at the relevent method definition (ie version the same type arguments as given) in my editor. In the julia source code or my own modules.

You don’t have to reimplement all your python work from the get go. You can call python from Julia.

Honestly my biggest bug with Julia was the lack of good programming environment. I detest MATLAB, but I can be very productive in that IDE. PyCharm is pretty good too when using NumPy. For Julia though, the tooling just didn't seem there yet.

Juno is actually quite nice. I find it far more productive than Jupyter and superior to Hydrogen

I didn't know about the Unicode tab completion. I gave it a whirl on the cli and loved it. Then I tried it out in julia-mode in Emacs, and it works there too! What other editors support this?

vim: https://github.com/JuliaEditorSupport/julia-vim I have an F-key shortcut in my .vimrc to turn this on mainly for non-Julia files, so that I can type (maths-y) unicode in Markdown / text files.

Certainly Atom (and Juno), Sublime Text, and Jupyter notebooks.

Works in VSCode with the Julia extension too.

I tried to like Julia but after a while I realized there was absolutely no use case for me to choose Julia over other languages. It might be useful for new grads looking to learn a language though.

I'm moving my code that uses statistics from Ruby to Julia, and I love the similarities.

Of course 1 based indexing makes the move harder, but it's worth the effort anyways.

I'm trying to embrace 1 line functions as much as I can, and Julia helps me with it's special notion for short function definition.

I just hope they remove the awful @threads macro with https://github.com/JuliaLang/julia/pull/22631

Developing anything multithreaded was a major pain.

As someone who has never run into performance problems with R, and also knows how to use Python - is there a good reason to learn Julia?

Multiple dispatch and meta programming are both wonderful, irrespective of speed benefits.

For example, compare linear algebra syntax in Julia with those in R and Python.

Plus, the flexibility of multiple dispatch applying equally well to any of your own types really makes it feel like you can do anything in Julia.

Metaprogramming, writing code that writes code, can take a while to get used to. But is extremely powerful.

A final point I will make is comparability of Julia code. If you want to for a Bayesian model or solve a differential equation in Julia, you can mix your own types and code and any generically written library code at will, and it will often just work. In R, you're not likely to find that; your Stan code can't call R functions you've written or those from your favorite libraries.

Multiple dispach and meta programming are available in python as well. Although duck typing is usually a better solution and it's not really a killer feature to data analysts anyway.

Compared to python, i'd say julia has the reputation for being generally faster for things you can't use numpy with, and you can more easily scale multiple cpu or machines.

By metaprogramming in Python, do you mean using decorators ? If so that's a far cry from metaprogramming

Import hooks with access to the ast, full object introspection and metaclass are not too shaby.

They are not that used though, too much magic and implicit processing are not promoted in the community.

Not to shabby indeed ! Now have to check whats up with Hy.

Multiple dispatch has to be coded by hand in Python, it isn’t a feature of the language.

Python 3.7 added the ability to use type annotations with the @functools.singledispatch decorator which lets you write functions in a multi-dispatch manner.

As the name suggests, that lets you write functions in a single-dispatch manner. That is different from multiple-dispatch.

Sorry you are right. We only have single dispatch built in. I never had the use case for more so I confused them.

Ideally, Julia should give you the best of R (good, integrated data-types, -structures and functionality for statistics etc) and python (a sane, real programming language) - with the benefit of speed - and no/hardly any need to drop to/link to C libraries - "everything" is in Julia, and you can inspect the code "all the way down" (well, to a point - there's of course llvm at the bottom).

Whether Julia is that for you right now probably depends a bit on your needs/use cases.

Until Julia gains some form of aot compiler for static binaries - it's not quite an alternative to python imnho - you can't expect to run Julia scripts everywhere you can run python. Though this is similar to R. And there are of course much bigger ecosystems for python and R code/libraries than currently for Julia.

Personally I think they've done a great job on the design/syntax of Julia - it's a fun language - and that might be the best reason to play with it?

> and python (a sane, real programming language)

So... R is an unreal language then?

Nah, it's just a re-implementation of S, which was really, really old and different from what people expect.

R is wonderful (and has multiple dispatch, thank you very much), but it's definitely not as familiar to developer types as Python is. R has lots of weird quirks (a[1] vs a[[1]] vs a[1,]) and uses function based generic programming rather than object based generic programming.

Because of the re-implementation of S, it has lots of odd quirks and multiple ways to do things, which Python tends to avoid. Python is definitely more consistent from a programming point of view, but R is a more cohesive system from an analytics point of view.

But yeah, lots of people are hateRs (unfortunately).

Wise words here.

In my humble opinion, applying S's syntactic sugar on top of Scheme-forged core did more bad than good to R. Most of R's quirks stem from the authors' desire to keep it as compatible with S as possible.

As for the two approaches to solving problems (TIMTOWTDI vs Python's "one true way") I think that's a matter of personal preference; I, personally, appreciate the freedom that R gives me. Also, it might be anectodal, but I believe that - to some extent - it promotes thinking "outside the box", especially when you work in a team and review others' code on a daily basis.

> Most of R's quirks stem from the authors' desire to keep it as compatible with S as possible.

On the other hand, if it was not for that compatibility with S we would not be having this conversation: R would be just a footnote in the history of statistical computing like Lisp-Stat.

Exactly! I probably didn't make the trade-offs clear in my original post. I believe that the massive price increases in S-plus licensing may have driven lots of people to start using R, which would definitely not have happened without the S-compatibility.

Your very last point should be heavily against R, not in favour. Having multiple ways to do the same thing is fine when you work by yourself, but in a large team it makes the style of the codebase vary a lot based on who wrote each block of code and makes reviewing others' code more difficult. If you want to get anything done, reading code really shouldn't be your daily challenge of how "outside the box" you can think just so you can understand what is going on.

That's true if we're talking about the overall architecture of the project. And, since R permits so many different coding styles (and paradigms: native functional and several OO), the team should have someone (usually the most senior software engineer) play the role of an architect and design the structure of the code, data structures, API, and so on.

Perhaps having a fixed (and commonly agreed on) set of design patterns - like in Java - makes development easier and smoother, but most of the projects I've worked on have proven successful, so maybe it's not that bad either... :-)

> Nah, it's just a re-implementation of S, which was really, really old and different from what people expect.

Was S "really, really old" when it was re-implemented in R? Gentleman and Ihaka first announced R in 1993 (26 years ago!).

The development of S started in 1976 (i.e. 17 years before R was first released). R was a implementation of S3, "the New S language", which was a major rewrite done in 1988 (it was just five years old when R was announced!).

I use R daily at work, and often reach for it before Python for non-statistical tasks out of comfort.

But R's functions for non-statistical tasks often break the language's idioms. Those dealing with files and connections are especially ugly.

I wouldn't call it insane or unreal, but it's definitely not intended for general scripting.

I don't argue that R is not insane ;-)

As for (a bunch of) functions in the basic library (which, essentially, is a set of packages that you can discard or simply not use), I dare say it has little to do with the language itself. I don't like Python's regular expressions library, for example, but it's only a set of functions (or methods) that you can write on your own! The same goes for R's data-wrangling routines; there's this tidyverse, and you can write good, reliable, production-grade code without using a single function from the base library. Heck, you can even write your own DSL (think: Grammar of Graphics) if you think it could improve your code!

There are several quirks built so deeply into the core of the language that you cannot overcome them ("why can't I overload `+` for character strings??"), but they have nothing to do with the rather poor and unintuitive basic library.

> there's this tidyverse, and you can write good, reliable, production-grade code without using a single function from the base library

A different take on the subject (https://r4stats.com/2017/03/23/the-tidyverse-curse/):

> On the other hand side – I have met a guy using R in production. And he told me that he needs code that stayed the same for 5+ years. That is why he does not use dplyr or any tidyverse, it is still changing too much.

That's right! And I always keep telling my peers that they should think twice before they decide to use tidyverse packages in the production code. What I meant is that in R you always have choice - and even if you feel that the existing solutions don't meet your expectations, you can always write your own (even if it means writing a DSL like Hadley et al. did in ggplot2 or rlang).

Last week I compared a standard regression command across several statistical programs (R, Py, Stata, Matlab, Julia, etc) on a stupidly simple 10x5 matrix, and Julia took ages to run (this is on windows). Even loading the CSV file took seconds in Julia, hundreds of times slower than all of the other packages put together.

I'm not sure if they just suck at Windows, but I was extremely dissapointed... :/

The first time you run code in a REPL Julia will have to compile it. So starting a new REPL, and trying to solve a very small problem one time will currently be a terrible workflow for Julia (unless you don’t care that it may take a few seconds).

If you had run your code again within the same REPL you’d see it runs substantially faster the second time. Likewise, if you had tried to process a large enough matrix or CSV that compilation time should be a minimal subset of the total computing time, and probably not noticeable.

It isn’t a language limitation, just a tooling limitation. The startup times have been getting better as Julia develops, but there is still noticeable overhead the first time code gets run in a new REPL...

Probably it took ages for the first invocation. Because that is when the compiler has to produce native code. The compiled codes then is super fast. Just invoke it a second time (in the REPL) to see the difference.

Obviously that question depends on your motivation, but one good reason is that it is a great way to learn how software actually works.

R makes it very easy to see the underlying R code (you just type the function name), until you get to a ".Call" or ".Primitive": from that point, it is effectively a black box.

But as most of Julia is written in Julia, you can easily inspect and understand how functions work, all the way down. Moreover, by using the @code_* macros, you can also inspect the various stages by which the code is transformed from high level Julia code down to the actual machine code which is running on your computer.

Think about it a moment, which answer would you prefer?

Regardless of the answer, I have plans to learn it simply out of curiosity. I'm just not yet convinced that there are any practical benefits to using Julia over the other languages.

As you said yourself, you have no performances problem with R and Python, so I do not think you need to learn Julia.

However, in fields dealing with larger datasets (e.g. genetics, astrophysics, ...), Python and R are real bottlenecks in data processing pipelines, and are thus often replaced by programs in C/C++/Java/etc. Having a language with the expressiveness and dynamism of Julia on the one hand, and performances in the same order of magnitude than C++ in the other hand is a huge plus.

Do they use Fortran in those fields?

In astrophysics yes (although I only follow the fields through friends), quite a lot of linear algebra/physics reference libraries are in Fortran, and new programs are still being developed in the more modern versions of it.

In Bioinformatics, not at all. Or at least in no direct way that I'm aware of. Sure, R packages indirectly call upon Fortran libraries (LAPACK et al.), but user facing programs are mostly, from my experience, written in C, C++, Java, Python & Perl.

Great, the answer is then: You should give it a try, to a certain extend "Why Julia is nice" is easier experienced than explained.

Why not Julia : https://juliacomputing.com/about-us.html

27 men and 1 woman, which is a bad parity score, even for a tech company.

That is a terrible argument on the usability of a tool.

Julia looks for me more like an academic exercise to show you you can come up with a "scripting like" language which is also fast.

It looks julia developers implemented nice looking features, which can cause problems:

   pi = 3
works, but

   pi = 3
not (second version makes magic import which imports a constant).

Same for omitting multiplication "*":

   3(x + 1)
is ok, but

   x(x + 1)
not (because julia assumes a function call here).

These adds lots of problems for larger programs and maintainability.

Whenever I see people claiming that Julia is fast, I feel like they are cheating.

It is only fast if you don't count the "compilation" times.

For example, every time you are trying to plot something ```using Plots```, it will take over 40 seconds to compile the Plots package.

Then, you have the large memory usage. In my experience, plotting in Julia requires approximately 14x more RAM than in python or Matlab.

Julia has an awful experience, really.

That's the trade off I guess, because of the JIT you have a slow initialization followed by very fast code.

That's not really good for a lot of workflows though (if you're running a smallish script over and over for example).

If you are running a script over and over (or using small modules to do organize things) it will be very fast and easy to work with. The compilation is just for the first time a script is run, not every time.

It really isn't a barrier any more than, say, waiting for `library` commands in R.

I don't think I've ever seen an R library() take more than about a second, and usually the action is complete as my finger is starting to raise from the 'return' key. When I tried "using Plots" in julia, it took several tens of seconds the first time, and several seconds in subsequent sessions. So, slower than R, but not terribly so. I suspect the real advantage of Julia is that it lets the analyst stick to a single language, without (as in the R case) having to write time-consuming components in C, C++, or Fortran.

For me the appeal is 3-fold

1) I can contribute to widely used packages like DataFrames and HypothesisTests. I had never made a git commit before this and my only "real" programming was CS 101 in Java. The fact that I could get up and running so easily contributing is a testament to the language's ease of use

2) I think its tough to predict your computational needs at the start of a project. Sure everything can be done in `lme` in R at the outset, but if you need some new bootstrapping procedure that a reviewer wants you might be left connecting some high performance code to an existing, large, R-based codebase. That's tough. I think Julia makes that "refactoring" (if you can call it that) easy.

3) Hopefully Julia will open a lot of doors for me in the future in my research career. I will be able to write interesting simulation procedures that are otherwise too unweildy for the comparison R or Stata economist.

I tried an earlier version of Julia (0.7). Then there was an update and all packages started breaking. There was no backwards compatibility. We had to rewrite a lot of code and completely change DB layer. I hope it's more stable now

There was a big transition to 1.0 last year, for which 0.7 was a transitional release (1.0 but with deprecation warnings for old syntax). The package ecosystem took a few months to catch up, which may have been a confusing time to start.

The promise now is that things should be more stable.

I wish Julia would be more strict wrt type coercion of integer to float values.

I've once spent a day debugging the issue caused by the following line of code, where t1, t2 are floats and v is an array:

d = (t2 - t1) * length(v)

It should've been LinearAlgebra.norm():Float instead of length():Int. Had julia been stricter the code would have failed to run, saving me much time.

I actually can't think of a single language that doesn't allow you to multiply an integer and a floating point value, yielding a floating point result.


In Rust, all type coercion must be explicit, you can't even add different integer types:

5i64 + 5i32 // WONT compile

5i64 + 5i32 as i64 // will work

OCaml has distinct multiplication operators for ints and floats. Haskell's multiplication operator requires its arguments to be of the same numeric type, and returns a product of the same type. That's two, off the top of my head...

JavaScript? 23n + 5 causes a TypeError.

Go ?


So... The answer is fundamentally "statically typed", right?

I think the answer is fundamentally: giving the compiler a lot of information at compile time so it can optimize the generated code for the runtime, be it explicitly by type annotations or implicitly by type inference. If the compiler is sure a variable will be a 32 bits int, it can translate to the same instructions that any language that is not dynamic could.

What Julia language does is give programmers powerful tools to encode types, behaviors and even code manipulation/generation tools that can be fully resolved in compile time, while still having a fully dynamic runtime.

Julia is not statically typed, at least not in the common sense of the term.

Yes, I got that. But the "good parts" according to this artcle derive from where it can work like a statically typed language.

Yeah but it's not statically typed. Let's say you have a machine learning algorithm and you'd like to change your algorithm to have a complex-valued weights instead of float valued weights. How do you do that in a statically typed language?

Let's say you want to test the numerical performance of the fast Fourier transform using an alternative datatype to IEEE 754 (for example bfloat8). How do you do that?

Another application:. I wanted to play with galois fields for reed Solomon encoding. One key step is LU decomposition to recover missing data. Well, as it turns out built-in matrix solving algorithm in Julia is general, (though it does kick over to BLAS for floats), so my galois field type was plug and play, I didn't have to write a custom solver.

> How do you do that in a statically typed language?

You can do that only if the interface for both types is identical or compatible. They should offer at least a common subset of the same operations. Hence, "generics" is the answer to your question.

I understand your feeling. But my impression is that this kind of "black magic" isn't always the best. With generics and static typing, if it compiles, you'll know it will work. With "type inference" you need to test it at runtime. It may compile with near-native performance, or it may not. And maybe a small change in your source code will turn near-native performance to dismal performance.

Not really. Values have types, unsurprisingly, but variables don't.

However, if the compiler can deduce the type at compile time, it will dispatch to the accordingly optimised function. (And type stable code helps doing that).

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact