Has anyone built something medium-large sized in Julia? Maybe I'm missing something. When I was trying to use modules to organize my code the response I got was that modules are more trouble than they are worth so just don't use them. So why do they even exist? That really put me off from Julia despite really liking it otherwise.
I would want to build things with it not just play in a REPL and notebooks.
I would also suggest aggressively unit testing all the parts. Numerical developers are not often in that habit, which is a shame.
For certain! I don't know how anyone can do any serious numerical development without a whole lot of test cases, including esoteric PhD-level numerical analysis stuff.
I remember working with the IMSL libraries back in the 1970's and being in awe of the huge set of numerical tests they had in the test suite which gave them, from my point of view, an unbeatable lead in development. These were tests that were often about highly esoteric aspects of numerical stability in floating-point algorithms and were written by extremely well educated mathematicians with tons of numerical experience.
For those who aren't greybeards IMSL (the International Mathematical Subroutine [now Statistical] Library) was one of the first uses of re-usable software in application programming. It started as decks of punch cards that you would put in a request for and the computing facility would punch a copy of the appropriate routines for you to include in your program deck.
I normally break up my code across modules, but also across libraries.
I'd say a few of my projects are at least medium sized. I discuss an example here, where much of the code is split across many separate modules: https://bayeswatch.org/2019/01/29/optimizing-a-gibbs-sampler...
I achieve roughly 1700x better performance in an example than a JAGS model. A C++ version does a bit better at 2000x when compiled with Clang.
Having all these dependencies checked out for development is not best practice, so I wouldn't recommend strictly following my example.
Specifically, these things make Julia less suitable for larger projects:
- Lack of support for OOP. And no, purely functional programming is not the best way to code all projects.
- Dynamic typing.
- Doesn't compile to native code (there are ways to do it, but too complicated and there are caveats).
- Module system doesn't seem to be design well.
- You can't use a function before it's defined.
That said, I think OOP is fine. And so is FP. They’re just tools, and I would prefer to model nouns in the former, and model verbs with the latter.
I think you're looking at this wrong. Part of the reason why so many large scale products use OO, is because they use Java, and java is awesome if you're working at such a large organization that your productivity is most easily measured by pointy haired managers who count how many LOC you've written ;)
As for readability and conceptual simplicity, may I introduce you to something called the Factory pattern?
Edit: never mind, you're obviously aware...
I am skeptical of that history. Simula and smalltalk were academic endeavors long before lots of devs were working on projects, Java became popular because it was a portable C++ which had no pointers and garbage collection.
It surely had lots of consulting opportunities, though.
Conversely, python's objects feel anti-object since you're passing self all the time (aka dot notation is a weaksauce syntactic sugar)... You could easily model this behavior if you really wanted in Julia, for example, with a custom operator (not that you should)
function Base.^(o <: Object, fn)
(a...) -> fn(o, a...)
I'm talking about projects like Mapbox, Blender, Firefox, Android.
Ad hominem if I ever saw one.
What do you mean by that? It absolutely is compiled to e.g. x86_64 instructions by default, going through LLVM. Do you mean no standalone binaries? If so, that's actively being worked on by e.g. PackageCompiler.
I'm not sure what you mean by this. You obviously can't call a function before it is defined, but you can use a function in another function without any problems:
julia> foo(x) = bar(x)
foo (generic function with 1 method)
julia> bar(x) = x+2
bar (generic function with 1 method)
Julia doesn't rely on functional programming and there are also structs and operator overloading. What feature do you think is missing specifically?
> Dynamic typing.
Julia is not dynamically typed, though if you write a type unstable function, you will get back an Any type.
What specifically is problematic?
Interfaces, access modifiers, calling methods via object.method() instead of method(object), you can't define methods inside classes (structs) with an implicit "this" parameter.
> Module system doesn't seem to be design well.
...and neither is OOP? For doing data things I greatly prefer functional programming for ease of understanding how the data flows and gets changed.
> - Doesn't compile to native code (there are ways to do it, but too complicated and there are caveats).
Neither does Python, neither does R. AOT compilation is being worked on, I expect it to get pretty great. C and FORTRAN do, sure, but most people are not writing C and FORTRAN in production.
> - Module system doesn't seem to be design well.
It’s maybe not perfect, but it’s honestly miles better than Python.
> - You can't use a function before it's defined.
This feels like an incredibly unfair and unreasonable complaint given not many languages that aren’t AOT compiled actually support this and Python and R certainly do not.
OOP support doesn't imply not having support for functional programming. Swift, Scala, and Rust are good examples of languages thst support both.
The parent seems to like julia, but laments that it is missing several features which they have come to appreciate and depend on from Swift. Your response points out that Python and R don't have those features either, which is true, but sometmwhat besides the point of the parent comment.
Please. No. Languages that try to do everything are crap. If you want to do something OOP, why don't you grab a language built for it?
- aren’t access modifiers the way to define scope? That is possible in julia.
- getters and setters, wouldn’t those be defined in a struct in julia?
- inheritance: pretty sure this doesn’t exist in julia.
I seriously barely understand, i’m reading from the doc here:
It makes sense with matrices.
Maybe if you don't think of it in terms of a different index basis and instead you think of it as indexing vs. offsets then it becomes easier to switch between the two?
It's basically bikeshedding.
"But I look at the shed all day..." "But I'm used to looking at reddish colors..." "Red is more correct than green because <yadda yadda>..."
Let's all move on to more important things :-)
I might have to learn Julia at some point but 1-based indexing was a fairly heavy indicator to me to hold off.
> it is pervasive and fairly important
> it's kind of a deep indication that something is amiss.
> it's an argument that was had and decided 20 years ago.
Evidently not, if we’re having this debate.
> If this is wrong then you can probably assume that lots of other things which are important...are wrong
> google why 0-based makes sense by Djikstra
Is Djiktra’s opinion so infallible it supercedes the design choices of several successful languages?
There's another argument, I can't remember if I originated it or read it but it's the one I find most convincing.
Consider that you have finite units of equal length 1 stretching away from you in a line. How would you specify to someone (using real-number measurements) to retrieve the first N=6, or describe the vector they occupy? You would say, take everything between [0.0, 6.0). If you use zero-based indexing, the numbering scheme specifying the interval is identical across both systems: [0.0, 6.0) => (python)[0: 6]== 0<=x<6. In real numbers the interval could be open or closed for this argument to work, because for there to be any elegant translation the integer upper bound has to be open. That's the only way you can keep the elegant transformation with 0 and 6. Caveat that thie works because you're numbering the real-number origin as 0 but try telling physicists that they're not to do that! O=(1,1,1) is an even more indefensible position.
So there's another mathematical argument that I either came up with or read elsewhere that feels deeply convincing to me. If there's 2 there's probably more, and here's the list of languages on wikipedia, you tell me which you'd rather work in.
Sorry for the irate tone but I really think that on reflection and fair consideration this is an obvious one.
Putting aside the implicit claim that something as subjective as aesthetics will be agreeable by 99% of the population...is 0-based indexing simpler? Why?
Your next example is...a little all over the place mathematically, if I’m being honest. I’m having trouble following it. But let me try...
So you’re saying I have a 1-dimensional, ordered set S, whose elements are real numbers. It can’t be an interval, because S needs to be discrete for this to be possible computationally or even theoretically (S is uncountable if its continuous, and thus cannot be indexed). It needs to have an order to have a sense of position in the first place, so map S to the natural numbers in whatever way you choose.
Now choose an element s in S. You want me to pick the first n elements from s that are “next”, for whatever your order relation is. We’ll call that relation <. In that case I would respond by saying, “Let s be an element of S, and choose an n-tuple N of elements
(x_1, x_2, x_3, x_4, x_5, x_6) in *S*
Haven’t I accomplished your exercise just fine without using 0 indexing? If I were programming this, I could index the n-tuple starting with x_0 or x_1. I don’t understand the issue - what are you trying to convey here? Personally, I’d program this by defining S to be my vector or list or whatever, k to be the index of s in S, then defining my n-tuple with:
N = 
for (i = 1, i <= n, i++):
N.append(S[k + i])
Now if you choose 0 indexing + (closed, open) for your slicing, then a (python) programmer would say to another programmer, take the slice [0:3] to get the 3 trucks. So the slice numbers correspond perfectly to the vector description of the space that the items occupy. That's why in this slicing system taking L[a, b] then b-a = the number of items you get. Because it's matched perfectly with a real-number vector description of the space the items occupy on a real number line. All the other advantages like (I can describe an empty interval: [0:0]), I can describe taking the last element and it feels right [-1: 0] follow from this physical-space homomorphism. That last one: in a circular (modular?) space, walk left 1 and then come back right 1 and pick up the element you pass over. In 1-based: [0: 1] = the last element? I don't know if any languages do this but either you can't and that's sad or you can and it is incredibly counterintuitive.
I thought about my post after I read it and came to this- given Djikstra's (only system which can describe an empty interval and interval with first element without "unnaturals" ) and (?)'s argument that it creates a nice homomorphism between real-number intervals and integer slices then it's inarguable to me that 0-based is more mathematically elegant.
Where I guess there's space to disagree is this, I believe (to almost a point of faith I guess) that mathematical/logical elegance is important, and moving away from it is normally a mistake which leads to more pain in the long run.
And I don't think they're idiots, I just think that they're wrong. Not many people have read Djikstra and (?) and have deep faith in mathematical beauty, or have even thought about this much, doesn't make them idiots.
That's the second time you've ridiculously misrepresented my statements: "anyone disagreeing with you is an idiot" "deep faith will lead you to the conclusion" (and not deep faith + thinking about the actual problem, several convincing arguments and some ability)
Fortran chose 1-based indexing for a very obvious reason...it was the best translation from the mathematics literature that they were trying to implement. Because matrix notation uses 1-based indexing! MatLab, a language designed specifically as a high level language for matrix mathematics, chose it for the same reason. R, a language for statistics, chose 1-based indexing because it is a statistical language, and counting is one of the most fundamental operations in statistics, and 1-based indexing is the form used for counting.
Mathematicians obviously have no problem switching back and forth between 0-based and 1-based indexing for different domains, so it boggles my mind that computer scientists have turned it into such a huge holy war, and even more mind-boggling that 0-based zealots claim to have mathematics on their side.
The idea is that the interface to the data structure should ideally closely match the semantic meaning of the data (timestamps, frequencies, etc.) rather than memory addresses/pointers. That lets you program at a higher level of abstraction. More generally, both zero-based and one-based indexing have their natural uses, depending on what you're referring to. Eg: consider floors in a building. If you wish to index the horizontal surfaces separating the spaces, it's more natural to start with zero for the "ground level". If you wish to number the spaces between the surfaces, then it's more natural to start with one for the "first floor". Which one is more convenient depends on the semantics of the problem domain and what ideas you wish to communicate. Being overly attached to one perspective is unhelpful.
To quote myself from those links:
> The way I think of it is that different indexing schemes suit different problems. I want to think carefully about the problem domain and use the most convenient convention. For example, when my array stores a time series, I would like the index to correspond to timestamps (and still be performant, so long as my timestamps can be efficiently mapped to memory locations, which is true for affine transformations, for example). When another array stores the Fourier transform of that time series, I would like to access elements by the frequency they correspond to. That stops me from making annoying indexing errors (eg: off-by-1), because the data structure's interface maps nicely to the problem domain. I find that much easier than the cognitive cost that comes with trying to shoehorn a single convention on every situation. But it's difficult to appreciate that when thinking of language constructs divorced from specific problem domains, as one tends to do when typically studying data structures and/or algorithms.
> [Regarding offset array indexing...] Think of this feature as blurring the distinction between data (accessing an array) and computation (calling a function). The fact is that arrays as they are used (contiguous memory location collection) often carry more information than being just a dumb list of arbitrary data, and it's very convenient to expose that in their interface.
Now, for example, an array can more closely resemble a cached function computation, because the interface to both carry the same semantic meaning.
I hadn't seen it before, basically if I understand it right treating all vectors as lisp s-exps which are themselves 0-indexed (0th element is 'list' , because you're doing a lot of symbolic manipulation. It's the best argument I've seen for 1-indexing, I'd say two criticisms - 1: I don't know how homoiconic Julia is, if it's not very then the actual use of this seems very low relative to slicing. 2: It seems to be mixing up indexing a vector itself, and indexing the expression that the vector is (like macro-quoted vs unquoted). I can see how it would smooth things in some way but it feels like a fudge that might cause more pain somehow later.
It depends. In the USA, 0-indexed floor numbers are the norm. Elsewhere (eg, Australia), 1-indexed floor numbers are the norm.
I am curious what people write all day for this to be such a big issue. It's not like you need to re-invent N-dimensional array indexing every morning -- we have abstractions. If translating code then it could be from either convention... or from a mathematics book. (Where you may note that mathematicians are unconvinced by the CS arguments... admittedly not a group of people known for taste in all matters, but taste in notation they have thought about quite a bit.)
That depends on which circles you run in. If you're a programmer, yeah, it breaks with common convention. If you're a scientist or engineer, not so much. When writing a programming language for scientists and engineers you have a choice: do I stick with programmers' conventions or do I stick with engineers' conventions? I've had the pleasure of teaching both MATLAB and C++ to freshmen engineers. For them, they get off-by-one errors in C++, not in MATLAB.
- zero-indexed: i×m+j
- one-indexed: (i-1)×m+j
OTOH one-based is slightly better for trees stored in 1D arrays:
- zero-indexed: parent=(child-1)/2; children=2×parent+(1, 2).
- one-indexed: parent=child/2; children=2×parent+(0, 1).
My favourite fact about this stuff: in VB (or was it VBA?) when you asked for an array of size n, you actually got an array of size n+1. So people could do 0-based or 1-based indexing and be none the wiser...
In this context doing 2D indexing in 1D arrays is a code smell. It does come up in the case of writing libraries for general purpose language with poor support for numerics and linear algebra, but then you should be abstracting this away from your callers.
Iverson's J did a lot of things (chief of them was removing the awesome symbols) to make it more appealing. A lot of folks understand the cause, but also realize it takes away one of the best reasons to use APL. So moving to 0-base wasn't necessarily because he thought it was better.
I think that these issues are generally acknowledged although I don't know if they will be addressed. Seems like major pain points for library development should have been addressed before 1.0.
It really is not:
* `from package import <star>` is more work than `import package`, `using` is shorter than `import`
* the official documentation Python documentation starts with qualified imports, then introduces local bindings, and finally unqualified imports, opening a few doc pages imports are either fully qualified or explicitly bound; the first occurrence of using a non-prelude package in Julia's tutorial is `using` and the modules documentation explains `using` first
The official Julia documentation very specifically steers the reader towards unqualified imports as the default & proper way to do things, Python's does the opposite.
edit: oh for fuck's sake I hate hn's shitty brain-dead pseudo-markup.
By one letter!
And in the right direction: for trying things out interactively, `using ThePackage` and then having everything available is great.
Then once you know what you want to do, in more careful code you can switch to `import` and qualify more things, and your future self with thank you.
But serving both of these needs seems like a valid design goal. I'm not sure the manual does a great job of explaining this right now, import vs. using also changes the rules around adding methods to functions, and it is perhaps more confusing than it has to be.
Yes, by one letter. Out of 6. "using" is 16% shorter than "import". And it uses better keyboard alternation (the first 4 letters of `import` are on the right side of a qwerty keyboard, the last 2 on the other, for using the only "i" and "n" are consecutive letters on the same half of the keyboard).
So "using" is 1. introduced first 2. the primarily documented import mechanism 3. significantly shorter and 4. significantly more comfortable to type.
If Julia's community doesn't want people to use it everywhere, they're doing a very, very good job of fucking with and victimising their users by way over-incentivising the use of `using` over `import`.
> And in the right direction: for trying things out interactively, `using ThePackage` and then having everything available is great.
And for writing maintainable code it's terrible, despite being by far the easiest and most convenient solution.
ChrisRackauckas complains that people use `using` over `import`, literally everything in the language and documentation pushes them towards it.
Putting the convenience of interactive sessions way, way over that of proper programs does not seem like "the right direction" to me, especially not when people then jump on their high horses and chide users for doing what the language unambiguously pushes them towards.
> But serving both of these needs seems like a valid design goal.
Python does that just fine: it provides convenience for interactive use without pushing users towards the least maintainable and desirable option.
Seriously though, it's hard to write one manual which gives everyone the perfect on-ramp for their needs. I actually think it's fair to aim it low, make it easy for matlab refugees to get started, interactively. People who know how namespacing works in several other languages (and intend to write big projects) are the ones well-equipped to know how to dig deeper. Nobody has been victimised!
Of course, that was actually a pretty big consideration during the "vcs wars", both the length and the alternation of VCS commands.
> I actually think it's fair to aim it low, make it easy for matlab refugees to get started, interactively.
Then don't whine that people use that, and police your community.
> Nobody has been victimised!
Well how's that for dishonesty. ChrisRackauckas pretty much goes "this article is invalid because you're doing exactly what the languages suggests and incentivises" half a dozen comments above.
As far as I'm aware there is no budget for a police force. There is, however, a double-your-money-back guarantee on all advice from blog posts!
I do think the manual could be clearer about using/import. Perhaps this should be emphasised under workflow , and packaging differences from Python included in the list . The experts have forgotten what the pain-points were; contributions from those who have not are thus welcomed.
Counterpoint: I am going to attempt an autocomplete after two characters by pressing tab. Typing 'us' leaves my left hand, the tab hand, as the last used hand before the tab. This prevents me from buffering the tab by moving my hand into position while typing the first two characters. 'im' is typed very easily with my right hand and while I'm typing that, my left hand is free to move into position over the tab key.
I'll need to take another look at the code I was working on. I was planning on doing so after a break anyway.
When you start building out a project, it's easy to keep track and debug if multiple dispatch starts failing (i.e. <any> type starts spreading everywhere and Julia slows to Python like speeds).
In medium-to-large projects, it becomes extremely cumbersome to manage this. It's doable, but adds a layer of complexity management to projects that simply doesn't exist in strictly typed or pure scripting languages.
Of course, you can just decide to explicitly type everything - but the issue here again is the lack of enforcement.
In a nutshell: Julia is great when you're a grad student working mostly by yourself on small scale projects! But not so great in prod.
And there's really no problem with that; that's who the language was designed for!
Some people would disagree with that
I would also argue that the large open source Julia packages are also great examples of Julia "in prod".
Just highlighting what I think is a significant con in a language with many pros!
I had to go so far as to read the project Github page to find them.
Whose opinion do you want, the Pope? Would a divine sanction be enough for you?
There's no reason Juno or any other IDE couldn't display the output of a static analyser inline, or allow you to command-click a function call to go to the site of the exact function being called, or show a list of alternatives, and so on.
Give Julia and its IDEs a few years to improve and you might find it much better suited to large projects. I wouldn't consider Java any good for large projects either, if it didn't have the excellent IDE support it now enjoys.
That certainly depends on your use case. For instance, the launch time is ridiculously slow, so that you cannot realistically run a small matrix computation in julia from within a shell loop. It is better to use octave for that, where the startup time is almost negligible (just a bit slower than starting a subshell).
The above may only be an issue of BigFloat, to be fair, since Float64 compiled in an instant (never measured, and the time never bothered me).
So Julia has solved a lot of problems for me, and I see great potential for it in the future.
$ time julia -e 'print(1)'
$ time python -c 'print(1)'
That said, instanciating julia every step of a bash loop.. I think it requires a jvm mindset, warmup once and iterate inside rather than outside.
I do not have this mindset then. I prefer tools who are mindset oblivious. They are really useful!
For example, imagine I have a collection of a few hundred images with their projection matrices (in text files). I want to crop them and apply a simple imagemagick operation (which is not available from inside julia). The elementary solution is to run a shell loop to apply the crop, and call julia to perform a simple adaptation of each projection matrix. This is impossible today: most of the running time of such loop is spent on julia initialization. Half a second to do nothing is simply unacceptable in a serious scripting language.
Yet you picked a specialized tool "imagemagick" (made for working with images), and want to drive it by another specialized tool (the shell, made for file and process management).
Since you want "mindset oblivious" tools why not write everything from scratch, in C?
Because imagemagick, julia and other specialized programs already provide the advanced algorithms that I need. Unfortunately, julia is not a good team player.
What it excels at is rapidly solving the same problem many times within one REPL/interpreter (since after the first time the code will be compiled), or for longer computations where the overhead of the initial compilation does not matter.
It would be great if it could also get fast at the use case you are interested in, but this isn’t currently what it excels at.
Or, you know, Julia is not specialized for your use case.
The slow startup time may be a minor inconvenience, I agree. But nonetheless it seems to be a case of sloppy engineering, as other scripting languages do not have this egregious problem. It sets a bad tone, and makes me wonder if there are other hidden monsters inside the interpreter, that may be solved by "you are holding it wrong" like this one.
Out of curiosity I timed ocaml
$ time echo 'let x = "1" in print_endline(x)' | ocaml -stdin -
$ time echo 'let l = [1;2;3;4] in let m = List.map (fun x -> x*x) l in print_int(List.length(m))' | ocaml -stdin -
ps: this hints at the init time being constant recompilation of its core https://www.reddit.com/r/Julia/comments/4c09m1/julia_045_sta... and they were thinking (2 years ago) of caching..
It's just because i find the science python stack a bit absurd.
Julia is not a scripting language, it's a language for mathematical analysis.
Also going back to your example the solution would be to use ImageMagick.jl from Julia, and extend the library (4 extra lines with ccall) if it's not available, and send a pull request to help the community.
Ok, this clarifies the matter a lot. So julia is not intended to be a general-purpose programming language. Are you involved in julia development?
(Besides, I strongly dislike the design tradeoff of not being a good unix citizen.)
The only trade-off here is the restricted resources the Julia devs can afford. Focusing on the scientific computation niche, the Julia compiler was built to compile code just ahead of time, since this way it can create code as fast as compiled languages while still being dynamic like any interpreted language (such as Matlab and Python). But there is nothing in the language that prevents it from being fully interpreted (and therefore having minimal compile time overhead, but generating poorly optimized code for long running processes), or alternatively pre-compiling and caching to have both (which is being worked on: https://github.com/JuliaLang/PackageCompiler.jl )
Also the language itself supports extending code (multiple dispatch really helps it to stay clean).
julia> @time print(1)
1 0.000031 seconds (7 allocations: 272 bytes)
$ time ./julia -e 'print(1)'
It's a bit annoying it's true, a well featured language with dynamic/repl appeal that is not in the ms launch time. But alas..
16 seconds. Sixteen seconds. For real. This is not usable.
> This output is saying that a floating point multiplication operation is performed and the answer is returned.
But "this output" is:
%2 = mul i64 %1, %0
ret i64 %2
> Here we get an error. In order to guarantee to the compiler that ^ will give an Int64 back, it has to throw an error. If you do this in MATLAB, Python, or R, it will not throw an error.
while the quoted input and output is
In : 2^-5
Another confounding fact is that Julia optimizes on small unions, so the generated code isn't that bad anymore. Now it just creates a branch. It used to have to do all of inference and dynamic dispatching on the fly, which is what it has to do in fully uninferrable code of course, but a union of two things just does a type check and splits. So now... that example is not as bad as it used to be...
So there are lots of older articles discussing drawbacks to Julia, or ways to do things in Julia, that simply aren’t relevant anymore. Unfortunately they still often are the first links that come up when searching google for Julia questions.
> In : 2^-5
> Out: 0.03125
> (ie, no error, but the correct (floating point) result.)
C/C++ makes the same choices, by and large.
Based on the confusion there, it definitely seems that it could be made clearer in the docs that if one wants floating point results one needs to use 10.0^17 instead of 10^17.
Plus I got annoyed with extra weird syntax, but I can't remember the specifics.
Basically, Julia required more lines and characters and wasn't as close to the math.
Aside:I think Matlab / Octave is a lot like SQL and Tcl: lots of haters, unfashionable, but usually the most elegant solution .
You can even use nice Unicode notation such as A ⊗ B ⊗ C.
So, `kron` is actually provided by the standard library , are you saying that this kronecker product didn't do the job?
 search in this file: https://github.com/JuliaLang/julia/blob/master/stdlib/Linear...
I wasn't comparing Julia to python or c+, but to Matlab octave.
IMO for most people, for array based algos, Matlab hits the "notation as thought" Iverson saying in a way that APL didn't quite make it.
You could translate most of those files line-for-line, with quite a few lines identical or trivially changed (bracket shape, or max -> maximum). This is often a useful thing to do, get a transliterated version running, and then re-write bits of it more idiomatically, while checking that the output is identical.
There are so many Julia packages that do similar stuff that I imagine it can't be all that hard for people who have become fluent with the macro system.
const ⊗ = kron
A = rand(5,5)
B = rand(3,3)
A ⊗ B
Tada! I'm not sure how MATLAB/Octave's kron(A,B) looks more like math than A⊗B, but everyone can have their own opinion.
This is the second time I've learned about time-saving functionality related to infix that I didn't know about. (The previous was the symbol for integer division, which had been left out of the documentation until version 0.7). In this second case, interpretation of Unicode symbols as infix operators should be better surfaced in Julia documentation and tutorials - it's a really useful feature.
Also I ha e found multiple dispatch to be harder than regular OO methods to locate (for IDEs but grep also).
Of course 1 based indexing makes the move harder, but it's worth the effort anyways.
I'm trying to embrace 1 line functions as much as I can, and Julia helps me with it's special notion for short function definition.
Developing anything multithreaded was a major pain.
For example, compare linear algebra syntax in Julia with those in R and Python.
Plus, the flexibility of multiple dispatch applying equally well to any of your own types really makes it feel like you can do anything in Julia.
Metaprogramming, writing code that writes code, can take a while to get used to. But is extremely powerful.
A final point I will make is comparability of Julia code. If you want to for a Bayesian model or solve a differential equation in Julia, you can mix your own types and code and any generically written library code at will, and it will often just work.
In R, you're not likely to find that; your Stan code can't call R functions you've written or those from your favorite libraries.
Compared to python, i'd say julia has the reputation for being generally faster for things you can't use numpy with, and you can more easily scale multiple cpu or machines.
They are not that used though, too much magic and implicit processing are not promoted in the community.
Whether Julia is that for you right now probably depends a bit on your needs/use cases.
Until Julia gains some form of aot compiler for static binaries - it's not quite an alternative to python imnho - you can't expect to run Julia scripts everywhere you can run python. Though this is similar to R. And there are of course much bigger ecosystems for python and R code/libraries than currently for Julia.
Personally I think they've done a great job on the design/syntax of Julia - it's a fun language - and that might be the best reason to play with it?
So... R is an unreal language then?
R is wonderful (and has multiple dispatch, thank you very much), but it's definitely not as familiar to developer types as Python is. R has lots of weird quirks (a vs a[] vs a[1,]) and uses function based generic programming rather than object based generic programming.
Because of the re-implementation of S, it has lots of odd quirks and multiple ways to do things, which Python tends to avoid. Python is definitely more consistent from a programming point of view, but R is a more cohesive system from an analytics point of view.
But yeah, lots of people are hateRs (unfortunately).
In my humble opinion, applying S's syntactic sugar on top of Scheme-forged core did more bad than good to R. Most of R's quirks stem from the authors' desire to keep it as compatible with S as possible.
As for the two approaches to solving problems (TIMTOWTDI vs Python's "one true way") I think that's a matter of personal preference; I, personally, appreciate the freedom that R gives me.
Also, it might be anectodal, but I believe that - to some extent - it promotes thinking "outside the box", especially when you work in a team and review others' code on a daily basis.
On the other hand, if it was not for that compatibility with S we would not be having this conversation: R would be just a footnote in the history of statistical computing like Lisp-Stat.
Perhaps having a fixed (and commonly agreed on) set of design patterns - like in Java - makes development easier and smoother, but most of the projects I've worked on have proven successful, so maybe it's not that bad either... :-)
Was S "really, really old" when it was re-implemented in R? Gentleman and Ihaka first announced R in 1993 (26 years ago!).
The development of S started in 1976 (i.e. 17 years before R was first released). R was a implementation of S3, "the New S language", which was a major rewrite done in 1988 (it was just five years old when R was announced!).
But R's functions for non-statistical tasks often break the language's idioms. Those dealing with files and connections are especially ugly.
I wouldn't call it insane or unreal, but it's definitely not intended for general scripting.
As for (a bunch of) functions in the basic library (which, essentially, is a set of packages that you can discard or simply not use), I dare say it has little to do with the language itself. I don't like Python's regular expressions library, for example, but it's only a set of functions (or methods) that you can write on your own!
The same goes for R's data-wrangling routines; there's this tidyverse, and you can write good, reliable, production-grade code without using a single function from the base library. Heck, you can even write your own DSL (think: Grammar of Graphics) if you think it could improve your code!
There are several quirks built so deeply into the core of the language that you cannot overcome them ("why can't I overload `+` for character strings??"), but they have nothing to do with the rather poor and unintuitive basic library.
A different take on the subject (https://r4stats.com/2017/03/23/the-tidyverse-curse/):
> On the other hand side – I have met a guy using R in production. And he told me that he needs code that stayed the same for 5+ years. That is why he does not use dplyr or any tidyverse, it is still changing too much.
I'm not sure if they just suck at Windows, but I was extremely dissapointed... :/
If you had run your code again within the same REPL you’d see it runs substantially faster the second time. Likewise, if you had tried to process a large enough matrix or CSV that compilation time should be a minimal subset of the total computing time, and probably not noticeable.
It isn’t a language limitation, just a tooling limitation. The startup times have been getting better as Julia develops, but there is still noticeable overhead the first time code gets run in a new REPL...
R makes it very easy to see the underlying R code (you just type the function name), until you get to a ".Call" or ".Primitive": from that point, it is effectively a black box.
But as most of Julia is written in Julia, you can easily inspect and understand how functions work, all the way down. Moreover, by using the @code_* macros, you can also inspect the various stages by which the code is transformed from high level Julia code down to the actual machine code which is running on your computer.
However, in fields dealing with larger datasets (e.g. genetics, astrophysics, ...), Python and R are real bottlenecks in data processing pipelines, and are thus often replaced by programs in C/C++/Java/etc. Having a language with the expressiveness and dynamism of Julia on the one hand, and performances in the same order of magnitude than C++ in the other hand is a huge plus.
In Bioinformatics, not at all. Or at least in no direct way that I'm aware of. Sure, R packages indirectly call upon Fortran libraries (LAPACK et al.), but user facing programs are mostly, from my experience, written in C, C++, Java, Python & Perl.
27 men and 1 woman, which is a bad parity score, even for a tech company.
It looks julia developers implemented nice looking features, which can cause problems:
pi = 3
pi = 3
Same for omitting multiplication "*":
3(x + 1)
x(x + 1)
These adds lots of problems for larger programs and maintainability.
It is only fast if you don't count the "compilation" times.
For example, every time you are trying to plot something ```using Plots```, it will take over 40 seconds to compile the Plots package.
Then, you have the large memory usage. In my experience, plotting in Julia requires approximately 14x more RAM than in python or Matlab.
Julia has an awful experience, really.
That's not really good for a lot of workflows though (if you're running a smallish script over and over for example).
It really isn't a barrier any more than, say, waiting for `library` commands in R.
1) I can contribute to widely used packages like DataFrames and HypothesisTests. I had never made a git commit before this and my only "real" programming was CS 101 in Java. The fact that I could get up and running so easily contributing is a testament to the language's ease of use
2) I think its tough to predict your computational needs at the start of a project. Sure everything can be done in `lme` in R at the outset, but if you need some new bootstrapping procedure that a reviewer wants you might be left connecting some high performance code to an existing, large, R-based codebase. That's tough. I think Julia makes that "refactoring" (if you can call it that) easy.
3) Hopefully Julia will open a lot of doors for me in the future in my research career. I will be able to write interesting simulation procedures that are otherwise too unweildy for the comparison R or Stata economist.
The promise now is that things should be more stable.
I've once spent a day debugging the issue caused by the following line of code, where t1, t2 are floats and v is an array:
d = (t2 - t1) * length(v)
It should've been LinearAlgebra.norm():Float instead of length():Int. Had julia been stricter the code would have failed to run, saving me much time.
In Rust, all type coercion must be explicit, you can't even add different integer types:
5i64 + 5i32 // WONT compile
5i64 + 5i32 as i64 // will work
What Julia language does is give programmers powerful tools to encode types, behaviors and even code manipulation/generation tools that can be fully resolved in compile time, while still having a fully dynamic runtime.
Let's say you want to test the numerical performance of the fast Fourier transform using an alternative datatype to IEEE 754 (for example bfloat8). How do you do that?
Another application:. I wanted to play with galois fields for reed Solomon encoding. One key step is LU decomposition to recover missing data. Well, as it turns out built-in matrix solving algorithm in Julia is general, (though it does kick over to BLAS for floats), so my galois field type was plug and play, I didn't have to write a custom solver.
You can do that only if the interface for both types is identical or compatible. They should offer at least a common subset of the same operations. Hence, "generics" is the answer to your question.
I understand your feeling. But my impression is that this kind of "black magic" isn't always the best. With generics and static typing, if it compiles, you'll know it will work. With "type inference" you need to test it at runtime. It may compile with near-native performance, or it may not. And maybe a small change in your source code will turn near-native performance to dismal performance.
However, if the compiler can deduce the type at compile time, it will dispatch to the accordingly optimised function. (And type stable code helps doing that).