Basically, I can summarise the argument as: for complex, real-world projects involving multidimensional arrays, having to remember which dimension means which thing, and track these through various tensor transformations, is the 21st century equivalent of how prior to high level programming languages we had to manually track which registers we we were using to store what quantities, and how this changed in different parts of a program.
It was the real dark ages of programming, and we’re in that same place with a lot of complex deep learning models now.
So my question is: has this been discussed at all in the Julia community? Is this on anyones radar? Since one of the core competencies of Julia is array processing and elegant metaprogramming, I would hope that it can gracefully absorb the idea of named array axes in a way that feels natural and complete, and could start showing up in ML frameworks like Zygote and Flux down the line...
While these don't pass dimension labels along in the type, there are also a number of Einsum-like packages which should work on any storage order, and could be taught to interface with such types:
In Julia, as long as any custom type implements the base methods (through overloading) it will work with anything and with no performance cost (since most base Julia types are written in Julia anyway), including every ML library. For example unitful that keeps track of the physical dimensions that perfectly integrate with the differentialequations library:
My understanding (please correct my if I’m wrong somewhere) is that in Julia, for releases which are supposed to be non-breaking such as 1.2, the Julia devs run the test suites of every registered Julia package on stables versions and then on their release candidate and then go through any regressions and identifies any unforeseen breakages that occurred and then either fix them in the Julia release candidate or if the package relies on an internal implementation detail which is not guaranteed to be stable, they will often try to help the package owners fix the regression. That’s seriously impressive and while I’m sure it happens in other languages, I hadn’t heard of a programme like that and thought it was worth mentioning here as the happy reason why this release was delayed.
The Rust project has a tool called `crater` that can build all crates in crates.io and run their tests, a process that takes a couple of days. It also has a github bot that allows running crater on PRs on demand.
This is super super useful and something that all programming languages should have. It allows you to:
* make sure that a new Rust release doesn't break code that compiled with old releases.
* assess the impact of a PR implementing a breaking change (e.g. due to a bug fix)
* obtain quantitative data about language usage for language evolution.
This last point is probably the most impactful. When I was working on C++, a question like "How much people are doing X?" would often come up.
What did we do? Stakeholders in C++ meetings would travel back home, grep their codebases for the X pattern, and report back 6 months later. And that's it. That's all evidence we were hoping to get. And this was often super biased due to different coding guidelines and styles in the different companies.
In Rust, you open a PR that makes doing X trigger a compilation error, schedule a crater run, and get a list of all places in all crates in crates.io where the error is triggered. This gives you quantitative information about how often is X done in the wild, and you can then go and analyze the code doing X and obtain even more information.
It's hard to convey how important this is. You'll never have to read discussions about "I've never seen X in the wild" vs "I see X in the wild all the time". Those discussions make no sense when you have a tool that can actually tell you the answer. Every programming language being evolved today should have something like this.
Here's an example of a CPAN module (Moose) which is tested against every release of Perl (going back to 5.8.3 from 2004) against multiple OS/Platforms - http://matrix.cpantesters.org/?dist=Moose
A nice language for scientific computing after Fortran (some might hate it, but I liked it during my CS studies).
I hope Julia succeeds like Python. I have been using Python for quite a while for scientific computing, machine learning and as general purpose language for web services development. I wanted to use Julia for some machine learning problem for predicting prices (time series forecasts), but familiarity and tool support of Python always kept me back and with Facebook's Prophet (fbprophet) library I could do it easily.
I will definitely give Julia a try in coming months. I like this competition and innovation in scientific computing.
Isn't a lot of the frustration regarding Fortran aimed towards libraries written by PhDs who don't know a thing about writing maintainable code (because that's not their specialty - not their fault really) and on top of that are so intelligent and specialized in their own field that you end up with even more impenetrable code, while also being dependent on said library because nobody else understands the problem domain in question to the same degree as those PhDs did (emphasis did, to make things worse) when they wrote the library?
That sounds like a frustration layer cake to me, but it's not exactly the language's fault.
His initial code's interface was "?" and you entered integer numbers to execute various functions, and there where multiple levels here so you had to memorise the numbers and where you where in the program :-)
I added prompts (with Mixed case Hollerith statements) that showed the options at each level and what they did.
This was to reduce the risk of a botched run which when we scaled up to 1:1 in an 8-1 m dia tank could cost £20k - the cost of a small flat at the time
I hope to see it succeed in the server-side web development area
At least the times I've seen it the first authors proclamed it as done and "just need to port it for production" only to find out pesky things like massive amounts of bugs (no testing in the originals) or unusable performance problems (turns out what you can do on a 64-core cluster overnight is not suitable for doing realtime)
The idea for exploratory programs is just to see if the core idea is actually viable in the first place, before spending time on things like making it robust/fast.
Ofc, if you could make it robust/fast for free/cheap, then it makes sense to call it an anti-pattern — you made it fragile for no good reason — but I’m not clear on how to obtain robustness/speed without quite a bit of work
I've heard all kinds of arguments against it, but they mostly boil down to "data scientists/analysts/whatever" can't be arsed to bring it to production (or "do not have the right skill set") and vice-versa for software developers.
I've seen much more success with a developer and less-technical person working closely in a team and bringing a project from prototype to production together.
Also, the quant often isn’t good enough at computer science to actually develop a production system himself. And even if he was good enough, it isn’t his comparative advantage so he probably should be focusing on research.
After all a lot of the performance sensitive libraries written in C/C++/etc... in Matlab/Python are written in Julia itself.
It really depends what you mean by production.
Personally, what I'm really looking for when I ask if something is ready for production is, first and foremost, is it buggy? Second, has it stabilized, or are breaking changes still common? Third, is package/dependency management something that ops can work with without too much hassle, preferably without just relying on Docker to keep it manageable like a parent closing the door to their teenager's pigsty of a room.
Also, you don't have to be faster than the bear. Julia's chief competitor, Python, is often used in production even though it fails the 3rd test and only semi-succeeds at the first.
I certainly used to write real time code for FORTRAN IV on DEC RT11 - thank goodness for the DECUS tapes for example code.
The difference between an exploratory code in my opinion and production code is that production code must be correctly structured (comply to an architecture defined by the developers, properly separated in functions, readable), fully documented, unit tested, integration tested, properly tested in homolog/staging, logging, metrics, health checking and extensive error/exception handling.
So in general, during exploration phase you can use the already tested production code/libraries and extend them without much care on the above until you're satisfied with the approach/results (usually in a fast REPL/Jupyter/Revise.jl cycle). Then you start refactoring and implementing all the above to prepare for deployment (and probably during the testing you'll possibly find errors in your exploratory code, which may lead to another exploration phase). Eventually you'll end up with solid production level code.
Also, it’s pretty clear from the existing Julia libraries and homepage that serving requests isn’t what’s important to them.
Can someone point me to excellent resources/tutorials for learning Julia?
I mean I can search Internet but I am not sure I'll stumble upon really good tutorials (I don't want to go through Medium or TutorialsPoint kind of tutorials).
Excluding the university course listings, I think most of the resources there should be up-to-date (post Julia 1.0).
Once you have a basic feel for the language, the forum (discourse.julialang.org) and Slack are both very welcoming.
Also, there's a lot of stuff that Julia has which Crystal doesn't (and vice versa since they focus on very different domains)
The crystal guys should have but their effort into nim or something like it imo