I’ve only being doing some analysis stuff at the moment, but I’ve got a few machine learning/NLP projects coming up that I’m super excited to use Julia and Flux for!
Package compilation was painless, code compilation isn't really noticeable once it starts running and the benefits you get from native speed are evident. I recently learnt that the IO functions are async as well (powered by libUV!) and the parallelism is easily an order of magnitude nicer to work with than Python's (and it's only going to get better).
* Working in a rapid application development (RAD) fashion by operating on vectors using a language like Julia/MATLAB/Octave/Scilab which allows focusing on abstractions instead of implementation details and other distractions.
* Running code optimized automagically on GPU/TPU/etc.
* Sharing work over the web in a standard fashion (Jupyter Notebook on colab.research.google.com)
It's not clear to me where in this process the code is actually run on TPU (maybe someone has a tutorial?) but that doesn't really matter. The specific machine learning algorithm used is also not really that important.
The important part is that this enables amateurs to tinker with machine learning, see results quickly and share their work. Which means that now we'll finally see the accelerated evolution of machine learning.
Any of these blockers alone hindered the evolution of AI for decades, but seeing all three knocked down in one fell swoop is pretty astonishing, at least for me. I favorited it as a watershed moment in the history of AI! Congrats to him.
You know what doesn't throw errors whenever I just try to look at it? Products like Jupyter notebooks - stuff not made by google. I believe Colab is a using dark patterns to discourage blocking 3rd party cookies by subtly breaking what should render as a simple non interactive webpage to passive viewers.
To me, at least, Google and their dark patterns are considered harmful.
* raw.githubusercontent.com — it's loading the notebook directly from GitHub, so yeah, this one is fairly essential. Just need one XHR here.
* googleusercontent.com — no cookies, but it does load scripts, css, a frame and an XHR here.
That's it. There are some other domains it hits (like fonts.google.com and gstatic.com), but they're not needed to view the file.
A Markov chain generator wouldn't:
* Capitalize the first word of each line
* Make lines of approximately the right length
* Mark text with who is to speak it
While this is just a toy example, it's powerful enough to start showing the ways RNNs can produce text that looks superficially correct.
(Generating Shakespeare is actually one of the examples given in the classic http://karpathy.github.io/2015/05/21/rnn-effectiveness/)
1. Yes it would. It would see capitalized words as high probability for first word on a line or after a point.
2. Obviously it could, depending on stop condition. Especially if you include line length.
3. If trained on corpus of plays, for sure it would.
The strength on the RNN is supposed to be in context and memory... Perhaps handling of grammar.
There are advanced hierarchical grammars that are related to Markov random field models that are about on par with RNN based on many text and music analysis loads.
(In fact probabilistic math is often used to describe results and workings of a deep NN anyway.)
1. To use a small dataset that can be fed from the relatively small Colab VM (so people can play with it themselves)
2. To use a well known model (so people can focus on the implementation parts of it)
Also, this is very early stage software, so things are a bit more verbose than they should be :).
By analogy, and contrast, no one says Python runs on GPUs, just because TensorFlow allows describing models in Python that then get run on the GPU, or Numba rewrites Python loops to CUDA PTX.
It looks like marketing tbh despite know that Julia is a very solid language technically and shouldn’t need these kinds of rhetorical tricks.
- Does what you have to write to run on X follow the semantics of the language?
- Can you use data structures/code defined in libraries that don't know about your thing?
That approach is very different from something like TensorFlow where you're essentially metaprogramming an expression graph. Numba probably counts for python (yeah, you have to put an annotation on things, but if the python people really wanted, they could probably import Numba into core cpython and make it work more smoothly). Of course in python, you have the additional complication that most of the core implementation is not in python itself, so even if you satisfy my two criteria above for the core language, you're still gonna have to rewrite the whole standard library.
It seems like for non-demo usage, you would want upstream maintainers to agree that their code should be TensorFlow-compatible, and have tests keeping it working.
Aside: this is not TensorFlow, but XLA, which are two very different things. It's also possible to try this kind of thing and generate a TF graph, but TF is a much less nice compilation target.
The set of functions that some generic code calls could be considered similar to an interface or trait in other languages. If they expand the interface (by calling new functions) then you'd need to make sure the new functions they call have the appropriate implementations needed as well.
In Go, for example, the interface definition would be explicit. You'd add a method to the interface (or perhaps define a new interface) and update all the implementations you know about. If there is any outside code calling it with their own implementation, they'd get a compile error.
It does sound rather convenient if essentially every function call allows for new implementations, though.
"Julia runs on GPUs" actually means that you write Julia code and it is compiled to natively run on GPUs . Similarly, this post is about writing Julia code and compiling it to run natively on TPUs, not calling some predefined TPU library. Yes, of course you _can_ call libraries that are compiled for GPUs and TPUs—as you can in Python or any other language—but in Julia, nearly arbitrary user code can be compiled and run on GPUs and TPUs. The major restriction on GPUs is that you cannot do dynamic allocation, so you need to write non-allocating Julia code, but that's common requirement for high-performance code anyway.
You can write fully no-python C functions, all the way up to straight CPython dynamically typed code relying on the GIL and reference counting, etc.
Sure, cython is great if you decide to rewrite the library you want to use in cython. Same with scipy.optimize for instance: it can be super fast for everything but ordinary python cost functions.
Here is an easy way to prove me wrong: Find the minimum of a pure python function quickly, with whatever python tool you want, without having to deal with the enormous python function call overhead, without rewriting the function. Imagine this function is deep down in another library and it uses a bunch of 3rd party libraries.
When you ask for a pure Python loop, for example, you are directly saying you want something like the iterator protocol of Python, inclusive of its overhead, because your use case needs the dynanic behaviors, overloading via custom iterators, whatever. You’re directly saying the performance trade-off is worth it for you personally because a low overhead C-style loop won’t give you the extra features of e.g. the iterator protocol that matters more to you.
When you say, but my use case doesn’t need the iterator protocol, then you just write that piece of code in Cython or use a nopython numba jit call, etc., because you want to make your trade-off differently in that single case.
You’re essentially saying, “how do I make a triangle with 5 sides,” by requesting a pure Python section of code to not be pure Python.
On a side note though, you can take pure Python code and compile it directly with Cython, with no modifications, and in many cases it will still be quite faster because it can compile away some types of attribute accesses, overhead of boolean logical checks as function calls, and also reduce function call overhead for many standard library functions or data structures that are already implemented as C extension modules (by bypassing some of the method lookup logic to avoid checks on PyFunction objects in CPython).
Usually this is a recommended first step before ever adding a type annotation or doing anything more difficult.
Finally, just to be clear, numba (and tools using llvmlite more generally) don’t worry about this, since they perform static type inference and have other rules about enforcing static types when jitting a Python code segment.
Have you tried that yet?
If you’re writing a general purpose function minimizer, you have to make assumptions about the functions you will minimize, for example that they are bounded below or that minimization is restricted to a closed subset of the domain.
If they rely on third party libraries, you have to make even more assumptions, that those third party function calls don’t require (for legit reasons) some Python-specific dynamic typing feature of the language that would imply that jit compiling them is impossible.
If you are willing to make these assumptions, then solving this is trivial: just monkeypatch all the relevant functions with jitted versions of the functions. You could even write a tool to walk the relevant packages with pkgutil and monkeypatch everything. Zero rewrites, and you wouldn’t even have to manually indicate what to monkeypatch.
For example the library gevent does this “monkeypatch everything” pattern to change all functions in the requests package into non-blocking async requests automatically.
If you are asking instead to preserve pure Python features that these third party libraries are using, like reliance on iterator protocol, descriptor protocol, metaclasses, context managers, Python data model special methods, dynamic attribute lookups, etc., then the question is once again not logically consistent. It’s asking for a triangle with 5 sides.
Julia can Jit code that relies on vastly improved versions of all those features (minus some setatr etc).
Can you use them on a custom array type in any case let alone scipy? No. and that's the point. In julia, the optim package for example takes in abstract arrays with all those features. Mathematical assumptions are categorically different than unavailable or slow PL semantics.
Julia preserves full general programming language semantics including zero cost abstractions, higher order functions, closures, zero cost differentiation and extending custom types, while still being fast enough for numerical computing (minus exceptions).
It seems to be that Python is sticking with 1d figures and that's assuming your monkey patch idea works, also ignoring the ecosystem cost and the fact that monkeypatching in Julia is part of normal code design (through multimethods) whereas in python it's a code smell.
I think you are really grasping at straws here.
This basically summarizes the problem with most Julia upvote party posts like this one on Hacker News. Your comment is totally one-sided, Julia is better at every possible thing, so much that you are noseblind to it and can’t get an outside perspective that no, in fact, Julia’s language features do not have some fully dominating feature by feature parity compared against Python.
Every time it’s just an agonizing dragged out comment thread full of this type of overly one-sided thinking. Usually I just ignore all Julia posts for exactly this reason, and probably should have this time too, but seeing Python jit options and Cython options misrepresented so badly just got the better of me.
Yes, we all know that if you program in a very particular way (basically by not using any of the great dynamic or introspective features) you get fast python. How is it not objectively better to have a language that is fast independently of whether you use its dynamic/introspective/metaprogramming features?
"Python is fast as long as I program in this very particular and very constrained way" is a silly way to defend python (which is nonetheless an amazing language).
Julia has a ton of "zero cost abstractions". Python, as great as it is, simply does not.
The same applies to those OpenACC pragmas that can offload a butt ugly Fortran loop to GPU: no one says Fortran is running on the GPU, rather the compiler is doing code gen and RT calls to make user life easy.
It thus smells like marketing rhetoric.
This doesn't generate novel, meaningful text content without a template, but I'm unaware of any machine learning model that does this well (OpenAI's included). This model does absorb the 'rules' of the text, both grammar and structure (character names and dialogue, scene introductions, etc)
In my own project, I made use of this to find spelling and grammatical errors on Esperanto Wikipedia: https://medium.com/@mapmeld/esperanto-nlp-part-3-correcting-...