Careful, folks. S4TF is pretty much dead on arrival. It was pushed aggressively by Chris Lattner (for obvious reasons) but he left Google a while ago and since then most internal users lost interest. There's nothing in Swift that's inherently suitable for ML and building the ecosystem is a ton of work; without all the political pushing, it went nowhere and is close to a "semi-abandoned research project" phase.
Speed, too. For PyTorch to train models and run inference quickly, your Python code gets translated to C++/CUDA. Part of the idea with S4TF is to be able to write ML code in a single, fast language.
S4TF still requires either c cuda kernels or XLA. Julia on the other hand has JIT GPU codegen and its CPU codegen has been benchmarked to beat openblas
He probably means Tullio.jl (which also seems to integrate with Julia's source to source differentiation library Zygote.jl, the main competitor to Swift for Tensorflow):
Regardless if it can consistently beat Fortran/BLAS in every area, in general JIT languages have more opportunities for optimizations than AoT languages, so it's interesting to see what comes out of a language that focuses on leveraging this to get the most performance.
I'm surprised re: blas - that is closer than I thought! Still a huge gap on the GPU, but still impressive.
> in general JIT languages have more opportunities for optimizations than AoT languages
I'm not sure I agree with this. I would say the opportunities for any non-GC mature AoT language (C++, Rust, etc.) is going to be pretty much the same, since you can just attach a JIT to most AoT langs.
The GPU gap is only if written in the high level index or loop style. There is little to no gap if done either using array abstractions (broadcast, map etc) or at a level similar to Cuda C (though with nicer Julia abstractions and syntax): https://juliagpu.org/cuda/
The Julialab at MIT is working on making the higher level codegen faster
I guess that makes sense to me.. you can just automatically convert the C in BLAS to Julia and then if they're both being converted to llvm ir by clang anyways than i guess it'll be about as fast!
That's not at all what Julia is doing. It's much more sophisticated in that it has very low level intrinsic primitives that can compose and it optimizes the IR to make it fast and then compiles it to CUDA. These all map to Julia constructs.
Sure, if you give static language a JIT they'll be able to get the advantages of having JIT, though language semantics still matter. A language built for JITs like Julia or Common Lisp have native ways of interfacing with the compiler, and programs are built without worry of exponential explosion of implementations during method monomorphization (as you'll only compile the optimal versions that you'll actually use, based on runtime information, without having to be pessimist as any overspecialization can be fixed on demand). AoT languages would probably need a compiler pragma or type similar to a dynamic boxing but for delayed monomorphization/compilation for methods when you want to avoid compiling all paths AoT (which might be a way to allow for example tensor specialization on sizes, similar to StaticArrays on Julia).
I am not too experienced with Julia, but my understanding was that it uses LLVM to jit itself. Since the LLVM jit compiler is also an API available to C++, anything that can be done in Julia can be done with jit to LLVM api in C++.
Then you just compile the methods that you'll actually use with LLVM right before using them.
Sorry, you're right in that Julia is written in C/C++, so everything Julia does can be solved in those by writing a language (like Julia itself, and not unlike Tensorflow original interface) and compiling it on demand and finding a way to eval the new code and recover the results. I was talking along the ways of how to make it sort of convenient (at least viable to implent unlike the former), as an extension to the C++ compiler itself where you can just tell the compiler what stays AoT and what is JIT'd but otherwise keep the same C++ syntax.
Not to mention if you want to reimplement Julia's logic in C++ you'll have to develop it's sophisticated type inference, since Julia compiler is so aggressive that it will compile at once entire blocks of program (the entire program if it can) as long as it can infer what types are used downstream, which is why it can compete with AoT compiled languages (it's basicaly a "Just Ahead of Time Compiler")
The crux is that which are "the methods that you'll actually use" is very difficult to answer. A lot of effort is put into this on the Julia "Package compiler" and "Compiler" projects.
If true, it does not surprise me. While there is a lot of language war going on in the ML ecosystem, I never really heard anyone using, planning to use or waiting for swift for TensorFlow.
It might be a nice language (never used it), but there are other contenders with more engineers/scientists support like Rust and Julia, for which the advantages were clearer.
Finally, the whole ordeal got a very bad look from having its main proponent being the creator of Swift instead of the actual community pushing for it.
Generally, your metaphors should not rely on comparison to a pretty traumatic event that has happened to quite a few people, many of whom might be around you without you knowing.
Just letting you know the reality that a lot of people view "stillborn" events as particularly traumatic. I'm not going to spend a lot of time arguing about this because it's a pretty simple point.
It's up to you to choose what kind of person you want to be, I don't have control over what you say.
I think the difference is that the term "Dead" is used in a lot of different contexts. E.g. Battery is dead. Stillborn is not, and it mainly associated used in one very traumatic context.
Anyone who is avoiding "master, black list or sanity check" probably thinks abortion is super awesome and that the term "abort" should never be stigmatized.
Best not to worry about such silly things and keep writing code.