Hacker News new | comments | ask | show | jobs | submit login
Why Swift for TensorFlow? (github.com)
212 points by nnd 5 days ago | hide | past | web | favorite | 143 comments





It's interesting how it's going to play out. On one hand side, Swift is a pleasant language to work with (despite its infancy). But on the other, having a Tensorflow API doesn't suddenly give it a bunch of libraries for statistics, comp. vision, modeling, visualisation, etc. that Python/R/Julia coughMATLABcough have.

Nowadays, it's difficult enough to convince people to drop e.g. MATLAB for R or Python for Julia (let's assume that there's some merit to it), despite them having excellent counterparts for almost everything. Swift's success in this domain depends solely on the adoption by developers/researchers/engineers. Unless they're just going to mostly use it internally (as Google is known to).

Which brings me to the last point - why on Earth would they pick Swift (apart from Christ Lattner being involved) when Julia was on the table? It ticks all their boxes and has more mature ecosystem for all things "data". Provided rationale is hardly convincing.


Yeah, I don't buy the justification versus Julia because of community size either, given most of Swift's community has little to do with data science. The document even says as much, contradicting that rationale, later on.

As someone who uses TF heavily, I would be much more excited about this project if they'd chosen Julia. Swift's tooling isn't great, and I already have a foot in one language with an immature data science ecosystem (Rust).


How is Rust-Julia interop?

Julia's `ccall` is great in terms of overhead[0], so calling Rust shared libraries is not a problem. On the Rust side, it took me a while to figure out passing in pointers, and then constructing slices via from_raw_parts[_mut], so that I can transition to safe Rust. Perhaps that is obvious to more experienced Rust programmers, but I was left with the impression that receiving pointers and crunching numbers is not yet a common application for Rust (unlike C, C++, or Fortran). Meaning there is not a lot of introductory material coming from that angle at the moment. Additionally, to get good vectorization seems to require nightly and a fastfloat[1] library. In particular, you'd want associative math / fp-contract for fma and SIMD instructions, and perhaps fno-math-errno to turn off branches in functions like sqrt.

I imagine calling Rust from Julia will be much more common than calling Julia from Rust. I know approximately nothing about this, but there are plenty of questions about embedding Julia into C/C++[2][3]. May be similar for Rust.

[0] https://github.com/dyu/ffi-overhead [1] https://github.com/robsmith11/fastfloat [2] https://discourse.julialang.org/t/support-with-embedding-jul... [3] https://discourse.julialang.org/t/api-reference-for-julia-em...


I will say this in the risk of talking out of my ass as I have no experience in either language :). Having a statically typed language greatly simplifies the tooling because static analysis is much easier; graph program extraction involves one such analysis. When you have to deploy the trained model in production one would hope not to use Python or Julia.

I'd like to add that, with my limited experience in prototyping some of my ML models, having a static checker to check that your tensors have the right shape is much better than having to run your code.


what's the problem with deploying julia in production in inference? Some occasional piece of data that looks wrong in an unanticipated way causes a runtime type fault? People deploy high uptime websites with django - how do they do it? Well you use kubernetes (or, gasp, systemd) and have restart and load balancing logic. Even if you were typecheck-compiled, you can't guarantee some other developer logic or system error, or an errant bit flip from a cosmic ray, won't take your setup down. Static checker doesn't really matter. If you're at the point where you're ready to deploy, you're probably good for at least 95-99% of the data you'll ingest. The rest of the gap can be closed using rolling update.

On the other hand, Julia can do the right thing dynamically. Your matrix happens to be symmetric? Julia will chose an appropriate factorisation and will propagate that knowledge through dynamic dispatch.

"Automagical" solutions tend to work until they don't. I would much rather have my solution be provably correct at compilation time than to depend on high-overhead runtime systems.

You don't know what you're talking about.

Tell me where I'm wrong

You are not wrong. For some "highly dynamic" applications, say, optimizing compiler IR where there are many different subclasses of IR nodes, dynamic dispatch is nice. But when you are running an ML model, scientific application where you already know which sparse matrix format you need, etc, you can do all of that statically with less overhead and performance predictability. This is in no way an argument against Julia; the point is that you don't need dynamic dispatch if you can statically determine what needs to happen.

That is basically the core insight behind julia. The really performance sensitive parts of your application are already static just by the nature of that code, so we can extract that static information to make it really fast. We can also make use of the same static information for static error messages or static compilation and get the best of both world (dynamic during development, static when you're done), but the tooling for that is a bit less developed at the moment.

Julia can and does make 'provably correct' decisions at compile time, it's just that at the the default typesystem settings are not quite correct for machine learning apps.

Also it's not a high overhead runtime. The runtime itself is compiled to highly optimized machine code (it can even compile, say the derivative of f(x) = 5x+3 down to the machine immediate "5" at compile time).

There is a lot of lifting to get that compilation framework into place, so there is a load-time overhead.


Which you don't need most of the time in production because you can statically determine which operation needs to happen without the overhead of dynamically choosing the operation.

Just because you can deploy production code without static checking does not make it a good idea. And to me if you need to rely on an external process to restart your application for NPE's etc. it is a sign your application is not that robust.

It's possible to write production code in brainfuck if that's what you really want to do. Statically checked code is easier to implement correctly, easier to modify, and easier to maintain.


> Statically checked code is easier to implement correctly, easier to modify, and easier to maintain.

If that were true literally no one would program in python. Static checking is not the end all to uptime and stability. I wrote an elixir program in three days that served as a testbench for a senior's go program (which took him six months to write). This senior believes in static typechecking for everything and doesn't write unit tests. The testbench handles thousands of parallel async requests without a hiccup and even survives operating system resource exhaustion, where the go program falls over and panics.

Erlang is not statically checked (there is a static typechecker, but it's not fully typed). I promise you a well written erlang program has much higher stability than a well written go program. There is a reason why kubernetes exists, after all.


> If that were true literally no one would program in python.

There are many cases in which people don't choose the optimal language. But I would say the size of the python community has more to do with inertia, the breadth of libraries available, and a relatively shallow learning curve than it says about its strengths as a tool for writing good software.

I actually find that Python has some rather serious warts: the whole story around environment/version management is a mess, and the less I have to work with Python in a serious capacity the better.

> I wrote an elixir program ... The testbench handles thousands of parallel async requests without a hiccup and even survives operating system resource exhaustion, where the go program falls over and panics.

Well Erlang is specifically designed for concurrency and stability: if you want to judge your result on those two metrics I hope it is going to perform well.

I never made the claim that static typing is the "end all to uptime and stability" - static typing makes it easier to reason about your code, and to provably eliminate many issues. It's very nice that you implemented a test harness quickly, but come back to me when you've worked on a complex codebase with several other people over an extended period of time.


The problem comes down to how you use your model, which is all that matters. If there is a mismatch between the language in which your model is embedded in and the language in which your main application/interface is written in, then ideally you want a way to extract the model statically from the dev environment. It's not about type-checking per se, which is nice, but how you get the model out without having to, say, start a python interpreter in your server process just to do the inference (which is why in some aspect Tensorflow is more convenient).

I'd argue in general that outside of python, there's not really much of a focus from google itself. They are largely leaving other language bindings to other people (see what's currently going on with tensorflow 2.0).

Their focus is more on the c bindings and allowing other people to build what they want on top of that.

Other language bindings aren't generally going to be used for anything more than inference. First class actual data science work isn't going to happen in other languages anytime soon (at least outside of julia and R which are at least trying to compete in this niche).


Julia doesn't tick the "compile to .o/.h" box. As far as I can tell, the use case for AOT Julia is avoiding package compilation overhead, not delivery of standalone code objects.

_edit_ seems JuliaC does support this sort of thing:

https://juliacomputing.com/blog/2016/02/09/static-julia.html


Julia does already support this kind of thing. Moreover with a minuscule fraction of the money that’s being poured into making Swift usable for data science and machine learning, truly top notch support for generating standalone binaries from Julia could readily be developed. Which is kind of frustrating but what can you do?

If you'd like to track the latest developments:

https://github.com/JuliaLang/PackageCompiler.jl


Hah...it's funny how they pretend to give objective rationales for choosing Swift when it's pretty clear the decision was made long before.

Swift is a nice language but its reliance on reference counting means you have to work a lot harder to avoid retain cycles than you do in a garbage collected language.

That might have been the right choice for Apple’s uses of Swift where GC pauses affect the user experience but for most other use cases it’s too much of a cognitive burden IMO.


Personally, I find that this only really comes up rarely. Most of the time strong references are fine.

My iOS code is loaded with weak/strong ref handling logic. It comes up all the time when using closures in UIKit.

UIKit is kind of annoying because it’s really not “idiomatic” Swift: it’s a wrapper around Objective-C (albeit, a very nice one) that happens to bring along with it a bunch of decisions that lead to having to deal with reference lifetimes.

That could be. I'll admit I haven't used Swift at all outside of coding Appkit/UIKit apps.

iOS Swift is usually the front-end code which means that you're doing a lot of connections to data sources elsewhere. Non-blocking connections always need a weak/strong dance. If you're using Swift for logic most of that is gone.

> Tensorflow API doesn't suddenly give it a bunch of libraries for statistics, comp. vision, modeling, visualisation, etc. that Python/R/Julia coughMATLABcough have.

Actually in this case it does. Swift for Tensorflow includes python interop out of the box: https://www.tensorflow.org/swift/api_docs/Global-Variables#/...

The supported use-case would be to do your ML work in Swift, and then call numby etc. from Python.


I feel like GraalVM has a chance to solve some of this at least. I wonder if anyone will make an Octave GraalVM frontend, they already have one for R.

Isn't Graal an Oracle thing? I don't understand why anyone would want to touch that even with a 10-foot pole.

Yeah, I would tend to agree. If Google successfully appeals Oracle's suit against them for implementing Java, I might consider using it in a product, but Oracle makes it legally risky to use any of their products.

> why on Earth would they pick Swift

Because of iOS?


iOS isn't really relevant for this. You would certainly deploy a model into an app, but that is likely to be using ONNX: https://medium.com/@alexiscreuzot/building-a-neural-style-tr...

I also wonder how much of this coincidentally lines up with Chris Lattner landing at Google. As Chris will admit, and as was left out of this analysis, Swift has also been given the humble goal of achieving world domination. All joking aside I'm very thrilled about this and have enjoyed tremendously watching the Swift language mature since its launch due in large part to the open source community and the Swift team's admirable commitment. Onward!

http://nondot.org/sabre/Resume.html

"Swift for TensorFlow rethinks machine learning development ... I imagined, advocated for, coded the initial prototype and many of the subsystems after that; recruited, hired and trained an exceptional engineering team; we drove it to an open source launch and are continuing to build out and iterate on infrastructure."


What do you mean?

This was not in process before Chris came, it was a project he suggested and started pushing on?

What is there to be coincidental?


I think they mean the exact opposite of "coincidental", in the sense of "hmm... is this a coincidence" (no, it isn't)

I get that, but i guess i don't understand what there is to be guessing at at all.

I think it's been pretty straightforward that Chris joined Google and started this project. You don't have to wonder if that's a coincidence, that's what he's happy to say happened.


It's not a coincidence at all

Strange to see a requirement for choosing one language over another is supposed ease of adoption and then they choose the one not easily adopted across every platform. That easy to write syntax takes precedence over general easy to write/run is unfortunate.

Fail to see the point of this project.

Swift for Tensorflow might work if the scope is to create a client side model definition loader natively for various TF models.

Nobody use Swift seriously for server side training, there is no point in doing so except to add swift to the list of language that claim to do deep learning but in reality nobody will consider them.


Well written explanation! I really enjoy Swift but it's not as accessible as some of the other languages mentioned. I have a 2011 Macbook Pro and wanted to use the latest and greatest new Swift features. Unfortunately, my machine is too old to upgrade to Mojave which means I can't download the latest version of xcode, which means no new version of Swift. I'm not mad at Apple in the least bit. I just wish I could use Swift 4 on my machine.

If you don't mind stranding from the 100% stable roads, you can install Mojave on your macbook using this patcher: http://dosdude1.com/mojave/

Personally I am running Mojave on a late 2009 macbook pro and it still works amazingly well. Transition from Mojave and especially the new XCode are also way faster than previous iterations. There are caveats though, as the processor in my computer is too old, I had to hack homebrew to compile everything from source.

(Also, using the patcher does not hinder my ability to push updates to the App Store or use iMessage, if that is a concern)


I'm gonna try this, thank you!

FWIW, you can install Xcode 10.1 on High Sierra, and use the Swift.org 5.0 toolchain (https://swift.org/download/#snapshots) or build swift from source. You can't ship App Store apps this way, but it works great for experiments.

My machine is too old to upgrade to High Sierra.

https://support.apple.com/kb/SP765?locale=en_US

> MacBook Pro (Mid 2010 or newer)

So your 2011 should be supported. If it's actually an older machine, compiling from source is an option, if slightly inconvenient.


You're right. Downloading now. Thank you!

You might be interested in checking out Swift on Google Colab (e.g., https://colab.research.google.com/github/tensorflow/swift-tu...)

Have you considered trying to build Swift from source? It's a bit time-consuming the first time, but subsequent updates less so - and you'd have Swift 4 at your disposal.

Isn't Swift open source and available to build anyway, regardless of Xcode and OS X version? Even on Linux etc?

Julia would have been a much better and more cost effective choice in my opinion.

It's a superior platform to on which develop this sort of thing, and further along at that. Also easier to use.


Julia doesn't have a debugger... They specifically claimed this is a very important thing.

In my experience, Julia has also inscrutable scoping rules, a slow REPL, and it's only fast if you don't count the "startup time" of having to precompile everything.


Re: Debugger, fair enough but it's in the works

Re: Scoping rules, these are being evaled.

Re; startup time, already better in 1.1, and will soon be marginalized from two ends: Better static compilation and better tiered compilation.


Is there a "standard" way of running Swift on Ubuntu LTS nowadays? A while back I looked into it, and ran into some hokey and unsatisfying solutions. I used Swift on iOS, and I like it a lot, but if they care about adoption, someone needs to reduce friction of getting up and running to approximately zero. A snap package a-la Go or per-user script based installation a-la Rust would be quite OK, as long as it's just one, easy to discover command.

It looks pretty straightforward, see the Linux section of https://swift.org/download/#using-downloads

Yeah, going through two screenfuls of text every time I want to upgrade is not "straightforward".

Sorry, I just didn't want to give people the impression that it's more difficult than other languages. To upgrade you'd just have to remove the original install directory and untar the new release.

I'm not familiar with Snap, but I did find https://snapcraft.io/swift

Also, upgrading will be less common than Rust since almost everyone uses the latest release/toolchain. There's not really a reason to use the daily builds unless you're contributing to the Swift project.


For comparison, for Go, it's "sudo snap refresh go". For Rust it's "rustup update stable". I mean, how hard would it be to properly package this stuff, and why should tens of thousands of users deal with all this manual downloading and unpacking? Assuming, of course, that Swift folks don't deliberately want to make the language unpolular, like Haskell.

I agree they should put the effort into having an easy "apt-get" solution. On the other hand, your original question was whether there is a standard way of installing Swift on Ubuntu. The answer is a very clear YES: download the latest tarball of the binaries from the Download page and unzip.

I’m kinda annoyed there’s no PPA for this, but I suspect they won’t bother shipping this in the standard repo until there’s a stable ABI

And that's fine: ship an official snap or use the Rust solution. Not doing this very directly impacts adoption. Most people won't even try to set it up.

Swift is also now supported in Colab (Google-hosted Jupyter notebooks) and there's a nice tutorial on some of the features of Swift for Tensorflow at https://colab.research.google.com/github/tensorflow/swift-tu...

I have no idea what TensorFlow is (other than the basics) but I enjoyed reading that entire document because it did such a wonderful job of explaining a complex and potentially contentious decision. It’s fascinating to see Swift feature so strongly in a pragmatic analysis that doesn’t explicitly favour Apple platform interop.

I am a bit ignorant on the topic, but is swift available for Windows/Ubuntu? Most of the deep learning scientists I know and work with use either of the two setups. I know there technically exists CUDA GPU support for Apple, but I have frankly never even attempted to mess with it.


Ubuntu is "supported". The compiler might be available, but there are hardly many libraries available that would compile outside Apple platforms.

That's no longer true. The Foundation framework is basically complete on Linux, and the vast majority 3rd party libraries which are not iOS specific will work on Ubuntu. Even many of Apple's own libraries (i.e. SwiftNIO, a low-level, high performance networking library for things like implementing web-servers) are cross-platform.

Last time I checked, the "the vast majority 3rd party libraries which are not iOS specific" was actually quite tiny.

You'd be surprised. There are a few reasonably well developed server-side frameworks, some of which are already used in production various places.

Also a lot of the libraries which are mostly used in iOS don't have any dependancies on the iOS platform: for instance promise or event emitter implementations etc.

IBM is actually supporting a number of open-source swift projects as well: https://github.com/IBM-Swift.

Between that and painless interop with C/C++, Swift does not feel under-supported on Linux in the least.


I am surprised Dart is not mentioned at all (maybe implied under the OOP languages?). While Flutter and Tensorflow are very different usecases, I am surprised there is nothing in the document on why Dart specifically would be a good choice. I believe if they used Dart for Tensorflow as well, the community would be able to get behind the idea that will not be an abandoned language.

I vaguely know tensor flow as the most(?) popular lib of his kind, but I wonder how is the history of swift on non-apple platforms and its impact of the actual users.

Is TensorFlow "huge" in linux, windows, android? Because I also evaluate swift for my use case (https://www.reddit.com/r/swift/comments/8zb9y1/state_of_swif...) and decide instead on use rust mainly because the lack of solid support on non-apple platforms. However, after use rust for months now I still consider swift a better contender between performance/ergonomics than rust (rust is damm hard sometimes, and suddenly you could hit a trouble with above-average complications. I don't see how put this burden in a library to be used for more "regular" folks could work)


The story of Swift on Linux is now quite good.

Windows is less far along, but recently a contributor got nightly builds started on Azure, and it appears there is serious work on this front.

In any case, it's already possible to run Swift for Tesorflow on Windows using WSL and Docker.


Ok, that is for swift...

But is not tensor flow popular on windows? Because then build on top of swift will mean:

- Put swift on a fast track to be decent on windows, linux, android(?)

- Ignore the windows users and let them battle a bad dependency?


Setting aside for a moment the appropriateness of Swift for TensorFlow, this is a very impressive example of using an embedded DSL to work with a component that is a full programming system in its own right.

On the one hand, we do want full access to the programming model exposed by the component -- its control structures, abstractions, everything else. One the other hand, these are mostly duplicated by our host programming language: it's going to have variable bindings, operators, iteration, conditionals and everything else. Doing an embedding like this is a way to expose most of the component's facilities without introducing a ton of "new syntax" in the form of combinators or having programs where a lot of the critical code is escaped in strings.

This same problem shows up in programming interfaces to RDBMSes. LINQ is a good example of the same embedding technique.


You might agree/disagree with their decision but this is one of the most honest and comprehensive evaluation for using a language I've read.

Forgive me for my ignorance, but does swift have any good plotting and interactive "notebook" ability? Specifically the ability to plot images such as matplotlib.

I ask this because the number 1 reason my deep learning research group chose python was because of the extensive and interactive scientific plotting ability that's built into python jupyter notebooks. While our volume of analysis isn't on the scale of say a google/fb (primarily biomedical image analysis), the ability to easily visually debug the results is much more important for developing robust models.


Yes! Swift is supported in Google Colab, and as a Jupyter kernel: https://github.com/google/swift-jupyter.

What is the plotting experience like though? As I previously mentioned, plotting is one of the main reasons our group uses python.

Another reason now that I think about it, is the number of scientific libraries that I can just "pip install" without much thought (such as scipy/opencv).


You can call out to matplotlib (or any other python libraries installed on your system), using the python interop feature (https://github.com/tensorflow/swift/blob/master/docs/PythonI...)!

https://github.com/google/swift-jupyter#rich-output has an example with screenshots.


Oof

Interactive plotting and “notebook” capability isn’t a property of a language so it’s fallacious to ask if Swift has it. (Or Python, or Julia, or Wolfram, etc.)

Reading the document really gives a feeling the author is not being honest on why they chose Swift.

The lack of windows support is addressed in just two lines. Julia being an already established language in the domain of data science does not seem to be especially important to them.

I think the most honest part of the document is:

> because we were more familiar with its [Swift's] internal implementation details



why? because i made the language, thats why...no real good reason

And here's 3 pages of very vague, half-justifications as to why we didn't choose anything else to head off any complaints.

As as python machine learning practitioner and previous iOS engineer I have for while come to miss using swift and type safety for that matter. I really like the language and wish great success for the TF team with swift.

Side note, does anyone know the effort required to get various python based libraries running on swift? i.e. numpy, scipy, pandas and so on?


The Swift for Tensorflow team has added some python interop to Swift. So you’ll be able to, for instance, do an “import numpy” in your Swift code.

I wonder what this means for iOS apps themselves

At the moment not much. Swift for TensorFlow is a fork of the language, with language-level support for some features which are useful for data science, for instance automatic differentiation and dynamically-callable objects.

Some of those features are making their way into the main branch, but at the moment you could not import the TensorFlow library into an iOS project and use it. Swift for TensorFlow needs to be built using a separate toolchain.


I can imagine swift really taking off in this space. It’s going to be a battle between Julia and Swift for who does the best automatic differentiation.

I’m happy for Swift, but I really, really want Julia to win out here.

There’s some pretty impressive ML frameworks in Julia and the language can do some really cool things, so I’m hoping that gives it the edge.

Plus, I found tensorflow exceedingly painful to use, so hopefully something else prevails.


Google would need to make Swift a first class citzen on Windows, currently Julia is winning.

That's one of the outcomes I am hoping for in this. I would love for Swift to be a first-class language.

I believe the Swift for TensorFlow team is currently hiring for this.


how can you not mention Python when it's currently what is used 99% of the time, the other 1% being R

Is there an official release roadmap?

Last I heard the goal for "initial adoption" is set for Spring 2019.

I'll wait for the PyTorch version.

Haven't we seen this before?

It's been in development for about a year, and I have seen several posts about it here. It's still not quite feature-complete or ready for real use.


This can be used as an argument against just about anything.

noop, some solution will dominate. Which will make 14 => 1 or 2.

[flagged]


So far I can easily use Julia on Windows, Swift on the other hand....

I don't think it's over yet really. Tensorflow might be moving their code base to Swift, but there's other frameworks and TF isn't the be-all-and-end-all of ML frameworks. Having used it, I'd really hope it isn't, because it's incredibly painful to use.

I'm personally excited for the likes of Julia's Flux framework to get a bit more production ready, I think that's got serious legs.


I particularly enjoy that Flux.jl is an AD framework and a couple lines of Julia code defining convenience functions for ML, and it is all written in Julia so it invites exploration.

Tim Besard gave a talk at a Tensorflow meetup recently that walked through the Julia stack (GPU&ML) that I would highly recommend: https://docs.google.com/presentation/d/1y93Kg8ZizvabKAGs-zmA...


Why not Rust?

Edit: I wonder if Swift could be replaced with Rust for iOS development?


From the article:

We believe that Rust supports all the ingredients necessary to implement the techniques in this paper: it has a strong static side, and its traits system supports zero-cost abstractions which can be provably eliminated by the compiler. It has a great pointer aliasing model, a suitable mid-level IR, a vibrant and engaging community, and a great open language evolution process.

A concern with using Rust is that a strong goal of this project is to appeal to the entire TensorFlow community, which is currently pervasively Python based. We love Rust, but it has a steep learning curve that may exclude data scientists and other non-expert programmers who frequently use TensorFlow. The ownership model is really great, but mostly irrelevant to the problems faced by today’s machine learning code implemented in Python.


As I pointed out in two lengthy comments on day one[1][2], that reasoning is nonsense. If Chris wants to use the language he created in this new endeavor for machine learning simply because he made it, that's totally fine and completely his prerogative, but he should just say so, rather than trying (and failing) to convince people that other languages aren't better suited for this task.

From my point of view, a weak justification is worse than no justification in cases like this.

Rust is much better suited to this task than Swift from a technical point of view. The far superior platform support for Windows and Linux is ample reasoning to say Rust is better suited for this task, since very few data scientists will be training models on macOS. However, that's only one of several areas where Swift has shortcomings for a project like this. Swift is great for iOS and macOS development, of course, since it was designed for that. I don't think Swift is a bad language by any means, and with enough effort, it can be reshaped to be good for Tensorflow... the GitHub document just provides zero useful justification for the work required to make it good for Tensorflow.

EDIT: to some of the replies talking about Rust's learning curve, that mostly applies when you start trying to design efficient, interlinked data structures involving ownership. For most applications of machine learning, this simply wouldn't be a problem. The library would provide the data structures, you just have to use them. Rust can provide simple interfaces to complicated things.[3] The compiler's error messages are usually incredibly helpful.

The learning curve of Rust should not be relevant here, compared to Swift, which is also full of idiosyncrasies. Swift and Rust both have a large learning curve for someone coming from Python. This is because they're statically typed languages that are just different from a scripting language. For an application like this, I would say those learning curves are roughly equal at the language level, but as I pointed out in my comments, Swift has an enormous learning curve of requiring many data scientists to either install and learn Linux, or throw out their current computer, buy a Mac, and learn macOS.

My point here is not that Rust is the most suitable language for Tensorflow (although it could be), but rather I'm making the point that Rust is more suitable than Swift for a project like this, and therefore this document is just annoying. It would be better for them to delete this document and just say "we're using Swift because our team has a lot of experience with it and because the creator of Swift is leading this project, so we would lack enthusiasm and momentum if we were using something else, even if it were more suitable."

Julia would be really interesting to see explored further, since it would appeal much better to many existing data scientists who would be transitioning from Python. The times that I've played with Julia, I was amazed at how slow the JIT is for even tiny scripts. LLVM is powerful stuff, but it is painfully slow at everything. It would be nice if Julia offered an alternative backend for rapid development.

[1]: https://github.com/tensorflow/swift/issues/3#issuecomment-38...

[2]: https://github.com/tensorflow/swift/issues/3#issuecomment-38...

[3]: http://kiss3d.org/


I personally find Rust to have quite a learning curve (which I guess is also an opinion shared by others). The language is great though.

I do agree with your criticism of the document here, though. It feels very much like Swift happens to check many boxes, but the lack of Windows support is baffling. It's simply table stakes to be able to run, fully supported, on Windows, macOS, and major Linux distributions. That should be the very first thing anyone considers.

But beyond that, I think even with Rust's macro system it could be difficult to make it work for Tensorflow in a way that feels appropriate for Rust programmers _and_ for TensorFlow. This was explored in F# for Tensorflow research[0] and a completely different approach[1] was taken because making a type system suitable for tensorflow got too unweildy.

[0]: https://github.com/fsprojects/TensorFlow.FSharp

[1]: https://github.com/fsprojects/TensorFlow.FSharp#live-checkin...


> But beyond that, I think even with Rust's macro system it could be difficult to make it work for Tensorflow in a way that feels appropriate for Rust programmers _and_ for TensorFlow.

If you're talking about matrix shape compatibility (matching up rows from one with columns from another) I'm hopeful about const generics here: https://github.com/rust-lang/rfcs/blob/master/text/2000-cons...


It seems likely that the justification is retrofitted and team's familiarity with Swift was the bigger driver. I am surprised they didn't find Scala to be a good fit given that it has already been used with great success in Spark which I presume has similar technical requirements. Anyone can throw light on the short explanation below? Does it really apply to Scala?

"Java / C# / Scala (and other OOP languages with pervasive dynamic dispatch): These languages share most of the static analysis problems as Python: their primary abstraction features (classes and interfaces) are built on highly dynamic constructs, which means that static analysis of Tensor operations depends on "best effort" techniques like alias analysis and class hierarchy analysis. Further, because they are pervasively reference-based, it is difficult to reliably disambiguate pointer aliases."


I agree with you regarding lack of Windows support, however I would rather see Julia as a better alternative than Rust, given the language ergonomics.

More to the point static typing is just not that important for data scientists. Arguably it's not that important for backends devs either (e.g. lisp, erlang).

Should be prefaced with, "I think".

Having done user research on this by speaking to data scientists, I can say that static typing is desired by a nonzero number of who practice what we would consider to be data science and machine learning. Much like how TypeScript is seen as a revelation to hordes of JavaScript programmers who have never used static types before, the ability to get some level of correctness verification at design-time matters.


The more time I spend with strongly typed languages the more I am convinced it is the right way to go. For modern languages with good type inference, and good tools for protocols/interfaces not tied to an inheritance hierarchy, it is a at worst minor inconvenience for a huge benefit.

> I can say that static typing is desired by a nonzero number of who practice what we would consider to be data science and machine learning

Who would trade static typing with fast prototyping any time.

Data science is a really nebulous term covering many drastically different domains of CS. Many DS I talked with, don't really produce code, they do coding to produce analysis, which is the actual delivery. For them, code is ad-hoc and disposable, created on demand and left in the dust until rediscovered when mission comes.

Some of the code do survive and enter production stage, I guess that is where they would seek some assurance from static typing. But I do think they could learn to mitigate most of pain if they can commit themselves to write some unit-tests/functional tests, yet such awareness is rare among the DSs I know and worked with.

So all in all, yes static typing MIGHT help, in some way, but I don't think it addresses the underlying pain point as much.


> Who would trade static typing with fast prototyping any time.

These need not be at odds. Many ML languges like F# or OCAML, by use of type inference, get you type safety without having to type a bunch of stuff and sacrifice faster prototyping. And certainly in F# there is a history of having productive tooling that lets you prototype easily. Simply writing some F# code in an F# script in an IDE, hitting alt+Enter, and letting it execute in an interactive shell is hugely productive for exploratory tasks. And features like Type Providers build out types for an arbitrary data set that let you guarantee your code is actually correct for the data.

What I've mentioned isn't without its flaws, and eventually someone is going to reach head-scratching problems just as they would in any other environment. I don't think there's an objective way to measure productivity across a wide range of professionals, but I do believe that some subset of them would prefer static types for their work. This is backed by conversations with some of them about problems they encounter.


Although I am a big fan of a couple of dynamic languages, when it scales we really need static types to make any sense of it, even to our older selfs a couple of months down the line.

So gradual typing like in Julia is already a good thing for having the best of both worlds.


Correctness verification at the level that data scientists need can generally be achieved with optional typing (presuming a well designed type system)

Perhaps! I personally think it's still a very young field, and there's likely a spectrum of professionals who prefer some strong degree of typechecking.

This is being explored with "Live Checking" in F#[0], which offers a form of static typing over TensorFlow without actually forcing you to express every complex interaction with data in types.

[0]: https://github.com/fsprojects/TensorFlow.FSharp#live-checkin...


> achieved with optional typing (presuming a well designed type system)

Enter stage left: Julia

Julia is already pretty great, I'd really love to see what cool stuff we could have with a swell in community size and investment!


Yeah that's kind of what I'm referring to but the default array typing in flux.ml doesn't encode tensor dimensionality in the type system. If it did (which it very easily could in julia) you wouldn't wind up with a situation where your learning task halts in the middle of a training run, which can happen in flux.ml

Due to the way that code composition works in Julia, there is no real “default” array for Flux. Rather, you can lift in any array type that you like. The GPU arrays are an excellent example of this, Flux “knows” nearly nothing about GPUs (apart from a few convenience functions), yet works perfectly when using a GPU array type. So there is nothing stopping you from lifting in say StaticArrays [1] which carries the sizes in the type or NamedArrays [2] where dimensions have explicit names – the latter being superior in practice to the former in my opinion, or perhaps someone is up for marrying the two?

[1]: https://github.com/JuliaArrays/StaticArrays.jl

[2]: https://github.com/davidavdav/NamedArrays.jl

In brief, it is not the duty of the automatic differentiation package to favour a specific array type – it just works for all of them, which is something that I find fairly magical with Julia.


1) It is not the duty of AD to favor an array type, but flux is an ML library. When you do something like Chain() or Dense() or LSTM() in flux, which is very obviously an ML tensor operation, it SHOULD pick reasonable, fixed (or variable!) tensor dimension. This is maybe not so easy, but it should be doable. Likewise, I wish Flux had "batch" and "minibatch" types that had specifiable dimensions so that if you try to hook up to data to layers of the wrong shape it gives an early warning.

2) StaticArrays would be a good starting point, but the point of it is to optimize Arrays by unrolling for loops and triggering SIMD (IIRC) and there are performance penalties when your arrays get really large, which they do, in ML. Something LIKE the staticarrays typesystem but without the overoptimization would be welcome.

3) (kind of tangential) I have beef with how GPU is handled as GPUArray in julia. It really should be handled as a worker node using the ClusterManagers-type semantic; you should be async sending tasks to the GPU as if it were a remote agent (which it kind of is, due to PCI bus bandwidth and latency bottlenecks) and waiting for the result to come back as a Future.


Regarding 3, can you make an issue or discourse post for discussion?

In that regard Julia is hardly any different to TypeScript.

Julia compilation time is much improved in 1.1 and they are working on tiered compilation to make it better.

it's fairly well accepted that rust has a high learning curve and their targeted users are not software engineers, so I wouldn't say their point is nonsense

> If Chris wants to use the language he created in this new endeavor for machine learning simply because he made it, that's totally fine and completely his prerogative, but he should just say so, rather than trying (and failing) to convince people that other languages aren't better suited for this task.

Do you have any insider knowledge that Chris Lattner had the unilateral power to choose Swift for this project? I would imagine with the importance of TensorFlow at Google, the decision to go in this direction had to be agreed on by a number of people.

> The learning curve of Rust should not be relevant here, compared to Swift, which is also full of idiosyncrasies. Swift and Rust both have a large learning curve for someone coming from Python.

How exactly would Rust-Python interoperability work? Swift for Tensor Flow allows any python library to be called like a native library in Swift. Could you do that in Rust?


> Could you do that in Rust?

Yes, and companies are even doing it in production. Sentry probably being the best well known.


> I wonder if Swift could be replaced with Rust for iOS development?

If you like the pain of using a non supported language without all the XCode, UIBuilder, CoreData, Instruments, Metal Shaders debugging,... goodies then yes.


Chris Latner is the driving technical force behind the project and he wrote Swift. So they were able to fix any issues with Swift so the trade study was “unfair” in that regards.

I still don't understand why they would choose Swift over C#?

They complain about C#/Java having "highly dynamic constructs" but correct me if I'm wrong but isn't swift also a GC/OOP like Java and C#?

I don't think Swift has any inherent objective advantages over c#.

I think it would have been a better decision to go with C# over Swift as Microsoft has a clear roadmap with the language and it is already supported on linux/mac/windows.


I would rather claw my eyes out than write any ML stuff in C#.

It's a fine enterprise language, but good lord writing data science and machine learning stuff in it would be an right pain. It's also not super high performance, and when you're doing a lot of maths heavy operations, high performance is absolutely crucial. I had great difficulty establishing whether SIMD/vectorisation was even supported, and then even more difficulty getting it to work.

Julia would have been a far, far superior choice than Swift.


For data science and anything with a demanding domain model I find F# streets ahead of C#.

For a project like this, though, the type F# providers are a bit of a game changer that opens a lot of roads to create a 'best of both worlds' experience. For example, offloading heavy maths to other runtimes while providing a mature stack for everything outside of ML. The F# Type Provider for R (http://bluemountaincapital.github.io/FSharpRProvider/), is an example of this hybrid approach.

I believe Julia looks to be the better choice over Swift, tho.


SIMD is supported already for quite some time on RyuJIT. Quite easy to find out when searching the MSDN .NET Blog.

Its performance is good enough for doing medical digital imagining as presented by Siemens at FOSDEM 2019.

It is a matter to properly use the features that the language gives us.


Yeah I found the blog posts, but then had the problem of “what compiler am I using now?” Was it the Roslyn one or RyuJIT? Does RyuJIT support .NetCore or is it in Standard or Framework or one of other seemingly limitless versions of .Net that exists for some reason.

Apparently I could use a library called Vectors, buried deep inside some numerical library, but then the runtime wouldn’t recognise the libraries existence despite being a dependency and installed (and linked and every other thing you have to do to get .Net to do anything). After I fixed that issue it wouldn’t let me construct any arrays or anything.

Suffice to say, on top of C#/F# being painful to deal with at the best of times, attempting to do anything numerical was an absolute shit fight. I’m sure if you’ve got a whole team, you can make anything work, but for me it was not at all worth the effort.

When you consider I can get fully guaranteed (not just hoping the compiler chooses to optimise it right) in Julia practically for free along with nicer syntax, 100% less namespacing hell, equal or greater performance, and far more data science and numerical packages and it’s hard to see what the draws of C# would be.


It appears to me that the issue was not being confortable with the .NET eco-system.

Roslyn and RyuJIT aren't the same thing. Roslyn is the new compiler infrastructure for MSIL generation, where the original C++ compiler got replaced by bootstraped VB.NET and C# compilers.

RyuJIT is the new JIT compiler introduced in .NET 4.6, replacing the former JIT64.

I don't disagree that Julia is better suited for data science given the eco-system, as proven by my other posts in this thread, just that the performace is also there when one wants it.


While I do think the choice of Swift is kind of weird, you are wrong about Swift being garbage collected (it uses Automatic Reference Counting). It also compiles to actual machine code (rather than an intermediate representation for use in a VM).

Reference counting is a garbage collection algorithm as per CS literature, you are mixing it up with tracing garbage collection algorithms.

Swift makes use of SIL and LLVM bitcode before the final binary is produced.

Likewise C# can be AOT compiled to actual machine code via NGEN, .NET Native, CoreRT and Mono/Xamarin.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: