Tensorflow 2.0 Beta 0

NickHoff · on June 7, 2019

Maybe I'll give TF another try, but right now I'm really liking PyTorch. With TensorFlow I always felt like my models were buried deep in the machine and it was very hard to inspect and change them, and if I wanted to do something non-standard (which for me is most of the time) it was difficult even with Keras. With PyTorch though, I connect things however how I want, write whatever training logic I want, and I feel like my model is right in my hands. It's great for research and proofs-of-concept. Maybe for production too.

strebler · on June 7, 2019

TF's deprecation velocity was way too high for my taste. Things we wrote would stop working randomly with their updates. I feel very similar to you about the models being "buried too deep" in their (ever-changing) machine. I much preferred how easy it was to hack Caffe V1 (once you got past the funky names, etc).

These days, I really like mxnet. Torch was a disaster, but Pytorch is much better. It's not bad in production, definitely my #2.

panpanna · on June 7, 2019

> TF's deprecation velocity

That's Google on a nutshell. In fact, they may drop TF altogether next month. You never know ...

iamcreasy · on June 8, 2019

I am curious, what do you like about mxnet?

chipotle69 · on June 8, 2019

panpanna · on June 7, 2019

Can't explain it but for some reason Tensorflow never felt "right" to me, even work keras.

Pytorch on the other hand feels so much more natural...

credit_guy · on June 8, 2019

Maybe I'm not up to speed with the latest PyTorch, but to me Keras feels much more natural. In Keras if you want to define a deep learning network, then you just do that, you specify the first layer, the second layer, etc, then you calibrate over some test and validation samples, using a certain flavor of gradient descent, for a given loss function. In PyTorch, you have to define a class, with a constructor, some method called "forward", I don't know, maybe if I follow an example to the end I get the hang of it. My problem is that I don't want to write object oriented programming, I want to do machine learning. Keras doesn't force me to know what a class it, or what it means to inherit from nn.Module, or that a constructor in python needs to contain 'self' as a variable. PyTorch, at least the exmples I saw online want me to do just that, and that's a turnoff.

On the other hand, in Keras I can't (easily) change the architecture of a learner after I defined it. I can't prune some nodes and split others, maybe that's easy in PyTorch. If that's the case, I'll take a second look. Until then, when I have some time, I'm really tempted to invest some time in MXNet, as the book "Dive into Deep Learning" appears to be quite good.

p1esk · on June 8, 2019

in Keras I can't (easily) change the architecture of a learner after I defined it

I'm not sure what you mean here, because only PT lets you change architecture after you define it, while TF/Keras uses a static precompiled graph. Now that's changing with eager mode, but that used to be the main advantage of PT.

credit_guy · on June 8, 2019

That was what I heard too, that in PyTorch you could adaptively build your graph. In Keras you could in principle emulate this in a brute-force way: you define the graph, train it, saving the weights, analyze them, decide what nodes to remove and what nodes to add, and then rebuild the graphs from scratch and use the adjusted weights from the previous round. I say in principle, personally I never did it. Some people hint that in PT this is a breeze, I'd be curious to see some example. If you have any links, that would be much, much appreciated.

p1esk · on June 8, 2019

Again, not sure I understand your example. What do you want to do?

credit_guy · on June 9, 2019

In a large NN a lot of nodes end up being useless. You can remove them without degrading the performance of the NN.

To be more concrete, here's a link [1] to Google's neural network playground. I built a network with 5 layers and 37 hidden nodes. It trains quite well, but the last layer has 2 nodes that contribute with very little weight to the final output. The app allows you to change their weight (you click on the corresponding line and edit). If you change the weight to zero (effectively dropping the node), the classifier, if anything, gets better. My guess is that you can easily remove about half of the nodes. Conversely, if you look at the nodes with the highest weights out, you can in principle clone them and halve the weight out both for the original and for the clone. With this configuration, the network output is exactly the same, but if you continue training, it allows more flexibility, as the original and the clone are allowed to diverge.

This type of operations are not possible in Keras. Are they in PyTorch? If not, then what type of dynamic graphs are possible? What can one do with PyTorch that one can't do with Keras?

[1] https://playground.tensorflow.org/#activation=relu&regulariz...

p1esk · on June 10, 2019

Your first example is commonly referred to as network pruning, and is typically used to compress a model (the nodes are still there, but a sparse weights network can be stored in compressed form). It's also possible to remove nodes themselves, rather than individual weights. This is typically done on a filter level (for convnets), so that entire filters are removed.

The second example (cloning the nodes) is typically performed to improve network robustness (by avoiding important nodes a single point of failure).

To do either one during training you need dynamic graphs, so either PyTorch, or TF eager mode. Here's one filter pruning implementation: https://github.com/jacobgil/pytorch-pruning

p1esk · on June 7, 2019

I just hope they don't screw it up in the process of integrating PT with Caffe2.

levesque · on June 7, 2019

Might give it another try, but my latest incursion in the Tensorflow universe did not end pleasantly. I ended up recoding everything in Pytorch, took me less than a day to do the stuff that took me more than a week in TF. One problem is that there are too many ways to do the same thing in TF and it's hard to transition from one to the other.

m0zg · on June 7, 2019

Yeah, the only reason to use TF is really its deployment friendliness. If PyTorch addressed that more comprehensively, there'd be no good reason to use TF at all. For research PyTorch blows TF out of the water completely, and it's been that way for years, ever since it came out.

forgotmyhnacc · on June 7, 2019

What are you looking for in deployment friendliness? There's TorchScript to run your code faster (which is a work in progress)

m0zg · on June 7, 2019

For me it's deploying to mobile, mostly. There's ONNX but it doesn't seem to be terribly mature and it doesn't support some of the common ops, and e.g. FB's own Caffe2 doesn't run it natively. There's also no mature tooling to produce quantized models. TF remains the only real option to do quantization aware training or even easy post-training quantization.

Specifically, my life would be a lot easier if I could save a mobilenet-style model to e.g. ONNX or some other static graph format that does not require model code in order to load weights. I would like then to be able to load this saved model directly into something on Android and iOS that can use GPU and DSP present on the chip, with minimal extra futzing.

chadmeister · on June 7, 2019

Seriously, I've been waiting for a long time now for this to come about. This would make pytorch a much more powerful platform.

jeffshek · on June 7, 2019

One of the major benefits of TF 2.0 is apparently the capability to quickly deploy to TPU units with a single parameter change. (I haven't tried it, just followed the marketing).

AFAIK, This is still being worked on PyTorch via XLA, but not quite there yet.

m0zg · on June 8, 2019

I've found that with TF in general you can only go "quickly" if everything works. If anything is busted you're more or less screwed because it's so opaque. In contrast, PyTorch lets you inspect whatever you want by setting a pdb breakpoint, and when it gives you errors you most of the time can figure out what's wrong without debugging. The importance of this cannot be overstated.

sandGorgon · on June 7, 2019

What do you think of Keras in this space ? Because TF 2.0 is entirely keras based.

https://medium.com/tensorflow/standardizing-on-keras-guidanc...

sytelus · on June 7, 2019

This is one thing that confuses me. Why Keras is still a separate brand? Why everything isn't under just tensorflow namespace instead of having to do tf.keras all the time. I really wish tf just had one API and just one thing to learn.

cheez · on June 7, 2019

Keras is a high level API that can use multiple backends. So it makes sense for them to remain separated.

p1esk · on June 7, 2019

I haven't used TF much lately, but the last time I looked at TF2 it felt like they are making it harder to build low level api models.

cheez · on June 7, 2019

I doubt it, it's more likely that they are creating higher level abstractions atop the lower level ones and are advertising/documenting the higher level ones

levesque · on June 7, 2019

Keras is part of the problem for me. It is rather rigid and it's hard to get around. Works super well for the regular use case. On the other hand, when you want to start doing custom stuff, it's hell.

dannykwells · on June 7, 2019

I am very happy that Google has realized the importance of usability. Hopefully that comes with concomitant improvements in the tf documentation, which, while thorough is completely unusable and lacks good examples for complex things.

flensortow · on June 7, 2019

Any news on Swift for Tensorflow?

I’m skeptical of how much practical benefit it will provide but still willing to take a look at it.

There doesn’t seem to be any mention of it here.

mark_l_watson · on June 8, 2019

I have spent a few evenings playing with early releases. I like the ‘turtles all the way down’ idea, but I am waiting to see more mature releases. I have spent much more time with TensirFlowJS that works well, has many great examples.

panpanna · on June 7, 2019

From what I understand this is mostly because they hired the Swift guy.

I understand the benefits compared to Python (although I would have preferred Go or Kotlin). But what happens when the guy eventually moves on in a year or two?

favorited · on June 7, 2019

Google's making a big investment in Swift, so if Chris left and they were interested in continuing to support it they shouldn't have a problem.

I've gone to Swift on the Server conferences hosted/sponsored by Google, their (non-TF) Swift teams are building some cool Swift tools, etc.

panpanna · on June 8, 2019

Google is a Go shop, specially when it comes to servers.

Swift is not properly supported on linux, which is Google's main platforms.

bge0 · on June 8, 2019

This is incorrect; swift is open source and has linux deployments : https://swift.org/download/

panpanna · on June 8, 2019

Linux support is still WIP. Some parts are missing

favorited · on June 8, 2019

Google is not at all a Go shop. Google is a C++ shop, especially when it comes to servers.

panpanna · on June 8, 2019

(that contradicts your earlier comment)

Google has been gradually moving away from C++ and Java since 2012. See this quora post with multiple references from Google employees.

https://www.quora.com/How-is-Go-used-at-Google-What-could-be...

pjmlp · on June 8, 2019

That is not what their employees talk about at CppCon, LLVM conferences, ISO C++ meetings, Java Language Summit (yes they come around in spite of Android).

Go is mostly a Docker/Kubernetes thing.

favorited · on June 8, 2019

The fact that most Google code is C++ does not at all contradict that they are invested in Swift.

pjmlp · on June 8, 2019

Google is a more of C++/LLVM, Java/Kotlin, Python shop, than Go one.

You will even notice that it is seldom supported when they announce new server products SDKs.

nickserv · on June 7, 2019

Nice to see the project moving along, I'm just getting started with the basics for a way finding application and will probably start off with version 2 then.

Hopefully by the time stable comes around I'll be near production ready as well.

A bit off-topic, but does TF or pyTorch work nicely with AMD GPUs?

I'd rather not have to deal with Nvidia's blob drivers if at all possible.

p1esk · on June 8, 2019

does TF or pyTorch work nicely with AMD GPUs?

No.

hn2017 · on June 7, 2019

Does anyone know when there'll be updated Coursera (or other sites) courses with TF 2.0?

wodenokoto · on June 7, 2019

They have one on udacity