Hacker News new | past | comments | ask | show | jobs | submit login
Tensorflow 2.0 Beta 0 (github.com/tensorflow)
109 points by Gimpei on June 7, 2019 | hide | past | favorite | 43 comments



Maybe I'll give TF another try, but right now I'm really liking PyTorch. With TensorFlow I always felt like my models were buried deep in the machine and it was very hard to inspect and change them, and if I wanted to do something non-standard (which for me is most of the time) it was difficult even with Keras. With PyTorch though, I connect things however how I want, write whatever training logic I want, and I feel like my model is right in my hands. It's great for research and proofs-of-concept. Maybe for production too.


TF's deprecation velocity was way too high for my taste. Things we wrote would stop working randomly with their updates. I feel very similar to you about the models being "buried too deep" in their (ever-changing) machine. I much preferred how easy it was to hack Caffe V1 (once you got past the funky names, etc).

These days, I really like mxnet. Torch was a disaster, but Pytorch is much better. It's not bad in production, definitely my #2.


> TF's deprecation velocity

That's Google on a nutshell. In fact, they may drop TF altogether next month. You never know ...


I am curious, what do you like about mxnet?


Test


Can't explain it but for some reason Tensorflow never felt "right" to me, even work keras.

Pytorch on the other hand feels so much more natural...


Maybe I'm not up to speed with the latest PyTorch, but to me Keras feels much more natural. In Keras if you want to define a deep learning network, then you just do that, you specify the first layer, the second layer, etc, then you calibrate over some test and validation samples, using a certain flavor of gradient descent, for a given loss function. In PyTorch, you have to define a class, with a constructor, some method called "forward", I don't know, maybe if I follow an example to the end I get the hang of it. My problem is that I don't want to write object oriented programming, I want to do machine learning. Keras doesn't force me to know what a class it, or what it means to inherit from nn.Module, or that a constructor in python needs to contain 'self' as a variable. PyTorch, at least the exmples I saw online want me to do just that, and that's a turnoff.

On the other hand, in Keras I can't (easily) change the architecture of a learner after I defined it. I can't prune some nodes and split others, maybe that's easy in PyTorch. If that's the case, I'll take a second look. Until then, when I have some time, I'm really tempted to invest some time in MXNet, as the book "Dive into Deep Learning" appears to be quite good.


in Keras I can't (easily) change the architecture of a learner after I defined it

I'm not sure what you mean here, because only PT lets you change architecture after you define it, while TF/Keras uses a static precompiled graph. Now that's changing with eager mode, but that used to be the main advantage of PT.


That was what I heard too, that in PyTorch you could adaptively build your graph. In Keras you could in principle emulate this in a brute-force way: you define the graph, train it, saving the weights, analyze them, decide what nodes to remove and what nodes to add, and then rebuild the graphs from scratch and use the adjusted weights from the previous round. I say in principle, personally I never did it. Some people hint that in PT this is a breeze, I'd be curious to see some example. If you have any links, that would be much, much appreciated.


Again, not sure I understand your example. What do you want to do?


In a large NN a lot of nodes end up being useless. You can remove them without degrading the performance of the NN.

To be more concrete, here's a link [1] to Google's neural network playground. I built a network with 5 layers and 37 hidden nodes. It trains quite well, but the last layer has 2 nodes that contribute with very little weight to the final output. The app allows you to change their weight (you click on the corresponding line and edit). If you change the weight to zero (effectively dropping the node), the classifier, if anything, gets better. My guess is that you can easily remove about half of the nodes. Conversely, if you look at the nodes with the highest weights out, you can in principle clone them and halve the weight out both for the original and for the clone. With this configuration, the network output is exactly the same, but if you continue training, it allows more flexibility, as the original and the clone are allowed to diverge.

This type of operations are not possible in Keras. Are they in PyTorch? If not, then what type of dynamic graphs are possible? What can one do with PyTorch that one can't do with Keras?

[1] https://playground.tensorflow.org/#activation=relu&regulariz...


Your first example is commonly referred to as network pruning, and is typically used to compress a model (the nodes are still there, but a sparse weights network can be stored in compressed form). It's also possible to remove nodes themselves, rather than individual weights. This is typically done on a filter level (for convnets), so that entire filters are removed.

The second example (cloning the nodes) is typically performed to improve network robustness (by avoiding important nodes a single point of failure).

To do either one during training you need dynamic graphs, so either PyTorch, or TF eager mode. Here's one filter pruning implementation: https://github.com/jacobgil/pytorch-pruning


I just hope they don't screw it up in the process of integrating PT with Caffe2.


Might give it another try, but my latest incursion in the Tensorflow universe did not end pleasantly. I ended up recoding everything in Pytorch, took me less than a day to do the stuff that took me more than a week in TF. One problem is that there are too many ways to do the same thing in TF and it's hard to transition from one to the other.


Yeah, the only reason to use TF is really its deployment friendliness. If PyTorch addressed that more comprehensively, there'd be no good reason to use TF at all. For research PyTorch blows TF out of the water completely, and it's been that way for years, ever since it came out.


What are you looking for in deployment friendliness? There's TorchScript to run your code faster (which is a work in progress)


For me it's deploying to mobile, mostly. There's ONNX but it doesn't seem to be terribly mature and it doesn't support some of the common ops, and e.g. FB's own Caffe2 doesn't run it natively. There's also no mature tooling to produce quantized models. TF remains the only real option to do quantization aware training or even easy post-training quantization.

Specifically, my life would be a lot easier if I could save a mobilenet-style model to e.g. ONNX or some other static graph format that does not require model code in order to load weights. I would like then to be able to load this saved model directly into something on Android and iOS that can use GPU and DSP present on the chip, with minimal extra futzing.


Seriously, I've been waiting for a long time now for this to come about. This would make pytorch a much more powerful platform.


One of the major benefits of TF 2.0 is apparently the capability to quickly deploy to TPU units with a single parameter change. (I haven't tried it, just followed the marketing).

AFAIK, This is still being worked on PyTorch via XLA, but not quite there yet.


I've found that with TF in general you can only go "quickly" if everything works. If anything is busted you're more or less screwed because it's so opaque. In contrast, PyTorch lets you inspect whatever you want by setting a pdb breakpoint, and when it gives you errors you most of the time can figure out what's wrong without debugging. The importance of this cannot be overstated.


What do you think of Keras in this space ? Because TF 2.0 is entirely keras based.

https://medium.com/tensorflow/standardizing-on-keras-guidanc...


This is one thing that confuses me. Why Keras is still a separate brand? Why everything isn't under just tensorflow namespace instead of having to do tf.keras all the time. I really wish tf just had one API and just one thing to learn.


Keras is a high level API that can use multiple backends. So it makes sense for them to remain separated.


I haven't used TF much lately, but the last time I looked at TF2 it felt like they are making it harder to build low level api models.


I doubt it, it's more likely that they are creating higher level abstractions atop the lower level ones and are advertising/documenting the higher level ones


Keras is part of the problem for me. It is rather rigid and it's hard to get around. Works super well for the regular use case. On the other hand, when you want to start doing custom stuff, it's hell.


I am very happy that Google has realized the importance of usability. Hopefully that comes with concomitant improvements in the tf documentation, which, while thorough is completely unusable and lacks good examples for complex things.


Any news on Swift for Tensorflow?

I’m skeptical of how much practical benefit it will provide but still willing to take a look at it.

There doesn’t seem to be any mention of it here.


I have spent a few evenings playing with early releases. I like the ‘turtles all the way down’ idea, but I am waiting to see more mature releases. I have spent much more time with TensirFlowJS that works well, has many great examples.


From what I understand this is mostly because they hired the Swift guy.

I understand the benefits compared to Python (although I would have preferred Go or Kotlin). But what happens when the guy eventually moves on in a year or two?


Google's making a big investment in Swift, so if Chris left and they were interested in continuing to support it they shouldn't have a problem.

I've gone to Swift on the Server conferences hosted/sponsored by Google, their (non-TF) Swift teams are building some cool Swift tools, etc.


Google is a Go shop, specially when it comes to servers.

Swift is not properly supported on linux, which is Google's main platforms.


This is incorrect; swift is open source and has linux deployments : https://swift.org/download/


Linux support is still WIP. Some parts are missing


Google is not at all a Go shop. Google is a C++ shop, especially when it comes to servers.


(that contradicts your earlier comment)

Google has been gradually moving away from C++ and Java since 2012. See this quora post with multiple references from Google employees.

https://www.quora.com/How-is-Go-used-at-Google-What-could-be...


That is not what their employees talk about at CppCon, LLVM conferences, ISO C++ meetings, Java Language Summit (yes they come around in spite of Android).

Go is mostly a Docker/Kubernetes thing.


The fact that most Google code is C++ does not at all contradict that they are invested in Swift.


Google is a more of C++/LLVM, Java/Kotlin, Python shop, than Go one.

You will even notice that it is seldom supported when they announce new server products SDKs.


Nice to see the project moving along, I'm just getting started with the basics for a way finding application and will probably start off with version 2 then.

Hopefully by the time stable comes around I'll be near production ready as well.

A bit off-topic, but does TF or pyTorch work nicely with AMD GPUs?

I'd rather not have to deal with Nvidia's blob drivers if at all possible.


does TF or pyTorch work nicely with AMD GPUs?

No.


Does anyone know when there'll be updated Coursera (or other sites) courses with TF 2.0?


They have one on udacity




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: