
Eager Execution: An imperative, define-by-run interface to TensorFlow - alextp
https://research.googleblog.com/2017/10/eager-execution-imperative-define-by.html
======
alextp
You can read out more about it in the blog post (
[https://research.googleblog.com/2017/10/eager-execution-
impe...](https://research.googleblog.com/2017/10/eager-execution-imperative-
define-by.html) ) or the README (
[https://github.com/tensorflow/tensorflow/tree/master/tensorf...](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/eager/README.md)
). This is still a preview release, so you may hit some rough edges.

Looking forward to your feedback as you try it out.

~~~
josh11b
I'm on the team that worked on this -- happy to answer questions!

~~~
gormanc
Hot damn this has got me all giddy. How will this work on single node multi-
GPU systems? For example, with PyTorch you have to either use threading,
multiprocessing, or even MPI. Can you think of a not-too-scary way to use
eager execution with multiple GPUs?

~~~
alextp
We're still fairly early in the project, so for now threading is the only
supported way.

We can do better, however, and we're working on ways to leverage the hardware
better (for example, if you have no data-dependent choices in your model we
can enqueue kernels in parallel on all GPUs in your machine at once from a
single python thread, which will perform much better than explicit python
multithreading).

Stay on the lookout as we release new experimental APIs to leverage multiple
GPUs and multiple machines.

------
chrisprobert
Announcing TensorFlow's new development roadmap mandate: copy everything
PyTorch is doing :-)

~~~
ychujo2
I think you mean Google is following the leadership of Chainer, like Facebook
already does? PyTorch started as a Chainer fork. Its dynamic graph internals
are all from Chainer.

~~~
bradleyjg
This isn't art. There are no points for originality. If open source projects
borrow the best parts from each other, that's a good thing.

~~~
ychujo2
It's not a bad thing. It's good for users. But give credit to the leaders in
the field. If you make an iPod clone, you call it an iPod clone, not a clone
of the Zume HD.

Chainer started it, was around years earlier, and it still has more users. So
Google is not copying PyTorch, it's copying Chainer.

~~~
arbie
Do you have a source for Chainer currently having more users than TensorFlow?

~~~
Narew
Not more users than Tensorflow. But maybe more user than other dynamic
deeplearning framework (PyTorch, Gluon, DyNet...)

------
congerous
TensorFlow: everything to all people.

Eager is actually not as innocent as "open-source projects borrowing the best
parts from each other", as some commenters here suggest.

Google is attempting to dominate the machine-learning API and the Python
ecosystem for scientific computing.

The company that controls the API influences which apps are built on it and
how. Think about how Google bundled Android services on top of Android, and
how that posed an existential threat to other companies. That's what's coming
for TensorFlow. Many developers are too naive to realize it, or too short-
sighted to care.

~~~
tree_of_item
Huh? They're attempting to dominate the machine learning ecosystem by writing
a bunch of free and high quality machine learning libraries? What exactly are
they doing wrong?

I wouldn't compare a permissively licensed library to Android services at all.

~~~
congerous
I'm surprised I have to write this, but Google is not a charity. They are
pouring commercial resources into Tensorflow for a reason. That reason is
Google Cloud. Tensorflow is a Trojan horse to get people to use Google Cloud
and other paid Google products. How do I know this? Because Tensorflow works
better on Google Cloud than anywhere else, and Google is making a concerted
effort to catch up with AWS in cloud, mostly through machine learning.

I didn't compare Tensorflow to Android services. I said that Tensorflow would
serve as the basis of a service bundle, much like Android did. Let's come back
in a couple years and I'll tell you I told you so.

~~~
matt4077
> I'm surprised I have to write this,

Insulting the reader

> but Google is not a charity

truism

> They are pouring commercial resources...

As opposed to "non-commercial resources"?

> ... for a reason.

Everything happens for a reason.

> That reason is Google Cloud.

> How do I know this?

Pray tell!

> Because Tensorflow works better on Google Cloud than anywhere else.

This is the only real argument in this conspiracy. And if "anywhere" includes
the users' hardware, it's wrong: tensorflow runs flawlessly on any
Linux/NVIDIA hardware. Maybe it works better with GCE than AWS, but that would
once again fall into that "rather unsurprising" category of factoids.

> Google is making a concerted effort to catch up with AWS in cloud, mostly
> through machine learning.

This can be re-written as "Google has a cloud offering, which it tries to
sell. And right now, machine learning is pretty hot". Throwing a "concerted
effort" in there is just trying to jazz it up to something ominous. Which it
isn't.

> I didn't compare Tensorflow to Android services. I said that Tensorflow
> would serve as the basis of a service bundle, much like Android did.

"The basis of a service bundle" actually doesn't sound that scary. Nobody is
disputing that Google offers services build on tensorflow. It just isn't any
sort of "Trojan horse" conspiracy, and it is somewhat limited by the fact the
tensorflow is OSS licensed and could be forked by anybody people suddenly find
out it's full of geek soldiers.

~~~
congerous
> truism

Maybe, be people in this thread treat Tensorflow's creation as an act of
simple altruism.

> And if "anywhere" includes the users' hardware, it's wrong: tensorflow runs
> flawlessly on any Linux/NVIDIA hardware. Maybe it works better with GCE than
> AWS, but that would once again fall into that "rather unsurprising" category
> of factoids.

Sorry, Tensorflow is slow on GPUs compared to other frameworks. This is not
just an early blip, its a consistent pattern that has been repeatedly
demonstrated. Why is Tensorflow slow on commodity hardware? Why isn't Google
with it's infinite resources making Tensorflow run as fast as other frameworks
on GPUs? Because it needs to demonstrate an advantage on the Google Cloud with
TPUs.

On that cloud, it surrounds Tensorflow with other functionality that makes it
easy to build AI, which aren't part of the Tensorflow project. Tensorflow is
hard and inefficient to serve for inference, for example.

Machine learning is Google cloud's only hope to salvage Diane Greene's efforts
and extend their dominance to a new sector. They're running a distant fourth.

> actually doesn't sound that scary.

It sounds scary to a lot of companies that don't want to be controlled or
destroyed by Google. But by all means, lend them a hand, geek soldier.

------
sandGorgon
Hey guys, if I could request... Please fix the serialization story for
tensorflow. There 6 googleable methods to export from tensorflow and nobody
knows what will work on the cloud, what can be exported from cloudml and what
can be loaded on Android.

It has to be consistent and there has to be one way to do it.

I personally have a 10 message thread with Google cloud support on exporting a
Cloud trained model to tensorflow and nobody could figure it out [Case
#13619720].

~~~
alextp
Did you try using SavedModel? It should be seamless to use downstream with
tensorflow serving and it's not that hard to get estimators to spit those out.

~~~
sandGorgon
I really wish.
[https://github.com/tensorflow/tensorflow/issues/12750](https://github.com/tensorflow/tensorflow/issues/12750)

In fact if you dig up the case, then even official support told me that
savedmodel needs some freezing using bazel otherwise it doesn't work.

The github page and stackoverflow are full of these. If you can, please take
the message to the other side :(

I don't think the cloud guys (where training will happen in distributed mode)
talk to the android guys (where models will be used after quantization). There
is a huge serialization problem that all of us are currently struggling with.

~~~
alextp
Ah, I didn't know SavedModel didn't work in android. I think freezing is still
the way to go there? I'm sorry, I don't personally work on the mobile side of
things.

~~~
sandGorgon
I should apologize for hijacking this thread(and i'll stop here). But
Tensorflow is getting to be unusable because of the serialization story. We
don't have such issues on Caffe2 or anywhere else. It essentially means
different parts of the tensorflow ecosystem are unable to talk to each other.

I really pray the tensorflow teams give it due importance.

~~~
petewarden
I'm the original author of the freeze_graph script, so I'm to blame for a lot
of the on-going mess here. For what it's worth I'm actively working on
cleaning this up, since I know what a painful experience it is. Apologies for
everyone who's struggled with this, and I will take a look at the case number
mentioned above and follow up internally to see if there's anything I can help
with.

~~~
sandGorgon
Thanks for this! I would like to bring two things to your attention :

1\. We don't know what to use and its very confusing. For example, now there
is [https://stackoverflow.com/questions/42216208/should-
tensorfl...](https://stackoverflow.com/questions/42216208/should-tensorflow-
users-prefer-savedmodel-over-checkpoint-or-graphdef). Will freeze_graph become
canonical and we forget about SavedModel? And everything else deprecated? It
should be part of the core API and workable on CloudML, where we don't have a
lot of control on running scripts and certainly not Bazel builds.

2\. Android/ios story. Now you have the Pixel Visual Core as well... Please
make it seamless all the way to Android or Ios or raspberry pi (whatever you
guys support).

