
Tensorflow 2.0: models migration and new design - formalsystem
https://pgaleone.eu/tensorflow/gan/2018/11/04/tensorflow-2-models-migration-and-new-design/
======
CretinDesAlpes
I've used tensorflow in 2016-2018 and switched to Pytorch a few months ago. My
theory is that TF is used mainly because it is supported by Google, even if it
is really badly designed for practitioners. In fact, TF was not meant to be
public but was at first some internal tool at Google. This would explain why
people at Google itself started to develop Keras or Sonnet (deepmind) at some
point.

For anyone interested in research in deep learning with a python-numpy like
environment, I can only recommend to switch.

~~~
kajecounterhack
Keras wasn't developed by Google, it was adopted by TF.

> My theory is that TF is used mainly because it is supported by Google, even
> if it is really badly designed for practitioners.

This is not correct. TF is mainly used because it's designed for industrial-
strength machine learning. It provides primitives with an eye to scale from
the outset (because it's Google).

It's probably true that prototyping / research was not the main audience.
That's exactly why Keras was adopted, as well as features like tf.eager, to
abstract away the underlying computation graphs and make it easy for people to
try different things.

Well-designed primitives / abstractions are important; Tensorflow does this
well.

~~~
buboard
for something to be "industrial" it has to be at least stable and well
documented. tf doesnt feel like that.

~~~
kajecounterhack
Care to point out some examples? (I use tf every day and it feels fine to me
but am sure the tf team is curious where others see areas for improvement).

------
LeanderK
> Tensorflow 2.0 will be a major milestone for the most popular machine
> learning framework: lots of changes are coming, and all with the aim of
> making ML accessible to everyone. These changes, however, requires for the
> old users to completely re-learn how to use the framework

I never felt like I was not re-learning tensorflow! A constant series of
breackage, deprecations, new apis etc.

~~~
FridgeSeal
Not to mention terrible documentation.

I think that’s a property of Google though, because there’s never been a
google service for which I’ve read the documentation and not been like “I do
not understand what’s going on at all” for a non-trivial amount of time.

And all their examples for Python are just like “run this magic-code-ridden
python file, congrats you did the tutorial”.

~~~
luckydata
Ok this is very useful to read, I also felt like that lately and I just
thought I was an idiot. Now maybe we are just two idiots, or the documentation
could use work.

~~~
p1esk
Well, you should not start with TF documentation if you don't know how basic
deep learning models work. Start with online courses

~~~
luckydata
I wasn't talking about deep learning at all. I think most google technical
docs suffer of a little bit of "and now draw the rest of the Owl" style.

------
maaaats
Not trying to start a language war, but I would like the old or the new api
better if it had static types. Right now it's thousands of functions that can
take thousands of different objects, and I would never know what to use except
the few I randomly learned in a tutorial.

~~~
me2too
But the API is strictly statically typed in reality. The core is written in
C++ and every operation requires a well-known type in input. As you can see
from the error raised in this example:
[https://pgaleone.eu/tensorflow/go/2017/05/29/understanding-t...](https://pgaleone.eu/tensorflow/go/2017/05/29/understanding-
tensorflow-using-go/#question-time-1)

------
friendshaver
This is a great development for both new and experienced users. Most
importantly, it seems to be an effort to introduce a canonical way to build a
given network in Tensorflow, as opposed to the previous era where any given TF
repo might implement the same network in one of several completely different
ways (tf.contrib.slim, tf.layers, etc). Hopefully, defining a standard way to
build models will also help accelerate the standardization of train/test
scripts and data pipelines, putting to rest the era of models being deeply
infected with the execution structure built around them.

Overall, it looks like Pytorch is forcing TF devs to focus more on users and
usability, and I'm excited to see how they continue to spur each other's
growth.

~~~
p1esk
_effort to introduce a canonical way_

I'm pretty sure there will be plenty of incompatible API changes in every 2.*
release, and a completely new "canonical" way introduced in TF 3.0.

------
kodablah
> Support for more platforms and languages, and improved compatibility and
> parity between these components via standardization on exchange formats and
> alignment of APIs.

Does this mean we can get a pure C API for training and running? Right now,
IIRC you can only train in Python and C++ which ignores a large part of the
programming community. If not a pure C FFI API, what is the approach being
taken?

~~~
danieldk
The C API permits training and running. But you have to define and serialize
the graph in Python (or C++).

(Of course, you could also write Tensorflow protobuf directly, but that would
be tedious.)

~~~
kodablah
Pardon my naivete, so at some point when not reusing others' graphs/models, to
get the most out of TF you are basically forced to use one of those two
languages? With TF 2 claiming more platforms/languages, will this no longer be
the case?

------
nshm
How about making it more stable? Kind of tired to close 10 issues a day on
github because tensorflow developers decided to rename a variable. Seriously
looking on pytorch.

~~~
juanuys
Stable, and complete:

[https://github.com/tensorflow/tensorflow/issues/14248](https://github.com/tensorflow/tensorflow/issues/14248)

------
beefsack
Tangential, but as someone in the southern hemisphere it's frustrating when
people use seasons for timelines. "Q2 2019" is immediately more understandable
instead of having to do some mental gymnastics to convert it.

------
qnsi
Feels bad man. All tutorials will be outdated.

------
sytelus
This:
[https://twitter.com/fchollet/status/1052228463300493312](https://twitter.com/fchollet/status/1052228463300493312)

------
davmar
as if it wasn't already confusing enough. should i use tensorflow hub? TF
slim? keras applications that import tensorflow?

should i still use those outdates TF slim models that are for some reason kept
in models/inception?

please, tensorflow team, make this stuff easier for us non-Phds.

~~~
etaioinshrdlu
Phds have as much trouble as you.

------
amelius
It still uses dataflow graphs :(

As a programmer, I don't want to think in terms of dataflow graphs. That is
what compilers are for!

~~~
me2too
At your contrary, I love the graph.

However, migrating to a keras-like approach, you can work thinking about
"objects with variables inside" and then the graph will be built for you by
the abstraction introduced by `tf.keras.Model`.

However, for automatic differentiation, graph is always required (as you can
see from the example that uses eager)

~~~
amelius
> At your contrary, I love the graph.

Can you explain what you like so much about manipulating graphs, over just
writing statements like z= x⋆y, where ⋆ is some tensor operation?

Really, the graph makes me feel like I'm stacking Lego bricks with chopsticks,
rather than with my bare hands ;)

~~~
me2too
The fact that I have all the computation described in a coherent manner, in
something that's agnostic to the language I'm using.

The same description (the graph) can be taken and used in every other
language. I can move a trained model to production just picking a single file
(a `tf.train.SavedModel` IIRC), give it a tag and I'm ready to go, with
different models with a native support for the "tagging" (hence the model
versioning).

~~~
amelius
Whether that's an advantage depends on your perspective, because what has
really happened is that you have now created a new language inside the
original language.

------
breatheoften
I think they should really consider making a clean JavaScript/typescript api
first — use that and test for awhile - then port that design to python for a
2.0 of the python api...

The worst thing about tensorflow is being locked into having to use python for
graph description and training (in my opinion) ...

~~~
mark_l_watson
Try TensorFlow.js - very nice, lots of examples included, easy to set up.

------
ericfriday
This reminds me Keras's creator wrote a blog on how to design APIs for the
best user experience: [https://blog.keras.io/user-experience-design-for-
apis.html](https://blog.keras.io/user-experience-design-for-apis.html)

------
adev_
By the sake of god, please try to add also a more friendly build system too.
Bazel is a disaster that would almost make maven looks lightweight, npm
reliable and autotools userfriendly.

~~~
Judgmentality
What do you dislike about bazel? Not championing it, just curious.

~~~
adev_
The list would be too long but out of order :

\- Using a JVM to compile mainly C++ and python, implying to deploy a JDK/JRE
just for that.

\- Awful RAM consumption, compiling tensorflow takes 16GB of RAM to create a
python wheel

\- Try to download the world and compile everything without allowing to
specify external dependencies.

\- Compile time that are crazy long due to previous reason.

\- Invasive. Make very hard to integrate with anything else that does not
build with Bazel

\- Make very hard, or sometimes, impossible to tune compiler flags

\- Just not reproducible.

\- Unstable, try to build a recent Google package with a 6 month old Bazel and
good luck.

I strongly advise you to watch this video of the last FOSDEM :
[https://archive.fosdem.org/2018/schedule/event/how_to_make_p...](https://archive.fosdem.org/2018/schedule/event/how_to_make_package_managers_cry/)

~~~
haberman
I strongly agree with you on the first one.

Many of your other ones are byproducts of the fact that Bazel is primarily a
build-from-source system. This has some benefits, particularly in a C++
ecosystem where binary compatibility across versions basically doesn't exist.
But it also has some big drawbacks when it comes to compile times.

I do see Bazel seems to support depending on a prebuilt .so, though I have not
tried this: [https://docs.bazel.build/versions/master/cpp-use-
cases.html#...](https://docs.bazel.build/versions/master/cpp-use-
cases.html#adding-dependencies-on-precompiled-libraries)

> Invasive. Make very hard to integrate with anything else that does not build
> with Bazel

I think your main options here are:

1\. prebuild the other projects, then depend on the .so from Bazel
([https://docs.bazel.build/versions/master/cpp-use-
cases.html#...](https://docs.bazel.build/versions/master/cpp-use-
cases.html#adding-dependencies-on-precompiled-libraries))

2\. write a BUILD file for the external project. Here is an example of a
project that builds a bunch of non-Bazel deps by writing Bazel BUILD files for
each of them:
[https://github.com/googlecartographer/cartographer/tree/mast...](https://github.com/googlecartographer/cartographer/tree/master/bazel/third_party)

> Make very hard, or sometimes, impossible to tune compiler flags

"bazel --copt=<your options here> :target"?

> Just not reproducible.

What isn't reproducible? I tend to think of reproducibility as a strength of
Bazel. Because all of your dependencies are explicit and Bazel fetches a known
version, the build is less dependent on your system environment and whatever
you happen to have installed there.

Disclosure: I am a Googler. I have some gripes with Bazel, but overall I think
it gets some important ideas right. You have a BUILD file that is declarative,
then any imperative code you need goes into separate .bzl files to define the
rules you need.

~~~
adev_
> Many of your other ones are byproducts of the fact that Bazel is primarily a
> build-from-source system. This has some benefits, particularly in a C++
> ecosystem where binary compatibility across versions basically doesn't
> exist. But it also has some big drawbacks when it comes to compile times.

Nix, Guix and Spack packager managers solved the C++ ABI issue a long time ago
already without the crazy needs of Bazel in term of resource consumption,
integration and compile time. They even supports binary distributions for some
of them.

> I think your main options here are: > 1\. prebuild the other projects, then
> depend on the .so from Bazel ([https://docs.bazel.build/versions/master/cpp-
> use-cases.html#...](https://docs.bazel.build/versions/master/cpp-use-
> cases.html#...)) > 2\. write a BUILD file for the external project. Here is
> an example of a project that builds a bunch of non-Bazel deps by writing
> Bazel BUILD files for each of them:
> [https://github.com/googlecartographer/cartographer/tree/mast...](https://github.com/googlecartographer/cartographer/tree/mast..).

I know that. But all of them are terrible options. I do not want to depend of
SQLite, OpenSSL, libxml or whatever other system library compiled by Bazel,
nor I want Bazel to takes 45 minutes to recompile them. Additionally, this
will cause diamond dependency problem with other softwares that use Bazel
artefact without compiling with Bazel.

> "bazel --copt=<your options here> :target"?

Can I use that to specify a flag to some target and not some other ? Without
having to build sequentially each of them ?

Concrete example of the Maddness: SQLITE will not compile if you enable some
options that would make tensorflow faster.... Bazel recursively compile both.

> What isn't reproducible? I tend to think of reproducibility as a strength of
> Bazel. Because all of your dependencies are explicit and Bazel fetches a
> known version, the build is less dependent on your system environment and
> whatever you happen to have installed there.

Bazel try to build in isolated environment but do half of the job. It still
depends on system compiler, and do not chroot nor "compiler-wrap" ( c.f Spack
) making the build still very vulnerable of system side effect and update.

> Disclosure: I am a Googler. I have some gripes with Bazel, but overall I
> think it gets some important ideas right. You have a BUILD file that is
> declarative, then any imperative code you need goes into separate .bzl files
> to define the rules you need.

I can understand that Bazel is very convenient in Google environment with
Google resources. But it's a nightmare for everyone I talked to outside of
Google.

~~~
haberman
> Nix, Guix and Spack packager managers solved the C++ ABI issue a long time
> ago

Yes this is certainly something that is solvable at the package manager level
also. And this approach will certainly have shorter compile times.

I agree it would be nice if Bazel integrated with package managers like this
more easily. I hope Bazel adds support for this. There is a trade-off though:
with less control over the specific version of your dependencies, there is a
greater risk of build failure or bugs arising from an untested configuration.
Basically this approach outsources some of the testing and bugfixing from the
authors to the packagers. But it's a trade-off I know many people are willing
to make.

> Can I use that to specify a flag to some target and not some other ? Without
> having to build sequentially each of them ?

You can put copts=["<opt>"] in the cc_library() rules in the BUILD file. This
will give per-target granularity. You can add a select() based on
compilation_mode if you need to define opt-only flags:
[https://docs.bazel.build/versions/master/configurable-
attrib...](https://docs.bazel.build/versions/master/configurable-
attributes.html#configuration-conditions)

> Bazel try to build in isolated environment but do half of the job. It still
> depends on system compiler, and do not chroot nor "compiler-wrap" ( c.f
> Spack ) making the build still very vulnerable of system side effect and
> update.

Bazel allows you to define your own toolchain. This can support cross-
compiling and isolating the toolchain I believe, though I don't have any
direct experience with this:
[https://docs.bazel.build/versions/master/toolchains.html](https://docs.bazel.build/versions/master/toolchains.html)

> I can understand that Bazel is very convenient in Google environment with
> Google resources. But it's a nightmare for everyone I talked to outside of
> Google.

I hear you and I hope that we see some improvements to integrate better with
package managers.

FWIW, I have been experimenting with auto-generating CMake from my Bazel BUILD
file for my project
[https://github.com/google/upb](https://github.com/google/upb). My plan is to
use Bazel for development but have the CMake build be a fully supported option
also for users who don't want to touch Bazel.

~~~
jingwen
Bazel integrates well with nixpkgs and npm with rules_nixpkgs [0] and
rules_nodejs [1] respectively.

Tangentially related to package manager integration: Bazel can now build with
CMake in-tree [2].

[0]
[https://github.com/bazelbuild/rules_nixpkgs/](https://github.com/bazelbuild/rules_nixpkgs/)

[1]
[https://github.com/bazelbuild/rules_nodejs/](https://github.com/bazelbuild/rules_nodejs/)

[2]
[https://github.com/bazelbuild/rules_foreign_cc/](https://github.com/bazelbuild/rules_foreign_cc/)

------
xvilka
Hopefully this time it will support AMD opensource drivers. It is awful TF now
tied up to proprietary NVIDIA drivers.

------
tmulc18
2.5 years of learning tensorflow now gone to waste :(

