
TensorFlow Model Analysis – A library for evaluating TensorFlow models - wjarek
https://github.com/tensorflow/model-analysis
======
robkop
What's the advantage over just using tensorboard and writing summeries of
whatever you want?

------
p1esk
I've read the README and I still have no clue what exactly this library is
supposed to "evaluate".

~~~
rasmi
It's meant to help evaluate the performance of a TensorFlow model after
training. In particular, you can define a series of metrics to measure, and
different subsets of the data to evaluate on. And it can scale up to evaluate
on large datasets since it uses Apache Beam for distributed data processing.

This notebook here shows some helpful cases:
[https://github.com/tensorflow/model-
analysis/blob/master/exa...](https://github.com/tensorflow/model-
analysis/blob/master/examples/chicago_taxi/chicago_taxi_tfma_local_playground.ipynb)

~~~
p1esk
Thanks.

I still don't see the point, but maybe someone needs this. To me, it would be
a lot more useful to evaluate things like memory consumption and CPU/GPU
bottlenecks (for each node in the graph).

~~~
jamesblonde
They do show resource usage (GPU utilization/memory) - but only for TPUs
(i.e., only if you use managed tensorboard on GCE).

This library is about bringing Apache Beam into play. Tensorflow is not really
distributed (distributed tensorflow requires you to start all the servers and
manage all the endpoints yourself). Right now, we use Spark for pre-processing
pipelines and for launching both parallel experiments in TensorFlow and
distributed training. Spark is a key part of TensorFlow for us (www.hops.io).
Google want an open-source alternative (not just managed TF, which is
distributed) that enables ML pipelines. This is it. Spark is fighting back,
with Image pre-processing now part of Spark 2.3. We recommend Spark for pre-
processing, spitting out .tfrecords, which then are used as training/test
data.

~~~
p1esk
What kind of models are you training? Why do you need distributed TF? I was
under impression that it's pretty rare to need it for training (e.g. large
scale evolutionary hyperparam search on 256 GPUs, or things like that).

~~~
jamesblonde
Anything that's not MNIST. Even Fashion-MNIST benefits from hyperparam search.
Distributed TF is the standard in Google, will be the standard amongst
everybody else within a couple of years. They actually just run Estimators,
and distribution is somewhat transparent to them.

When hyperparam search is this easy with PySpark, why wouldn't you do it?

def model_fn(learning_rate, dropout):

[TensorFlow Code here]

args_dict = {'learning_rate': [0.001, 0.005, 0.01], 'dropout': [0.5, 0.6]}
experiment.launch(spark, model_fn, args_dict)

For distributed training, we favor Horovod. It requires minimal changes to
code, and scales linearly on our DeepLearning11 servers:
[https://www.oreilly.com/ideas/distributed-
tensorflow](https://www.oreilly.com/ideas/distributed-tensorflow)

~~~
p1esk
_why wouldn 't you do it?_

Perhaps because I don't have hundreds of GPUs laying around? I'm a DL
researcher fortunate enough to have a 4 Titan X workstation for my
experiments. Most of my peers at other labs only have shared access to such
workstations, and with the current trends in GPU pricing, that's not likely to
change in the nearest future.

More importantly, the lack of compute power is rarely a bottleneck in my
research. Reading papers, brainstorming new ideas, and writing/debugging code
are the most time consuming parts, it's not like I'm sitting idle while
waiting for my models to train.

 _Distributed TF is the standard in Google, will be the standard amongst
everybody else within a couple of years._

By "everyone else" you mean a few dozen corporations, which can justify
spending millions of dollars on hardware to speed up their mission critical ML
research?

~~~
jamesblonde
Things can change quickly. When Ethereum goes to proof-of-stake the market
will be flooded with cheap 1080Tis. A moderately well-funded lab should (by
the end of 2018) be able to buy 100 1080Tis in DeepLearning11 servers for 100K
Euro and do cool work in distributed learning and model architecture search.

SoTA in Cifar-10 and ImageNet is now model-architecture search. This
paper/blog was an eye-opener for me:
[https://blog.acolyer.org/2018/03/28/deep-learning-scaling-
is...](https://blog.acolyer.org/2018/03/28/deep-learning-scaling-is-
predictable-empirically/)

~~~
tasuki
What's the ETA on that? I'd love me some cheap 1080Ti!

~~~
jamesblonde
Prices are already down. Keep an eye out here:
[https://www.reddit.com/r/nvidia/](https://www.reddit.com/r/nvidia/)

Current estimates are that 75% of ethereum mining is done on 1080Tis. So, if
that stops overnight - will they sell them or switch to some other bitcoin? My
guess is that they will flood the market. Mostly they are under-volted when
mining, so they won't burn out within a few weeks if you buy used.

