
Tensorflow sucks - nicodjimenez
http://nicodjimenez.github.io/2017/10/08/tensorflow.html
======
CompleteSkeptic
An appropriate quote: "If you can't intelligently argue for both sides of an
issue, you don't understand the issue well enough to argue for either."

There are many people for whom the declarative paradigm is a huge plus. I
would say there are at least 2 major approaches in running fast neural
networks: 1. Figure out the common big components and make fast versions of
those. 2. Figure out the common small components and how to make those run
fast together.

Different libraries have different strengths and weaknesses that match the
abstraction level that they work at. For example, Caffe is the canonical
example of approach 1, which makes writing new kinds of layers much harder
than with other libraries, but makes connecting those layers quite easy as
well as enabling new techniques that work layer-wise (such as new kinds of
initialization). Approach 2 (TensorFlow's approach) introduces a lot of
complexity, but it allows for different kinds of research. For example,
because how you combine the low-level operations is decoupled from how those
things are optimized together, you can more easily create efficient versions
of new layers without resorting to native code.

~~~
yongjik
After being exposed to several declarative tools during my career, I must say
they age poorly: make, autoconf, Tensorflow, and so on. They may start out
being elegant, but every successful library is eventually (ab)used for
something the original authors didn't envision, and with declarative syntax it
descends into madness of "So if I change A to B here does it apply before or
after C becomes D?"

At least Tensorflow isn't at that level, because its "declarative" syntax is
just yet another imperative language living on top of Python. But it still
makes performance debugging really hard.

With PyTorch, I can just sprinkle torch.cuda.synchronize() liberally and the
code will tell me exactly which CUDA kernel calls are consuming how much
milliseconds. With Tensorflow, I have no idea why it is slow, or whether it
can be any faster at all.

~~~
rkangel
I believe that make's declarative is not the cause of it's problems at all -
it's poor syntax and lack of support for programming abstractions is what
makes it clunky to use.

Something like rake, which operates on the same fundamental principles (i.e.
declarative dependency description) but using ruby syntax has aged better.

~~~
eru
Indeed. Getting these text based configuration tools work requires a lot of
experience in language design.

Lots of tools become accidentally Turing complete, like Make. You need to plan
these things from the start. If you want any computation possible at all, you
need to be extremely vigilant, and base your language on firm foundations. See
eg Dhall, a non-Turing complete configuration language
([http://www.haskellforall.com/2016/12/dhall-non-turing-
comple...](http://www.haskellforall.com/2016/12/dhall-non-turing-complete-
configuration.html)).

If you are happy to get Turing completeness, you might want to write your tool
as an embedded DSL and piggy-bank on an existing language, declarative or
otherwise.

------
batmansmk
Despite its shortcomings, I share the same vision as this article. Here are my
reasons:

\- Tensorflow has a way too large API surface area: parsing command lines
arguments handling, unit test runners, logging, help formatting strings...
most of those are not as good as available counterparts in python.

\- The C++ and Go versions are radically different from the Python version.
Limited code reuse, different APIs, not maintained or documented with the same
attention.

\- The technical debt in the source code is huge. For instance, There are 3
redundant implementations in the source code of a safe division (_safe_div),
with slightly different interfaces (sometimes with default params, sometimes
not). It's technical debt.

In every way, it reminds me of Angular.io project. A failed promise to be true
multi-language, failing to use the expressiveness of python, with a super
large API that tries to do things we didn't ask it to do and a lack of a
general sounding architecture.

------
zo7
I think the author raises a good point about Google envy. TensorFlow is not
the most intuitive or flexible library out there, and it is very over-
engineered if you're not doing large-scale distributed training. The main
reason why everyone talks it up so much is because Google heavily marketed it
from the outset, and everyone automatically assumes Google == Virtuoso
Software Design because they couldn't make it through the interview. Really
it's just modern enterprise software which has five different ways to
implement batch norm that they push on the community so they don't have to
train new hires on how to use it.

~~~
dsl
Or maybe it is built by a company that is doing large-scale distributed
training, and they open sourced it not to cater to every need, but to help
others trying to do the same thing they are. Companies are under no obligation
to make sure their open source is well suited for others use cases.

~~~
zo7
That was kinda my point, it's not the be-all deep learning library because
they made it for their own use case, but its towering popularity (as in 10x
the number of stars of other popular libraries) is not genuine.

Also I _highly_ doubt that the main reason Google open sourced it was to be
charitable.

------
j2kun
There is little analytical or "detailed" about this post. The most complex
model is y = 3*x, the author provides no evidence to back up any claims about
adoption, difficulty of use, etc., and most of the author's complaints boil
down to a lack of syntactic sugar.

I'm open to a discussion about the downsides of tensorflow, which is why I
read the article in the first place, but this post doesn't provide that.

------
sangnoir
I'm probably being overly cynical, but this is (indistinguishable from) a
"growth-hack" submarine article by the author to promote their tool. There is
hardly any substantiation to support the assertions. Tucked right at the end:

> If you want a beautiful monitoring solution for your machine learning
> project that includes advanced model comparison features, check out
> Losswise. I developed it to allow machine learning developers such as myself
> to decouple tracking their model’s performance from whatever machine
> learning library they use

~~~
chairmanwow
I'm actually pretty ok with these types of articles. They are generally well-
researched and well-written—giving technical introductions to important
concepts.

As always, it is important to be wary of the reasons that an author writes an
article. If there is an advertisement at the end, then the author motivations
(at least in part) are clear. But I often find that promoters of new systems
and tools are able to present excellent critiques of established tools and
practices. New things are USUALLY made to address the shortcomings of existing
things. You as a reader have to parse whether their arguments are sound and
maybe do some more research before you can make a sound judgement on the
matter.

------
awnihannun
There are a few categories that I think TensorFlow is notably strong in.
Namely:

1\. Deployment. 2\. Coverage of the library / built-in functionality. 3\.
Device management.

For more details, I wrote a comparison of PyTorch and TensorFlow (mostly from
a programmability perspective) a couple months back. Interested readers may
find it helpful. [https://awni.github.io/pytorch-
tensorflow/](https://awni.github.io/pytorch-tensorflow/)

------
brianchu
This article is not that detailed, but it's a sentiment I agree with, so I'll
add one major shortcoming of Tensorflow: _its memory usage is really bad._

The default behavior of TF is to allocate as much GPU memory as possible for
itself from the outset. There is an option (allow_growth) to only
incrementally allocate memory but when I tried it recently it was broken. This
means there aren't easy ways to figure out exactly how much memory TF is using
(e.g. if you want to increase the batch size). I believe you can use their
undocumented profiler, but I ended up just tweaking batch sizes until TF
stopped crashing (yikes).

TF does not have in-place operation support for some common operations that
could use it, like dropout (other operations do have this support, I believe).
Even Caffe, which I used for my research in college, had this. This can double
your GPU RAM usage depending on your model, and GPU RAM is absolutely a
precious resource.

Finally, I've had issues where TF runs out of GPU RAM halfway through
training, which should _never_ happen - if there's enough memory for the first
epoch, there should be enough memory for every epoch. The last thing I want to
do is debug a memory leak / bad memory allocation ordering in TF.

~~~
danieldk
_The default behavior of TF is to allocate as much GPU memory as possible for
itself from the outset. There is an option (allow_growth) to only
incrementally allocate memory but when I tried it recently it was broken. This
means there aren 't easy ways to figure out exactly how much memory TF is
using (e.g. if you want to increase the batch size). _

There is also _per_process_gpu_memory_fraction_ , which limits Tensorflow to
only allocate that fraction of each visible GPUs memory. It's still not great,
but has been helpful in keeping resources free for models that do not need all
the GPUs memory.

------
eob
It seems to me the reason for insisting on a verbose declaritivization of
everything is obvious: it guarantees you can build run/traintime environments
which scale your model automatically.

Google’s mindset isn’t “train this model to multiply by three”. It’s “train
this model on a 1% sample of search traffic over the last year.” That’s
reflected in the design choices of tensorflow.

------
mattnewton
Wait, (s)he’s arguing that the code isn’t imperative enough, but the punchline
is they don’t like having to type session.run? I don’t understand their
vendetta against the graph, which is a powerful abstraction that lets you
choose different backend, and let’s tensorboard show you an awesome view of
your computation. Session.run isn’t hard to type and it takes at most a few
days to grok that everything is lazily evaluated.

Would (s)he like eager apis? Does (s)he want better c++ apis? Or does the
author just want to hate on tensorflow because it’s been hyped so much?

~~~
ffriend
> I don’t understand their vendetta against the graph, which is a powerful
> abstraction that lets you choose different backend [...]

You don't really need a graph to support different backends. One popular
approach is to have different array implementations (e.g. CPU and GPU arrays).

> [...] and let’s tensorboard show you an awesome view of your computation

At the end of the post the author shows his API that lets you do the same
things as Tensorboard, but for whatever framework you like.

All in all, expression graphs like these used in TF and Theano are great for
symbolic differentiation of a loss function and further expression
optimization (e.g. simplification, operation fusion, etc.). But TF goes
further and makes everything a node in a graph. Even things that are not
algebraic expressions such as variable initialization or objective
optimization.

~~~
KirinDave
> You don't really need a graph to support different backends. One popular
> approach is to have different array implementations (e.g. CPU and GPU
> arrays).

And now you can ( _waves arms_ ) write it twice! Alternatively, you can make
the interfaces between various impls be exactly the same but rename them so
they're purpose-named. Then you've written Graph, for the most part.

~~~
candiodari
Except your graph is symbolic, and good luck getting a breakpoint to fire when
the calculation is happening ... Or if you don't like debugging, the problem
manifests itself with merely printing values too.

------
foo101
I was reading your post and thinking, "Okay. Fair point. Although I don't face
this problem while using Tensorflow, but I can imagine this could be a problem
for some people", until I came across this sentence,

> Pytorch’s interface is objectively much better than Tensorflow’s

How does your subjective opinion about Tensorflow suddenly become objective? I
don't see any fact based or metrics based argument that establishes this
objectively. All I see is an argument based on your needs and preference. That
is far from objective.

------
7kmph
I thought I was alone.

I think the user-unfriendly of tf mainly comes from these two aspects.

1\. The choice of python.

When I'm writing python, I spent more time debugging, (compared to something
like OCaml), and sometimes it takes more time to get started because arguments
in functions are documented rather than enforced by contract/compiler, (and
those "if this pass arg1 else pass arg2" documentation will never let you have
the same kind confidence as you would if you are using a more rigorous
language), so you end up trying. This isn't specific to tf, but tf makes this
more obvious because it's something like a language-in-language.

If tf was made by someone else, I would understand the choice, because the
popularity of python and tons of library available, but since Google has
unlimited resource, and AI is clearly the future, I really expect they have
the courage to choose something else.

Sometimes I wonder, AI may gone rouge some day - not because that we
deliberately make it so, but that a bug somewhere in our code.

2\. Lack of maintenance.

We all like shiny new ideas, and get excited implementing them, but once the
fun part is done, so goes the excitement. A good library needs to be tweaked
and re-tweaked, some of these need boring hard work, smart people don't like
that.

But hey, Google is doing this for free, as long as it's not deliberately made
so (to stall the community), we should be appreciate, it's a open source
project and that don't just mean we can use it for free, but also that we
should done our part to make it better.

------
ninjakeyboard
I'm not a comp-sci grad and I worked at Google for a bit after a startup I was
at was acquired. I didn't proceed far after that 2 years due to the need to
commute and/or relocate but I had a clear path into full time work without
much else. Although I was granted a bit of a pass, I believe anyone can work
at Google given they are 1) slightly above average and have learned every base
that they run into in reasonable depth 2) motivated enough to try to interview
at least a couple times including extensive "refreshing." Not many people pass
the interview the first time so it's a +6 month play. Maybe longer. They look
at progress from one interview to the next. Google is a big organization full
of lots of people and not many of them are strictly that far above average.
Maybe the Dunning Kruger effect may be shifting what I think of myself and the
average developer there a bit but it's not unrealistic for any developer to
think they can work there. It just takes interest and effort.

~~~
TheMagicHorsey
One should also ask the question: would you want to work at Google today. This
is not the Google of 2002 ... or even the Google of 2010. 2017 Google is like
1999 Microsoft.

Lots of brilliant people working on heavily resourced projects ... but also
significant bureaucracy and many political animals in what was formerly a
pristine engineering "garden of eden".

You can see a lot of Google projects struggling now, and many startups in the
same space as Google projects doing much better than more resourced teams
doing the same thing at Google.

Nobody ever got fired at Google for spending all day brilliantly arguing on
Google's internal newsgroups and not doing any real work. And it shows in
Google's work culture. Imagine a person doing that at a startup ... or Amazon
for that matter. They would not survive very long.

If you are young, and have many years of productive/earning years ahead of
you, it might make sense to turn down a Google offer to try something a bit
more "bloody" and hectic for a few years before you settle down in a comfy
Google job.

There are many brilliant people you can learn from at Google ... but very few
work very hard. And hard, productive work is a skill to learn too.

~~~
jacquesm
> Nobody ever got fired at Google for spending all day brilliantly arguing on
> Google's internal newsgroups

There was a pretty high profile example of this not all that long ago.

~~~
DanBC
That was decidedly not "brilliant" arguing.

------
krisives
Like he tacks on at the end of the article, the reason most care about
TensorFlow is because of TensorBoard making most of this a moot point. People
would code while being hung upside down as long as they still get TensorBoard.

------
sandGorgon
The serialisation story in Tensorflow is an obscene mess. There are bugs open
on keras and tensorflow asking how to export a model and run it on your laptop
and even better...on Android. It simply is crazy bad and cannot be done
easily.

In fact, to do even half decent export of TF models, you have to switch to
keras to try and do any kind of export.

I have a 10 email conversation with enterprise Google Cloud support to try and
get a ML Engine output serialised to work on Android.

There are threads open all over the place on stackoverflow and elsewhere - and
yes, we have tried all SIX ways.

~~~
danieldk
Out of curiosity, what problems are you running into? I have never had serious
problems with the 'save parameters -> dump graph -> freeze graph -> load up
with C API' path with feed-forward networks or various RNNs. Either from Go or
from Rust.

Admittedly, the documentation in this area is extremely bad and I basically
had to figure out myself how to do it, though this was long before 1.0.

~~~
sandGorgon
The issues with the high level API goes slightly deeper. It looks like some of
the graph operations are not available on Android (and equivalently on the
desktop) by default[1]. The motivation for this is that we have stricter
requirements for computation costs and application size on different
platforms. So there is an approach that allows to compile minimal set of
operations required to run the graph[2][3]. However it requires Bazel as the
primary build tool for the Android app as well. The good news is that the
TensorFlow team understands the issue and works on improving the documentation
and tooling[4].

That said, would you be able to share any example snippets on how you are
persisting and loading these models in your code ? That would be super
helpful.

Also, im getting the feeling that you are using the deprecated method of
saving. I think they are shifting to Metagraph now (not sure about this)
[https://www.tensorflow.org/versions/master/api_docs/python/t...](https://www.tensorflow.org/versions/master/api_docs/python/tf/train/export_meta_graph)

[1]
[https://github.com/tensorflow/tensorflow/issues/10254](https://github.com/tensorflow/tensorflow/issues/10254)
[2]
[https://github.com/tensorflow/tensorflow/blob/master/tensorf...](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/print_selective_registration_header.py)
[3]
[https://github.com/tensorflow/tensorflow/blob/master/tensorf...](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/selective_registration.h)
[4]
[https://github.com/tensorflow/tensorflow/issues/10299](https://github.com/tensorflow/tensorflow/issues/10299)

~~~
danieldk
_That said, would you be able to share any example snippets on how you are
persisting and loading these models in your code ?_

The relevant code is (still) in private repositories, but I have a deck with
some examples using Rust:

[https://www.dropbox.com/s/t8r056f6wqlktqv/embedding-
tensorfl...](https://www.dropbox.com/s/t8r056f6wqlktqv/embedding-
tensorflow.pdf?dl=0)

------
pmarreck
I can find no good empirically-based argument here against the declarative
style other than an appeal to taste/preference.

The only effective difference IMHO is that the declarative style does deferred
evaluation which makes it more difficult to examine "intermediate states" (aka
"debugging") but as the author stated themselves, this is simply a matter of
outputting those intermediate states/setting them as outputs... this is no
different from examining intermediate state in every programming paradigm ever
invented. Unit-testing every step is IMHO a far better way of
debugging/ensuring validity, but this is possibly much more difficult in the
"black box" environment of self-weighting neural networks than it is in your
more traditional programming environment

------
batat
> First, let’s look at the Tensorflow example

> Now let’s look at a Pytorch example that does the same thing

Reminds me Escobar (russian philosopher, not to be confused with Lopez-
Escobar) theorem, which approximately translates into English as "With no
alternative choice of the two opposite entities, both will be an exceptional
nonsense."

------
KirinDave
Usually I'm on the hook for being open minded about programming languages in
this forum, but I'm putting my foot down.

I really hope at some point this entire universe gets liberated from Python at
some point. Even R would be more palatable. Both these examples are _awful_ ,
error prone, obfuscated, and beholden to Python's difficulties with large sums
of data.

~~~
walrus
Can you give a sketch of how you would like these examples to look? Any
language is fine; you can pretend that libraries for GPGPU, backprop, and
gradient-based optimization already exist.

I tried to do so myself and couldn't come up with anything significantly
better, but I've been writing Python for a long time and might just be stuck
in a local minimum :)

~~~
KirinDave
Sure.

Another comment has already pointed out that Julia framework. I'll relink it
for completeness: [https://fluxml.github.io](https://fluxml.github.io)

You can also look at Haskell's Grenade examples:
[https://github.com/HuwCampbell/grenade](https://github.com/HuwCampbell/grenade)

Fundamentally different because the notion of what the compiler should be
doing is fundamentally different, and the notion of how data should be input
is somewhat different.

------
akavel
Ummm; for me, as a total layman w.r.t. ML, the Tensorflow snippet actually
looks much more readable and understandable at high level... I guess it might
be harder to evaluate what's exactly happening inside (what precise algorithms
are chosen, what are the runtime costs, etc.), but at least I have some kind
of idea what this code might be doing. While between MSELoss, SGD, zero_grad
and loss.backward(), all bets are off for me. So, what I want to say here, is
that Tensorflow seems to at least win at readability; which I find a non-
negligible aspect.

------
dna_polymerase
> Pytorch’s interface is objectively much better than Tensorflow’s

Ummm. No. 'Objectively' is utter nonsense. For an objective view we would need
to define "better" first and measure both interfaces performance. I think it
is preference. I prefer the Tensorflow interface and don't mind it's
declarative style.

However, if one wants to criticize something one could start with the static
nature of Tensorflow (which you rightfully mentioned), which makes it hard to
do stuff like LSTMs and dynamic batching. That works better in Torch. To me
that is the only real attack point for Tensorflow. But keep in mind, that
Tensorflow is still "1.x" software and stuff like Tf-Fold addresses the
problem with static graphs already and Google plans for more dynamic graphs in
Tf 2.0.

Also it would have been nice of you to measured performance of the both
frameworks on common problems. But looking at the interface and shouting "bad"
at Tensorflow is not really critique but a personal dissatisfaction with
Tensorflow.

I think that Tf currently aims more at production code than research stuff.
The mentioned problem with being to low-level for simple stuff like layers is
also not right. Have a look at the shipped contrib modules, you'll find common
layers in there.

~~~
0xbear
In my experience (computer vision, deep learning) PyTorch is substantially
faster as well, especially in data augmentation where it’s not just a thin
layer over cudnn.

That said, you’re right. There’s no way I’d deploy it to production.

~~~
Marat_Dukhan
Have you looked at ONNX? It is a neural network exchange format that, in
particular, lets you deploy PyTorch models in production via Caffe2. Here is a
tutorial:
[http://pytorch.org/docs/master/onnx.html](http://pytorch.org/docs/master/onnx.html)

Disclaimer: I work on Caffe2 team (not on ONNX, though)

~~~
ffriend
If I want to export my own computational graph to ONNX, what is the first
place I should look at? Do you know about any documentation or reference
implementation of the format?

~~~
Marat_Dukhan
Reference specification and validator are hosted in
[https://github.com/onnx/onnx](https://github.com/onnx/onnx)

------
bitL
...and that's why you are using Keras instead.

~~~
cjalmeida
I like Keras, but I always find myself having to write TF code whenever I need
to implement something more interesting. And debugging, already hard in pure
TF, is more complex due to the extra layer.

IMO, learning TF or pytorch is more effective at least in the current state of
affairs.

~~~
fchollet
But that is precisely how you should be using Keras!

* If you are implementing a standard model (that's 90% of industry use cases, and a large fraction of research use cases as well), Keras primitives considerably simplify your workflow and make you a lot more productive.

* When you need to implement something highly customized or unusual, you can revert back to writing pure TensorFlow code, which will integrate seamlessly with your Keras workflow (via custom layers, functions etc).

Basically, Keras increases your productivity for common use cases, without any
flexibility cost for rare/custom use cases. It is meant to be used _together_
with TF, not as a replacement for TF.

~~~
wdroz
You also created the backend abstraction that let you customize a lot without
real "raw TensorFlow". Thank you Mr. Chollet.

------
dontreact
I think the tensorflow code could be improved a bit if the author knew about
the layers and loss modules.

    
    
      import tensorflow as tf
      import numpy as np
      X = tf.placeholder("float")
      Y = tf.placeholder("float")
      pred = tf.layers.dense(X,use_bias=false)
      cost = tf.losses.mean_squared_error(labels=Y, 
        predictions=pred)
      optimizer = 
      tf.train.GradientDescentOptimizer(0.01).minimize(cost)
      with tf.Session() as sess:
          sess.run(tf.global_variables_initializer())
          for t in range(10000):
              x = np.array(np.random.random()).reshape((1, 1, 1, 1))
              y = x * 3
              (_, c) = sess.run([optimizer, cost], feed_dict={X: x, Y: y})
              print c

------
_pmf_
Well, working with Android will tell you one thing: Google is abysmally bad at
designing APIs.

~~~
HillaryBriss
yeah. one of my favorite things is the collection of FLAG_* constants on
Intent: FLAG_ACTIVITY_CLEAR_TOP, FLAG_ACTIVITY_CLEAR_TASK,
FLAG_ACTIVITY_CLEAR_WHEN_TASK_RESET, FLAG_ACTIVITY_FORWARD_RESULT ...

the android SDK appears to lack a coherent, overarching concept or set of
guiding principles which the client programmer can internalize and rely upon.
there are many special cases and inscrutable behaviors.

like, sometimes you ask for a piece of work to be done by the SDK and it calls
you back on your implementation of a Listener class. but, other times, you
have to register a BroadcastReceiver. still other times, you have to override
OnActivityResult. but, then, there are these other times when you have to
create and provide a PendingIntent.

you go through enough of this stuff and you start to wonder: "Did there really
need to be soooo much variety in the way these SDK methods hand info back to
the client? Couldn't they have standardized this?"

it's almost like Google turns their (very smart) programmers loose on the SDK
and never looks back. google seems to just defer entirely to their opinions
and judgments. smart as these programmers are, they seem to have different
tastes, different approaches to API design.

and it all goes into the SDK.

------
guhcampos
I'm not an AI expert and barely understand how Deep Learning works. It was
always my impression that I am the target audience for Tensorflow: mainstream
tech workers from the corporate environment who want to add deep learning into
their toolbox, while people actually doing research on the matter seemed to
prefer Scikit. Wouldn't this justify the adoption of the declarative model and
the aggressive abstraction of internal workings?

~~~
chillee
Researchers definitely do not prefer Scikit. If anything, Scikit is a better
option than tensorflow for "mainstream tech workers from a corporate
environment".

------
amelius
Personally, I hate it that all these libraries are so much geared towards
neural networks.

Why can't we just have compute networks that can be used for anything, from
computational linear algebra to deep learning?

> Let’s be honest, when you have about half a dozen open source high-level
> libraries out there built on top of your already high-level library to make
> your library usable, you know something has gone terribly wrong

I consider that not a bug but a feature.

~~~
nnfy
Have you looked at the APIs for any of these libraries? I think there is a
perception that these are all for NN only because NN is the hot topic that
everyone is jumping on noe; but tf for example is a general matrix
manipulation library with a bunch of amazing, extra NN stuff shipped alongside
it.

~~~
homerowilson
This is a good observation! Tensorflow is a practical and flexible dataflow
implementation. See for instance Greta,
[https://github.com/gavinsimpson/greta](https://github.com/gavinsimpson/greta)
\-- a framework for Bayesian stats built on tensorflow.

------
fujipadam
Don't flame me but how does tensorflow compare with Azure machine learning
platform. For me it provides a great platform for practitioners

~~~
j2kun
I think these are not comparable. Tensorflow is a software library, Azure is a
compute environment which allows one to run, among many other libraries,
tensorflow implementations of ML models.

~~~
Arcsech
I think GP is referring to Azure Machine Learning Studio[1], which does seem
like it might be comparable to TF. That said, I don't know enough about either
to answer their question.

1: [https://studio.azureml.net](https://studio.azureml.net)

------
serveboy
Quick shout out for DyNet! The API is so simple and intuitive even a two year
old (a prodigious one) can use it.

~~~
syllogism
DyNet is also _vastly_ faster than Tensorflow for realistic NLP models,
especially on CPU.

[http://dynet.readthedocs.io/en/latest/index.html](http://dynet.readthedocs.io/en/latest/index.html)

If you're doing the sort of work that would be published at ACL or EMNLP,
DyNet is a really good choice.

------
iluvmylife
Tensorflow does do really well when it comes to serving models in production.
Tensorflow serving or TensorRT 3 are fairly throughput efficient and low
latency. PyTorch, for instance, does not have a good serving solution (I guess
that's where Caffe2 is useful)

------
gormanc
Hell, being able to effortlessly switch between PyTorch and
Numpy/SciPy/sklearn/skimage has been so helpful for the project I'm working
on. That and I have tensors in later layers whose shapes depend on the
training of the previous layers.

~~~
vaughngh
_I have tensors in later layers whose shapes depend on the training of the
previous layers._

Rad! Do you have any examples (or literature) that explains when this is
beneficial?

~~~
gormanc
Not yet! I'm not using convnets or backprop or anything so I don't think it
would be beneficial that way, but you could get something similar to what I'm
doing by looking at Fritzke's Growing Neural Gas[1]

[1] [http://papers.nips.cc/paper/893-a-growing-neural-gas-
network...](http://papers.nips.cc/paper/893-a-growing-neural-gas-network-
learns-topologies.pdf)

~~~
vaughngh
Neat, thanks for the link.

------
throw847333
A few more for your selection,

\- Bloated build system that is near impossible to get working - who even uses
maven ?! Pytorch/Caffe are super-simple to build in comparison; with Chainer,
it's even simple: all you need is pip install (even on exotic ARM devices).

\- The benefits of all that static analysis simply aren't there. In addition,
PyTorch has a jit-compiler which one can argue lets one have their cake and
eat it too.

\- Loops are extremely limited. Okay, we know RNN/LSTMs aren't really TF's
thing, but if you venture out to do something out of the ordinary even making
it batch-size invariant is difficult. There isn't even a map-reduce op that
works without knowing the dimension at compile time. You can hack something
together by fooling one of those low level while_loop ops, but that just tells
you how silly the whole thing is.

~~~
gormanc
I love that PyTorch kind of went all-in with anaconda. Building it is so much
easier than TF! I'm a recent convert but it's dang good.

~~~
orf
Why not pip?

~~~
gormanc
For installing it, yeah pip is great too, but for building conda includes
third party tools and libraries and stuff. e.g. in order to use the MPI
backend for PyTorch's distributed processing you need to build it yourself and
conda just makes it a bit easier. That and I had a real bad experience with
trying to build Tensorflow (and Bazel) to run on an HPC cluster.

------
jrs95
How do you even make a "Tensorflow Sucks" article and not mention CNTK?

------
bluetwo
So what would you include in the next-generation ML tool?

------
somesnm
For nice looking basic statistics of your dataset the pandas-profiling library
will do the trick [https://github.com/JosPolfliet/pandas-
profiling](https://github.com/JosPolfliet/pandas-profiling)

