
A promenade of PyTorch - astdb
http://www.goldsborough.me/ml/ai/python/2018/02/04/20-17-20-a_promenade_of_pytorch/
======
sytelus
To be fair static graphs has its own advantages, specifically figuring out how
to distribute computation over heterogeneous units. I think TF will still have
upper hand when distributing computation over 10K CPUs/GPus but for everything
else it’s a major pain to tolerate. The major issue with TF is it’s
dissopointing API design- especially Estimator APIs and in many cases even
lower level APIs. I wants to say WTF like at every line. Nothing is intuitive
in that land. TF has jumped the shark here. In 1.5, they replaced
documentation from old MNIST quick start to dumbed down Iris using Estimator
API. I can only say WTF? Hopefully this disaster would divert enough of new
comers to better things like PyCharm or even CNTK.

~~~
cjalmeida
The Estimator and Dataset API docs is so bad I'm planning on starting a blog
with posts on running MNIST and a few other common datasets so others don't
have to suffer.

~~~
mrry
[Disclosure: I designed and wrote most of the docs for TensorFlow’s Dataset
API.]

I’m sorry to hear that you’ve not had a pleasant experience with tf.data. One
of the doc-related criticisms we’ve heard is that they aim for broad coverage,
rather than being examples you can drop in to your project and run straight
away. We’re trying to address that with more tutorials and blog posts, and
it’d be great if you started a blog on that topic to help out the community!

If there are other areas where we could improve, I’d be delighted to hear
suggestions (and accept PRs).

~~~
sytelus
No offense but TensorFlow's Dataset API documentation also sucks. This
combined with bad API design (that actually can be used as case study for bad
design in classrooms) is disaster in making. For example, shuffle() takes a
mysterious argument. Why? It's not to be found in docs except that it should
be more than items in dataset. Why can't shuffle() just be shuffle() and why
do I now always have to remember passing correct parameter for rest of my
life? Whatever. I still don't get what exactly repeat() does. Does it rewinds
back to start when you reach past end? Why you need it? Why not just stick to
epochs? Why make things complicated with steps vs epochs anyway? Docs gives
zero clue. Then there are whole bunch of mysteriously named unexplained
methods like make_one_shot_iterator() or from_tensor_slices(). Why is
make_one_shot_iterator() not just iterator()? Why do I have to rebuild dataset
using from_tensor_slices()? The docs are designed with a point of view "take
all these code calling mysteriously designed APIs, copy-paste and don't bother
too much about understanding what those APIs really do". It really sucks.

~~~
cjalmeida
IMO, shuffle is something they did really fine. Unlike PyTorch datasets, TF
allows streaming unbounded data. For something like this work with shuffle, it
must cache some data before passing it down the pipeline. You specify how much
in the argument.

This may not seem useful this conventional training, where you usually work
with a fixed amount of samples you know beforehand. But there may be cases
where this is not true (for instance, in some special cases of augmentation) -
the streaming part is useful but then you must use this caching trick.

But I agree API naming is not stellar, or at least should come with better
documentation.

------
mlboss
Tensorflow Eager provides dynamic graph computation similar to PyTorch.
[https://research.googleblog.com/2017/10/eager-execution-
impe...](https://research.googleblog.com/2017/10/eager-execution-imperative-
define-by.html)

~~~
cs702
Yes.

Yet in my experience PyTorch is much nicer to use. Its API feels very natural
and easy to extend -- it feels very Pythonic.

TensorFlow's API, on the other hand, seems to get in my way whenever I try to
do anything new that isn't already built into one of its higher-level APIs
(e.g., Keras). I frequently I find myself _fighting_ with TensorFlow's API.

For iterative R&D/exploratory work, I find I'm more productive -- and happier
-- with PyTorch than TensorFlow.

~~~
epberry
Agreed that pytorch feels “pythonic”. I think tensorflow doesn’t because it’s
really a C++ API with an extensive set of wrappers and it shows. Pytorch feels
like they started with python and added in the C extensions after the fact.

------
hokkos
Maybe someone should extract the program flow from the AST to create the
compute graph, or create a restricted language that facilitate that. It seems
in TF or PyTorch the syntactic sugar is infective at some point and you have
to use the explicit API.

~~~
skierscott
Isn’t this what JITs do?

~~~
singhrac
There's a (very exciting) PyTorch JIT incoming sometime soon!

------
yvsong
Need an official tool to convert to CoreML.

~~~
ipsum2
Check out [https://attardi.org/pytorch-and-
coreml](https://attardi.org/pytorch-and-coreml) for an example of writing in
PyTorch and using ONNX to convert to CoreML.

------
stealthcat
seeing that Pytorch started as Chainer 'fork' (?) with similar API but
different backend, showed us how much a BigCo can do against a startup, it's
unsettling.

Chainer autograd has been around years ago.

~~~
agibsonccc
As a DL framework author myself, I can say for a fact we all steal good ideas
from each other. There's no conspiracy here.

A lot of folks not actually building the tools like to assume there's some big
war going on where we're trying to sabotage each other.

In reality, we're all just scratching our own itch. This goes for pytorch, TF,
as well as us.

Yeah there's occasional public debate, but we're not out to go to "war" with
each other or anything. It's the end users that make this out to be something
it's not.

Just food for thought here.

~~~
thanatropism
TensorFlow is hardly a cat scratch fever, it's a major strategic effort by a
top 3 global tech conglomerate.

~~~
agibsonccc
Yes and so are the other frameworks? When I said "scratching our own itch"
google is building it for their own strategic initiatives ranging from hiring
to google cloud. FB uses pytorch for their research, caffe2 for deployment.
CNTK is used by microsoft. Onnx was a joint effort between FB and microsoft to
compete against TF's file format.

I'm not sure what your point is. What I was attacking was this "conspiracy"
that startups and these companies are somehow out to get each other. Of course
there is competition and various strategic reasons folks implement their own
frameworks. We ourselves implement our own framework that imports all the
python frameworks and runs them in production on the JVM and big data stack.
Imagine that, I do it to sell licenses.

The "startup" the above was alluding to was PFN in Japan. FB supposedly
"stole" ideas from chainer. And yes they did, they even say so as such.

It doesn't mean there's a conspiracy, it's just smart to do. If something is
working adapt it for your use case. That's all that happens across any of the
major frameworks.

I'm not sure if I undersold myself a bit, but I just want to say this is all
I've been doing since 2013. My framework is well used by a great portion of
the fortune 500 and all over the globe as well as part of a major foundation.
It's not some toy, it's a commercial venture with millions in funding and a
decent sized engineering team, open source foundation/community behind it.

I'm more than familiar with the space and even compete with google's business
model to a certain extent.I have plenty of incentive to care about these
things, but I'm still calling it out for what it is. I talk to other framework
authors and have nothing but good things to say about them. We're all out
there just building what we need to to suit our purposes. Yes those things
have incentives, but it doesn't mean there needs to be conspiracies and trash
talk.

I'll call it out again: The users are the ones who blow this stuff way out of
proportion. I've seen this play out for years now.

