
Tensor Considered Harmful Part 2 - ghosthamlet
http://nlp.seas.harvard.edu/NamedTensor2
======
amelius
By the way, why don't we just call them "arrays"?

Perhaps I'm missing something, but (practical) deep learning seems to have
little tensor algebra going on, just operations on slices of arrays.

~~~
nabla9
Mathematically tensors are multilinear maps, not data structures. Typically
tensors in deep learning represent multilinear mappings forwards and
backwards. Scalars, vectors and matrices can also be tensors when they
represent mapping.

If you think how normal fully connected feedforward network learns, its outer
products of derivatives. Inference is affine transformations + nonlinearity
sequence.

~~~
thanatropism
We've got a winner. Tensors "do" something; they're functions.

In the rank-1 and rank-2 case these functions can be represented as objects
familiar to linear algebra. 3-tensors are not the same as stacked two-tensors,
however.

I wonder if "neural networks as ODEs" will catch on. I could use some of my
masters work and go get a PhD.

~~~
twtw
Is it accurate to say that rank-n tensor is to n-D array of numbers as linear
transformation is to matrix?

~~~
krastanov
For a mathematician or a physicist, yes. But the word ended up having new,
confusing meanings.

~~~
nabla9
It's the same for ML perspective as well.

------
mcabbott
Index notation is often less confusing than numbered dimensions, without
requiring an you to use an explicit NamedTensor.

Perhaps of interest, here's a quick prototype of how this could be useful. If
you use an index "n"-for-number on the dimension for which you previously used
an index "c"-for-colour (when performing any einsum or slicing or re-shaping
operation) then you will get a warning:

[https://github.com/mcabbott/TensorSlice.jl#checking](https://github.com/mcabbott/TensorSlice.jl#checking)

------
sja
A couple critiques for this article’s argumentative approach, as well as the
proposed library. My expertise is not in Torch, but I believe these thoughts
are valid on a more general level.

Yes, the ability to add names to dimensions of Tensor objects would be useful.
Accessing components by named dimension instead of numeric indices (assuming
clean syntax) would reduce mental overload for both readers and writers of
code. However, the examples provided do a poor job of making the case that
people should use this library. I’ll try breaking the issues down into these
buckets:

* Awkward UX

* Muddied and uninteresting examples

* Missing the point

\---

Awkward UX

First and foremost, it’s hard to ignore that this API is not fully fleshed
out. I appreciate that the author is explicitly seeking feedback and taking an
iterative approach, but it feels like there needed to be some more time spent
thinking about how to best approach fixing the problem at hand from a user
perspective.

Since this post explicitly attempts to take a pragmatic approach, it’s
important to understand that pragmatic users will only tolerate as much user-
unfriendliness in a library as it provides value. Idealistic users will
happily saddle themselves with difficult/painful libraries, but pragmatic
users demand value for pain. And unfortunately, while it is useful to have
named dimensions in a Tensor, it is not so useful that a user should be
expected to completely shift the way they write their code. This library asks
above-and-beyond what I’d expect a client to reasonably sign up for. A couple
examples:

* Forcing users to use a vaguely named op() method to wrap functions, but only sometimes (for functions that “change dimensions”?). * Wrapping Torch library functions, but changing their signature along the way (e.g. the sample() method for distributions) * Incompatibility between NamedTensor and base Tensor with raw Torch (ie. if a client uses NamedTensor, they don’t get to use the rest of the Torch ecosystem as-is without manual access to the underlying Tensor object).

Note that I’m not making note about some of the controversial decisions here
(such as not allowing for explicit dimension access by index). There’s
discussion to be made there, too, but I want to focus on the decisions that
were made implicitly, as that’s where many UX issues can arise.

The author seems open to taking advice and iterating from there. My
recommendation would be to iterate the API even earlier: _before_ writing
implementation code. It will be faster to type out some example client uses
and see what people think than to implement a working version and find out it
fails in some cases. You’ll also find it easier to take a step back, as you
won’t have spent as much time in the code you’ve already written. This may
allow for a fresh perspective on what would make the most useful/useable API.

\---

Muddied and uninteresting examples

Several of the examples given make it hard to compare the original version of
the code and the proposed version. This is most obvious in the MNIST example.
The default version uses a “reassign to same variable” approach, while the
proposed version uses method-chaining. Some of this may be due to some of the
issues mentioned above, but I want to note that it detracts from the author’s
argument as it distracts a reader who many otherwise be on board.

The examples given aren’t particularly inspiring, either. The “Experiments on
Canonical Models” section includes toy examples, many of which are just as
readable (if not more so) in the original format as they are in the
NamedTensor version.

\---

Missing the point

At the end of the day, there’s a large stack of reasons why working on adding
named dimensions to Tensors is useful. After the first article came out, it
seems that the author received a fair amount of comment on how pragmatic their
proposed approach was. Unfortunately, I feel like the author took “pragmatic”
to mean “how would one literally use the library as it stands” as opposed to a
challenge to consider the practicalities of the end-user and the tradeoffs
that need to be made between the ideals of the author and the needs of the
client.

I think the main failing of this library/article is that instead of honing in
on a narrow-scoped thesis of “how Tensors should have their dimensions named”,
it ends up being distracted by the quirks of the way the named tensor library
was implemented. I think the author may benefit from taking a step back from
what they have and thinking about how they could achieve their goals in a more
cohesive way.

~~~
whyever
> Accessing components by named dimension instead of numeric indices (assuming
> clean syntax) would reduce mental overload for both readers and writers of
> code.

Couldn't this be solved (at least for the reader) by using descriptive
variable names when accessing the dimensions?

~~~
krastanov
This is exactly how I solve the problem for myself, however if this is a
language feature it will be neater, more concise, automatically verifiable,
and universally legible by everyone.

