
Named Tensors - tosh
http://nlp.seas.harvard.edu/NamedTensor
======
andyljones
This is a good but old article, and PyTorch has in fact integrated a lot of
these suggestions over the past two years.

[https://pytorch.org/docs/stable/named_tensor.html](https://pytorch.org/docs/stable/named_tensor.html)

Integration's still ongoing mind you, so it's not yet a complete replacement
for 'traditional' indexing.

------
npr11
Similar functionality is available in Julia via
[NamedDims.jl]([https://github.com/invenia/NamedDims.jl/](https://github.com/invenia/NamedDims.jl/))

And in other packages, such as
[AxisArrays.jl]([https://github.com/JuliaArrays/AxisArrays.jl/](https://github.com/JuliaArrays/AxisArrays.jl/))
and
[AxisKeys.jl]([https://github.com/mcabbott/AxisKeys.jl](https://github.com/mcabbott/AxisKeys.jl)).
In fact, the AxisKeys README has [a nice overview of packages with similar
functionality]([https://github.com/mcabbott/AxisKeys.jl#elsewhere](https://github.com/mcabbott/AxisKeys.jl#elsewhere)).

~~~
mcabbott
Right, NamedDims.jl is the one closest what the article describes. Very simple
and light-weight, plays well with others.

There is a small zoo of packages attaching also labels along the indices, alla
python's xarray. Perhaps it's a little too easy to write such things... but
once we understand the design space the hope is to quietly take most of them
to the woodshed.

------
sgillen
I’ve been using Xarray recently (mentioned as an implementation of named
tensors in the article) and I have to say I like it a lot. It’s really a game
changer and makes my code a lot more readable.

~~~
apawloski
Just wanted to give a major +1 to xarray. It is an excellent approach to
labeled n-dimensional arrays, and has a good ecosystem/community in the Earth
Science domains..

One nice integration with xarray is with an optimized data store like Zarr
[1], TileDB [2] or Cloud-Optimized GeoTIFFs [3], where you can pull only the
pieces of the nd-array relevant to your problem, rather than downloading a
large monolithic netCDF4 file.

Another great integration in the xarray ecosystem is Dask [4], which is a very
thoughtful abstraction over parallelizing computations across your nd-array.

[1]
[https://zarr.readthedocs.io/en/stable/](https://zarr.readthedocs.io/en/stable/)

[2] [https://tiledb.com/](https://tiledb.com/)

[3] [https://www.cogeo.org/](https://www.cogeo.org/)

[4] [https://dask.org/](https://dask.org/)

------
alpineidyll3
I hate that this feature isn't stable yet. I actually ended up writing my own
just to gurantee a stable interface.

The notion to put name derivation on operations is probably more trouble than
it's worth. Random new ops will never support it, leading to never predictable
results.

Still this is all better than pandas. <Shudder> pandas.

~~~
rmrfstar
> I actually ended up writing my own just to gurantee a stable interface.

Yeah, I do the same. Just write a thin wrapper and hook torch in the
__init__.py.

But ragging on pandas isn't cool. Pandas is probably the most important factor
driving python's popularity over the past decade.

~~~
alpineidyll3
Why is Python's popularity a good thing in itself? If all the people making
plots in pandas just used R, the world would be very much the same.

Pandas promotes bad software practices in my experience with it's horrible
bloated interface. It has cost me a lot of time-- for no benefit whatsoever. I
am actually pretty sure it was released to slow the pace of competing
quantitative hedge funds.

------
dang
Discussed at the time:
[https://news.ycombinator.com/item?id=18823777](https://news.ycombinator.com/item?id=18823777)

------
amelius
Is this how, in R, you can assign names to rows/columns?

------
noctune
Tensorflow's protobuf definitions actually supports naming dimensions:
[https://github.com/tensorflow/tensorflow/blob/master/tensorf...](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/tensor_shape.proto#L24)

Too bad nothing really uses or propagates those names though.

------
canjobear
I’d love for something like this to be more standard. Writing complex stuff in
PyTorch reminds me of writing assembly, except instead of constantly keeping
track of state you have to constantly keep track of what the axes mean. Ideas
like named tensors will have to become standard if “differentiable
programming” is going to be more like regular programming.

------
nyanpasu64
"named dimensions" feels like a dynamically typed version of "giving each
dimension index a distinct type", with the added ability to select nesting
levels by name.

------
CyberDildonics
Is this not just making a map of strings to the integer of the dimension? It
seems like this is something that would be done with enums or hash maps.

~~~
canjobear
It would also change the broadcasting rules.

------
Konohamaru
A tensor is a multilinear map that is invariant under coordinate change? What
do those have to do with machine learning?

~~~
the_svd_doctor
It basically means a multidimensional array in ML.

~~~
Konohamaru
That's a severe abuse of terminology, almost to the point of ignorance. A
tensor is an algebraic object (not limited to linear algebra) with a very
specific definition.

~~~
7402
What's going on here is a collision between the way certain terms are used in
two different fields.

People who learn about tensors and vectors in the culture of Mechanical
Engineering, Applied Mathematics, or Physics, are strenuously trained that a
"vector" is not just any list of numbers, but a specific mathematical object
with specific properties. (Similarly with tensors.) In that culture, the use
of the term "vector" the way it is used in Software Engineering looks quite
odd. An undergraduate might even have a problem set that shows various number
triples and be asked to identify which ones are vectors and which ones aren't.

I was educated under that system, but spent years as a software engineer, so I
am comfortable switching between the two contexts. When I wear my Software
Engineer hat, I am quite comfortable with the generalized use of vector and
tensor as done in software and ML, but for someone who has only grown up in
one culture, it looks very wrong.

