
How Transformers Work (2019) - bra-ket
https://towardsdatascience.com/transformers-141e32e69591
======
lowdose
Great visualization in this post.

Could neural networks in general be compared to the way information is stored
in DNA?

The encoder/decoder is in the process and the data loop would be the same as
life in this metaphor. The training conditions of the NN are constrained such
that it is able to store small variational changes, success accumulates by
survival over many trial and error experiments. Like animals in nature
DeepMind and OpenAI have shown that NN's can evolve into sophisticated local
optimum solutions for specific game environments.

~~~
1e
in my experience, it is more useful to view neural nets as geometric
transformations - via stateful functions - that map stuff in input space (eg a
sentence written in english) to stuff in some other space (eg the same
sentence written in french).

by viewing neural nets (and machine learning, in general) from a mathematical
perspective, you can readily exploit an entire field of tools and techniques
(eg numerical optimization) and clearly define objective functions to train
against - benefits that you dont necessarily get by viewing ml from a
biological perspective.

------
heinrichf
This looks like a dumbed down/partly stolen version of
[http://jalammar.github.io/illustrated-
transformer/](http://jalammar.github.io/illustrated-transformer/) (and his
other posts). towardsdatascience usually screams "low quality content" :/

~~~
livingmargot
I know this is a useless comments, but thank you for the link! Finally helped
me grasp this model a bit better, OP's link is kind of garbage...

~~~
heinrichf
Not useless at all, you're welcome and I'm happy to hear that my opinion about
the above link is shared by someone ;)

------
pjmlp
And me thinking this would be some post about how physics would apply to
Transformer robots change of state.

