How Transformers Work (2019)

lowdose · on Jan 5, 2020

Great visualization in this post.

Could neural networks in general be compared to the way information is stored in DNA?

The encoder/decoder is in the process and the data loop would be the same as life in this metaphor. The training conditions of the NN are constrained such that it is able to store small variational changes, success accumulates by survival over many trial and error experiments. Like animals in nature DeepMind and OpenAI have shown that NN's can evolve into sophisticated local optimum solutions for specific game environments.

1e · on Jan 6, 2020

in my experience, it is more useful to view neural nets as geometric transformations - via stateful functions - that map stuff in input space (eg a sentence written in english) to stuff in some other space (eg the same sentence written in french).

by viewing neural nets (and machine learning, in general) from a mathematical perspective, you can readily exploit an entire field of tools and techniques (eg numerical optimization) and clearly define objective functions to train against - benefits that you dont necessarily get by viewing ml from a biological perspective.

heinrichf · on Jan 5, 2020

This looks like a dumbed down/partly stolen version of http://jalammar.github.io/illustrated-transformer/ (and his other posts). towardsdatascience usually screams "low quality content" :/

livingmargot · on Jan 6, 2020

I know this is a useless comments, but thank you for the link! Finally helped me grasp this model a bit better, OP's link is kind of garbage...

heinrichf · on Jan 6, 2020

Not useless at all, you're welcome and I'm happy to hear that my opinion about the above link is shared by someone ;)

pjmlp · on Jan 5, 2020

And me thinking this would be some post about how physics would apply to Transformer robots change of state.