

Aetherial Symbols - etiam
https://drive.google.com/file/d/0B8i61jl8OE3XdHRCSkV1VFNqTWc/edit?pli=1

======
araes
Such an important concept and a title that's for clever not clarity. I'm glad
somebody finally published this. Short version:

\- words are the symbolic indicators of thought vectors

\- words carry with each a probabilistic stream of potential further thoughts
and links to past symbols.

\- much like implicit CFD, they are backward convolved with prior words to
determine most likely hidden thought, and then forward solved to determine
next word.

\- further, these streams are described with formal logic relationships based
on the identities of the included words which can have levels of "meta-
identity" (ie: I can't know some pair are brother and sister without having
having been given the idea of bros/sis pairs or seen others)

\- knowledge of more or varied relationships (and more logic paths) provides
me more efficient / accurate ways to solve an optimized path through the
higher dimensions of word / symbol space.

\- in a sense, I may never know the idea of "bros / sis" but it is
probabilistically highly likely that given a male and female with the same
parents that they are also bros / sis.

So close to the tech tree ding.

------
comex
Interesting. I'd love to see the output of this 25-way machine translation
neural network.

Still... I'm not very knowledgeable about AI, but the part of this that
describes biological mechanisms really feels handwavy to me.

 _What we have in our heads is not a cleaned up version of the input. There
are no pixels or symbol strings in the head._

 _All we have in our heads is big activity vectors that cause more big
activity vectors._

Big activity vectors means nothing. What is the structure of these vectors?
Even if the brain's image processing turns out to have a lot of random
complications and exceptions that fit the data your eyes have seen before
(even if these innumerable complications are _necessary_ for it to work well),
surely some god with perfect understanding of it could still summarize it as
_generally_ following such-and-such algorithms. Even if you don't know what
the neurons in your model are for, I bet most of them are in fact "for"
something comprehensible by mere humans, and understanding that would make it
easier to design good starting conditions for models. Not that I can prove
it...

(Note: I suspect that the author of this presentation has a more nuanced view
than what I'm complaining about - or maybe I'm just misinterpreting it. This
comment is just intended to start discussion about what's written on the
slide.)

~~~
mdda
The 'big activity vector' language is partly a reference to the Vector Word
Embedding stuff (see [1] for an explanation). The surprising thing about that
is that it is possible to learn an embedded of individual words in a multi-
dimensional vector space, such that (for example) vector(Queen)-vector(Woman)
~= vector(King)-vector(Man). Which is to say that there's a general 'royalty'
direction within the vector space - and all this can be learned purely from
seeing large amounts of English text (no 'traditional' supervised training).
Perhaps a 'god' could identify the meaning of each direction in the space (or
region of words), but the big Machine Learning labs (Google, Baidu, Montreal,
Stanford, Facebook, etc) are proving that purely manipulating the vectors in
the abstract works really well.

In addition to the 'word vectors' as inputs, the RNNs illustrated are also
iterating over an internal state (flowing from left to right though the same
network for each new word) - and this internal state is also an embedding of
some kind. But it's going to be very difficult to decipher what each dimension
here represents, as it's being built purely as a function of the input word
vectors, its own previous state and a NN with initially random weights.

Now, although actual 'brain experiments' have shown that individual neuron (or
local clusters) apparently light up when particular thoughts are had
(alternatively, cause thoughts to be had), each cluster seems likely to be
just one aspect of (say) 'dogginess'. So, one area will correspond to the
smell of dogs, others to wet noses, others to being outdoors (i.e. all aspects
of the overall 'dogginess' concept) - but these things will all overlap in
multiple ways with other concept 'vectors'. Which is how huge spaces of ideas
are searched in parallel, rather than sequentially (using, say, an
is_doggy_quality symbol).

There are also parallels here with the Numenta Sparse Distributed
Representations [2].

Overall, this presentation seems to be probing at the frontier of what works,
and how to leverage that up into something that's more about 'general
thinking' rather than pattern matching. It also appears to be a thought-piece,
rather than a conference presentation (though, of course, Hinton deserves to
be heard on just about anything in NNs, IMHO).

[1] [http://colah.github.io/posts/2014-07-NLP-RNNs-
Representation...](http://colah.github.io/posts/2014-07-NLP-RNNs-
Representations/) [2] [https://github.com/numenta/nupic/wiki/Sparse-
Distributed-Rep...](https://github.com/numenta/nupic/wiki/Sparse-Distributed-
Representations)

------
beefman
Posted yesterday:
[https://news.ycombinator.com/item?id=9427474](https://news.ycombinator.com/item?id=9427474)

~~~
mdda
Perhaps it's not getting votes because the title is so non-descriptive...

"Hinton's internal presentation on AI and Deep Learning" might attract the
attention it deserves.

(I'm not really advocating a title change/resubmission, but if someone's
reading the comments before going to Google Drive, it may be more of a hint
about value/bandwidth)

------
sbpayne
I feel his explanation of activity vectors closely resembles/indicates Hebbian
learning rules (due to the dependence on co-occurrences).

Pierre Balid from UCI has a paper coming soon about the theory behind learning
rules; I think it might present some interesting ideas for this work from
Hinton.

------
politician
I would love to see what would come out of such a NN trained on a set of chess
games. Would it "make the next valid move"? Would it appear to behave
tactically? Strategically?

