Sure, but all this is only within one space. The "internal language" is a formalism of that space, and not beyond.
What I think we're doing is, you know, forming concepts which are not constrained to spaces. I dont even really think our concepts, which are mostly just bundles-of-techniques, live within the space of the target object, nor its measurment space (ie., that of experience).
Eg., my "pen" includes being able to grasp a pen (indeed, the relevant motor skills are prior-to and necessary-for the abstract concept formation).
What you're talking about is a pixel-space projection of "pen" being able to function like the (animal, body-first) concept "pen". It doesnt.
Rather "pen" in pixel space is, sure, an "internal langauge" of a sampling of AllPixelPatterns concerned with pixel-space projections of "pen". This i'd call a "template", and I dont think it has almost anything to do with concepts/representations/etc.
What we are doing when we acquire motor techniques which produce "ways of sensing and moving" that eventually could become reified as the abstract "Pixel Pen" is really nothing at all like sampling PixelPenSpace and deriving a tempalte.
To see the difference, consider that our concept allows us to resolve ambiguities -- eg., if I think something might be a pen, I can go beyond one space of measurement (eg., sight) -- eg., move the pen, write with it etc. -- and thus return to the target space "Is Pen?" with sufficient confidence.
Once such "mere templates" exist, everything we do, isnt required. Indeed, a calculator can take over at that point; put it in a cupboard forever to repeat whatever thin inferences pixel templates admit.
> Sure, but all this is only within one space. The "internal language" is a formalism of that space, and not beyond.
This is incorrect:
You can swap encoders and final fully connected sections, then mildly retrain the network to substantially save on training effort — “transfer learning”.
Further, you can compose networks, eg image recognition on top of something extracting a structural map.
This implies that the structures they’re “learning” generalize to different contexts — with a bit of retraining. In much the way your knowledge of programming transfers between languages (swapping encoders) and tasks (swapping final connected network).
Your point seems deeply based on this incorrect understanding of DNNs.
Sure, as set up by the way we have constructed the dataset. "Green" isn't near "Leaves" because the world placed them there.
All these properties follow
from properties of the data which we have structured to produce these solutions.
We arrange the target domain of one dataset up, so that the target domain of another is aligned --- and in doing so, solve the only problem which requires intelligence.
Yes, the character-space structure of Y labelled stuff aligns with the pixel space structure of Y labelled stuff. And feature space templates can likewise be associated.
This is just more of the same. King +Woman isn't Queen --- it isn't anything. This isn't reasoning. And likewise PicOfKing+PicOfW isnt PicOfQueen.
This is schizophrenic pseudoscience --- it's actually the mind of a
schizophrenic who obtains sequential thoughts just by non-semantic associations.
The latent structure being learnt in both cases is just the coincidences we rig in the Y
domain --- Inhave no doubt they 'transfer', we
wrote them in
You can show a DNN a large sample of photos of animals, retain the lower sections of the network, and then show it a large sample of landscapes — and it will be able to recognize those faster precisely because a well-trained network picked up on the latent structure of shapes, and can reason about shapes in its internal language for describing pictures. It’s acquiring genuine semantic information — and learning about new experiences relative to that previous acquired understanding.
We didn’t create that similarity of form in nature — and DNNs discover that in much the way we do: finding recurring, abstract patterns to utilize in our internal language.
We labelled them -- their pixel space geometric structure has no inherent relevance to their character space structure --- and the only inferential
contexts this will work in is where we rig the deployment to only require these sorts
of coincidences.
Animals acquire robust techniques which are not brittle to these sorts of thin contexts.
We do so by not templating pregiven data -- but by coordinating our bodies in the world, and rearranging it, so as to determine how to regulate ourselves in response to it.
Only a teen reading cosmo would think King-Man=Woman. The reason we can even interpret this thin formalism is because of our robust understanding of the concepts that these character symbols name.
With these concepts we don't cross domains, we regulate our own imagination, and so on, so as to create representations which are thick across an infinity of domains.
Talking in terms of feature domains has it backwards -- it assumes they're available for measurement. We don't transfer thin templates across
domains. Our representations aren't within a feature space --- they're techniques for self-regulation.
With these we can construct templates in an infinite number of different feature spaces -- ie., project by imagination from our techniques
the downvotes on this one show some blind alligiance to word2vec it seems to me; I tend towards questioning the playing field, not clamping to a (large) set of useful vactors to be the new "hammer looks for nails"
Humans ability of conceptual modeling is entirely constrained by their perception. Humans can't conceptualize a tesseract. All of their conceptualizations are in some form related to their perception.
When you're thinking of a pen, are you thinking of the atoms, molecule compositions and all the quantum effects going on? No. Your mind is, just like ML, working from a template constrained by your perception (visual, haptic, auditory).
Humans are more advanced than ML, but they are nothing special.
Human can't visualize a tesseract, but human can conceptualize the Idea of a tesseract in the symbolic space, by math, physics, or in other words, by Reason.
The Symbolic is a radically simplification of all the complexities impossible to be fully sensible. Even though the simplification is always particular, contingent, and full of ambiguity (human languages) and often inaccuracy (Newton's laws vs relativity), without the simplification, without Reason, ML systems are probably like animals, eventually succumbing to the full force of the complexity of reality.
Perception in animals changes the structure of their bodies as they are coordinating with their environment, such that they acquire motor techniques and hence new ways of structuring their perceptions.
Perceiving the world isnt a passive activity in which facts strike your eyes. The world doesnt have these facts, there is no data, and nothing from which to simply "average to a template".
When light strikes your eye, there is no "keyboard" in it. Nothing in it from which derive even a template of a keyboard.
There are only templates in datasets we prepare -- and we dont prepare them by actually "encountering datasets in reality". Rather, we arrange reality and measure it so-as-to-be-templateable.
What animals have is the ability, with effort, to engage in this dynamic -- to arrange the world to make it knowable. It is this process of arrangement which requires intelligence; not "taking the average after it has happened".
There is nothing in the world to be "percieved" in the ML sense, as in: ready for analysis. That's an illusion we construct: we make "perceiable data".
It is our bodies, and our concerns, which make the world itself perceivable. The world itself is infinitely dense with infinities stacked on infinities. There isnt "data" to be templated.
And this isnt theorertical: no ML system will ever work. All that works is our data preparation.
What I think we're doing is, you know, forming concepts which are not constrained to spaces. I dont even really think our concepts, which are mostly just bundles-of-techniques, live within the space of the target object, nor its measurment space (ie., that of experience).
Eg., my "pen" includes being able to grasp a pen (indeed, the relevant motor skills are prior-to and necessary-for the abstract concept formation).
What you're talking about is a pixel-space projection of "pen" being able to function like the (animal, body-first) concept "pen". It doesnt.
Rather "pen" in pixel space is, sure, an "internal langauge" of a sampling of AllPixelPatterns concerned with pixel-space projections of "pen". This i'd call a "template", and I dont think it has almost anything to do with concepts/representations/etc.
What we are doing when we acquire motor techniques which produce "ways of sensing and moving" that eventually could become reified as the abstract "Pixel Pen" is really nothing at all like sampling PixelPenSpace and deriving a tempalte.
To see the difference, consider that our concept allows us to resolve ambiguities -- eg., if I think something might be a pen, I can go beyond one space of measurement (eg., sight) -- eg., move the pen, write with it etc. -- and thus return to the target space "Is Pen?" with sufficient confidence.
Once such "mere templates" exist, everything we do, isnt required. Indeed, a calculator can take over at that point; put it in a cupboard forever to repeat whatever thin inferences pixel templates admit.