
From birth to 18, the brain stores 1.5 MB of information to master language - ClarendonDrive
https://news.berkeley.edu/2019/03/27/younglanguagelearners/
======
nostrademons
This is slightly ridiculous - if it were only 1.5MB of information computers
would have a much easier time with it. Google trains their voice-recognition
and language models on petabytes of data and still have trouble matching an
average 10-year-old. Word2vec is significantly larger than 1.5MB, and that's
just the embedding.

Kids pick up way more than just yes-or-no questions like "Is a turkey a bird?"
They're also observing the world visually - when a parent says "Look, it's a
turkey" and there's turkey in front of them, they make the association between
the visual image and the word. They infer information from sentence structure
as well - "it's a ..." is a very good clue that the following word is a noun,
and semantically, only certain words can go in certain places.

They need to make _negative_ associations as well. For example, to my
14-month-old everything that is round is a "ball": wheels are balls, painted
circles are balls, cylindrical rattles are balls, apples are balls, and actual
balls are balls. He'll eventually need to learn that we usually only use
"ball" to refer to 3-dimensional spherical objects used for recreational
purposes, but as everything is a recreational purpose at the moment and he
doesn't really get the concept of "spherical", this comes later.

On the other side, somehow he has managed to learn that the wide variety of
dog breeds are all "gǒu gǒu" (Mandarin for "doggy"), as are 2 of his stuffed
animals, but _not_ the bear. Chihuahas, poodles, german shepherds, corgis,
dachsunds, labradors, newfies - somehow he gets that they're all gǒu gǒu
despite a wide variety of physical appearances. Deep learning models to
classify dogs vs. cats are also significantly bigger than 1.5MB, but have much
less accuracy.

