Hacker News new | past | comments | ask | show | jobs | submit login

Not really, he afterwards says that he was more trying to inject some humility. He really doesn't think this is measuring anything of interest. For the birds result in particular, see https://twitter.com/BarneyFlames/status/1531736708903051265.



If I read what that tweet says properly, the system ended up outputting things that were almost scientific nomenclature for the general class of items it was being asked to draw. There are probably many examples of "bird is an instance of class X" in the text but they are not consistent, and the resulting token vector is a point near the center of "birdspace".


Yes. Indeed, it seems to interpret a lot of nonsense tokens it doesn't recognize as though it's probably the Latin / scientific term for some sort of species it doesn't remember very well (keeping in mind that all these systems are attempting to compress a large corpus into a relatively small space). I think https://twitter.com/realmeatyhuman/status/153173904648934195... is best illustrative of this phenomenon.

So, it's certainly an "interesting" result in the sense that it shows how these kinds of systems work, but it's definitely not a language.


Why is it important if it's "a language" or not? What we're talking about are concept representations (nouns), not languages. But I think most people who read "DALL-E has a secret language" probably picked up on that because we're accustomed to the hype in machine learning naming things to sound like they are more profound and powerful than they really are.


It's important if it's a "language" because the original thread claimed that it was one (and indeed, a number of comments in responses to this article are still making that claim). You may argue that discovering how DALL-E tries to map nonsense words to nouns is independently interesting, and that's fine (I don't find it interesting personally though--considering it has to pick something, and the evidence that these spaces are not particularly robust when confronted with far out of sample input, I don't even think calling it a secret vocabulary would be accurate), but the authors should reasonably expect some pushback if they argue that this is linguistics.


It didn't pick "something"- it chose scientific nomenclature as a basis, and synthesized new classes from that basis.

They're not nonsense words, they're words with high probability which are not seen in the dataset.


When questioned about the change of tone, he answers "Well... a little bit of twitter hype makes a thread go a long way".

https://twitter.com/emnode/status/1531852124501553153




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: