I don't know that they physics analogy works so well, or at least, it's definitely missing something. What prevents the whole word "universe" from collapsing on itself, forming a black hole? That is, if there are only attractive forces, the global optimum is to co-locate everything in the same point, which doesn't give you a useful model. There needs to be something in the model that keeps different words apart from each other.
This page has the clearest explanation of word embeddings and the relationship between the objective function and why vector translation captures meaning.
it works because the gravity of word2vec isn't the gravity of real life.
notice that I only pull the word "dog" to the center of gravity of the rest words, instead of pulling all of them together. I think the full version even push the rest of the words away from the center of gravity.
but I need to double check the math.
this is not just an analogy. this is what word2vec's math says.
the only analogy part is that word2vec is in high dimension, my analogy is in 3 dimension.
It's a working class Caribbean neighborhood right now but was originally built as housing for middle-class Victorian clerics, so there's a lot of beautiful but run down buildings & great transport links (ie. ripe for gentrification).
In the 90s it was extremely rough (race riots), now it's fairly safe and the artisanal coffee shops and trendy restaurants are creeping southwards as the city's population continues to grow & get priced out of other areas.
Guessing a flop house packed with affluent young tech bros is going to throw some gasoline on the Crown Heights hyper-gentrification fire, for better or worse.
This is probably very context dependent, because I've learned the opposite.
For example, I was rewriting/consolidating a corner of the local search logic for Google that was spread throughout multiple servers in the stack. Some of the implementation decisions were clearly made because of the convenience of doing so in a particular server. But when consolidating the code into a single server, the data structures and partial results available were not the same, so re-producing the exact same logic and behavior would have been hard. Realizing which parts of the initial implementation were there because it was convenient, and which were there for product concerns let me implement something much simpler that still satisfied the product demands, even if the output was not bitwise identical.
I didn't read the parent comment as reproducing the exact same logic perfectly. More as a definition of the interface between the external code and the part to replace and matching that interface closely with the replacement.
This isn't always possible but seems like a reasonable objective given my experience.
Noisy environments are exactly when seeing the lips is a huge deal for me. I have a friend who has a tendency to absent-mindedly place his hand in front of his mouth. In a quiet office or home, no issue. In a bar? He pressed mute, as far as I'm concerned.
> That said, merely being in the top 1% of intelligence doesn't seem to be worthy of the genius title.
Especially considering that the population average in many places and times is far below the current population averages normed in the USA and UK and Western Europe. If you asked how many people throughout history or globally would be in the top 1% of the USA/UK/WE population, the answer will be much smaller than a billion. (Thin tails strike again.)
Sure, consider that $10k or whatever that you need for a couple months after you lose your job. In an economic downturn, if your stocks/bonds lose half their value, it costs you $20k of pre downturn, investment dollars rather than the $10k of pre-down turn dollars if the money was in a savings account.