Now it's `one_hot_focus x W1 x (one_hot_context x W2)^T`. So we still pick one r...

MichaelStaniek · on June 3, 2019

Yes, but thats also what happens in the normal formulation, no? So the second weight matrix actually are our context embeddings?

avn2109 · on June 3, 2019

New user "MichaelStaniek" has a grammatical, relevant, good-faith sibling comment to this which is (imho) inexplicably banned, almost certainly due to a mistake, and I hope somebody will unban him.