Hacker News new | past | comments | ask | show | jobs | submit login

Now it's `one_hot_focus x W1 x (one_hot_context x W2)^T`. So we still pick one row of the matrix from the focus and context embeddings, but they're separate embeddings.



Yes, but thats also what happens in the normal formulation, no? So the second weight matrix actually are our context embeddings?


New user "MichaelStaniek" has a grammatical, relevant, good-faith sibling comment to this which is (imho) inexplicably banned, almost certainly due to a mistake, and I hope somebody will unban him.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: