I'm half-way through your excellent article. How do you produce such great artwork?

I believe I understand the concepts of CBOW and skip-gram. But I'm a little bit stuck. I kind of don't understand this [0]. In fact I understand it so poorly that I can't even formulate a question around it.

[0] https://skymind.ai/images/wiki/word2vec_diagrams.png

Edit: An attempt at formulating a question: is it the process of feeding the model with the [context][context][output] vector that you are depicting?

Thanks! Mostly Keynote, and lots of iteration.

I'll be honest, I personally found this figure puzzling. Still not 100% clear on it, but I don't believe it refers to the negative sampling approach. My best guess is that it's referring to earlier word2vec variants where the input in skipgram (or sum of inputs in CBOW) are multiplied by a weights matrix that projects the input to an output vector.

It shows the input output pairs you would use to train the network. Projection is simply your fully connected layer of dimension the embedding size you want (e.g., something like 300). The output column is what is being predicted by the model, for which you have the true data and you'll calculate a loss and backprop as usual. In the BOW case you take multiple context words and predict the middle word (as shown in your diagram) and skip gram is the opposite approach.

