In the "Doing bad digital humanities with color vectors", if you consider colors as 3D vectors, which they do, you'll see that summing enough uniformly sampled vectors always gives medium browns, because that's the color in the middle of the colorspace. Instead, you should model colors in a polar space and sum vectors in that space. This will prevent going inside the sphere and losing color saturation.
The original model comes from Kim 2014, https://arxiv.org/abs/1408.5882 It's a very neat use of CNNs for language processing, instead of the more popular RNNs/LSTMs. CNNs have the advantage of training much faster.
Yes, I've not looked at fasttext but word2vec is a simple 1 hidden-layer network to learn word embeddings, which can then be used as pre-trained word embeddings in other tasks.
CNNs and RNNs would be used in the next stage of the pipeline for whatever your task is (machine translation etc), probably as some way of combining the word vectors. RNNs are especially nice as they can deal with variable length sentences. Note also that it's possible for systems to learn their own systems as part of training, rather than using pre-trained ones from word2vec etc.
In the "Doing bad digital humanities with color vectors", if you consider colors as 3D vectors, which they do, you'll see that summing enough uniformly sampled vectors always gives medium browns, because that's the color in the middle of the colorspace. Instead, you should model colors in a polar space and sum vectors in that space. This will prevent going inside the sphere and losing color saturation.
It's explained quite well here in the Interpolation section: http://www.inference.vc/high-dimensional-gaussian-distributi...
If you want to understand contemporary use of words embeddings in ML, a nice simple model is explained, with full code, here: https://blog.keras.io/using-pre-trained-word-embeddings-in-a...
The original model comes from Kim 2014, https://arxiv.org/abs/1408.5882 It's a very neat use of CNNs for language processing, instead of the more popular RNNs/LSTMs. CNNs have the advantage of training much faster.