
Wikipedia Vandal Early Detection: From User Behavior to User Embedding [pdf] - lainon
https://pdfs.semanticscholar.org/83c5/279cff3efc3330fc7df2cc217b65e94ee2c4.pdf
======
b_tterc_p
Not convinced their user embedding creation is useful. Did not read in detail
but it seems to use a list of edits similar to how one may create paragraph
vectors as an average of word vectors. But if I had to guess, they're not
really capturing more information than they originally had with a one hot
vector of whether or not a user had edited a specific article. It would have
been better if they had bench marked against this. I would wager that a simple
random forest and the one hot vector would do just as well if not better than
their NN solution.

