Hacker News new | past | comments | ask | show | jobs | submit login

I can't understand the first two layer RNN which according to the author optimized the word vectors.

it says:

During training, we can follow the gradient down into these word vectors and fine-tune the vector representations specifically for the task of generating clickbait, thus further improving the generalization accuracy of the complete model.

how to you follow the gradient down into these word vectors?

if word vectors are the input of the network, don't we only train the weight of the network? how come the input vectors get optimized during the process?

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact