Whatever, to give some opinion: I've read all of Goldberg's stuff and always think it's very excellent. If you are into (statistical) NLP, his work certainly has "sine qua non" rating...
EDIT: Oh, but yes, you can skip this and read the book if you are that interested in this stuff and can shell out the bucks - it's more complete and better redacted. (EDIT2: By which I mean the book is more up-to-date with refs from 2017, etc., not that the writing in the linked article is poor or anything!)
I think this is "just" a cleaned up version of the draft included in the previous discussion. I believe the expanded version mentioned in the aforementioned link is a longer book which grew out of this paper.
Are you sure you linked to the right article? The linked article is about neural translations using seq-to-seq while TFA is about neural models for all kinds of language processing.