Speech and Language Processing

mdcurran · on Oct 17, 2021

I used the 2nd edition of this textbook in my undergraduate studies extensively (linguistics). Coming from a non-technical background and starting to take technical classes, certain chapters were wonderful ways for me to bridge that gap. Specifically the second chapter on text normalisation helped me apply things I’d learned in 100 and 200 level classes and ultimately set me on the path to becoming an engineer. And I still use that text processing knowledge a lot in my day to day work.

I’m forever grateful to the authors for making these drafts freely accessible (there weren’t many copies of the second edition in my university library!)

armcat · on Oct 17, 2021

Fantastic book, this and Manning's Information Retrieval book (https://nlp.stanford.edu/IR-book/) are some of the best resources on natural language processing, and they are both free. I can just echo what's been already said - thank you for making these high quality resources available to everyone.

lgessler · on Oct 17, 2021

Interesting that the HMM chapter has been moved to the appendix in the 3ed. A consequence of how deep neural nets (CRFs in particular) have supplanted them for most use cases.

armcat · on Oct 17, 2021

They still talk about Hidden Markov Models (HMMs) in quite a bit of detail in the sequence labelling chapter, but you are quite right, Conditional Random Fields (CRFs) and especially neural network based CRFs are in the top rankings when it comes to named entity recognition (NER) and part-of-speech tagging (POS), e.g. see https://github.com/jiesutd/NCRFpp.

tasubotadas · on Oct 17, 2021

I am glad that they've updated it. The previous edition was hopelessly outdated as most if not all of the SOTA solutions now use deep learning.

wodenokoto · on Oct 17, 2021

Jurafsky’s introduction to regular expressions was the one that made it click for me. Both in terms of use case and syntax.