Hacker News new | past | comments | ask | show | jobs | submit login

Apparently they don't discuss language models at all.



Which is a major omission, as transformer-based language models are the most powerful available form of "probabilistic artificial intelligence". They predict a probability distribution over a token given a sequence of previous tokens.

My guess is that most of the content in the book is several years old (it's apparently based on an ETH Zurich class), despite the PDF being compiled this year, which would explain why it doesn't cover the state of the art.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: