Haugeland is GOFAI/cognitive science, not directly relevant to modern machine learning variety of models unless you are doing reinforcement learning or trees stuff (hey poker/chess/Go bots are pretty cool!). Russel and Norvig are the typical introductory textbooks for those. Marks and Haykins are all severely out of date (they have solid content, but they don't have the same scale of modern deep learning which has many emergent properties).
You are approaching this like an established natural sciences field where old classics = good. This is not true for ML. ML is developing and evolving quickly.
I suggest taking a look at Kevin Murphy's series for the foundational knowledge. Sutton and Barto for reinforcement learning. Mackay's learning algorithms and information theory book is also excellent.
Kochenderfer's ML series is also excellent if you like control theory and cybernetics
For applied deep learning texts beyond the basics, I recommend picking up some books/review papers on LLMs, Transformers, GANs. For classic NLP, Jurafsky is the go-to.
A quick point about the "tree stuff" and Norvig&Russell:
While it does cover minimax trees, alphabeta etc, it only really provides a very brief overview. The book is more of an overview of the AI/ML fields as a whole. Game playing AI is dense with various game-specific heuristics that the book scarcely mentions.
Not sure about books, but the best resource I've found on at least chess AI is chessprogramming.org, then just ingesting the papers from the field.
To your second point I have a sneaking suspicion whatever is recommended in this very thread will suddenly jump in its estimation as a “classic.” History is made up as it goes along!
Well, GP's Neural Smithing is a solid example. There is nothing wrong with it, it is surprisingly well written and correct for something published before the millenium.
Take a look at the Google Books preview (click view sample). The basics are all there, intro to biological history of neural networks, backpropagation, gradient descent, and partial derivatives etc. It even hints at teacher-student methods!
The only issue is that it missed out on two decades of hardware development (and a bag of other optimization tricks). Modern deep learning implementations requires machine sympathy at scale. It also doesn't have any literature on autoregressive networks like RNNs or image processing tricks like CNNs.
Appreciate the comment very much. I feel like I need to build a foundation context in order to appreciate the significance of the latest developments, but I agree that most of what I posted doesn't represent the state of the art.
You are approaching this like an established natural sciences field where old classics = good. This is not true for ML. ML is developing and evolving quickly.
I suggest taking a look at Kevin Murphy's series for the foundational knowledge. Sutton and Barto for reinforcement learning. Mackay's learning algorithms and information theory book is also excellent.
Kochenderfer's ML series is also excellent if you like control theory and cybernetics
https://algorithmsbook.com/ https://mitpress.mit.edu/9780262039420/algorithms-for-optimi... https://mitpress.mit.edu/9780262029254/decision-making-under...
For applied deep learning texts beyond the basics, I recommend picking up some books/review papers on LLMs, Transformers, GANs. For classic NLP, Jurafsky is the go-to.
Seminal deep learning papers: https://github.com/anubhavshrimal/Machine-Learning-Research-...
Data engineering/science: https://github.com/eugeneyan/applied-ml
For speculation: https://en.m.wikipedia.org/wiki/Possible_Minds