As a Software Engineer that is interested in learning LLM fundamentals, what is your recommended course of action? Are there any courses or curricula you would recommend?
Papers would be good next step. The concept and the algorithms do not seem very complex. The training and fine tuning is what needs effort and infrastructure.
Once again these are not courses per se, but do provide intuitive explanations for how transformers work. There is also the nanoGPT series of videos by Karpathy on youtube. First video here: https://www.youtube.com/watch?v=kCc8FmEb1nY