Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is a very compressed work-through from perceptron to transformer.

When he is working through the gradients of an LSTM, for example, it is to help understanding, not help you implement it in your favourite framework.

When he is showing solutions in various frameworks, the purpose is to help create connections between what the math looks like and what code can look like.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: