I'm testing it on a 3-layer perceptron, so memory is less of an issue, but __slots__ seems to speed up the training time by 5%! Pushed the implementation to a branch: https://github.com/noway/yagrad/blob/slots/train.py
Unfortunately it extends the line count past 100 lines, so I'll keep it separate from `main`.
I have my email address on my website (which is in my bio) - don't hesitate to reach out. Cheers!
Here complex numbers are used for an eloquent gradient calculation - you can express all sorts of operations through just the 3 functions: `exp`, `log` and `add` defined over complex plane. Simplifies the code!
The added benefit is that all the variables become complex. As long as your loss is real-valued you should be able to backprop through your net and update the parameters.