
A friendly Introduction to Backpropagation in Python - sushantc
https://sushant-choudhary.github.io/blog/2017/11/25/a-friendly-introduction-to-backrop-in-python.html
======
apetrov
I had a really good time adapting Karpatny's blog post to python myself but it
didn't give me sufficient understanding so i continued with [1], then [2] and
finally deciphering [3].

[1] [https://mattmazur.com/2015/03/17/a-step-by-step-
backpropagat...](https://mattmazur.com/2015/03/17/a-step-by-step-
backpropagation-example/) [2]
[http://peterroelants.github.io/posts/neural_network_implemen...](http://peterroelants.github.io/posts/neural_network_implementation_part04/)
[3] [https://iamtrask.github.io/2015/07/12/basic-python-
network/](https://iamtrask.github.io/2015/07/12/basic-python-network/)

~~~
chewxy
I'd like your opinions on this that I wrote:
[https://blog.chewxy.com/2016/12/06/a-direct-way-of-
understan...](https://blog.chewxy.com/2016/12/06/a-direct-way-of-
understanding-backpropagation-and-gradient-descent/)

------
partycoder
Some suggestions:

1) I would make a better distinction between the function declaration and the
program output. e.g: format the output differently. like gray.

2) Capitalization. "InvalidWRTargError". It would help if you could capitalize
it as "InvalidWrtArgError". This is a guideline in most coding standards.
[https://en.wikipedia.org/wiki/Camel_case#In_abbreviations](https://en.wikipedia.org/wiki/Camel_case#In_abbreviations)

3) Better naming:

\- "getNumericalForwardGradient": Are there non-numerical gradients?

\- "applyGradientOnce": A function is applied once per invocation by
convention.

Then it would be good if you formatted using PEP8, as it is standard in
Python.

~~~
sushantc
Thanks partycoder! Points taken; will make some changes.

Regarding numerical gradients, named it so to differentiate it from analytical
gradients, which leverage formulas from calculus. The "numerical" ones are
calculated using (f(x+h)-f(x))/h every time.

~~~
ydidntithnkftht
Python community conventions are not camel case for functions...

forwardAddGate

would be

forward_add_gate

and return is not a function call...

return(max(x,y)) or return(x+y)

would be

return max(x, y) or return x + y

spaces around operators...

x + y not x+y

spaces around function args...

def foo(a, b) not def foo(a,b)

and when calling...

foo(1, 2) not foo(1,2)

[https://www.python.org/dev/peps/pep-0008/](https://www.python.org/dev/peps/pep-0008/)

Just things to think about when publishing python code for the greater
community.

~~~
partycoder
There's a package that verifies PEP8 for you.
[https://pypi.python.org/pypi/pep8](https://pypi.python.org/pypi/pep8)

------
kamyarg
My eyes hurt.

[https://www.python.org/dev/peps/pep-0008/](https://www.python.org/dev/peps/pep-0008/)

~~~
gspetr
Sadly, this CamelCase style is too entrenched in the academia and is very
often present in books that use Python but written by academics, not
professional Python developers.

------
amelius
Something I was wondering about lately: how can we back-propagate through a
max-pooling layer in a neural network?

[https://datascience.stackexchange.com/questions/11699/backpr...](https://datascience.stackexchange.com/questions/11699/backprop-
through-max-pooling-layers)

~~~
LolWolf
Yeah! Another way of seeing it is that the derivative is a small
(infinitesimal) perturbation around a region of interest:

Any input that isn't maximal will be some finite distance away from the
maximum, so any small enough perturbation won't change it (thus it has zero
derivative). If we change the entry which is maximal, though, then the maximum
changes proportionally to it (with proportionality constant 1), so we're done
and the derivative is one for the maximal entry [0] and zero for any other
ones.

\---

[0] If there is more than one maximal entry, then any convex combination for
the entries that are maximal is a valid "derivative-like" operator (i.e.
subgradient).

------
sushantc
Intuitive explanation of backpropagation from first principles with a simple
python implementation

~~~
partycoder
Superlatives in this phrase: intuitive, simple.

These should be ideally determined by the reader, not the author.

