
Implementing a Convolutional Neural Network from Scratch in Python - vzhou842
https://victorzhou.com/blog/intro-to-cnns-part-1/
======
tsumnia
Scrolling HN, I saw the two words I'm always weary of - "from scratch". Mostly
because I will click on the link, hoping to learn the mathematics behind a
particular algorithm, only to see they've import sklearn and skip over all the
explaining of how things are ACTUALLY getting done. Not that these types of
tutorials do not have their place, but its irksome to see "from scratch"
including most of the hard part being done.

With that, thank you Victor. Specifically because you did not do this at all
and instead wrote a very easy to follow guide. I think this type of learning
material will be very useful for CS and mathematics. The idea that very
complicated algorithms must be explicitly implemented and then walked through,
rather than symbols in a white paper will help make the mathematics of CS more
applicable to everyone.

And to anyone bringing up numpy, it is at a level of "prepackaged" I'm fine
with. I'm not going to raise my own pigs and chickens to make a breakfast
burrito, but saying I did it from scratch by microwaving a frozen one isn't
going to cut it either. Numpy is like the basic ingredients to the recipe.
While something like sklearn or tensorflow are perfectly acceptable, I
wouldn't say that's the best method for learning CNNs.

~~~
ericol
This story started really bad, but what a plottwist!

I saw this very same posted here in HN or Reddit last week methinks, and one
of the top comments was complaining preciselly about numpy.

You disarmed that point right from the start, so kudos.

~~~
ericol
I stando corrected based on a comment form the author that is found below:
Last week's post was the previous one in this series.

------
vzhou842
Hey, author here. Any/all feedback is welcome, and I'm happy to answer
questions.

Previous discussion on HN of the "introduction to Neural Networks" referenced
in this article:
[https://news.ycombinator.com/item?id=19320217](https://news.ycombinator.com/item?id=19320217)

Runnable code from the article: [https://repl.it/@vzhou842/A-CNN-from-scratch-
Part-1](https://repl.it/@vzhou842/A-CNN-from-scratch-Part-1)

Github: [https://github.com/vzhou842/cnn-from-
scratch](https://github.com/vzhou842/cnn-from-scratch)

~~~
parkaboy
Honestly: the "convolution" (cross-correlation) part in your article is the
clearest step-by-step explanation I've ever encountered. Well done!

~~~
Ranlot
For those interested in a clear explanation of the backward pass (especially
for convolution layers), here's a good resource:

[https://arxiv.org/abs/1811.11987](https://arxiv.org/abs/1811.11987)

[https://github.com/Ranlot/backpropagation-
CNNs](https://github.com/Ranlot/backpropagation-CNNs)

~~~
savant_penguin
came here exactly looking for the backward pass

~~~
vzhou842
I'll have a sequel explaining the backward pass with full code up by next
week!

------
duked
This is a great tutorial. However, every time I see RNN/CNN it's always
applied to some video stream or set of images. I really would like to find
some tutorial but applied to event logs or other text-based input. Anyone has
a good link for that?

~~~
siekmanj
Andrej Karpathy has a pretty good introductory article to RNNs here:
[http://karpathy.github.io/2015/05/21/rnn-
effectiveness/](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)

He has some code which is pretty easy to follow to go along with the article:
[https://gist.github.com/karpathy/d4dee566867f8291f086](https://gist.github.com/karpathy/d4dee566867f8291f086)

------
amelius
> If you trained a network to detect dogs, you’d want it to be able to a
> detect a dog regardless of where it appears in the image. Imagine training a
> network that works well on a certain dog image, but then feeding it a
> slightly shifted version of the same image. The dog would not activate the
> same neurons, so the network would react completely differently!

But CNNs only deal with translations. What if the image of the dog is rotated?

~~~
vzhou842
True! If dealing with stuff like rotations is a concern, you could augment the
training set by applying small random transforms to it (like rotations,
cropping, scaling, color adjustment, etc).

~~~
amelius
Ok, this makes me wonder how humans do it. Would a person who has never seen a
rotated upside-down dog recognize it?

~~~
greiskul
Well, human children do train their vision in a lot of different ways while
playing, and probably very early develop a mechanism for dealing with all
kinds rotated objects.

For some objects though, even adults are not instantly good, reading text
upside down is something that is initially very hard, but with some practice
it can be done, I've met some teachers that can do it from years of tutoring
people from other sides of desks.

------
master_yoda_1
You copy code and figures from karpathy lecture notes and did not cite him.
Its call plagiarism.

------
windsignaling
Where's part 2? That's going to be the thing that's actually non-trivial

~~~
vzhou842
It's in the works! I needed a few more days to polish it up. It'll probably be
up by early next week.

