Hacker News new | comments | ask | show | jobs | submit login
How to build a simple neural network in 9 lines of Python code (medium.com)
62 points by bryanrasmussen on June 29, 2017 | hide | past | web | favorite | 14 comments

How to draw an owl in two lines of code;

    import owl

I chuckled, but I don't think the criticism is fair. The imports are for generic math tools that aren't specific to neural networks, the most complex one being a dot product, so I'd say there's no trickery here.

This comment reminds me of this one time in 2006 when I did a little meeting where I was advocating for using RDDL (http://rddl.org/) to find Danish Government XML Schemas, and demonstrated a bit of code I had written to enrich a webpage with rddl data and someone (won't say who here) announced that he had that same thing working on his blog and it only took one line of code and the use of the word framework for this was a little bit pretentious.

Clearly not an applicable analogy, in this case.

>The human brain consists of 100 billion cells called neurons, connected together by synapses. If sufficient synaptic inputs to a neuron fire, that neuron will also fire. We call this process “thinking”.

86 billion neurons, with many thousands of synapses between each.

If enough action potentials from presynaptic neurons arrive within a little enough amount of time, the postsynaptic neuron will _probably_ also fire.

We do not call this process "thinking".

Yeah in general I find that cleaning the data, doing exploratory analysis, doing sensible feature engineering, etc. are much more involved tasks than running ML algorithm XYZ.

I've often seen many fixed-input posts like this. I've been looking for something that will allow me to train a model that will take a varying number of inputs and produce a single output.

So for instances:

      [list of 10 items] -> Some Number
      [list of 500 items] -> Some Number

These items are not reducible to a single scalar value. They have too many widely varying meanings.

Can anyone point me in the right direction?

TLDR: This is what LSTMs do.

E.g. http://machinelearningmastery.com/sequence-classification-ls...

More detailed answer. LSTMs have 2/3 modes : sequence-to-sequence (immediately), sequence-to-single-output, and sequence-to-sequence (delayed).

Sequence-to-sequence immediate is generally referred to as "seq2seq models", if you want to google it. This is used, for instance, in Deep Speech. Essentially the network takes in a sequence and immediately generates a new sequence from it.

Sequence-to-single-output is called "sequence classification", and is used in text sentiment analysis. The network takes in a sequence of items and comes up with some number.

Sequence-to-sequence (delayed) is called sequence generation. An example use would be translation. The network takes in a sequence, thinks about it a bit, and then outputs a new sequence.

There are other things that may work well. For instance I've found "windowed convolution" over a sequence to work well, even better than LSTMs (and certainly easier and quicker), for sentiment analysis. You essentially make a "window" of items, and have it output "bits" (1 = true, 0 = false). For every window frame you generate these bits and then you maxpool (or just add) them over the entire length of the sequence. Of course this will never detect anything longer than the window.

I am not an expert, but maybe look into recurrent neural networks [0]? These things can turn a sequence of inputs of possibly arbitrary length into a fixed size internal representation, and can output a single value derived from this representation.

[0] https://en.wikipedia.org/wiki/Recurrent_neural_network

most implementations are not arbitrary length, but instead rely on padding to make it a fix length sequence.

I've been meaning to do a similar exercise, and this one is very helpful. Thanks.

What he calls back propagation is actually gradient descent.

Backpropagation (as in, the training method for NNs) is an instance of gradient descent. There are many other instances (e.g. any EM algorithm does gradient descent, as does minimum search on a simple real function), so using the more specific term is appropriate.

for the 400th time.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact