
Teaching a neural network to use a calculator - baylearn
https://reiinakano.com/2019/11/12/solving-probability.html
======
FraserGreenlee
Here the neural network was given examples of how to use the calculator for
each question which means it wasn't generating it's own abstractions.

If you wanted to use this to solve other (e.g. programming) problems you would
need examples of every step required for almost every problem.

Using neural networks in this way is akin to locality sensitive hashing,
instead it should understand what it's lowest level operators do and discover
useful combinations of them that can solve new problems.

------
fyp
I haven't been following this field, but anyone know what happened to Neural
Programmer Interpreters (2015)? It seemed like such a promising direction back
then. It showed that a neural network can learn to use arbitrary commands to
execute algorithms such as multidigit addition and bubble sort: [http://www-
personal.umich.edu/~reedscot/iclr_project.html](http://www-
personal.umich.edu/~reedscot/iclr_project.html)

That seems like a much better demo of using blackbox tools as substeps in
problem solving. Is there a reason why it shouldn't work when the blackbox is
a more complex function like sympy's eval?

------
JHonaker
> Something that intrigued me in Saxton et. al.’s paper was how high a
> baseline transformer scored on probability tasks (~0.77 and ~0.73), given
> that working these out are a multi-step process. How could basic pattern-
> matching score so highly on such a task? Is mere perception enough to figure
> out something like the probability product rule, on such a generic
> architecture without any prior knowledge of numbers or probability?

> To try and explain this, we point out that although questions are unique, a
> lot of them will share the same answers. For example, Calculate prob of
> sequence aad from abcda, Calculate prob of sequence bbz from zbbmn, and
> Calculate prob of sequence rpr from {r: 2, p: 1, x:2} all lead to the same
> answer, 1/30.

> Doing a bit of analysis on training set questions, we find that out of 1
> million samples each, swr_p_level_set and swr_p_sequence have 977179 and
> 978045 unique questions, respectively. This seems reasonable, as duplicates
> are limited to <3% of the training set and the distribution over questions
> appears fairly uniform.

> On the other hand, doing analysis on training set answers reveals that out
> of 1 million samples eachs, swr_p_level_set and swr_p_sequence have 1458 and
> 1865 unique answers, respectively.

> Counting the collective number of samples that share the top K most common
> answers reveals even more imbalance.

This is the real takeaway for me from the article.

------
king07828
From the title, I was expecting the neural network to take an input (e.g.,
speech or a string "5+11+3=") and then control mouse movements to push the
keys on a calculator program (e.g., Windows Calculator). I.e., a neural
network driving an existing user interface based on commands from a user.

But the article is more about using neural network transformers to build steps
of a mathematical proof with each step checked by a symbolic "calculator".
I.e., transformers applied to mathematical proofs.

------
The_rationalist
The fact that a neural network isn't even able to calculate, even if only
trained to do this show how limiting are neural network only AGIs.

~~~
j-pb
Of course you could train a NN to do arithmetic, but this is much more
impressive. Training a NN network to solve problems with available tools means
more abstraction, and is closer to AGI than just essentially learning a LUT.

~~~
Sean1708
> Of course you could train a NN to do arithmetic

Are we really capable of teaching a NN to parse and calculate an arbitrary
arithmetic expression? Because that sounds incredibly impressive...

~~~
laretluval
Yes, it can be done.

[https://openreview.net/pdf?id=S1eZYeHFDS](https://openreview.net/pdf?id=S1eZYeHFDS)

Natural language is harder.

