
Checking Code Style with Neural Networks (2017) - dsr12
https://blog.prismatik.com.au/checking-code-style-with-neural-networks-f9e7a05553e7
======
andbberger
The mind boggles. This is just a bad idea, it's not even interesting...

> Code style is an inherently subjective and difficult thing to judge

wat.

Why would you want an opaque probabilistic style checker...

~~~
gentaro
Imagine if this were a real life practice. The thought of having to gradually
acclimate yourself to the whims of your nebulous NN style checker is pretty
hilarious.

------
aalleavitch
Isn’t this exactly the opposite of what you want when you’re checking style?
The entire point of a style is that it’s a list of strict, simple rules that
you want to follow without variance.

Unless this is trying to get deeper than merely style, and is supposed to
evaluate the design/structure of the code, which can be more nebulous. That’s
a lot to ask of a neural net, though, when most developers struggle with it.

------
shoo
It's easy to overfit models with huge numbers of parameters when there isn't
much data, which produces a model that's possibly amusing but not any more
useful than an overcomplicated key value store that memorises the training
data. I'm not familiar with RNN / LSTN models, or how torch-rnn encodes
features, so it's pretty opaque looking at the writeup and results to gauge if
there's a bunch of overfitting going on here. Is 2 megs of javascript
sufficient to train a model of this complexity? Or would 20kb suffice? Or is
2GB really the minimum needed? I have no idea, but it doesn't really seem like
this article knows either.

I believe in the olden days where computational power was limited,
statisticians would hand craft simple statistical models with low numbers of
parameters, and figure out how much reasonably distributed data they'd need to
collect and encode to the constrain the parameters in their model (i.e. so the
resulting model fit wasn't complete nonsense and had some predictive accuracy
or explanatory power). These days with much more computational power, it is
relatively easy to fit highly complex models with huge numbers of parameters,
but arguably not much easier (or perhaps harder?) to assess if the model fit /
model fitting procedure is complete nonsense or not, and if the resulting
model is any good. I guess k-fold cross validation and so on can help, it
doesn't read like that was explicitly done here. By default the torch-rnn
docco claims to use 0.1th of the data for test and validation sets. This
sounds... small?

The writeup gives results in terms of "Loss" for a number of configurations,
but doesn't define what the loss is (which subset of the data was used to
compute it) or really explain what the experimental protocol was for training,
validation, parameter tuning, etc.

There is a risk that the author has observed the loss measured on the test or
validation set for each combination of hyperparameters (network topology,
dropout, etc) and then used this information to manually implement
hyperparameter optimisation (to choose network topology, dropout parameter, to
minimuse loss on the test / validation set). this would result in there being
no "fresh" unobserved data left to give an independent validation of if the
resulting model fitting strategy (including the manually-executed
hyperparameter tuning algorithm) has any predictive accuracy as all.

------
snarfybarfy
My built in neural network already does this for me.

And it only takes ONE neuron to do the job:

    
    
        func isGoodStyle() if (true) return false;

~~~
nayname
No neural network here, but decision tree)))

------
nayname
Seq2seq translator (which trains to translate from one human language to
another) is not a good architecture to learn computer code.

~~~
andyonthewings
What can be used instead?

~~~
nayname
Another thing I like is NPI -
[https://arxiv.org/pdf/1511.06279.pdf](https://arxiv.org/pdf/1511.06279.pdf).
At the end, you can find NPI vs Seq2Seq charts

