All well and good, until the clock stopped working during rush hour, and people started asphyxiating.
BTW, did he stop releasing his writing as creative commons?
I don't see how this story gives a "misleading" view of deep learning. From my (admittedly limited) experience with self-driving RC cars, this type of mistake is quite easy for a neural net to make while being quite difficult to detect. In our case, after utilizing a visual back-prop method, we realized our car was using the lights above to direct itself rather than the lanes on the road.
Now, you can refute this and say "well clearly your data wasn't extensive enough" or "your behavioral model is too simple for a complicated task like driving" however as these tools become easier to use, more and more organizations will put them into practice without as much care as the researchers behind most of the current production efforts.
Contrary to this author's claims, despite using data augmentation and a fancy modern CNN, a neural network trained to identify whales hit a local optimum where it looked at patterns in waves on the water to identify the whale instead of distinctive markings on the whale's body.
I don't buy the "this isn't a problem in real world applications" argument being made in this article.
> This naive approach yielded a validation score of just ~5.8 (logloss, lower the better) which was barely better than a random guess.
which is different from the tank story. For the tanks, the neural network appeared to perform well, but was actually not looking at the tanks. Here, it never performed well, and when debugging why not he found that it was not looking at the whales.
Me neither. Especially considering that this story was already alive before the latest deep learning advances. It is totally believable.
And even with a modern CNN approach, you would expect a model to be able to learn a sunny/cloudy categorization much easier than the nationality of a tank.
This story was repeated by professionals for ages because it is totally believable.
More importantly, it's obvious that "it" could definitely happen and in fact
happens a lot- "it" being overfitting to examples. Machine learning classifiers
suffer from this a lot, it's the whole bias/variance tradeoff issue. Neural Nets
are not only not immune to overfitting, they are even particularly vulnerable to
it (especially the ones with millions of parameters). We've probably all read
the adversarial examples papers- a clear case of overfitting to irrelevant
The story (apocryphal or not) seems like a cautionary tale against overfitting,
or a not-so-innocent attempt to poke fun at machine learning researchers. One
way or another, overfitting is no joke and it's definitely no urban legend.
? How is it strange? Presumably people are not, right now, still retelling the story because they are terribly afraid that there are perceptrons out there from the 1960s lurking, waiting to strike, or that anyone is going to go out and try to use 1960s style perceptrons. People are telling it as a cautionary story about current NNs, in the 2000s and 2010s and 2017. Which means... CNNs. So it's worth asking, can it happen with CNNs as trained by any reasonably standard workflow?
> More importantly, it's obvious that "it" could definitely happen and in fact happens a lot- "it" being overfitting to examples.
Overfitting is not dataset bias, as I note several times. For example, dropout or heldout datasets or crossvalidation are highly effective in fighting/detecting overfitting, but do nothing about dataset bias.
Can you give an example of where overfitting happened and was successfully corrected for?
If irrelevant feature is present in all class samples, then it is not a fault of NN to use it as a class feature, it's bad data.
My question is are people using overfitting as an excuse of a what is instead a badly made NN.
If you are smart enough to create a NN that can tell if it's sunny or not then tanks would also be possible. But if your NN just sucks than blaming overfitting is a convenient out.
Overfitting is a major issue in machine learning and it's an inherent
characteristic of learning from examples and not the result of a mistake, or of
poor practice. There are special techniques developed explicitly to reduce
overfitting- early stopping (what red75prime above, describes), regularisation,
bagging (in decision trees) etc. A lot of work also goes into ensuring measures of
learning performance don't mistake overfitting for successful learning (e.g. k-fold cross
I'm sorry that I don't have time to track down a good source for a discussion of the
bias-variance tradeoff and overfitting. You can start at the wikipedia page
[https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff] and follow the
links. In short- a model that learns to reproduce its example data with very
high fidelity, risks generalising poorly, whereas a model that generalises well
may have high training error. Linear classifiers in particular are
high-bias, whereas nonlinear learners, like multi-layered neural networks or
decision trees, are high-variance.
The problem is real, it's a big bugbear and you won't find any specialist who
dismisses it, or who considers it "not a real issue".
Over-fitting is dismissed as an amateur mistake when it really is an endemic problem you are constantly battling no matter how good you are.
What the heck is going on there?
He's schizophrenic, is famous for TempleOS and infamous for the contents of his posts on the internet.
Rainer Kordmaa • 6 months ago
Kinda reminds me a story of how a neural network was
trained by military to detect camouflaged tanks on
terrain, except pictures with tanks were taken on a nice
sunny day and pictures without on a cloudy day and instead
of a tank detector they ended up with a sunny day detector
In my opinion, it's far more dangerous to downplay the limitations of this technology and embolden snake-oil purveyors than it is to demand an inconvenient degree of rigor and caution in reporting results.
I am well-aware of adversarial examples, and they are not the same thing as dataset bias, and I am very troubled by them. If you look at the section on whether we should tell the tank story as a cautionary story, I already say:
> I also fear that telling the tank story tends to promote complacency and underestimation of the state of the art by implying that NNs and AI in general are toy systems which are far from practicality and cannot work in the real world (particularly the story variants which date the tank story recently), or that such systems will fail in easily diagnosed and visible ways, ways which can be diagnosed by a human just comparing the photos or applying some political reasoning to the outputs, when what we actually see with deep learning are failure modes like "adversarial examples" which are quite as inscrutable as the neural nets themselves (or AlphaGo's one misjudged move resulting in its only loss to Lee Sedol).
To expand a little: dataset bias at least has the tendency to expose itself as soon as you try to apply it. You waste your time, but that's generally the worst part. I'm more worried about stuff like adversarial examples, which will work great in the field right up until a hacker comes by with a custom adversarial example (eg the adversarial car sign work showing you can trick simple CNNs into misclassifying speed limits and stop signs using adversarial examples pasted onto walls or signs or streets). This is not dataset bias; you can collect images of every single stop sign in the world and that will not stop adversarial examples.
> embolden snake-oil purveyors than it is to demand an inconvenient degree of rigor and caution in reporting results.
I think it's ironic to say that doing the very simplest level of fact-checking like 'did this story ever actually happen' is an 'inconvenient degree of rigor and caution' and 'emboldens snake-oil purveyors'.
Until today, I believed it was true. It was told to me as an undergrad, by a professor who believed it himself.
Many times it seems like people go into these things hoping that the machine learning part will figure out things for them and relieve themselves of the problem of thinking hard about the problem. It doesn't. It only moves your problem over a bit and increases the difficulty.
In fact this problem pops up even in pedagogy where the lessons people are taught actually train them to do the wrong thing (for example pilots responding to aircraft attitude upsets).
The parable's lesson is a simplistic one, basically: "stop and think about what you're doing". But like other simple lessons about crying wolf or stitching in-time, it bears repeating.
Well, that's not quite true. In robot sensing several things have recently moved from the nigh-on-impossible column to the holy-shit-that-actually-works-pretty-well column, thanks to ML.
But I agree with the rest of it.
He's not arguing against having cautionary tales, he's arguing that we should base them on actual problems instead of imaginary ones.
However, also consider that the tank parable has circulated in textbooks and undergraduate introductory lectures for several decades. The main lesson I learned from the story that after training a model, one should validate it to see if it does generalize to detecting tanks both during night and daytime. Is it really a surprise that it might be difficult find egregiously naive mistakes?
1. Training on a biased data set leads to biased predictions. This is undoubtedly true.
2. Data sets can be biased in unexpected and unforeseen ways, so therefore productions can also be biased in unexpected and unforeseen ways. The examples at the end of this article don't quite touch on that point. But examples of this abound in social science. Eg: https://blog.conceptnet.io/2017/07/13/how-to-make-a-racist-a...
3. Deep and convolutional neural networks are susceptible to this phenomenon. This is the point that the article is debating.
And there is a big difference between something that happened and something that did not happen.
This is a hasty link, IIRC the error happened when someone read out instructions aloud to someone else took who was taking notes.
Yes, that badly behaving neural network(s) was human, and therefore far more sophisticated than any we can build yet. Which makes the problem worse and more real, not better or less real.
From this viewpoint, I found the section where the author lengthly argues how this could not possibly happen with the current state of the art visual task CNNs (especially because people apply preprocessing steps such as whitening and augmentation to get rid of exactly this kind of biases), let's say, weird. The parable is not about CNNs, it is about the importance of paying attention what features your model will extract from the training dataset and whether your model is learning the right things.
There's even hierarchical models with an equation giving the probability that an item will be observed at all, conditioned to known features.
Those who don't know their statistical models are bound to reinvent statistical theory.
Is there anybody still doing this?
The Google query gwern cites is highly misleading because "normalize" in the context of neural nets for computer vision almost always means "subtract the average and then divide by the standard deviation."
If you want some better data than unreliable searches, go download pretrained models for popular architectures and popular frameworks and look at the input pipelines for them. You'll find that whitening is absolutely not common for image classification/detection today (yes, there are still some cases where it is used, but typically on smaller datasets where you can't get that invariance from data, which is the way you prefer it to be - if one class actually is more likely to be present in dark images, you don't want to kill that information).
AlexNet is the name of a convolutional neural network, originally written with CUDA to run with GPU support, which competed in the ImageNet Large Scale Visual Recognition Challenge in 2012. The network achieved a top-5 error of 15.3%, more than 10.8 percentage points ahead of the runner up. AlexNet was designed by the SuperVision group, consisting of Alex Krizhevsky, Geoffrey Hinton, and Ilya Sutskever.