Hacker News new | past | comments | ask | show | jobs | submit login

By the way, this is written by the sixth-place winner of last year's Kaggle Competition for detecting Whales. [1]

I think another very interesting point brought up was when that even such a well-ranked model on Kaggle did a poor job when applied to an different dataset and had to be retained, which is a nice example of over-fitting.

Excellent article and nice details, thanks!

[1] http://www.kaggle.com/c/whale-detection-challenge/leaderboar...

And third-place winner of the second whale challenge :-) https://www.kaggle.com/c/the-icml-2013-whale-challenge-right...

This second challenge actually featured a different dataset with different hydrophones used etc. But even without retraining (which was rather trivial to do at that point; the hard work of finding the right hyper parameters had already been done), I would have still scored well above 90%. And I think Nick Kridler reported the same.

So overfitting yes, but not too much considering there was a different sensor.

Overfit may be too strong of a criticism. If you spend your life only seeing 2's and one day I show you a 3 and you can't tell it's different (you think it's just a poorly written 2), are you overfitted to 2's or did you just not have enough varied experience?

I guess it is technically overfitting, but overfitting sounds wrong when you didn't have access to the extra data in the first place (or even realize you were being given pre-cleaned-up data).

Solution: the first de-noising done with actual noise?

The work is great though. More of this and less bitcoin, godaddy announcements, and SV gossip politics on the front page, please.

I know very little of this competition, but it's possible that's a case of a non-representative training set rather than overfitting.

Learning fluent English doesn't teach you Portuguese.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact