My point is that they all generalize better from larger datasets. Size is relative and some techniques work better with more or less data. Linear regression, for instance, can work quite well with much less data than a neural net. It just depends on the complexity of the problem.
>> My point is that they all generalize better from larger datasets.
Like I say, this is not the case. There are learning algorithms that generalise so well from few data that their performance can improve only marginally with increasing amounts of data, or not at all.
I appreciate that you probably have no idea what I'm talking about. I certainly don't mean linear regression.
> Like I say, this is not the case. There are learning algorithms that generalise so well from few data that their performance can improve only marginally with increasing amounts of data, or not at all.
Erm, no. Not unless they are solving the problem perfectly.
> I appreciate that you probably have no idea what I'm talking about. I certainly don't mean linear regression.
I work in the field. I'm quite certain i'm familiar with whatever it is that you think you're talking about.
The category of algorithms that attempt to learn things from few examples is called 'One shot learning'. It's usually in the context of image classification, but it applies equally well elsewhere. These algorithms still learns better from more data.
Do feel free to share an example of an algorithm that generalizes better from less data. I'll wait.
>> Erm, no. Not unless they are solving the problem perfectly.
Well, yes, that's what I mean.
I gave an example here a while ago, of how a Meta-Interpretive Learning
algorithm, Metagol, can learn the aⁿbⁿ grammar perfectly from 4 positive
examples:
That's typical of Metagol, as well as other algorithms in Inductive Logic Programming, the broader sub-field of machine learning that MIL belongs to.
>> Do feel free to share an example of an algorithm that generalizes better from
less data. I'll wait.
To clarify, my claim is that there are algorithms that learn adquately from
few data and therefore don't "need" more data. Not that less data is better.
That said, there are theoretical results that suggest that a larger hypothesis
space increases the chance of the learner overfitting to noise. So what is
really needed in order to improve generalisation is not more data, but more
relevant data. Then again, that is the subject of my current PhD so I might
just be interpreting everything through the lens of my research (as is typical for PhD students).
Not to my knowledge. What techniques did you have in mind that work like that?