When I first read this paper I thought it was thought-provoking and captured the tension being referenced pretty well.
Over time, I've come to see it as pretty dated and misleading.
The problem is that the methods of both "cultures" are pretty black box, and it's a matter of which black you want to dress your box in. Actually, it's all black boxes anyway, all the way down, epistemological matryoshki.
The real tension is between relatively more parametric approaches, and relatively nonparametric approaches, and how much you want to assume of your data. That in turn, reduces to a bias-variance tradeoff. Some approaches are more parametric and produce less variance but more bias; others are less parametric and produce more variance but less bias. In some problem areas the parameters of the problems might push things in one or another direction; e.g., in some fields you know a lot a priori, so just slapping a huge predictive net on x and y makes no sense, but in other fields you know nothing, so it makes a lot of sense.
Another tension being conflated a bit is between prediction and measurement (supervised and unsupervised classification, forward and inverse inference, etc.). Much of what is being hyped now is essentially prediction, but a huge class of problems exist that don't really fall in this category nicely.
I disagree that computational statistics was being neglected in statistics. What I have seen is a new method (NN classes of approaches) got new life breathed into it, and became extraordinarily successful in a very specific but important class of scenarios. Subsequently, the "AI/ML" learning label got expanded to include just about any relatively nonparametric, computational statistical method. Maybe computational multivariate predictive discrimination was neglected?
A lot of what AI/ML is starting to bump up against are problems that statistics and other quantitative fields have wrestled with for decades. How generalizable are the conclusions based on this giant datasets to other data? What do you do when you have a massive model fit to a idiosyncratic set of inputs? How do you determine your model is fitting to meaningful features? What is the meaning of those features? Why this model and not another one? There are really strong answers to many of these types of questions, and they're often in traditional areas of statistics.
Anyway, I see this paper as making a sort of artificial dichotomy with regard to issues that have existed for a long long time, and see that artificial dichotomy as masking more fundamental issues that face anyone fitting any quantitative models to data. It's a misleading and maybe even harmful paper in my opinion.
Over time, I've come to see it as pretty dated and misleading.
The problem is that the methods of both "cultures" are pretty black box, and it's a matter of which black you want to dress your box in. Actually, it's all black boxes anyway, all the way down, epistemological matryoshki.
The real tension is between relatively more parametric approaches, and relatively nonparametric approaches, and how much you want to assume of your data. That in turn, reduces to a bias-variance tradeoff. Some approaches are more parametric and produce less variance but more bias; others are less parametric and produce more variance but less bias. In some problem areas the parameters of the problems might push things in one or another direction; e.g., in some fields you know a lot a priori, so just slapping a huge predictive net on x and y makes no sense, but in other fields you know nothing, so it makes a lot of sense.
Another tension being conflated a bit is between prediction and measurement (supervised and unsupervised classification, forward and inverse inference, etc.). Much of what is being hyped now is essentially prediction, but a huge class of problems exist that don't really fall in this category nicely.
I disagree that computational statistics was being neglected in statistics. What I have seen is a new method (NN classes of approaches) got new life breathed into it, and became extraordinarily successful in a very specific but important class of scenarios. Subsequently, the "AI/ML" learning label got expanded to include just about any relatively nonparametric, computational statistical method. Maybe computational multivariate predictive discrimination was neglected?
A lot of what AI/ML is starting to bump up against are problems that statistics and other quantitative fields have wrestled with for decades. How generalizable are the conclusions based on this giant datasets to other data? What do you do when you have a massive model fit to a idiosyncratic set of inputs? How do you determine your model is fitting to meaningful features? What is the meaning of those features? Why this model and not another one? There are really strong answers to many of these types of questions, and they're often in traditional areas of statistics.
Anyway, I see this paper as making a sort of artificial dichotomy with regard to issues that have existed for a long long time, and see that artificial dichotomy as masking more fundamental issues that face anyone fitting any quantitative models to data. It's a misleading and maybe even harmful paper in my opinion.