Hacker News new | past | comments | ask | show | jobs | submit login

Leo Breiman [0] (inventor of bagging and random forests) wrote a paper called "Statistical Modeling: The Two Cultures" [1], and since I read it, I see it everywhere. The basic idea is that Statisticians place(d) too high an emphasis on model interpretability ("data modeling" in the paper), and as a result, missed out on the revolution of machine learning ("algorithmic modeling" in the paper). In the author's words (parenthetical added by me), "[T]he focus in the statistical community on data models (simple, interpretable models) has [l]ed to irrelevant theory and questionable scientific conclusions."

In this TDS post, the author says "Statisticians and Actuaries are at the bottom of the heap as a prior role for existing data scientists." Maybe this isn't a coincidence? Plenty of companies had statisticians on staff, but the explosion of data science happened anyway. Why? Because data scientists do the same types of tasks as statisticians, but while statisticians are of the data modeling culture, data scientists are expected to be of the algorithmic modeling culture. It seems that the market is saying that the algorithmic modeling culture is getting results.

The author references "Type A vs Type B Data scientists" [2], which seems to be getting at the same thing: "The Type A Data Scientist is very similar to a statistician... Type B Data Scientists share some statistical background with Type A, but they are also very strong coders and may be trained software engineers. The Type B Data Scientist is mainly interested in using data "in production." They build models which interact with users, often serving recommendations (products, people you may know, ads, movies, search results)." For whatever reason, there is a correlation between Algorithmic modeling / Type B and "getting things done".

[0] https://en.wikipedia.org/wiki/Leo_Breiman

[1] https://projecteuclid.org/download/pdf_1/euclid.ss/100921372...

[2] https://www.quora.com/What-is-data-science/answer/Michael-Ho...

With the failure modes we’ve seen from deep learning and related AI, maybe the statisticians are on to something?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact