
Why Are Data Science Leaders Running for the Exit? - gk1
https://www.linkedin.com/pulse/why-data-science-leaders-running-exit-edward-chenard
======
reureu
This is an article making baseless generalizations that all data science is
failing in Silicon Valley, and that the true need of a company is to hire a
"data strategist." Unsurprisingly, the article is written by a self-proclaimed
data strategist.

This blog post would be more compelling if there were citations or even
general context backing up the authors pretty wild claims ("the vast majority
(we are talking 80-90%) want to leave their current job", "failure rate of
data science teams is over 90% right now", "the top 5 data scientists I have
ever worked with, only 1 had a PhD").

As a data scientist working for a relatively well known startup in the Bay
Area, I don't dispute that there are problems in this industry. There are many
under-/un-qualified managers running around (I don't know if that's a data
science specific problem, or if it also applies to much of engineering as
well). There are ICs and teams that have trouble producing industry-applicable
work because they're too academically focused. There are also business-driven
teams that have trouble producing quality/reproducible work because they're
too business focused.

That said, the claim of 90% failure is ridiculous on its face. Having another
MBA put together a powerpoint presentation about a company's data strategy
isn't going to suddenly solve the actual issues in data science.

~~~
Amygaz
That was also my impression. It's kind of ironic that a Data
Science/Strategist writing would be so heavily opinionated and have so little
in term of data.

~~~
zengid
Maybe that's because the strategy is to produce bullshit

~~~
setgree
Ha, I like that the grammar of this distinguishes between "there is no
strategy" and "the strategy is, actively, to produce bullshit" and lands on
the second option, which jives with my experience :)

------
mizzao
Data science is a horribly overloaded term.

Ask a databases person and they will tell you about ETLs and relational joins.
Ask a statistician and you will hear about estimators and the bias-variance
tradeoff. Ask a social scientist and they will tell you about experiments and
causal inference. Ask a CS theoretician and they will mention streaming
algorithms and probabilistic guarantees. Ask a machine learning person and you
will probably hear a lot about prediction and stochastic gradient descent.

The approach that will breed the best practitioners is an interdisciplinary
one; a great example is UBC's program:
[https://masterdatascience.science.ubc.ca/program/courses](https://masterdatascience.science.ubc.ca/program/courses).
And also future positions that separate the roles of data janitor, data
analyst, machine learning engineer, and so on. There is a huge difference
between what Facebook's Core Data Science team does and what most companies
call "data science", which is sitting on a huge pile of poorly aggregated data
and having little idea what to do with it.

Most importantly, data science has to be thoughtful and part of the
engineering process, and can't be done after the fact with digital trace data.
It's a lot easier to run experiments than it is to do statistical gymnastics
with observational data.

------
dahart
There's some irony in an article about data science using only anecdote and
emotion to make it's argument. :P

PhD's are notoriously (and understandably) finicky about doing non-research
work when the job is sold as research, that's why you have to be sure you
really need them. That issue has been around since long before data science in
SaaS companies was a thing, it's been around since before the internet was a
thing.

Assuming that there really is an exodus problem, PhDs being unprepared to lead
teams and do non-data-science work is one theory that seems reasonable, but it
definitely isn't covering all possibilities.

Having been in a couple of companies hiring data scientists recently, I
personally think the big problem is expectation for data science in general,
more than who's been hired to do the job. The couple of companies I've worked
with, anecdotally, seem to expect "data science" to uncover huge missing
profits, expose gaping inefficiencies, and reveal new revenue streams the
business people didn't see. And when there turns out to be only minor ways to
streamline things, the team of data scientists need something else to do.

~~~
junkscience2017
I don't blame the practitioners, I blame the business people who foolishly
believed it was easy to shake billions out of their data if they applied
enough wizardry.

for most, Big Data has been a sham, but it was very profitable for cloud
vendors

------
eric_b
Maybe these people want to leave because they realize they aren't adding
value?

I've worked in many Fortune 500's with "data science" or "big data" teams.
These are well staffed, very expensive teams that have large budgets for
pricey hardware (sometimes on-prem, sometimes in the cloud)

I have never seen one of these teams produce insights or actionable
intelligence valued anywhere near their cost. I mean not even close. Usually
it is a tremendous money fire. (Also, before the pitchforks come out, I'm sure
there are places where the data science team is a profitable department, but
it's not the norm, not by any stretch)

Part of the problem is the business doesn't know what questions to ask. Part
of the problem is the technology itself. Spark streaming, Hadoop, and all the
other tools really aren't very good (very good being defined by helping
businesses answer burning questions in a reliable and timely manner)

The most valuable data insights I've seen come from purpose built analytics
tools using simple storage backends (RDBMS, elasticsearch etc) where the
person running the team is a domain expert, not a "data scientist".

~~~
zebrafish
I'm a firm believer that data science should be pulled out of IT and put into
the business. However, storage is so cheap that you should never NOT collect
data if you can. It's better to dump it into a data warehouse and wait for
somebody who can use it to come along than it is to never collect it.

Then, and only then, should you look at your service levels and determine if
there is a need for some sort of Hadoop or Spark infrastructure.

~~~
apohn
>I'm a firm believer that data science should be pulled out of IT and put into
the business.

I say this from personal experience. The major risk here is that Data
Scientists will constantly get pulled in as resources to support business
fires and high visibility projects that should be handled by other people.

Otherwise what happens is this: "I know you are working on that Data Science
project that is scoped for 4 weeks, but can you do this Business Analyst thing
the _insert exec name here_ asked for by this Friday?"

Data Scientists need to be shielded from that day to day analyst stuff,
otherwise they'll never be able to do their jobs. One of the ways to do this
is to put them in IT in a business facing consulting role. IT is typically
(not always) is more focused on a proper solution, not day to day triage. So
they can filter out projects that aren't appropriate.

~~~
IgniteTheSun
This may end up repeating the mistakes of a decade or two ago: after a
business leader gets two requests rejected, they are very unlikely to go back
with a third request - they will simply outsource every future request. Soon
the IT departments were being downsized because they couldn't get business
leaders to use them instead of going outside.

------
shas3
The dissing of PhDs is unwarranted. Gross over-generalization. People being
skeptical of ideas in a peer-review sort of way can be both good and bad. For
example, the assumptions and claims of this article can do with a substantial
amount of skepticism. Perhaps someone should express skepticism and view this
article with the peer-review critical eye that he mentions.

I am reminded of climate-change skeptic articles that say “hey these
scientists know nothing. See today it’s 10 below zero. Common sense! No global
warming.”

~~~
deong
Also, the idea that peer review is rife with people killing articles to save
their "pet theory" is idiotic. Reviewers just want you to cite them.

~~~
taeric
Ideally, reviewers would just want your paper to be as good as it could be.

And yes, we should all appreciate not just the named author of the work, but
all of the work that goes into it. This is true of all works, not just
studies.

------
amrrs
That's very well put reasons. But another significant reason is very often a
highly accurate model is not the right model that an organization wants. Hence
when you are taking this modelling exercise to the next level, the whole
effort of modelling fails because you have spent your time on something that
is not required. Hence the accuracy-sophistication obsession of data
scientists - irrespective of they are leaders or just ICs - make it tough to
establish their credibility and broader organization vision.

And in many other cases, Media has created a myth of Data Scientists as
Magicians, hence a lot of people expect a lot of unknowns from Data Scientists
while a lot of people become Data Scientists always expecting to create
unknowns and magics which is not going to yield any good better than a mere
Exploratory Data Analysis based Insight from Microsoft Excel

------
pseudonymGuy
As someone (a Director) running for the exit... lack of control over: taking
on new projects, staffing, infrastructure, meeting schedules, deadlines.
Complete lack of ability to say "no" to our biggest client. Add in conflicting
priorities from leadership, month long delays in compensation adjustment, lack
of clarity around valuation status, and having to focus primarily on new and
maintenance ETL work.

I got frustrated enough to where I almost quit on the spot a few weeks ago.
Frustrated enough to b on the verge of tears and punches because I felt I'd
wasted the past two years of my life. I worked to get myself transferred from
my leadership role to, ostensibly, a more thought leadership type role which
should happen in next few weeks.

However, if I don't see significant changes in January, February, and March
then I'm leaving. Honestly, I'll probably leave anyway because I think I can
grow my skillset, work on better problems, and set my future up better
somewhere else.

What I hate about this situation is that I used to be the biggest supporter
this company had. But years of this difficulty is enough.

~~~
megaman22
I feel ya; this is almost exactly what I've felt this year. It's just
incompetent management and leadership, and it's all too common. But it's
incredibly demoralizing to be on the front lines of it. When you can't control
anything, and are reduced to reacting on ever more tactical issues, it burns
you out hard and fast. Indecisiveness is killer; better to pick something and
do it, than waffle around forever spinning the wheels.

------
michaelbuckbee
A working knowledge of some basic stats, access to some decent tooling, a good
working knowledge of what the organization is trying to accomplish and the
ability to persuade and lead based on findings will likely move the revenue
needle much more for most companies than a pure "data science" role.

The trouble is of course we don't have a good name for that role, so it's all
"data science".

~~~
kk58
Thats just smart manager with good communication skills

~~~
deong
Most managers fail in the "knowledge of basic stats" and "access to decent
tools" areas. And if they have the knowledge, most larger organizations don't
have a ton of managers sitting around writing python code during the day.
Their day is full of meetings.

------
kk58
Data science and machine learning are different beasts. CS based Machine
learning and AI seem useful in task automation. Its real value is in building
IP.

DATA science is more like quant consulting.It is actually a subdiscipline in
management science in my opinion.

The key role of a data scientist in non SV company is usually 6-fold. First
understand the business problem thoroughly. Get the people,process and value
stack nailed.

Second and this is the most imp part. Figure out how to frame the business
problem. Framing isnt as linear as it looks. But it helps save a ton of time
and money if you front load time in framing problem. Substep here is to frame
it as an "information" or data problem.

Third articulate a strategy to gather data or information. Inhouse data
available? Web scrape data, buy data, build generative models. Youd be
suprised how much ML work happens in this step

Fourth shape the information to a format that can be analysed.

Fifth, study the hell out of this dataset- the PHD angle comes here. Very few
people are systematically trained to study data with rigor and be humble and
honest about their findings.

Lastly connect the dots between insights and business problem at hand.too
often this stops with bar graphs and some scatter plots. Thats just lazy. You
need to really take ownership of the problem and educate business why the
solution will help and be honest about ehat it will take to get it to work.
The people, process,value stack in line 1 kicks in here in recommendation and
action.

A bonus point. Put some kpi to track how well your suggestions are working.

Real data science is super super hard. Its like finding a real nugget of
diamond in dust.

Data science is a way thinking. Its a business culture to be honest.

------
ataturk
It was all hype. All of it. I enjoyed learning about data science, but in the
end, there were no jobs I could actually apply for and realistically get. I
also don't think they were providing the big wins for the company that would
justify what they were getting paid. Again, all hype.

~~~
autokad
same here. i work great in teams, masters in computer science, do really well
in kaggle competitions, I understand when to use the algorithms, how they
work, etc.

but in the very few ds interviews I had, I was tanked as soon I got asked
questions like: whats the formula for a T test. I know what the test is, and
when its not appropriate to apply it, but I don't memorize those kinds of
formulas. the field is just too big to do that

~~~
em500
I'm sorry but a standard t-test is like the most basic thing in statistics
after means and variances. It's not like they want you prove the CLT or
something. To me it's more like a statistics fizz-buzz.

~~~
steve_gh
To which, of course, the correct answer is "Unless you have good reason to
believe the data is normally distributed, you should be using Mann-Whitney",
then explain that :-)

~~~
thousandautumns
Thats not really true though. The vast majority of t-tests are done using a
sample average as a test statistic, and by the CLT, as the sample size goes
up, the distribution of a sample average becomes approximately normal. So
unless you have a really weak sample size, the t-test assumption holds even in
the absence of normally distributed data.

------
saosebastiao
Pardon the cynicism, but all I see when I read this is MBA turd-word lobbing.
It is meaningless drivel, bloviating self-promotion directed at people who
don't understand anything about how their world works while thinking they are
qualified to lead it because of X degree at Y institution, the admittance of
whom was due to Z (powerful rich person) that is friends with his dad.

Others have noted the lack of data coming from a data strategist. That is the
feature, not the flaw. Data isn't necessary in these people's worlds. They
spout platitudes about data, laud it when it supports their world view, and
ignore it when it doesn't. People that buy the world that this guy is selling
are suckers.

------
mykull
The author is way off base in saying software engineering is about assembly
rather than discovery. That's only true of the lowest-skill non-innovating
software shops, and even then, developers are constantly having to discover
how to use the new tool of the month.

On an unrelated note I don't really buy into the idea of data science as a
distinct thing from computer science.

~~~
sidlls
> The author is way off base in saying software engineering is about assembly
> rather than discovery.

Depends. The interview process that is the fad these days selects for assembly
workers. Organizations using that process are doing so to find people who
aren't curious about broadening their knowledge and experience, but to find
people highly competent at repeatedly doing their CS curriculum over and over.

> On an unrelated note I don't really buy into the idea of data science as a
> distinct thing from computer science

How is it not distinct? Data science is closer to a physical engineering
discipline than an applied math one. The tools of the trade might be enhanced
by application of CS, but the two are quite distinct.

~~~
js8
> Organizations using that process are doing so to find people who aren't
> curious about broadening their knowledge and experience, but to find people
> highly competent at repeatedly doing their CS curriculum over and over.

And why are they doing this? Why don't they just buy an existing solution from
another vendor, why build your own solution?

~~~
sidlls
Because such products typically don't exist (either they aren't appropriate
for the scale or the end use, or both).

There's a definite place for these workers.

------
chestervonwinch
As someone near graduation with a PhD in mathematics with research focusing on
machine learning / image processing and beginning the job search, this didn't
leave me feeling very optimistic. Is this a problem of not setting proper
expectations for a role on the company's part or not fully understanding the
expected duties of the position before hired on the data scientist's part?

~~~
ska
It's a bit of both.

You are unlikely to ever achieve a position in industry that has the same
freedom of investigation and time that you experienced as a graduate student,
let alone the holy grail of a self-funded post-doc or the like. On the other
hand, you won't achieve that as faculty either. In industry you're going to
end up spending a lot of time on activities and meetings that don't relate to
what you now consider "what you do" (e.g. ETL, domain interview, co-ordination
etc.). Your success and happiness in this will depend a lot on whether you
consider those things a waste of time taking you away from what you "really
do", or an important part of how you do such activities in larger groups.

On the other hand, data science has a similar problem as software programming;
failing to know how to find or produce good managers for it, companies tend to
promote from the ranks of their best individual contributors. This can work
well with enough support and mentorship (assuming the candidate wants to do
it) but it can be a disaster without a plan and that support.

~~~
chestervonwinch
> You are unlikely to ever achieve a position in industry that has the same
> freedom of investigation and time that you experienced as a graduate
> student, let alone the holy grail of a self-funded post-doc or the like. On
> the other hand, you won't achieve that as faculty either.

which explains my advisor's "enjoy it while you still can!" attitude whenever
we talk of such things :)

~~~
ska
They speak truth :)

------
Maven911
Besides the demands of agile and corporate culture, is there other reasons why
data science folks are leaving (if true)? For instance lack of cooperation, or
not using the insights/recommendations?

~~~
felideon
Our data scientist left because instead of data science, he ended up doing
more ETL and data engineering support than what he originally had signed up
for.

A full-time data engineer should probably have been hired.

I wonder if this leads to an analogous discussion as
[http://wiki.c2.com/?ArchitectsDontCode](http://wiki.c2.com/?ArchitectsDontCode).

~~~
crescentfresh
Not too familiar with the data-this, data-that space. What does a data
engineer do? What does a data scientist do? Does either code?

~~~
ct0
They both code. The engineer would pull the data from various sources and
structure it in a way that the data scientist is happy with. (Tidy format) The
DS would the code the models based on the engineer's work, which takes
considerably less time.

------
Xcelerate
Speak for yourself. I work as a data scientist at a large (non-tech) company.
Our team of data scientists is directly responsible for massive amounts of
profit that the company would not otherwise have. This article seems like a
baseless generalization.

~~~
sonabinu
A lot of value from Data Science comes in large non-tech companies. It's not
the super glamorous type of DS. It is about using data to drive optimizations,
process improvements, manage supply chains, instrumenting alert mechanisms to
monitor key mechanical and software components etc.

------
junkscience2017
probably some realization that 99% of their actual job is ORDER BY, GROUP BY,
SUM...not a bunch of cool math scrawled on a whiteboard

I tried warning people off of this field given that they would be doing sql
query monkey work and drawing simple line charts...most definitely not an
upgrade from software development like it was sold as

~~~
shas3
They’re probably in the wrong company. Are you suggesting there are isn’t an
explosion of jobs and applications requiring advance math and machine learning
or statistical inference? If so you are off the mark.

~~~
Asdfbla
You're right of course, but still there is a point: For 1 'intellectually'
interesting data science job created there's probably 10 that have more to do
with data massaging and all the relatively boring logistical stuff that come
with data science.

Doesn't invalidate your point, but the majority of jobs under the very broad
'data science' label just aren't super interesting after all. Guess you just
have to be careful to examine exactly what a specific 'data science' position
entails.

~~~
wakkaflokka
I've found this to be very true. I've got friends who were hired as data
scientists, paid a great salary, and are finding themselves doing basic
reporting and SQL querying all day. No statistical modeling, and certainly no
building anything.

------
KirinDave
I really question this "90%" number for "failed" data science teams.

While I certainly can see overhype in the space leading to team failure (and I
have indeed seen that), I'm just surprised that the 10% of teams I've been on
were so successful by comparison.

But one of the nasty things I see happening in the space is that often times
really great data science folks, in studying a space, come up with a way to
invalidate a lot of the business models. If a major player adopted these,
there is a chance of success, but it's also a major risk (as it basically
unseats the rest of the business).

As such, I think there is a lot of pressure for good data scientists to end up
in small teams leading what we'd call 'highly disruptive' businesses. We see
this in the finance sector (where it's least likely to succeed) all the time.

------
PopsiclePete
I've never seen a "data scientist" produce anything meaningful, at least at my
company. They run some data through some Python framework and produce weird
meaningless numbers that I'm supposed to be impressed by or to trust. No
insight as to how that number came to be. "87.525%". Ok sweet. Could've come
from /dev/urandom for all I know. At least 3 of them have left in the 2 years
that I've been here.

It seems like 90% of these guys are buzzword wankers who jumped onto the "Big
$$$" bandwagon. Same type of person who put "PHP Wizard" on their resume in
2005.

Good riddance.

I'm sure, of course, that quite a few of them are decent, intelligent people
producing real value at real companies. But the ones I've dealt with? Nah.

~~~
RHSman2
You sound like a decent person. Perhaps they left because of you?

~~~
PopsiclePete
In a company that's thousands of people large, across 5 continents, I have the
power to make dozens of people, most of whom I've never communicated with
outside of the occasional email, leave? I basically have superpowers.

Like I said, I'm sure "data science" as a science is great, and I'm sure
there's plenty of "data scientists" that I would love to sit down and talk
with and learn from. I just have never met any.

------
Spooky23
Depends on what data scientist is within an organization.

Alot of times, it means "the guy who does crap in SPSS or needs Tableau".

------
genzoman
Science progresses at the speed of a funeral.

