
Ask HN: Was data science just hype? - curiousk
I don&#x27;t hear a lot about &quot;data science&quot; anymore. And judging from the shrinking number of job postings, I suspect it was a bit overhyped a few years ago. What do you think?
======
tictacttoe
I think what companies really want is smart generalists with advanced math,
programming, and modeling skills coupled with domain knowledge. That skill set
will always carry high value in technical companies.

The reason it carries value is the skills are difficult to acquire. I think
the recent decline in interest reflects the rise of new data science
candidates that are taking the path of least resistance to a career in data
science. Rather than pursuing problem solving, people are pursuing "data
science" which is a nebulous term in and of itself.

~~~
Ntrails
I am wary when people wax lyrical about all of the ways they love using
machine learning on data. It makes me nervous because i worry that they have a
hammer and can't wait to use it on anything vaguely nail shaped.

~~~
mr_toad
ML makes predictions; testable predictions.

Machine learning is an area where you need to be able to produce results. Fake
it ‘til you make it isn’t going to cut it for long. Either these people
produce something that works, or they don’t.

~~~
geezerjay
> Machine learning is an area where you need to be able to produce results.

Having to produce results is one thing. Mindlessly throwing tensorflow/pytorch
at problems is an entirely different problem.

It's like those front-end devs who mindlessly insist that they need to use
heavy javascript frameworks with convoluted build processes such as
React/Angular to churn out a static web page with a couple of paragraphs and
images.

------
lame88
There's a lot of hype in software. Just like you were hearing about Blockchain
startups only when Bitcoin started trading at 5 bazillion USD and now you
don't hear nearly as much about it. During 2008-2015ish we went through an
insane churn of new JavaScript frameworks and NoSQL database hype.

Yeah it seems to have calmed, but I don't think data science was _just_ hype
because it comes from (and somewhat is) probability and statistics, and the
rate of data/information that's being produced by and extracted from people
seems to be ever increasing. But it absolutely was prone to a hype cycle as
with almost anything else in tech. IMO this is a phenomenon exacerbated by
venture capital.

~~~
nemild
That hype in software is the reason I wrote a media literacy guide for
software engineers. Hype makes it easy to get free marketing and also drives
clicks for media and social networks.

[https://github.com/nemild/hack-the-
media/blob/master/softwar...](https://github.com/nemild/hack-the-
media/blob/master/software-engineers-media-guide.md)

~~~
lame88
Nice, at a glance this is pretty well-detailed. Bookmarked. Wait, I've been
hacked!

------
blihp
It wasn't 'just' hype, but it was over-hyped. There are companies that have
their act together from a data standpoint and can make use of data scientists,
whatever that term actually means in the context of their organization, but
most can't. So the companies who spun up a data science initiative but had no
business doing so are now likely saying things like 'what do you mean we don't
have the necessary data?' and 'what do you mean our data is a mess?' etc. and
will quietly back off over time. Likewise, the companies who can take
advantage of it will quietly do so. No different than the hype surrounding
every other buzzword in tech... there is no silver bullet.

~~~
kevinventullo
If the data is a mess, cleaning it up is very much legitimate data science
work in my opinion.

~~~
dual_basis
Sometimes it can't be cleaned up, or the process of cleaning it up takes too
long or is more expensive than the company wants to spend. Sometimes the clean
up process is error prone, or leaves you with too little useable data.
Sometimes the data really is too noisy and no amount of clean up is really
possible. It's definitely true that data clean up is a problem data science
can address, but it's not a magic wand.

------
OldHand2018
The hype was that you could take a huge pile of data and turn it into hugely
valuable insights.

The realization is that any random pile of data likely doesn't have anything
in it that is worth paying for: Here's our analysis! We already do/knew that.

Some people have really good, valuable data sets. Most people don't.

~~~
foolrush
It’s an epistemology / ontology question, as folks familiar with the
humanities would spot in little time. Aka it’s not “data” until something
empowers the created metric a meaning.

The map is not the territory.

[https://www.amazon.com/Raw-Data-Oxymoron-
Infrastructures/dp/...](https://www.amazon.com/Raw-Data-Oxymoron-
Infrastructures/dp/0262518287)

~~~
bravura
I know what all those words mean, I've studied critical theory, and I have no
idea what your point is.

~~~
itronitron
whenever I hear the word ontology I reach for my dictionary

~~~
diehunde
It's starting to become a buzzword in data science too

------
bradleyjg
My guess is that data science roles will merge with business analyst roles.
Python and r will slowly join excel as tools of choice for making tables and
charts to stick in powerpoint slides and pdf reports. Meanwhile the machine
learning side of things will be the domain of _something_ engineers with
candidates more likely to come from the computer science/math/engineer world
rather than the sciences. (Other then those with degrees in physic who seem
able to perpetually land wherever they feel like.)

~~~
darkhorn
> My guess is that data science roles will merge with business analyst roles.

Data Scientist is a buzz word for Statistician. Business Analyst is buzz word
for Industrial Engineer. For example 10 years ago if you studied at my
university you would witness that some Statistics students were doing second
major mostly at Industrial Engineering and vice versa. They are already
related for many years but average Joe has no idea.

~~~
Gene_Parmesan
I still don't really know what a business analyst role truly entails. In my
office they seem more like mini project managers w/o the management. They talk
to internal stakeholders a bunch, handle a lot of our UAT, and I guess do some
reports stuff? One has moderate tech skills (but no programming) and the other
is just really good at Excel.

I'm still not sure what their actual formal responsibilities are.

~~~
kthejoker2
I mean, a business analyst _should_ do just that: analyzes the business. It's
usually their job to translate strategic areas of improvement in the business
unit into specific outcomes.

They ask questions to identify areas of improvement; translate those into
functional (and sometimes technical) requirements for other areas (not just
IT) to fulfill; and then coordinate the efforts to implement those
requirements, potentially as PMs, product owners, Scrum masters, UAT leads, or
just a SME.

The best BAs (paraphrasing the data science JD) know more tech than the
business and more business than the tech.

------
joelschw
We're passed the early peak of the hype cycle, but now the marketers have
calmed down the real world applications are maturing.

If you think of Data Science as AI sure, but if you frame it as applied
statistics + good software engineering practices + cloud scale I think things
are in a good place.

~~~
sha_r_roh
This. Any good data science person needs to understand how distributed systems
work. They need to be decent at applied statistics. I think an average
engineering grad can easily work with simple statistics they learn in Linear
algebra and intro to ML. The good software engineering part is what has been
brought on in the last few years.

I once saw some code written by a "data scientist". The overall code was non-
complex, but the Java/Scala code was the worst of my nightmare. Additionally,
I think other engineers have also matured enough to understand that underneath
the veneer of data science, the fundamentals do not change much.

------
ddragon
"AI" (or whatever rebranding it gets) always works in cycles, with a phase of
excitement and overpromising followed by a phase of apparent underdelivering
and skepticism. But what actually happens is that the innovations just become
part of the normal tooling, and stop being called "AI".

At some point there is no need to hire a "data scientist", as any python
programmer is already expected to know how to use numpy, pandas, sklearn and
keras, just like before it was already expected for them to do any kind of
data manipulation with SQL without requiring a dedicated database expert.

~~~
PLenz
The value is a "Data Scientist" isn't that they know how to use a tool - it's
that tell know why to use _that_ tool (technique) and not this other one.

~~~
ddragon
I'm definitely not denying the worth of the particular skillset, just like of
good DBAs. But as with any skillset, there are diminishing returns. You can
get fairly easily someone at the point where they can merge data from multiple
sources, create automatic reports with graphs, make simple similarity
clustering, regressions and expert systems even if in suboptimal ways and most
companies don't really need more than that. They can even learn to integrate
cloud/black box solutions for image/speech recognition without having any idea
of how to write one from scratch.

Of course, if a company wants to truly innovate in the area it will need PhDs
or people with great dedicated knowledge in ML/Statistics/Particular Domain,
if it needs to scale it will need good data engineers to create the data
pipeline together with DBAs and experts in each tools (like Spark/Flink), but
for most companies the basic above is already a great improvement to what they
had before.

------
xs83
It wasn't hype, however people got very confused about what they actually
needed vs what they thought they wanted.

When someone says they want a "Data Scientist" what they really mean is "I
want a Data Scientist who is also a Data Engineer".

I have seen so many companies spend a really decent chunk of money on a data
scientist and then are shocked to find that this data scientist doesn't know
how to deploy models, set up spark clusters or know how many and what type of
GPU they need to use to get the job done.

After all - that is not their purpose.

We were in a similar situation, but what we needed was a Data Engineer - we
had a rough idea of where we wanted to go and what we wanted to achieve, he
was doing a Masters in Data Science so he had that background as context.

We will look at adding a Data Scientist to our ranks in the future - but they
will be working side by side with a Data Engineer who can action their
requirements!

------
ficklepickle
The "data scientist" where I recently contracted struggles with generating
basic reports. I saw a full page SQL query with the caption "yeah, data
science is hard" posted to slack. Terrifying.

I think the term "data science" is often misused. It seems to make management
feel like they are on the cutting edge. They were talking about AI and a R&D
department the other day. They aren't even making use of simple heuristics
yet! I guess that talk helps with fundraising though.

------
flensortow
Over the past several months I keep seeing people trying to equate data
science with machine learning, and it made me wonder if the people doing this
are trying to salvage (or perhaps enhance) the investment they made in data
science by trying to blur the lines between the two.

~~~
Bostonian
Isn't the line between the two indeed blurry? Maybe deep learning is machine
learning, but modern statistical methods such as elastic net, SVM, and random
forests are things data scientists should know about.

------
cedricd
At my previous company data science has become synonymous with data analysis,
to the point that the number of data scientists on staff is starting to
outnumber data analysts. I think it's more a sign of a maturing field than
anything else. The more narrow view of data science as big data, models, and
machine learning is probably less a thing now, but data analysis overall is
only getting bigger.

------
usgroup
No it isn’t a fad. Data collection by every man and his dog is really
happening. The need for people that can use the data in order to improve
business outcomes is the consequence of it.

Data collection will become more prominent IMO because:

1\. Data driven business preference, competitive advantage and FOMO. Already
dominates sales and marketing. Starting to dominate in product and dev.
Already dominates production.

2\. IoT, and more data marketplaces resulting from it.

3\. Extensions of the global SaaS value chains (usually connected by data).

Hence data science will thrive in the future.

------
redstone08
I think people are realizing that data scientist without domain knowledge
cannot create valuable insights.

Enterprises seems to hire less data scientists actually, but they are trying
to raise their employees' data skills.

I think that's the cause of the growth of self-analytics tools. Below are
examples of them.

1\. Metatron Discovery : [https://metatron.app](https://metatron.app) 2\.
Metabase : [https://metabase.com/](https://metabase.com/)

~~~
sweeeety
I would like to hear more about my European fellows w.r.t. how GDPR affected
their ability to muster domain knowledge.

I used to work for a small start-up and the CTO was very strict on data
access, making my life as feature developer and "data scientist wanna be"
almost impossible.

He, on the other hand, had not only access to all data but also used the
product as a consumer (which didn't make sense for ICs so we ended just
playing with sales demo accounts). I ended leaving the company because of
that.

~~~
retiredcoder
I’ve been in similar situation and it was really hard to be effective in
product / high level planning meetings because it is easy to be blindsided. At
the time, I was still an youn engineer in a big co, so I just thought it was
my lack of technical experience, but in reality it was nothing technical to it
but BS politics.

What boils down to is that people who have any extra data access privilege
will have the lead.

Most of the insights will come from aggregate data, so I think companies could
work around privacy concerns but I am no GDPR expert.

Back in my days in academia, there was a saying “if you have the trace, you
have the paper”.

------
darkhorn
Was? As I always said it is Statistics. If you want "data scientist" employ
Statisticians.

~~~
natalyarostova
In my professional experience this is a little misguided. A pure PhD
statistician isn't going to be able to hack it working on fast-paced
production software environments and building end-to-end pipeline/software/ML
systems. I mean, no doubt a PhD statistician could learn and be good at it,
but the average statistician isn't geared up for this type of work.

On the other hand, your standard tech data scientist may find themselves out
of their element if needing to design a very rigorous randomized trial for
testing a new drug, and making careful inference (I mean I'm sure plenty
could, but I'm not going to trust a 25 year old with two years work experience
to do that).

------
sonnyblarney
With the cloud came a lot of data, so 'big data' was a wave in which Engineers
had to deal with ti, and 'data science' was the wave in which we tried to
leverage it.

The reality is that most insight from 'big data' are optimizations. They're
not going to move the needle on the business as perhaps we would have hoped.

Data Science focused on ad targeting - now that might move the needled.

And of course, maybe some Data Science working along side AI engineers make a
breakthrough which could move the needle.

But from a high level, CEO's view, all of these things have trendy
undercurrents, the trick is to figure out how much of it really matters to the
business.

The 'wins' for consumers will be slow: maybe better product search, better
ads. Maybe they figure out how to send flights around ugly weather or how to
slot landing times for an x% decrease in flight delays. Or slot road
fixing/lights for an x% decrease in traffic delays.

------
daturkel
I work on a data science team that's growing pretty aggressively. I also
frequently hear from recruiters hiring at companies big and small for analyst
and scientist positions.

I'm not sure what it would mean for days science to be "just hype." I see DS
work on this website alone all the time.

------
starpilot
It is "different now". The DS bubble popping was covered well by Vicki Boykis:

> Since academia is typically a lagging indicator in adoption to new trends in
> the work place, it’s been long enough that it’s truly worrying for junior
> data scientists, all of who are hoping to find data science positions. It
> can be very hard for someone with a new degree in data science to find a
> data science position, given how many new people they’re competing with in
> the market.

[http://veekaybee.github.io/2019/02/13/data-science-is-
differ...](http://veekaybee.github.io/2019/02/13/data-science-is-different/)

------
dagw
It wasn't 'just' hype, but the real problem was that very few companies had
any data that you could actually extract much value from. Garbage in Garbage
out is as true as ever.

That being said I've observed that the data science techniques and tools that
have developed over the past few years have been absorbed and adopted by a lot
of people that aren't "data scientist". So while companies are hiring a lot
less "data scientists", a lot of "data science" is now done by domain experts
and analysts as part of their work.

------
thedevindevops
The best quote I've heard for ML is "ML promises to deliver what Computer
Science promised to deliver in the decades before as AGI will promise to
deliver in the future"

------
RasputinsBro
No, science was not hype. It's not like the fashion is over and now we're
gonna go back to making worse decisions based on less data.

------
diehunde
I think they are evolving to more specific roles. A couple of years ago some
"Data Scientists" were actually doing Data Engineering. Now that distinction
is more clear. You also have Machine Learning Engineers, who can help with
deploying models or you can even see things like "Deep neural network
engineer" or NLP data scientists.

~~~
thibautg
What is the definition of “Engineer” in “Data Engineer” and “Machine Learning
Engineer”? And what is the difference with “Scientist”.

I’m not from the US and in my country, “(civil) Engineer” is a legally
protected title
[https://en.m.wikipedia.org/wiki/Civil_engineer#Belgium](https://en.m.wikipedia.org/wiki/Civil_engineer#Belgium)

------
rcarmo
It’s been subsumed inside traditional BI teams, who don’t interact with
“generalist” developer teams and are an entirely different audience.

So the field keeps growing, but in more specialized forums. Increases in
Python adoption, for instance, mostly come from re-skilling initiatives in BI
teams inside my customers.

------
nullwasamistake
A lot of it was. Like VR, blockchain. Companies didn't know whether they
needed this new tool. A lot of them didn't. Like VR, the long tail on this
might be huge. It will just be new technologies that weren't possible before,
not huge improvements on existing ones like some companies thought.

------
PLenz
Two thoughts: 1) Much of this is the bias of this community towards new and
more more technical companies. At many of our work places Data Science is just
part of the landscape and is now baked in. But there is a wide gap between
that and legacy companies where the idea of using these techniques this way
isn't new at all. In other words we've been through the initial wave of the
hype curve and are now into mainstream growth of the idea. 2) Branding
matters. In many companies that I get to interact with (fortune 100s) run-of-
the mill analysts exist for ass-covering the non-data-based decisions of
higher ups. Post-hoc decision justifying. But DS teams (currently) actually
get a seat at the table and get listened to. If for nothing else that makes DS
way more effective then just BI or whatever. And as long as we get things
right that should turn into a virtuous cycle of being heard, being seen to
contribute, and being asked to contribute more.

------
return1
I wonder what is the role of hype in the scientific progress, or progress in
general. Is there evidence that it pushes it forward?

------
ianthor
I work for one of the major bootcamps and I can tell you that interest has
significantly declined over the past 1 year. The market for data science has
indeed shrunk. The reason? I'm not sure, could be that it didn't live up to
the hype.

~~~
edoo
From my perspective it seems like even 5 years ago data scientists were more
about creating and validating ways to look at the data whereas today it is
more about processing the data with the standard tools and methodologies. Data
processing is as big as ever but takes a fraction of the effort to apply it.

