
Imposter Syndrome (in Data Science) - cdl
https://brohrer.github.io/imposter_syndrome.html
======
mykull
Data science is one of the most imposter-filled "professions". It's a
recently-established category of worker that falls across multiple disciplines
and is very effected by technological progress. I have met "data scientists"
who aren't really good at any aspect of it, but they still get by because of
the supply/demand and lack of any existing expertise to say "hey, you know,
this person we hired is barely competent and just googles everything we ask of
them"

~~~
gota
All this means is that we will inevitably reach a point where separate titles
are used, and 'data scientist' will means about as much as 'engineer'.

We will distinguish 'machine learning specialist data scientist' from
'database specialist data scientist' just like we distinguish 'electrical
engineer' from 'lab systems engineer', for example.

Then, we might have a term for generalists like 'data science technician'. And
by then, the people who 'aren't really good at any aspect of it' and can't
really function as generalists will be naturally sorted out because they can't
really fit into any of those titles

~~~
thousandautumns
The thing is that we already have/had that. Data science is really just a
blend of statistics, machine learning, data engineering, and software
development. I think the data science explosion is the result of more
people/companies wanting to hire a single person to do all of these jobs
rather than individuals for each.

If we ever go back to a world where we distinguish people by these specialties
we would basically just be going back a decade to where we had statisticians
focusing on statistics, data engineers maintaining databases, software
developers creating the prodect, etc. (which isn't necessarily a bad thing).

------
sanjha7
This describes my situation to the point. I have worked in big unicorns and
have deployed many ml based models in production which had moved the numbers
significantly while many a data scientists in our team just kept cribbing
about errors in data or scarcity of it.

I have no DS background, am a humble engineer but believe it's 10x better to
just work with whatever you have available and get sit done.

~~~
thousandautumns
Entirely depends. "Moving the numbers significantly" doesn't mean much if the
numbers aren't moving in the correct direction. Errors and scarcity in data
are real, significant problems. I'm a statistician working in data science,
and I can't count the number of times people complain about "why can't you
just work with what is available" while failing to understand that what is
available is total garbage.

You can dress up bad data in any number of ways to get results that sound and
look pretty. I see this all the time. Sometimes you get lucky and the model is
ok regardless. Lots of times the model performance isn't great, and it is
later assumed there are other outside issues to blame, or the project is
redone for the umpteenth time. Ocassionally you will have colossal failures
that do real damage.

Keep in mind that when a poorly designed machine fails and kills dozens, or
the financial system of the world crumbles under the weight of terrible loans
and convoluted financial instruments, or millions of people's personal info
gets hacked due to terrible, antiquated security systems, and everyone starts
asking "How could people be so stupid to let something like this happen?", the
answer is almost always executives, management, or "humble engineers" sweeping
the problems they don't like under the rug because they believe "it's 10x
better to just work with whatever you have available and get shit done".

------
natalyarostova
As a data scientist I have wondered if this field is particularly suited to
imposter syndrome. My formal background is economics, and every once in a
while I become terrified at how little formal statistics I've studied, or
large gaps in data structures etc. although I'm similarly surprised at how far
I've gone by just going home and studying the basics when I run into something
I don't know, and the gaps in knowledge some coworkers have in areas where I
know more.

...although I have met a few genius data scientists who seemingly really can
do everything. Although I'm pretty sure they are paid upwards of 300k.

~~~
tinymollusk
I tend to agree. I think it's because the field uses statistics, which most
people have decided are incomprehensible and don't even try to understand.
Combined with how bad our brains are at thinking statistically, and you have a
powerful desire-for-avoidance by non statisticians.

Combine with the value that good ones can provide, and you have a perfect
situation where a boss or peer just thinks thar be dragons in the work sphere
of the statistician.

~~~
thousandautumns
I think the issue lies in the fact that data science is a massive umbrella
term which really covers multiple large fields, namely statistics and computer
science, both of which by themselves are massive. Its very difficult if not
impossible for people to feel like they have enough expertise in all of the
subjects that fall under the umbrella of "data science". Which typically leads
to lots of self-teaching, learning on the fly, and hacking solutions together,
which is often the source of all impostor syndrome feelings.

------
celias
I enjoyed this Partially Derivative podcast where two of the podcasters
discuss their experience with imposter syndrome

[http://partiallyderivative.com/podcast/2017/03/06/badasses-f...](http://partiallyderivative.com/podcast/2017/03/06/badasses-
feel-like-imposters)

