
Don't Assume Your Data Scientist Is a Software Engineer. A Thread: - dynamicwebpaige
https://twitter.com/DynamicWebPaige/status/936680687486238723
======
Xcelerate
I started working as a data scientist 9 months ago, coming primarily from a
research background. I had never heard of tools like Docker or Airflow in grad
school. After reading about them though, their value to a small but growing
team of data scientists was quite apparent, so our team took some time to
learn how to use them. We now have a reproducible, versioned workflow that
removes a lot of headaches that previously existed.

I don’t think it’s too much to expect data scientists to quickly learn some
DevOps skills, as long as you can motivate the value for using them.

------
bllguo
if you really need your data scientist to know these things then just invest
some time in training. So many skills can be quickly picked up on the job, at
least to a "good enough" level, yet people insist on looking for unicorns that
appear to tick all the unrealistic checkboxes

~~~
smallnamespace
This is one reason I ask brain teasers in interviews.

I don't just care about what you know now, I need to know your willingness and
ability to think on your feet when confronted with a seemingly random puzzle
and actually persevere towards an answer if necessary.

Not everything will fit neatly into the box of tools that were previously
learned.

~~~
yodon
Be careful about focusing on brain teasers during interview sessions. In the
early days at Microsoft they focused heavily on brain teaser interviews, and
hired what was quickly seen to be an incredibly smart bunch of devs BUT much
later seen to be a highly concentrated monoculture that over indexed on one
set of skills at the expense of bringing in team members with other strengths.
It took a lot of work to break away from that monoculture driving interview
style and get the full set of skills and capabilities the company actually
needed.

Not all smart people play chess, or know every line of Dr Who dialog, or are
Makers, or like brain teasers.

~~~
smallnamespace
Fully agree with that, brain teasers should definitely not be the dominant
component of any interview process.

That said, I've run into interviewees who have simply refused to fully engage
with a brain teaser -- like they just shut down and gave up, even after I
provided hints. For me, that seems to be a signal that they're not likely to
deal well with unexpected puzzles that appear in the normal course of
engineering work.

~~~
yodon
It may also be they dislike the artificiality of the question style,
particularly if they look like “find the twist hidden deep in the description
because this isn’t a normal real world problem” sort of questions.

------
justherefortart
Lmfao, why would a data scientist need to know TCP/IP, Server Setup, SOAP/REST
Web Services, SDLC, etc?

Sounds like someone looked up a list of IT stuff you should know and applied
it to data scientists randomly. In fact, most of those things in the list may
or may not apply to a "software engineer".

~~~
clintonb
I agree that this knowledge is not necessary, but it could be useful for
certain scenarios.

TCP/IP: networking between cluster nodes Server setup: deploy a map-reduce
cluster SOAP/REST: read/write data from services Software development life
cycle: plan/deploy a reporting system for end users

------
wohlergehen
I agree that there is a big issue in the field w.r.t. "unknown unknowns",
where more effort needs to be put into making useful knowledge available.
However, I do not think that many of these technologies are hard for someone
who understands data science, at least at the level neccessary to use them.
Doing productive developement in these more systems or CS focused topics is a
wholly different topic though...

~~~
ztjio
There is no such implication. In fact she specifically implies otherwise. It's
just a matter of not making assumptions of specific knowledge.

------
thisisit
The classic problem of software engineering. Talking about how your specialist
doesn't know other stuff. Then durinng interviews lamenting the fact that
while you are getting well rounded generalists they are not up to par.

He/she knows SOAP/REST but that unaware of that NN model.

A human can only retain so much. Invest in a team which has it's own
specializations.

------
cdancette
Data scientist have a variety of background: CS, applied mathematics, pure
mathematics..

I don't think a data scientist need to know all that stuff to be good at his
job

~~~
rcoveson
I imagine what prompted this thread was the growing tendency of software
companies to hire for "Data Scientist" positions and imagine that what they'll
be getting is analogous to a Database or Distributed Computing specialist--
someone who has a strong software engineering background plus deep knowledge
of their specialty.

------
jinonoel
After looking at the stuff listed, no worries. Most software engineers don’t
know all these either

------
kapauldo
Don't assume your data scientist is a scientist. It's a made up non-
credentialed title.

