

From PhD to Data Scientist: Tips for Making the Transition - jakek
http://insightdatascience.com/blog/from-phd-to-data-scientist.html

======
Blahah
Sweet, according to his list I'm over-qualified. Interesting to think it would
be so easy to make the transition to data science. Except I can't imagine
wanting to work on less important problems than the ones I work on now.

Global food security vs. social network analytics. Yeah, fuck the money.

edit: calling all data scientists - why not consider becoming a computational
biologist? We have hard problems, real outcomes that affect people's lives,
and not much money.

~~~
daemonk
I am graduating phd bioinformatician, most likely going to transition into
industry. It's very easy to be caught up with the self importance of academia
because you are essentially in a bubble. It's great to be passionate about
science, but I really dislike religifying academia. It's almost expected of
aspiring academics to live like monks and just to be okay with shitty pay and
long hours. That's bullshit and academics take it while constantly assuring
themselves that "it's important and they love it". I am sorry that I am coming
off as extremely cynical, but I really don't think propagating the idea that
pursuing pure science is somehow more virtuous than other professions helps
with the situation.

And in my opinion, as inexperienced as it might be compared to more
established scientists, computational biologists are ready for biology, but
biologists are not ready for computational biology.

~~~
_delirium
> shitty pay

I can see this complaint in _humanities_ academia, but pay in the sciences
past the PhD student level is pretty reasonable. You could probably make more
elsewhere, but it's not like you're scraping by on ramen noodles as a
bioinformatics professor or anything. Postdocs typically make $50-60k, and
professors start at something like $90k at the minimum, easily up to $120k,
$150k, or more after tenure, especially if you're in a hot area like
bioinformatics, have made a name for yourself, and can get a position at a
top-30ish place. Unlike in tech, those salaries often come in places with a
lower cost of living than SF, too (at least if you want them to). Six figures
goes pretty far in Atlanta, Austin, Urbana-Champaign, Ames, or Raleigh, for
example.

You could beat that in industry, but either way you're making solidly in the
top 10% of U.S. salaries. And if you really need more money, most universities
will let you do 20% consulting time, or do a spinoff startup. There are
admittedly other reasons not to go into science academia (the list is pretty
long, actually), but fear that you'll have to take a vow of poverty doesn't
seem like a strong one.

~~~
jurassic
This response tells me you've never been in the science trenches. I've never
heard of a postdoc making anywhere close to 60k, even at elite schools in high
COL areas. The financial opportunity cost is staggering.

~~~
_delirium
Well, you'd be incorrect, since I'm currently a science academic. Have you
checked what Stanford, or Georgia Tech, or UT-Austin pay postdocs with
machine-learning or data-mining experience, in the past 5 years? There are
definitely areas that pay less, but bioinformatics, if by that you mean people
with serious computational skills, pays above the norm.

~~~
rdouble
Getting a postdoc at Stanford has the same probability as playing in the NBA.
Only you get paid $60K instead of $60M.

~~~
_delirium
If you have strong machine-learning experience and a few good publications in
the current market, getting a postdoc at a top institution is nowhere near NBA
odds. I don't know what the odds are specifically for Stanford, but if you
apply to whoever has openings among top schools, there are many each year. If
you know _something_ about biology and _a lot_ about machine learning, labs
might even recruit you rather than vice versa.

~~~
aspivelox
This has certainly been my experience as a recent PhD in computational
biology. I pretty much have my pick of Post Doc positions - I was getting
offers before I even graduated. I was also able to negotiate 50k without much
trouble, but I'm definitely worried about the opportunity cost. Giving up 100+
k for more than a year or two seems like a poor decision.

Also a Post Doc from a top lab directly correlates with how much $$ you can
make in industry.

------
fitandfunction
Technical skills aside, the best piece of advice in the article is "show them
that you want it."

I've conducted countless interviews / hires where it basically went:
candidates P & Q are the best on paper and in person, but candidate P said x,
y, z or did a, b, c, and seems to really want this job and work in our company

x, y, z was sometimes as simple as enthusiasm, and other times was in
describing what he/she did in their spare time. a, b, c was usually a project
for work, school or fun that was highly relevant.

Intellectually, I think I know that "enthusiasm" is a poor / weak predictor of
success. But, emotionally, it's a go-to tie-breaker.

~~~
rogerchucker
Should I start putting every substantial R/Python script I write, even if they
are based on some tutorials, on the Github/Personal-Website? Is that how I
"show"? I missed the Github bus for all my previous projects.

~~~
zhemao
What do you mean you "missed" the github bus? If you still have the code saved
somewhere, you can just create a new repo and put it up there.

------
tomrod
I'm currently finishing a PhD in economics and have spent a lot of time
learning the exact technologies he suggests (Python, SQL, a bit of R). Working
as a data scientist would be an awesome opportunity. But are most companies
_really_ in need of so many data scientists, or is it just a trend?

~~~
jkldotio
I think it's a new name for an old thing, lots of jobs through the last
century had things like "analyst" attached to them. Business has been about
measuring things for a long time, look at Taylorism or Gosset at Guinness in
1899 for example.[1][2]

A few little 1% gains from some A/B tests, or looking at geographic breakdowns
of customers from IPs or addresses add up.

[1][https://en.wikipedia.org/wiki/Scientific_management](https://en.wikipedia.org/wiki/Scientific_management)

[2][http://www.umass.edu/wsp/statistics/tales/gosset.html](http://www.umass.edu/wsp/statistics/tales/gosset.html)

------
gburt2
"Recursive programming"... as in, programming using recursion? Why would this
be important to "data science"? Surely loops are just as effective.

~~~
dudurocha
Are you serious? How would you iterate over a set of rules and a big data
volume, without using recursion?

~~~
aet
A loop? Can you please explain?

~~~
petegrif
'Explain'?

~~~
wavefunction
I assume the "explain" refers to the most important thing you can learn about
recursive programming: that it is often but not always the least efficient
strategy when compared with looping.

So the claim that recursive programming is the only or primary method of
iterating over large data sets requires some explanation...

------
lior60
Any possibility for a dev-minded MBA (finance) to make the data science
transition? I was pretty good back in the data with respect to R

~~~
eshvk
There is no magical set of qualifications to become a data scientist. Just
learn enough linear algebra, probability. Show people you can code. Maybe
setup some github projects. It is not like people in tech are doing something
magical with all these fancy data scientists. A little bit of math, a slap and
dash of code.

~~~
lior60
Are data scientists whom are in demand today dealing with neural networks and
machine learning, or are a large majority still working with large sets of
data and running correlation analyses/regressions? Your response above seems
to indicate that it's not overly complicated.

~~~
eshvk
It depends; I personally haven't used neural networks since graduating.
Standard machine learning algorithms get used. I work with large data sets all
the time. Sometimes correlations, regression things. The point is that 95% of
the stuff that gets used on a day to day basis is not hard to learn.
Especially if your background in linear algebra, probability and statistics is
good. The five percent that delves into more complicated things can be figured
out on the job.

~~~
lior60
Thank you for you candid and thoughtful answer :)

------
rogerchucker
Should have studied harder in undergrad. All the current "Fellows" are from
top tier schools :(

------
flanger
For those looking to make the transition to data science, another option is
Zipfian Academy
([http://www.zipfianacademy.com/](http://www.zipfianacademy.com/)). No PhD
required.

~~~
shoyer
No PhD required, but you're expected to pay 14k in tuition. In contrast,
Insight pays you.

If you want to pay money for experience, why not get an actual degree from an
accredited institution?

A more realistic alternative to Insight is to do a (paid) internship at a tech
company. This is the path I took.

~~~
clearspandex
Often advanced degrees at universities are much more expensive
([http://datascience.berkeley.edu/admissions/tuition-and-
finan...](http://datascience.berkeley.edu/admissions/tuition-and-financial-
aid/)) than an intensive program such as Zipfian and take much longer. I hope
that private education institutions (such as GA, Hackbright, Dev Bootcamp,
etc.) can coexist happily with traditional universities, as they each fill a
different niche. Universities are in the business of training researchers and
professors (and do a great job at that) while alternative educational
companies aim to produce industry practitioners (similar to trade schools).

I highly recommend internships and they are wonderful if you can get one.
Unfortunately not everyone can be so lucky, either due to lack of
experience/technical abilities or an advanced degree (not everyone goes to
college). I believe these alternative educational routes are democratizing
such industries and many of them offer scholarships and tuition assistance
programs.

