

Ask HN: how to get into a data analysis / machine learning career? - ajdecon

Hi all,<p>I'm currently a PhD student in materials science working on a polymer microfabrication-oriented project. Initially I was very interested in this project but my enthusiasm level plummeted rapidly. I've stuck it out a few years partly through wanting to finish what I started and partly, I'm ashamed to say, through inertia.<p>However, over the past year I've also been involved in a side project which has involved some pretty interesting data analysis (some complex image analysis and analysis of fluorescence time series), and it's like night and day. Science is fun and my work is exciting again.<p>And it's not just this side project: I'm learning R in my spare time, and reading through course material from Andrew Ng's Stanford machine learning course. I also did some analytical work as an undergrad which I really enjoyed, but I felt like I was commited to the microfab path. I'm really regretting that decision now.<p>I'd love to be able to push my current research more in an analytical direction but that's...unlikely, to say the least. And while leaving the program would be a really bitter pill to swallow, so would spending three more years on a project I don't like, and feeling even more committed to the work after I got a damn PhD in it.<p>My question is, does anyone have any suggestions for how I might do a course correction towards analytical work after a BS in Physics and years in a microfab PhD program? Should I be trying to take classes in statistics or CS? Are there "entry-level" jobs in this kind of thing I could qualify for?<p>Note: I realize this is not necessarily a HN kind of thing, but I see machine learning and data-oriented discussion here fairly often and it seemed like a good place to ask. Pointers towards more information or other communities are welcome.
======
lukatmyshu
The company I work for is currently looking for people interested in machine
learning & data analysis and I've interviewed a fair number of people that
have a background/story similar to yours. The one piece of advice I have is to
spend time learning how to take raw logs and munge/join/split them into a
format that you can stick into R. Get good at a scripting language, understand
what map-reduce is, play with PIG or Hive if you can. Are you considering
leaving your PhD and going into industry (it sounds like you are).

~~~
ajdecon
Thanks for the advice. Any advice on the best scripting languages to play with
on this? I've got some experience with python and I'm working on getting
better, but I don't know if there's a better choice in this domain.

Industry or a different academic program are both options in my head right
now. I don't love the idea of leaving my PhD but I also don't love the idea of
spending more years on something I find uninteresting.

------
sga
Unfortunately I don't have a solid recommendation for you. But I feel
similarly and would be interested to hear what you come up with. This past
January I completed a Ph.D. in physics in an interfacial science/polymer
physics/biophysics group. During my time I found that one of the aspects that
I really enjoyed was performing data analysis. A large part of my work
involved writing custom scripts in Matlab to model experiments or to analyze
experimental data. As I'm not particularly interested in continuing on in
academia, I'm really beginning to consider the idea that rebranding or
retraining myself in areas of data mining and analysis may be the way to go.
I'm not sure how best to accomplish this either. I let the thought of doing a
quick Masters enter my mind but that's not ideal for me. I think I can quickly
come up to speed on theoretical considerations I might be lacking so another
option may be to just convince a company to give me a shot.

------
silkodyssey
Bradford cross who does machine learning work at flightcaster addresses this
question in the following post:

[http://measuringmeasures.com/blog/2010/3/12/learning-
about-m...](http://measuringmeasures.com/blog/2010/3/12/learning-about-
machine-learning-2nd-ed.html)

~~~
ajdecon
Thanks, that's an excellent list of resources! (And also a good blog for me to
check out in general.)

But in addition to learning the topic I'm also looking for advice on handling
this kind of course correction.

