What are you basing this on? Senior software engineer jobs are a lot easier to come by than data scientist jobs and from what I've seen, pay better than the average data scientist job as well.
If it had been as difficult then as it is now I would probably have chosen software engineering.
Honestly, I feel software engineering might be better anyway as it's much easier to demonstrate value building features and shipping products rather than endless analysis and questionable models.
Current Software Engineer, for full disclosure, pondering how to move away from the prevalent web dev job market where I currently reside. Data science seems to pay on par.
There's nothing wrong in data science itself, just like there's nothing wrong with mortgage. But the current trend of software engineer/ non-software engineer moving into data-science is not sustainable. Things will break before it's fixed again, I've always considered myself a software engineer first and foremost, just with some extra machine learning/stats knowledge, and I'm glad to be out of that position now as it looks like we're in for a reckoning soon.
In my experience from my old employer...clients like Google get billed $120/hr for SQL analysts' services; ten years ago, staff earned $20/hr or a little more, and today, they've replaced almost everyone with offshore employees making $3-4/hr.
That’s interesting. I’m at one of F/G and a lot of the data scientists want to go the other direction to software engineering because we receive about 60% of the RSUs that they do. A few people on my team actually did switch; they said they found the data science work more interesting but an additional $40-100k per year can make a really big difference over the long run.
Facebook and Google "data scientists" (meaning those who hold the title) are really more like analysts -- they analyze data to inform decisions and use a lot of SQL. They make prototype models (usually based on less cutting-edge techniques) that get passed to engineering teams if they become worthwhile to scale/formalize. These folks get paid less than SDEs usually.
The other type of "data scientist" is basically an SDE (maybe SDE-lite) with research-level ML skills. These get paid similarly (or higher in some cases) than SDEs. I believe Facebook and Google call these SDEs. Sometimes the term "applied scientist" is used to describe these at other companies as well.
At my company, Type A are called "Data Analysts", but at Google Facebook they're called "Data Scientists". Type B are "Data Scientists" at my company, but "Machine Learning Engineers" (or SWE-ML or some other combination) at Google and Facebook.
As a Type B, at my company, I'm on the same pay scale as the SWEs. The Type As are not.
What sorts of SDE positions do these data scientists go into? Are there any additional skills they pick up as part of the transition, or are strong Python/SQL skills enough?
If your job is working you too hard, with not enough pay, then people here get another job. It seems harder to high people with some experience at my company anyway. New college grads make 120k+ at top companies (we are a startup but not a unicorn, we pay a little more than that).
Data Scientists primarily function to tell a story (based upon data) that technicals and non-technicals alike will use in business decisions. It’s critical that a Data Scientist be perceived as trustworthy, since the decision-makers are unlikely to reproduce or even understand the Data Scientist’s full argument.
What signals trustworthiness? A graduate degree from a Harvard Yale Princeton Stanford (HYPS) or similar university definitely speaks well for a candidate. Online degree programs like Coursera / Udacity / etc won’t carry nearly the same weight until their alumnae network grows, and that will require growing into non-technical fields.
What signals untrustworthiness ? Sadly, the “hacker” skills that are so very key in DS (e.g. for data cleaning) are completely at odds with traditional (and especially non-technical) assessments of trustworthiness. Many companies in the Bay Area will look past this issue, but it’s arguably a competitive advantage to simply be able to assess “hacker” skills effectively. That also entails making space for “hackers” at your company. Can’t take hackers? You probably will never hire a good Data Scientist.
I was called a few years ago by someone looking to hire a support person for scientists using supercomputers. I just couldn't believe my experience (SQL, ETL, data munging, etc) was applicable, and I talked him out of it. But some people might have taken the approach of get the job and then learn what it's about.
PhDs can't code, hackers don't know stats. A good team has both. If you're a one-man show, you need to do both yourself.
They also have quant guys/data scientists who use the data to help drive investment decisions.
In this TDS post, the author says "Statisticians and Actuaries are at the bottom of the heap as a prior role for existing data scientists." Maybe this isn't a coincidence? Plenty of companies had statisticians on staff, but the explosion of data science happened anyway. Why? Because data scientists do the same types of tasks as statisticians, but while statisticians are of the data modeling culture, data scientists are expected to be of the algorithmic modeling culture. It seems that the market is saying that the algorithmic modeling culture is getting results.
The author references "Type A vs Type B Data scientists" , which seems to be getting at the same thing: "The Type A Data Scientist is very similar to a statistician... Type B Data Scientists share some statistical background with Type A, but they are also very strong coders and may be trained software engineers. The Type B Data Scientist is mainly interested in using data "in production." They build models which interact with users, often serving recommendations (products, people you may know, ads, movies, search results)." For whatever reason, there is a correlation between Algorithmic modeling / Type B and "getting things done".
1. I think it’s absolutely fair to criticize this aspect of the analysis: the relative frequencies of the backgrounds of data scientists have been presented as suggesting the success rate from each field. Many of the comments in the post itself made a similar critique. As I’ve acknowledged in my responses to these comments, what we need are the relative frequencies of applicants from the different backgrounds, not just hires. However, one can justify the inference about the success rate of, say, Statisticians and Actuaries if one has the prior belief that the relative frequency of statistician applicants to DS positions should be higher than the observed relative frequency of statistician hires (<1%!) to DS positions. I don’t think this is unreasonable.
2. I make a similar argument with regards to MOOCs/bootcamps: my prior belief is that the relative frequency of bootcamp-only applicants should be higher than the observed relative frequency of bootcamp-only hires. Hence my statement about necessity vs. sufficiency.
3. It’s somewhat more complicated for applicants with both degrees and MOOCs/bootcamps. I haven’t done this, but what I can do is to look at the education distribution for hires with and without MOOCs. If the education distributions were similar, it would suggest that MOOCs have negligible impact. If, however, there is a higher relative frequency of say Bachelor’s degrees in the MOOC category, that would suggest that MOOCs/bootcamps have some value-added impact.
4. An ideal prospective study for the above would be to extract a sample of individuals from a precursor role, say, data analysts (hence naturally controlling for education). Note which of them have MOOCs or bootcamps, then follow them up in time to see how many end up as data scientists in each category.
5. I might actually change that profile picture. It’s 3 years old, in more innocent times.
6. As it happens I have landed a data scientist position in Singapore and will be starting in September.
...mentioned 2 hours as the limit on an edit. I don't know what it currently is. You're past 2 hour mark, though. Might explain it.
This has a lot more to do with the relatively small number of statisticians and actuaries out there than it does the odds of people from various backgrounds transitioning into data science roles.
What an excellent analysis that applies far beyond Data Science.
Perhaps describing themself as "Statistician, Data Scientist, Software Developer" might have a better hit rate against the skimmers who pre-screen the resumes. An honest-to-ghu statistician who became a programmer is much more exciting than someone who looks like a programmer attempting to leverage themselves into a new hot sector.
In any case, how much merit someone has often isn't obvious. Pretty much anyone involved in hiring will tell you that internal referrals are one of the best ways (if not the best way) to at least figure out who to put into hiring pipeline.
If you don't do any marketing of yourself, nobody is going to know about your merit. "Build it and they will come" is a Hollywood fantasy, not reality.
Whereas if I network with people the jobs they recommend are far more likely to be a fit because I've networked with them - they know my skills.
Fair or not, it's how most everything in life works. For good things to happen you, you have to put yourself in a position where good things can find you. That means marketing yourself. It applies to getting your dream job just as much as it applies to getting your dream partner.
Even if you get your dream job, talent and hard work simply isn't good enough. You'll need to be able to sell your ideas to others in the company.
I didn't realize you were such a philosopher.
That's my point!
"Why should we let you join our group?"
"Because I'm part of your group."
Also, frequently these issues become very Rashomon-like. Why did Bob not get along with anyone at site X? Is it because of Bob, or X, or because they were a poor fit for each other?
What's infuriating (or depressing) about this stuff from my perspective is that there's an implicit assumption always that the person complaining about not building networks has not built the networks because of poor social skills, rather than problems with the networks themselves.
I'm not naive about social connections, but in my experience the social skills stuff is vastly overrated. Serious problems get ignored when it's a friend, and molehills are made into mountains when it's not. It tends to devolve into petty gossip and junior high infighting.
The expected shredinger-distance of your cv is given by G * N^2, G being cv weight and N your networking coefficient.
I will vouch for someone who I know and is not terrible. I'm not going to vouch for someone who I don't know. So there is an element of having connections. There is also an element of social skills as in any endeavor that involves more than one person.
- "I am for the problems worth solving (..)"
- Use the default header on Linkedin
- No sign of roles or additional engagement that indicate 'self driven problem solver'
I love the article, but I need actionable insights.
Lol are you serious?
I'm not exactly sure what you are responding to, or complaining about for that matter.
But IMO the field is getting to a point where the engineers are going into machine learning because it's a pay bump and the data analysts are realizing they can start calling themselves data scientists with some more experience under their belt.
Ultimately the field is saturating to where people are now going into this field as an easy way to hit six figures. You see that with the rise of the data science bootcamps but I digress as I wrote more about it here: https://www.interviewquery.com/blog/the-saturation-of-data-s...
That is, there might be more interesting distinguish factors, but he was limited to education, position, and years of experience?
And with data scientists, I've found that it's a mix of a) people who did mathematics or physics degrees suddenly getting into computer science b) senior analysts learning how to power up their analysis and c) computer science graduates who went on to do phd in data science
Working with them my humble opinion is that a) you can't ignore the software aspect of your job, meaning that you need to understand basic database principles, parallel computing, SQL, etc. b) you also need to understand that it's not about how fancy your algorithm is, but also how it can be quantified, how you can manage the life cycle, how you maintain it, etc.
I still find it amazing that so many data scientists I know do not understand basic data software principles. Stuff like distributed vs parallel, database types (NoSQL vs RDBMS), immutability etc.
It also shows what Inthink is the most vital skill of a Data Scientist - to go and find the data to support the question
I wanted to be a data scientist at a top tech company, so I did as Hanif did, and went to the data on LinkedIn. my search was more specific - only data scientists at top tech firms, and is also a very tiny sample.
But first, my situation at the time:
Masters in Computer Science at the University of Pennsylvania. strong database, AWS, spark, and python skills. worked in a social media research lab that looked social media impacts on health, mostly did NLP involving twitter and health outcome data. Coauthored a paper that ended up in JAMA (journal of american medical association). Eventually I got what I wanted, but it wasn't easy.
- message recruiters directly
- find a way of showing them you are a good candidate beyond your resume - I found kaggle was really helpful, I recommend it.
- be careful of getting pigeonholed out of DS positions by recruiters. your LinkedIn should speak 'I am a data scientist at heart'
- be prepared to fail interviews and learn from mistakes
- study stats (youtube,books), coding (leet code), and SQL (leet code?)
Degree - Need masters or PHD.
Major - Statistics or some version of it was most common and MBA's at top MBA programs 2nd, Computer Science very rare. why MBAs? probably because those programs had wonderful stats programs.
School - Top schools are very important.
previous job - Intern at a top tech company. Intern as a data scientist hugely beneficial. Next most important feature was whether the previous positions was data scientist. Not data backed, but I would argue that becoming a data engineer to get adjacent skills is a bad strategy. DE are highly needed, a recruiter will put you on DE loops, not DS ones. I feel like data analysts also struggle to become DS.
I thought my situation was at least somewhat ideal, but I was not getting interviews. 0. Its hard emotionally to not be able to get to where you want without been giving a chance, just got to keep trying. Findings helped me realize (previous job) that I was going to need to go about getting interviews in a more efficient manner.
I needed a way of getting attention. the reply rate of websites seems to be 1/50, which is problematic if you want to work for a specific set of companies. I think the best thing to do is to go on LinkedIn, search <company name> + recruiter. Message the recruiters directly, they have all the power in setting up phone screens, and they send batches of candidates to open positions hoping some of them will get a role. Now you got their attention, so you also need a way of getting into those batches.
An important metric for success was already being what I wanted, so I had to find a way of saying 'I can do this'. I starting spending most of my free time on kaggle, the zillow home price prediction. I finished top 100, which I STRONGLY feel this helped me get interviews. I recommend it. Its a free, zero risk way to get experience and display your passion/skills.
Next, I got some phone screens but failed a few and failed an on site. Technical phone screens are either stats, coding, or SQL - never been asked ML questions. Sometimes I failed coding questions, sometimes I failed stat questions. I addressed this by studying lots of stats (YouTube was very helpful) and coding (leet code). I already had years of SQL experience, so those questions were always easy for me but be prepared to answer the histogram question. I had some recruiters tell me in initial phone screen 'even though you applied to a data science position, we think you would be a better match for this data engineering position instead'. ouch, I realized my LinkedIn and resume looked very DE like because of my years as a database administrator and I added lots of spark/HIVE to my resume because I saw that on most DS postings. Its important, but don't over highlight the wrong things. I politely declined and kept trying.
Eventually I got exactly what I wanted, and I am very happy for it. It took me 2 years after graduation to get there, and I had failures at all parts of the process. I know it sounds cliche, but keep trying is my best advice.