Hacker News new | past | comments | ask | show | jobs | submit login

This highlights one of my main complaints about the DS role. You are expected to have strong business intuition, sufficient coding skills to hold down a SWE role, a strong background in stats/math, know all the ML/DS specific skills and lastly, have technical depth in the subdomain you are looking to solve. All of this, while being paid the exact same as someone on the SWE or PM track.

No one can do it all. DSs that do 70% of these are the best of the best.

Mature DS groups have figured out that you have to pick your poison, and focus on archetypes rather than a 'well rounded' DS. Here are a few DS archetypes that I've seen.

1. The NLP/Vision/RL domain expert: High depth, low breadth people. Not very concerned with business intuition. Strong grasp of math for their domain. Moderate coding abilities, but pipelining for their field is fairly well defined. What is SQL?

2. The Generalist : Comes close to the 'good data scientist' outlined here. Never publishes, solves DS problems, will probably struggle to reach principal IC level in any specific product group because they lack the prerequisite depth. Will often become a manager down the line though and can also become an excellent PM at some point. SQL is their life blood. The less business savvy people see them as MBA-adjacent. But, they are super important.

3. Mr Maths or the Statistician : Pairs excellently with #4

4. The MLE who doesn't want to be an MLE - Excellent coding skills. Sufficient ML/DS skills. Just hasn't found a way to get their foot in the door to transition to a DS role without taking a pay cut.

5. The Researcher : Hiring a researcher in the wrong team can lead to a completely ineffective team. Also, not having a researcher in a team that needs it can lead to everyone going around in circles.

Top DSs will manage to host a max of 2 archetypes in them. Trying to get your DS to host >2 archetypes, is a losing battle. This is as good as it is going get. Also, most teams don't need all archetypes.

Identify the archetypes you need. Get some coverage over them through your hired DSs and let them continue growing along their selected archetypes.




Sadly on point. Some additions to your list of skills, from my exp.:

- Sufficient engineering skills to hold down a Data Engineer role

- Excellent at explaining and presenting your results/work to all sort of audience (users, other DSs, management, etc).

- Very good at Data Viz


Learning all this is not really that difficult. No more difficult than a biochemist training in subjects as diverse as organic synthesis (making stuff in test tubes), Raman spectroscopy (prediction of chemical structures using vibrational signatures) and DNA sequencing (computational analysis).

It's only because data science is much newer than biochemistry as a field that it seems beyond the grasp of an individual. It's perfectly possible to learn (and to teach) all of the things you've mentioned.

And what has pay got to do with it? Since when is pay correlated to how much you need to study (see, for example, musicians)?


Data science is a role, not a field. It's similar to but wider than the applied statistician role that is well-established in many fields of research.

You have a background in one field, but you are working to solve problems in another field (e.g. biochemistry). To do that, you must understand biochemistry well enough to be able to contribute. You are probably far from the best biochemist in the team, as you were hired for your methodological skills. In order to solve the problems, you may need tools from a number of fields, including statistics, machine learning, software engineering, data engineering, mathematics, and theoretical computer science. No matter which field your original degree was in, it's insufficient in both depth and breadth. You must keep learning new things and rely on others with complementary skills.

I work in bioinformatics, which is basically a more established flavor of data science. I have worked with people from a variety of backgrounds from electrical engineering to genetics, and everyone has had obvious gaps in their skills. Except maybe one or two people, but they are world-famous experts who are unnaturally curious about everything.


Pay has a lot to do with it because if you can switch to an engineering role (SWE or Data Engineer) and have more focused responsibilities and a higher salary then that's what most of them will do.

Although given the demands made for a DS role are often unicorn-level I don't even think increasing pay would help.


The parent comment says ‘while being paid the exact same as someone on the SWE or PM track.’ Not ‘less than a SWE’, as you imply.

Why should a data scientist be paid more than a SWE? Because they have to learn several different topics? That is not such a big deal in my opinion (I work as a DS).

This language of ‘unicorns’ has been highly damaging to the field. There is nothing magical about a job which requires a lot of varied technical knowledge. Try looking at a syllabus for some other scientific subject. It’s fairly normal.


I work as a DS as well. I don't think there's such a thing as "should be paid more" - the market shows us that SWE's are more highly valued presumably because there is more demand for those skills.

However, this will lead to people migrating from DS to DE and SWE roles if the compensation is relatively better. Yet we see articles about a 'shortage' in DS when they just aren't paying as much as a similar skill-set can get in a different role.


> the market shows us that SWE's are more highly valued presumably because there is more demand for those skills.

I think it's that it's that a tech company can more consistently make money from a SWE than any other role. You can always roll together an app and sell it. For every other role[0], you provide value to the organization, which eventually makes its way to the customers.

This is why the software bootcamp grads have fared better than the DS bootcamps (and ML bootcamps). A company can get a lot of value from a pretty crummy SWE and is willing to pay for it. A crummy Data Scientist, not so much.

[0] Sales is also similarly direct, depending on the industry. They enjoy a similar status.


> There is nothing magical about a job which requires a lot of varied technical knowledge

It's magical when the varied knowledge is orthogonal.

Having strong business sense AND being good at coding AND having stats/math skills is most definitely unicorn level.


This is an unserious comment and the worst kind of gatekeeping, as it only applies to the shallowest definition of "learn" - perhaps "remember" and "understand" on Bloom's taxonomy.

Most biochemists, like most professionals in any field, are dilettantes in 99% of the field - it's the difference between reading a French cookbook and being Paul Valéry. They specialize and the rest of what they have learned rusts and sprouts weeds and is useless when viewed under the lens of applied knowledge.

And as the OP noted, your "all this" to learn includes (no offense to biochemists) probably the most multidisciplinary set of skills in any field: communication, business analysis, psychology, statistics, computer programming, hardware and network topology, data engineering, domain knowledge, often a deep background in one of the hard sciences ..

Of course it's not beyond any individual - Renaissance Men and Women do exist, but to suggest it's "not that difficult" is an uhelpful myth.


I'm a swe that moved from backend to a ds role and then as a ds manager at my company and this is spot on. If I advertise a job por a ds position I have to mix all these archetypes and get used to at best have a solid 4 that wants to pivot to ds as this is the archetype that knows that we are creating real life data products not just using the latest model or beating some metric.


> Top DSs will manage to host a max of 2 archetypes in them.

This ignores experience. Top DSs will manage to have maybe one archetype per some number of years on the job. You can find unicorns, but they all have many many years experience and you're going to have to pay for them.


>All of this, while being paid the exact same as someone on the SWE or PM track.

Looking at Glassdoor the average Data scientist Salary is 10k higher than the average SWE Salary and 4k higher than the average PM.

In my experience Data Scientists are the best paid non-C-level roles


But the "average" data scientist is probably more senior than the "average" software engineer: the right comparison to make is between the same person, at the same point in time, in the two roles.

Companies that hire a "data scientist", as opposed to getting help from temporary consultants or assigning their engineers and mathematicians to data science tasks, are probably companies that value data science highly (because there is a perceived necessity and/or business value) and do enough of it to hire a full time specialist.

Moreover, a company that starts a data science team is likely to hire someone sufficiently senior to work more or less alone, not the more junior data scientists that would need the guidance of such a team leader.


>All of this, while being paid the exact same as someone on the SWE or PM track.

Why not pay top quality DS roles more than SWEs?


often (usually?) DS are paid less than SWEs of the same level!

I have plenty of cynical thoughts as to what drives that compensation gap. Maybe the simplest is just that there is high supply of people with these baseline skills and it isn't easy to distinguish if somebody is good or not.


I think there is just more demand for SWEs. Nearly every company will have software engineers, but not every company has data scientists and even the ones that do will almost certainly have more engineers than data scientists.

After all, you can't use data science to optimise your product or service if you don't have sufficient engineers to build it and maintain it in the first place.


As always, it's supply and demand. DS is often not needed as much as SWE and there is a lot of supply for DS, due to hype and ease of transition from people in other fields.


%s/not/don't companies/


Also be up-to-date on all the latest libs/frameworks/etc in one or several tech stacks.


I'm struggling to understand what people think is so difficult about all this data science stuff. The maths is very basic, even in "advanced" ml. Nor is it hard to learn backend software engineering for the purposes of 99% of companies.


It's all about epistemology. How do we know what we think we know? How do we come to know things we didn't know before? And how can we trust those conclusions?

Even if the math is basic, it's really, really easy to draw bad conclusions, look at the wrong problems, not realize that your data is more incomplete than you might think, etc etc etc. Guarding against these bad results - figuring out how to actually manufacture new knowledge - is the heart of the problem.


> How do we come to know things we didn't know before? And how can we trust those conclusions?

IMHO, this is the heart of what discerns scientists from engineers. Yes there is plenty of overlap, but to me this is the principle component. In engineering, the correct answer almost always exists. Enough eyeballs on the problem, sufficient double- and triple-checking converges on high confidence.

Scientific problems may not have a right answer, or gaining confidence has diminishing returns, and at some point you decide that's enough sigmas. You can scrutinize in so many ways but there's always blind spots.

(Data) scientists generally have to be way more comfortable with uncertainty. And as you mention, the easiest person to fool is yourself.


By the same logic what is so difficult about programming computers - it's just a bunch of zeroes and ones, very basic operations.


I spent 15 years of my damn life to become a dev and you don't know what's it like to be a beginner.

If you can re-read what you wrote with a beginner's mind, you will see how wrong you are.


99% of companies? Definitely not. The skills needed to do DS in business or healthcare are not very correlated with doing DS for the physical sciences. Which is the whole point of this comment thread, sure you can understand DL, but you also have to have an understanding of the field to know what type of DL to use. For example, in my role, I came with knowledge of machine learning but had to learn complex fluid physics to be able to know what type of DL techniques to apply or develop.


According you, Dyslexic person can't become a DS person just not because they love data but because ...

You are expected to have a strong background in stats/math

But waaaait

How come you forget about Philosophy?

Math and Stats based on Philosophy. You will have to learn Philosophy to become Super DS person!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: