
Ask HN: Data Analyst vs. Data Scientist - kreeWall
It sounds like these two terms are often used interchangeably. What are the differences between a data scientist and an analyst? If you were to hire for a data-related position, and you had a candidate with each title (and not their resumes), what skills would you assume each had?
======
nerdponx
Here is my stream of consciousness answer:

An analyst analyzes data. You have a question that needs to be answered with
data, so you go to the analyst. They gather data, they look at the data, and
they answer your question. Sometimes they find something in the data without
your prompting and make a report about it to you. They use straightforward
statistical analysis and basic modeling techniques such as linear regression,
decision trees, clustering, and exponential forecasting. They use tools with
lots of off-the-shelf functionality like SQL, Excel, SAS, QlikView, Tableau,
domain-specific modeling tools like Emblem, and occasionally R and Python,
where they might have basic proficiency (especially for messy data cleaning).
They might be familiar with the fundamentals of probability. They have a keen
nose for erroneous data points. They make use of clear data visualizations to
carry out their analysis and demonstrate results.

A data scientist carries out research using data. You have a business question
or a problem that needs to be solved, so you call a meeting with the data
scientist to see if they can help. They determine if and how data can help,
and carry out research that addresses your question. They work with you to
help define what it is that you need. They have most or all of the abilities
of a data analyst. They have a wide variety of tools at their disposal for
collecting data, analyzing data, and developing powerful models. They have a
deep mathematical understanding of at least a few of those tools, and they
know what they don't know about the others. When familiar methods fail, they
are able to research new or different methods, which might only be available
in technical literature. When existing software fails to implement a technique
or algorithm in the way that's needed to solve the problem according to
business specification, they are able to implement it themselves in a
programming language. They are likewise able to obtain data from a variety of
sources and implement data cleaning procedures that might not be available off
the shelf. They occasionally work with software engineers to implement their
solutions in production systems.

