
Ask HN: Data scientists, how are you being evaluated within your company? - loopasam
I&#x27;ve worked places of different sizes as data scientist (from start-up to mega-corp), and I&#x27;ve seen different ways to evaluate employee&#x27;s performance (think yearly performance review). In my experience, it&#x27;s ironically often difficult as a data scientist to demonstrate what you have or have not achieved during the year in a quantitative fashion. I would be curious to hear your input, what works and what doesn&#x27;t in your opinion. Evaluation methodologies I have often seen are:<p>- Demonstration of impact on business: In this case it&#x27;s up to the data scientist to justify as best as possible what business decision (or internal milestone) was made because of an analysis performed. In theory it makes sense (= your focus should be on impacting business), in practice I don&#x27;t think I&#x27;ve ever seen a single analysis changing the course of anything; decisions are driven by many factors, your analysis being only one of them.<p>- Tool usage: I guess some programmers are evaluated the same way; basically, you develop a tool for co-workers to perform analyses with. The more the tool is used, the better it is for you (it&#x27;s assumed that high usage = high business relevance). In this case the usage is sometimes easier to track and more impartial, but it&#x27;s often difficult to develop a data science tool covering many use-cases, and one frequently ends-up with a niche product with low usage.
======
shoo
> Tool usage: I guess some programmers are evaluated the same way; basically,
> you develop a tool for co-workers to perform analyses with

> often difficult to develop a data science tool covering many use-cases

yes ; yes. if you find the right niche a tool can be very valuable, and it
should be possible to estimate or compare the value provided by the tool to
the existing process without using the tool.

I worked in a domain where software was used to automate or optimise business
decisions as part of a large, expensive construction project. Some components
of the work that my colleagues & I did could arguably be framed as data-
science (more accurately operations research), although a lot of the work was
just software development. Occasionally there were small consulting projects
for clients where the output of the project was a report summarising some
modelling/simulation with recommendations.

The bulk of the work was building software tools used by the client to
automate and optimise business decisions. The value of such tools could be
evaluated in a few of obvious ways: How much labour cost did the tool save the
client by automating away previously manual processes? How much value did the
tool provide the client by making better business decisions than the previous
process? How much incidental value did the tool provide by forcing
standarisation of previously ad-hoc processes? (e.g. capturing data required
as inputs, data quality...) How much did the client pay for the tool? (the
above points would inform this one!)

The tools I worked on were used as part of the planning / design process. When
they were effective, these tools directly identified designs that would be
cheaper to construct than designs produced by the previous process. The value
of these construction savings could be estimated and was much larger than the
value from automating previous manual work. In at least one case, prior to a
sale to a very major client, there was a benchmark & comparison done between
the client's existing process and the new process using the proposed tool as
part of the business case to fund the sale of the product & related
integration work.

------
TadaScientist
I think this is a problem that most data science teams are facing due to the
hype and pressure for ROIs to be generated.

DS teams might work on operational improvements or external customer problems.
The same DS team is unlikely to be tasked to do both.

Fairly and factually measuring the impact a team has is not a new problem.
Banks have used transfer pricing models to allocate revenue to non-front
office teams. This requires a lot of buy-in from higher-ups and it is very
sensitive. Management is unlikely to be familiar and comfortable with the
notion that a model would calculate the implicit benefit each team is bringing
to the table.

Ideas I've seen attempted with various outcomes are:

* If your dashboard leads to hours saved by your internal users, focus on that because this will mean that your DS team's time investment translated into X workhours saved per week or month. Multiply by average salary and you get a cost-saving estimate.

* If your model predicts or calculates something, then it's even easier. It's the same if you are forecasting. It's difficult to measure ROIs on a non-financial investments but it's feasible.

* If your solution does not address an existing modelling need or problem or operational bottleneck and simply modernizes or brings something to the table that was not around before, things are a bit trickier. You need to think about opportunity cost (what could the DS team have been doing instead of this solution) but also what the company's strategic direction is. You also need to address operational risk. If your tool helps minimize the risk, then that's worth something. It's measurable by comparing the data pre and post launch of the tool (maybe a 6 month window of running both is sufficient to compare and contrast).

If you are looking for early stage successes to build the DS team's goodwill -
just focus on the first two bullet points. If you already have buy-in, then
time is on your side as long as you are productive.

I would also advocate that for conservative companies it's best to ensure that
you go to your internal clients and explicitly ask them to nominate projects
or problems or issues they need help with. If you can help them with your
solutions or data pipelines they will preach for the DS team doing your work
for you. Of course, there are a lot of companies where cliques make decisions
not just on merit. These companies are going to lose their best people and
wither away in time.

------
N_trglctc_joe
Question related to this:

I'm applying to jobs in data science (straight outta grad school), and I'm not
really sure how to market myself. The problem I keep running up against is
that data science is such a broad term that it's hard for me to express how I
can provide value to a company without speaking in empty generalities. From my
reading of job postings, it seems as though what's called "data science" at
one company is "software engineer" at another and "machine learning developer"
at a third. How do working data scientists view their role within a company,
especially how do their differentiate their purpose from the tools they use?

~~~
ASpring
In my mind there are 2 defining axes of data science:

The first is Statistics <-> Machine Learning. On one hand, you can be a data
scientist that primarily uses statistics to model user behavior, create
metrics, test hypotheses that then inform product design. On the other hand,
you can prototype and develop machine learning systems (recommendation
engines, predictive analytics etc).

The second axis is developing production code. Some data scientists live in
their notebooks and analyses and never write code that directly makes it into
production. Others are expected to sit alongside the SWEs and write production
level code for the models they have created.

Examples: Data Scientists at Google are statistics heavy and not often writing
production code (though this varies by team). On the other hand Quora Data
Scientists are mostly developing machine learning models and are expected to
write production level code to implement these.

