
Amazon data science interview questions - onerousraisin
https://mldatageek.herokuapp.com/questions/tags:amazon/
======
beebmam
As an interviewer, I don't ask questions like these.

By a very large margin, for most of the time you are actively developing as a
full time employee at a large tech company, you're working on integrating
systems, either through a build language or extending your software to use an
already existing API.

What are the most useful skills for these tasks? Being able to communicate
effectively, feeling comfortable asking questions, feeling comfortable
admitting when you don't understand something, and being kind and friendly
with those you interact with.

Therefore, when interviewing new hires, I do not ask many technical questions
beyond a general competency question.

~~~
zaptheimpaler
A company full of people great at communication but too little technical
expertise will fail. I think it takes both types - technical gurus who can
generate useful information and solve the harder problems, plus
connectors/communicators who gather information.

~~~
gaius
I get the feeling at many companies now, there are fewer truly interesting
technical problems than there are engineers enthusiastic or desperate to
exercise their hard-won CS knowledge. Most real-world work of commercial value
requires skill of course, and experience, but is not especially algorithmic. I
think the surge of interest in DS is that here people see an opportunity to do
something mathematical day-to-day (but 10% or less of DS is the "sexy" stuff).

~~~
metaobject
Much of the work I do that could be considered "data science" involves trying
to write generic visualization tools that allows us to plot various datasets
and the results from models we've built (a lot of 2D/3D/4D weather/atmospheric
data).

------
prestonh
Can't say I could answer all these questions, but I think they're really great
and motivate me to spend more time studying statistics. With the rise in data
science and machine learning I've noticed a lot of resources devoted to
teaching people how to create and train models for a given problem, but less
resources on how to interpret those models and on statistical inference in
general.

~~~
RA_Fisher
It's quite rare to find folks in the industry with formal education in
statistics. The folks that have it can run circles around everyone else,
though. Data Science and Analytics and Machine Learning all use statistical
theory when boiled down.

~~~
icelancer
When I ran a Data Science team we primarily hired Physicists and Mathematics
majors and not CS graduates for this reason. It _mostly_ worked as intended
(untapped source of excellent candidates), but some of them could not for the
life of them pick up software development / writing manageable code with a
team. They were so used to writing unmanageable scripts that didn't have long-
term time horizons (think: horrors of hundreds and thousands of lines of
R-stats code in a single file, uncommented, and brutal to look at, with the
expectation we'll just plug it into R-script on the command line) that
occasionally this became a very tough habit to break.

Still, besides that issue - which all novice programmers have, but Data
Scientists come in at a much more senior level - it was a great market
advantage, one that I think is largely gone today, as companies understand the
need for statisticians and economists who really and truly understand modeling
and math.

~~~
botnik
Physicists and Mathematics majors can only be better than a CS grad, if they
come prepped with Softwrae Development experience. A CS grad can communicate
better with the code they write and most of the time assuming it's straight CS
have knowledge and experience of ML that exceeds a Physics/Math major.

Anecdotally, I've worked in places where this is a major source of contention
- The Data Scientists treated as talented individuals (who produce broken
solutions which work for cherry picked data sets), and anyone else is just a
monkey who maintains and fixes the broken code.

~~~
icelancer
>Physicists and Mathematics majors can only be better than a CS grad, if they
come prepped with Softwrae Development experience

This is very much the opposite of my experience.

------
cabaalis
I appreciate the insight into the types of questions that might be asked.
Sadly, I would have to look the interviewer in the eyes and say "I can program
good"

------
aflam
Data science now has its own tree-sorting questions. Shameless plug, but since
the submission is popular, some could find this helpful:
[https://shapescience.xyz/blog/interview-questions-for-
data-s...](https://shapescience.xyz/blog/interview-questions-for-data-
scientists/)

------
ggggtez
These questions are largely unanswerable in their current form. Like"estimate
the probably of disease in a city given Nationwide has a low probability".
Like, what the heck does that even mean. I could imagine a dozen answers. I
can only guess they are trying to get you to think about Bayes probability an
event occurs given X independent identical trial). But that's ludicrous,
because on what world has anyone proved disease occurs uniformly at random?

~~~
nerdponx
The point isn't to answer the question _correctly_. It's to show how you think
through the problem. There is often a "right" way to approach the problem and
a "right" way to answer the question, even if there is no right answer _per
se_.

~~~
ben_jones
I've never understood how there is a "right way to think about something".
Isn't it an attribute of humanity that we all think and approach problems
_differently_? Are companies really looking for one single type of individual
to clone across all their engineering teams?

~~~
ImSkeptical
I believe it's not that there is a single right way to think about a problem,
but there probably are wrong ways to think about the problem.

For example, going through all the issues with the question and implicit
underlying assumptions, talking about what you'd need to give a good answer,
or the characteristics a good answer may have might not be exactly what the
interviewer expected or was looking for - but such an answer could demonstrate
your competence in the field and be a good answer.

On the other hand, trying to make up an answer, or brazen your way through
with jargon while hoping the interviewer didn't notice would likely be a bad
answer.

It's not that these companies are looking for a single kind of thought. On the
contrary, I believe they value different kinds of thought. Instead, they
believe they can sort kinds of thought into desirable and not.

~~~
dapreja
The reality is, unless you're directly communicating with someone who
understands the problem/question, they'll be expecting the textbook definition
as is or whatever stackoverflow response has.

~~~
nerdponx
In data science interviewing, that generally is the case. Technical people are
going to be asking these questions and evaluating you.

------
rafinha
I have mixed feelings concerning this methodology. At the same time I feel I'm
a bit rust on the details of the field, I still believe best way to assess
academic background is by sending a paper "offline" and asking the interviewee
to explain it. IMO all these quick-answer questions asses is how prepared is
the interviewee in answering quick-answer questions.

~~~
Xcelerate
Agreed. I have horrible short term memory but these questions are trivial as a
"homework" type assignment. Plus, for my current data scientist role, the
interview consisted of me bringing in a project I had worked on (actual code)
and demonstrating how it worked. I much prefer that tactic. I think it gives
you greater insight into the candidate.

~~~
gaius
_bringing in a project I had worked on (actual code_

I don't think commercial confidentiality makes that a viable generic method...

~~~
Xcelerate
Should have clarirified. This was right out of grad school—so code for my
research.

------
irishasaurus
I'm sorry if I'm ignorant of the platform, are the different posts the
interview questions? Also what am I to make of the platform itself? Is this an
app hosted on heroku I can access later on my phone? If that's the case that
would be awesome, like a targeted stackoverflow.

------
KirinDave
My interview was outrageously more math-y than this. Sadly, Amazon NDAs these
things pretty hard.

------
hackernewsacct
If you obtained a degree in Computer Science and specialized in Machine
Learning are you suppose to be able to answer these questions? What job
specialization is this aimed for? Almost strikes me more as a statistical
based interview.

~~~
ju-st
I have the impression that at big universities maths is always the #1 topic in
CS/ML. So its no surprise their graduates ask the same riddles as their profs.

~~~
screye
I personally know a few ML professors at my university who are not taking any
CS grads as pHDs. They only want people with maths degrees.

A lot of these CS professors are themselves maths grads.

------
cfusting
These are good questions. A good quantitative researcher can answer most of
them. They're also fairly open ended, allowing the candidate to show what he
or she knows.

------
manav
Interesting that it seems to be all on the stats side and less on engineering.

For anyone that wants to get the background for any of this I highly recommend
ISLR (and ESL as a reference). [http://www-
bcf.usc.edu/~gareth/ISL/](http://www-bcf.usc.edu/~gareth/ISL/)

~~~
RA_Fisher
I bet that's because statistics knowledge is much more scarce in the industry
right now.

------
soVeryTired
My quick and dirty answers are below. I'm thinking of moving jobs within the
next year so I could use the practice. Can anyone do better?

> How do you treat colinearity?

Throw away the redundant part of the data

> How will you deal with unbalanced data where the ratio of negative and
> positive is huge?

This is very problem-dependent, but it's got the potential to wreak havoc with
your learning algorithms. You might get seemingly good results by e.g. always
predicting positive. Think carefully about your loss function.

> How will you decide whether a customer will buy a product today or not given
> the income of the customer, location where the customer lives, profession
> and gender? Define a machine learning algorithm for this.

This is a messy combination of continuous, categorical, and binary data. I'd
encode the data in a vector [salary, x-y co-ords, one-hot encoding of
profession, and binary indicator for gender]. Something like a random forest
will probably get you most of the way there. Unusual professions could mess
with the algorithm, so consider grouping by industry or averaging with a model
that omits the profession data.

> Is it useful to apply PCA to your data before SVM classification?

Probably, but it could be data dependent. Semi-supervised learning often
improves machine learning models, and it'll lower the dimensionality of your
inputs, which will make training / hyperparameter search far more efficient.

> How do you compare a neural network that has one layer, one input and output
> to a logistic regression model?

Not entirely sure I understand the question, but a NN is basically just nested
logistic regression, depending on the activation function.

> From a long sorted list and a short 4 element sorted list, which algorithm
> will you use to search the long sorted list for 4 elements.

To be honest, I'd use binary search and call it a day. I realise this isn't
the answer you're looking for.

> How will inspect missing data and when are they important for your analysis?

You might try to impute the missing data from similar datapoints. Maybe use
something like K nearest neighbours on the non-missing components to deduce
the missing values. Got to be careful doing this though, since it could
massively bias your analysis: consider using a special encoding for "missing
data" too.

> Estimate the probability of a disease in a particular city given that the
> probability of the disease on a national level is low.

This is a tough one, since there are so many ways to answer it. I think I'd
start by writing down a list of factors that might allow a high prevalence of
the disease locally despite a low prevalence nationally. Weather? Local
wildlife? Proximity to major transport hubs? Then I might suggest a simple
model like naive Bayes.

~~~
screye
I will try.

> How do you treat colinearity?

Use an LASSO as a feature selector or regularizer.

> How will you deal with unbalanced data where the ratio of negative and
> positive is huge?

User F1 score instead of accuracy as a performance metric. Try using cascade
classifier.

> How will you decide whether a customer will buy a product today or not given
> the income of the customer, location where the customer lives, profession
> and gender? Define a machine learning algorithm for this.

More or less, the same as what you said.

> Is it useful to apply PCA to your data before SVM classification?

No, since kernel-SVM optimization is dependent on data (similarity) and not
features. Also, we will be projecting the data into a basis expanded space
anyways, so the dimensional reduction would be redundant.

> How do you compare a neural network that has one layer, one input and output
> to a logistic regression model?

An MLP-NN is the same as a logistic regressor, if all connections as 1-1 (no
basis expansion), no activation function in the data-1st layer connection and
Sigmoid as activation function in the 1st layer-output connection.

> From a long sorted list and a short 4 element sorted list, which algorithm
> will you use to search the long sorted list for 4 elements.

I am dreadfully bad at core CS questions...throw some AI questions at me. XD

> How will inspect missing data and when are they important for your analysis?

You might try to impute the missing data from similar datapoints. Maybe use
something like K nearest neighbours on the non-missing components to deduce
the missing values. Got to be careful doing this though, since it could
massively bias your analysis: consider using a special encoding for "missing
data" too.

Love your solutions. I might also try using cascade classifiers.

> Estimate the probability of a disease in a particular city given that the
> probability of the disease on a national level is low.

tough question. Would probably involve me asking a lot of questions about what
data I have to play with. Very interview specific.

~~~
govg
Regarding the PCA before SVM step - PCA would help reduce dimensionality w.r.t
linear correlations, kernels generally are used to obtain non-linear maps. So
wouldn't it be wrong to say doing PCA is redundant? Also, while kernel SVM
optimization is dominated by number of data points, computing the value of the
kernel function for individual data points might depend on the dimensionality
itself, would it not? Another instance of lower dimensional representation
possibly helping.

Please correct me if I'm wrong, I'm still just learning this stuff.

~~~
screye
The cost of computing similarities is definitely linear or lower, wrt. to
features. But overall, the Kernel SVM will scale to the order of n^3 wrt. to
the number of points. In that sense, focussing on the number of features may
not make a lot of sense.

PCA isn't lossless either. We are definitely trading off some richness in the
data for efficiency. When the gain in efficiency is negligible (as is the case
with SVMs) the trade off doesn't look as tempting.

I am student in the area as well, so take my words with appropriate caution.

------
b_tterc_p
I respectfully disagree with everyone saying these are statistics. This is
analytics. Most of them are down to earth and practical matters that I think
would be lost on someone who did statistics as a math degree (but maybe this
distinction is unclear because it is also observably different from straight
comp sci or neural net expertise).

I recently graduated from a masters of business analytics program and would
have humbly assumed that the breadth and depth of knowledge asked for in these
questions would be beyond my capacity, but it was actually pretty spot on with
what I had learned. Strictly speaking, I am far from an expert in
statistics/math/comp sci. But these seem like reasonable questions. The best
one is

"How will you decide whether a customer will buy a product today or not given
the income of the customer, location where the customer lives, profession and
gender? Define a machine learning algorithm for this."

This is a difficult question because for someone who has played with data
science before, the answer seems to be obvious: throw the data into your
favorite predictive algorithm and see the result (worst case: deep learning).
What they are probably looking to see is a discussion of feature engineering,
hypothetical exploratory analysis, and an effective way to present the answer
(straight probabilities, weighted loss functions, outlier detection, etc.)

------
rothbardrand
It appears that the current generation of "hackers" seems very keen to work
for a big name company-- Amazon, Apple, Microsoft, Google, Facebook. As
someone who has worked for many startups and several of the above named
companies, please let me give you a bit of advice:

\- A big company on your resume has no outside value to your prospects.

Seriously. There's not "hey well he worked for amazon so we know he's good"
free pass in the future for having Amazon on your resume.

There might be a "well, she wasted 3 years at amazon and we all know that's a
hellhole so we can get away with abusing her" signal though. (Amazon is a
hellhole, I spent 2 years there, in retrospect I would have been better off
walking the day it became clear they were using cult methodology.[1])

It took me many years to find my calling. Once I did then my career direction
was set. I'll only be working for companies in this industry going forward,
because it really is my calling.

Take a job that maximizes your creativity and ability to contribute.

Don't chase big money or big names. The two almost never go together, anyway.

[1] Magical Phrases are a key indicator of cult methodology. Also known as
"thought terminating cliches" they are multi-purpose sayings that are used to
invoke company ideology and shut down dissent. At amazon there are things like
"It's day one!" and the like. Every time I've seen these cult signals at a
company it turned out to be a toxic work environment.

A good question to ask in interviews is if they company has any slogans all
the employees know.

~~~
argc
My experience is completely different from yours in regards to every single
point you make. Working at a big company (amazon) has increased my salary
significantly, made me better technically, and noticeably made it easier for
me to find a job elsewhere afterwards (this could be because my technical
skill were better afterwards, and I had more experience). Additionally, I
actually loved my team at Amazon. My manager was incredibly empathetic and
caring, far more than any managers I have had at smaller companies before. I
just wanted to provide a different perspective, and show that generalizing
about companies, especially large ones, is a bad idea and often grossly
incorrect. Your experience and the people you talked to (likely) represent a
very small part of the whole company.

~~~
cheeze
I had a similar experience there. There were parts of the company that seemed
like they sucked, but my experience was only positive and put me in a position
to learn a ton from strong engineers while building my skills up and making a
good amount of money. When I started, I made less than some of my friends who
chose startups, but two years out of college I was making more than almost
anybody I knew.

I might not have worked on my passion in specific there, but I had a great
time, learned a ton, and IMO set myself up for a great trajectory in my
career.

Like most other large companies, YMMV depending on where you're at.

