
My Favorite Engineering Interview Question - brianm
http://skife.org/interviews/design/2010/10/27/favorite-interview-question.html
======
derwiki
Advice for new questions: instead of contrived ones, pick a problem you've
actually encountered in your job and ask the candidate to solve it. I've never
had to implement a datastore like this -- so unless that's what you're
actually doing, it doesn't seem particularly relevant. Another plus is that if
you're doing interesting work, you have an almost endless supply of questions
to choose from. Candidates also seem to respond better when you wrap up the
question with "this is something I actually worked on a few weeks ago."

FWIW, Ning turned me down back in April 2008 so feel free to ignore my advice
:)

~~~
zeraholladay
Asking relevant questions at an interview is a good idea, but I've noticed
that a lot of technical interviewers ask questions it took them weeks to
understand and fix when they use this approach. Understanding a problem is
often more difficult than solving it ... knowing that the interviewer
understands the problem and constraints is even more of a challenge.

I once had an interviewer give me the kobayashi maru without telling me. I
thought the guy was a jerk. He could just have pulled a gun on me and demanded
my wallet if he wanted to judge my personality. I declined the second round.

~~~
jemfinch
Can you describe your kobayashi maru question? I'm curious.

~~~
zeraholladay
Yeah, but it wasn't technical. The point of the question was to assess my
entrepreneurial IQ, or so I was told after. Generally if an unemployed
person's applying for a job, their entrepreneurial IQ is low at the moment and
they need money. Nevertheless: I was to acquire a desk to work at, which
needed to have certain dimensions for under a certain dollar figure. The
internet is down, the local office supply company doesn't have it, etc, etc,
until I'm starting to think the interviewer is nuts. OK, I'll sit on the
floor. His point was that you need to generate options, which isn't bad
advice. But it was a dumb interview question.

~~~
swaits
Very dumb question, indeed.

~~~
tlb
The point of the Kobayashi-Maru test isn't to test whether they're capable of
passing impossible tests, it's to test their character facing death. If you're
choosing fellows to go to war with, you want to choose ones that won't shit
their pants or otherwise make your last moments alive together awkward.

~~~
swaits
That's great for Star Trek. I'll keep interviewing technical candidates on a
technical basis.

------
jsankey
An idea I've tried a bit in the past (and would like to try more, given the
chance): ask a question that you don't know the answer to. This shifts the
power balance of the interview - you don't automatically have the answer over
the candidate. You also don't have your mind closed by your expected
answer(s). Both of these make the resulting discussion more comparable to how
you will work with a colleague, which is what you are trying to test the
candidate for, after all.

Doing this requires more effort and guts from the interviewer, but it's still
easier than being on the other side of the desk. Finding a supply of suitable
questions is probably the trickiest part - you could try looking up some
problems from competitions just prior to the interview, or you could perhaps
take a real problem you are trying to solve at that point in time (abstracted
as required).

~~~
brianm
I have done this before, usually if I am working on something interesting, and
for which I don't like any solution I have yet hit on. In this case, I do the
best I can to describe the problem, thinking thus far, etc, and turn the
interview into a design session at the whiteboard.

I can only recommend doing it with a candidate you are more selling to then
evaluating, and whom you think will be a good fit.

A strong candidate will jump in and you'll have a great session (I had one in
particular with a great young erlanger whom we wound up losing to someone
else, but the interview left a lasting impression and we kept in touch long
afterwards).

------
lrm242
These types of questions are terrible interviewing techniques. An interviewer
who, even jokingly, goes into the room with an interrogative mindset is
fundamentally failing at the core purpose of interviewing: finding good
talent, preferably relatively undervalued talent. An interviewer should not be
looking for someone who mirrors their way of thinking or or their approach to
solving a problem. They should be looking for people with skills that
complement the team, not duplicate it. Questions like these are typically
signs of bad interviewing technique, but not always depending on how they are
presented.

More often than not, however, people "design" interview questions to cause
problems for prospective employees. You might think this is clever. You might
think this helps weed out people who don't possess the ability to think
quickly or react to feedback on their feet. All you really are doing is
artificially limiting the prospective candidate pool.

Every dork with a chip on their shoulder can manufacture an a question to be
unanswerable. Deep down inside what most of these bad interviewers really want
is to design a question that creates one of two atmospheres in the interview
room: (a) a submissive one, where the candidate is forced to placate the
interviewer and reinforce their apparent intellectual superiority; or, (b) a
confrontational one where the candidate and interviewer battle it out for
dominance as alpha-geek. Either (a) or (b) can be seen as a positive depending
on the type of interviewer that goes in. Ultimately, any interviewer that uses
this technique is a bad one and any input they make into the hiring process
should be seriously discounted.

The interviewer should adapt their interview style to the candidate based on
reading their body language, communication skills, and overall demeanor in the
interview room. Its the interviewer's responsibility to create the best
interview experience possible. Some people are nervous; some don't think on
their feet well; most rely on Google heavily to point them in the right
direction; lots of them know more than they can demonstrate in the 5 minutes
you give them to react to your stupid question.

Your job is find diamonds in the rough. People who are the best, not people
who act, think, behave, or answer the way you think they should. Questions
like the one in this blog post are tell-tale signs of alpha-geek interview
dominance gone wild, and out of that you'll typically only get one style of
candidate that makes it through the gauntlet, certainly guaranteeing you're
not getting the best talent available.

~~~
jemfinch
I don't understand your objection here.

The question presented is a simple, open-ended question with a number of
acceptable solutions. It's not even remotely "unanswerable". It's at least
mildly interesting as indicated by the number of responses at a similar
question on StackOverflow:
[http://stackoverflow.com/questions/2573653/given-a-1tb-
data-...](http://stackoverflow.com/questions/2573653/given-a-1tb-data-set-on-
disk-with-around-1kb-per-data-record-how-can-i-find-dupl/2578319) . It's not
far from the problems Ning solves. It's not an "aha" problem that you either
get or you don't get. It's complex enough to reveal the candidate's thought
process and problem solving methodology, but it's small enough for good
candidates to get to a reasonable solution in the 45-60 minutes of an ordinary
interview.

Nothing about the article indicates that the author has a chip on his
shoulder, nor that he's a dork who wants to "battle it out for dominance as
alpha-geek."

Nothing about the article indicates that author is looking for a specific
solution reflective only of his particular prejudices in programming.

What in the world is the basis for your entire diatribe here?

~~~
lrm242
Seeing as you worked at Ning and likely participated in these types of
interviews it's not surprising you don't get it. Surely Ning was a great place
to work and I'm sure you found really top-notch employees through your well
known tough hiring process, but I'd be willing to wager they all look very
much the same when you blur your eyes a little bit.

My diatribe, as you so eloquently put it, is based on the article. The list of
things that the author expects to see in the answer. The way the question is
presented. The lead off with "interrogation". The "gotcha" follow-up regarding
how things might fail. These questions are great at finding a very specific
type of candidate, but the _entire_ point of my comment is that this one
candidate type is typically not indicative of the best engineer. Of course
you'll disagree, I'm sure, seeing as you were part of this process.

~~~
brianm
Asking how it will fail is a gotcha? To my knowledge and experience thus far,
all systems fail. The question is "given this one, that we now have a good
shared understanding of, which you designed with the tools you are most
familiar with, how is _it_ going to fail?

Not one candidate I can think of has been either surprised by this question,
or felt stuck, fwiw. Again, I may be completely self delusional, but I hope
not!

------
barrkel
I'm interested in why he's so against the "custom solution". Almost everything
a DB will try and add in for key-value lookup is predicated on the idea that
the requests aren't randomly distributed. The DB index will probably be based
around b-trees, which will do a logarithmic-time search for the top few levels
cached in RAM, but will fall over fairly miserably with multiple seeks as it
has to page in leaf nodes and likely the nodes a level above those: I don't
see it as realistic to expect a DB to do it all in a single seek.

On the other hand, if you have a perfect hash function for the keyspace into a
hash addressing space (you'll want something a bit larger than 2^32), you
could simply choose the device with the first few bits, then do a single seek
into a (large) file on disk with the position indicated by the remaining bits
of the hash addressing space, shifted over appropriately for the 128-byte
stride of values. You don't even need a perfect hash: with a merely good hash
function, you can cheaply read in more than 128 bytes from disk (OS will
probably be reading in 4K anyhow), and use open addressing with linear
probing.

You'll need enough devices to spread the load to get good latency for random
seeks; probably a combination of mirroring and what amounts to sharding, using
a portion of the key hash to select the device. But that's a relatively cheap
numbers game.

But I'm not really seeing what a DB buys you here, in this specific scenario.

EDIT: Further investigation of tc and cdb (as linked in the article) suggests
that tc (Tokyo Cabinet) may work in its hash form, but I'd rule out cdb for
using two seeks, which I would expect to double the number of devices
required.

~~~
brown9-2
I would imagine that for a certain number of candidates, they answer with a
custom solution because they believe this is what the interviewer is expecting
to hear (and asking for) when the question contains "you need to design,
build, and deploy...". I think to some people, the immediate thought is that
the interviewer would deduct points for originality if you said "well let's
just use this tool off of the shelf".

This is why interviews are so tricky - it's hard as a candidate to juggle
answering the question and trying to decipher what traits the interviewer is
really trying to expose with this question.

For example with this one: is the interviewer trying to see if I understand my
data structures and the size of this data to the point where I could design
such a system from the ground up? Or are they trying to see how practical I
can be given real-world time constraints?

~~~
brianm
I agree, and I try to make clear that I really want to understand how the
candidate would approach it -- if they genuinely would try to find a perfect
hash and keep a hash -> offset map in memory, then mmap a big data file, that
is fine. I _will_ ask how they will find the perfect hash, etc. If you can
describe well the approach you take, then this is fine.

Ofttimes, though, candidates _have_ started designing a general purpose on-
disk hash database. That is fine and dandy, and we always need a better
database, but it is pretty orthogonal to the immediate problem.

Actually, ofttimes engineers have implemented general purpose on-disk hash
databases as part of j. random project, so that path certainly _does_ occur on
real projects :-)

------
msie
Sigh, I'd rather start my own business than run the gauntlet of interview
questions again.

~~~
enneff
Really? I found my interview experience at Google a lot of fun. I would
happily do it again.

------
seiji
When I see "the data never changes" <http://cmph.sourceforge.net/> springs to
mind

~~~
brianm
Oo, very nice. This is going to get explored Thank you for the link!

------
DannoHung
I'm confused... why specify a 1 TB _disk_? It doesn't seem that the problem's
solution has any dependence on the aforementioned disk aside from the data it
contains unless I'm misreading how he's explaining the answer?

~~~
ntoshev
Yeah, I also thought he expected to serve it all from a single disk. Hopefully
in real life it gets clear in the discussion.

It's a problem when asking interview questions: there are many ways in which
the question may be unclear and the interviewer is mislead that the applicant
can't answer. Another problem may be that the interview interferes with the
thinking style of the applicant: you try to discuss it with him while he needs
time to think by himself, or just the opposite: he thinks better in a
discussion and you stay quiet while he sweats in stress.

~~~
user24
> there are many ways in which the question may be unclear and the interviewer
> is mislead that the applicant can't answer.

yeah, when I was interviewed for my university, they asked me to design an
algorithm to work out the n'th term of the fibonnacci sequence. They
repeatedly said I didn't need to use recursion or loops.

When I finally gave up, they said: well here's one way:

f(n) = f(n-1)+f(n-2)

I just looked at them and quietly said "well... yeah... I mean... obviously".
By that point I was too depressed to say "but that's frickin' recursive,
wtf?!?" but that's what I was screaming in my head.

Amazingly I still got offered a place.

edit: still bugs the crap out of me. Did I mis-hear them? Were they being
deliberately misleading? Was it some kind of test to see how I reacted to an
impossible problem[1]? Were they really expecting me to come up with some
mathematical algorithm to calculate the nth term without recursion or looping?
Argh!!!

[1]: actually it's not impossible:
[http://mathworld.wolfram.com/BinetsFibonacciNumberFormula.ht...](http://mathworld.wolfram.com/BinetsFibonacciNumberFormula.html)

but I don't have a background in mathematics and they knew that.

~~~
ntoshev
They were likely trying to help you and wanted to say they don't care whether
you write an iterative or recursive solution.

~~~
user24
I'm pretty sure I asked for clarification several times... It's possible I
suppose.

Here's the problem it started from:

Eleven lily pads are numbered from 0 to 10. A frog starts on pad 0 and wants
to get to pad 10. At each jump, the frog can move forward by one or two pads,
so there are many ways it can get to pad 10. For example, it can make 10 jumps
of one pad, 1111111111, or five jumps of two pads, 22222, or go 221212 or
221122, and so on. We'll call each of these ways different, even if the frog
takes the same jumps in a different order. How many different ways are there
of getting from 0 to 10?

(source:
[http://www.comlab.ox.ac.uk/admissions/ugrad/Sample_interview...](http://www.comlab.ox.ac.uk/admissions/ugrad/Sample_interview_problems))

------
codeslinger
Why have SSDs killed this question? Seems like you could tweak the amount of
data on disk and/or the size of the values and/or the requests/second
requirement and keep using it. Also, there are no 1TB SSDs available at this
point, so you'd still have to assume spinning platters for this, no?

~~~
drv
I haven't thought about it deeply, but my first guess would be that the seek
time of hard disks is what makes serving 5000 requests per second difficult;
consumer-grade hard disks can perform something in the range of a few hundred
(randomly distributed) IOPS at best due to seek time. SSDs make seeks (almost)
free, so even one decent SSD should be able to service 5000 requests per
second, assuming you can get a big enough one.

If multiple storage devices (hard disks or otherwise) are allowed, RAID (or
just splitting the data across multiple disks) would ameliorate the seek
problem somewhat, since seeks can then happen in parallel. I wouldn't see this
problem as requiring a distributed solution, unless the bottleneck is the bus
or storage controller rather than the single-disk seek performance.

~~~
xtacy
Multiple disks, across different physical servers (instead of RAID) would be
good too.

You could also try increasing the throughput by having an in memory cache. The
index/hash table can also be in main memory. Depending on the keys, locality
can play a crucial role in prefetching data.

------
aphyr
Goshdarnit. I just sat down and started talking to myself about hash
functions, filesystems, the wire protocol, and partitioning scheme for _half
an hour_ instead of making dinner. Nerd sniping?

~~~
pjscott
Why fret about the wire protocol? ZeroMQ will send messages plenty fast
enough, and it's really very simple to use; just a few lines of code.

Damn you, now you've got me doing it. Nerd sniper. ;-)

------
nirajr
I love to ask something that is simple to start with, and then build on it
interactively with the candidate.

One of my favorite questions is - Give me a normalized database structure that
you'll implement if you were to build gmail - incorporate conversations,
messages, multiple message participants and labels.

Then, depending on the candidate, I build upon the question,and go into
various optimizations possible, the ways caching would be implemented,
sharding/splitting/de-normalization would be done with load, etc. etc. With
good candidates, its always a very interesting discussion. And even the bad
ones can leave the interview thinking they know something and don't need to
feel bad :)

~~~
ericflo
This. A million times this. I do the same thing starting with a simple kernel,
but keep growing it, throwing in monkey-wrenches, see how the candidate
reasons and adapts.

Good candidates will start recognizing and acknowledging trade-offs, whereas
bad candidates will settle on one clear path.

Good candidates will speak about the ideas they're trying to get across,
whereas bad candidates will drop into talking about specific brands of
technologies.

~~~
nirajr
Definitely.

One candidate when faced with some questions on how he'll handle load said
he'll switch to Oracle :) Promptly asked to leave.

Good ones clearly show that they have the ability to deal with multiple tracks
of ideas, know that trade-offs need to be taken care of, in general, have an
approach that with some ingenuity, almost all problems are solvable. Thats the
attitude I (and I guess you too) look for.

------
pepsi_can
I was unsure on how to answer this question since most of my experience is in
application development.

I'd like to not be embarrassed when asked this question. What advice would you
give someone wanting to learn enough to answer this question?

Are there any projects, books or other resources one could undertake?

~~~
wazoox
Jon Bentley's "Programming pearls", definitely. This problem is somewhat
comparable to the first "pearl" from this book. Anyway one of the best ever
programming books :)

------
bialecki
I like this question, but I'm wondering what a good interview looks like if
someone is completely blind-sided by the question because they don't have a
lot (or any) systems experience? I'm imagining an engineer who's experience
ends with MySQL so knowledge of stuff like tcb is a definite no. Maybe that's
not an issue, because they wouldn't be interviewing for this job if that was
the case.

~~~
derwiki
... not an issue if the questions are relevant to the job. On the other hand,
recruiting is hard and it's easy to let a great candidate slip by for any
number of reasons.

------
tbrooks
Ning cut 40% of it's staff in April. Why are they hiring new outside
developers?

~~~
jemfinch
Because some of us went on to better places :) Ning was recognized for its
selectivity in hiring; though it did come as a shock, everyone that I knew in
my short time there landed on their feet at similar or better operations
(Google, Amazon, Etsy, etc).

In the intervening months since the layoffs, there's also been some attrition
(as there always is) in those who weren't laid off. I know of at least two
people who weren't laid off, but left for Facebook shortly after the layoffs.

I also know that they've already hired back at least one of my former
coworkers since the layoffs.

------
grammaton
This guy is taking a lot of heat for bad interviewing technique, and this may
or may not be the case - but it's hard to say, given that we don't know what
kind of a position he's interviewing for. Given the job requirements of the
position, this may be a perfectly reasonable question - i.e. if the job
requires someone to build and maintain large custom data stores, it's pretty
acceptable practice to ask questions that reveal how much they know about
large custom data stores. Personally I wouldn't be over the moon crazy about
this question, but it's still a hell of a lot better than being peppered with
random "what is X?" type questions that test nothing about your ability as a
developer, and only what you currently directly know. That is to say, those
questions where you end up looking like an idiot if you didn't happen to read
the same man pages as the interviewer. In the types of questions this guy is
asking, you have to basically guess what they're looking for - so it can end
up being a bit of a mind reading exercise, which is why I wouldn't be terribly
crazy about it - but it's still by far the lesser of several evils, simply
because it's a direct demonstration of how a developer thinks about
development, instead of how much rote minutae they've mastered or whether they
can find a way to stick eight kittens into six holes without any broken legs
of flying fur, etc....

~~~
metageek
> _In the types of questions this guy is asking, you have to basically guess
> what they're looking for_

Don't guess, ask. Having asked questions like this before, I can assure you
that a reasonable interviewer will actually be pleased that you understand the
alternatives.

(And, of course, if you get an _un_ reasonable interviewer, you probably don't
want to work there anyway.)

------
jtchang
I've never been asked this question in an interview so when I saw it I started
thinking about how I would answer it.

The first thing I did was work out the actual bytes/sec (throughput) needed
for this many requests. Also given the 2 week timeframe I would question
really hard what the data set looks like. Since the requests aren't randomly
distributed does that mean you can narrow it down to a small enough portion to
fit inside RAM.

I think the question is a good one because it does touch on a lot of aspects
of infrastructure and systems design.

~~~
follower
> Since the requests aren't randomly distributed

Does this follow when the original article says:

    
    
      "The keys are not distributed evenly within the keyspace."
    

and:

    
    
      "Requests are distributed evenly (and randomly) across the keys"

------
olegkikin
Easily solved with SSDs. Sort the keys, value is stored right after the key.

On each request do a binary search. It takes 32 seeks in the worst case.

Even current consumer level SSDs can handle 45K IOPS, so you'd get 1400
complete searches/sec from one SSD.

You might need to spread your data over 4 SSDs, but you can search in
parallel, so it will even be faster.

If you want more speed, replicate your drives and load balance.

There are better ways (with perfect hash functions), but this is the easiest,
and requires no additional storage.

~~~
caustic
In your case it would be better to use interpolation search[1][2] instead of
binary search.

[1] <http://en.wikipedia.org/wiki/Interpolation_search> [2] <http://sna-
projects.com/blog/2010/06/beating-binary-search/>

------
mhewett
Expecting a good answer to this question in 30 minutes without any warning is
totally unreasonable. Expecting anyone to implement a deployable,
generalizable, tested solution to this in two weeks is totally ridiculous. I
wouldn't work at a place that asked questions like this or made development
calendars with schedules like that.

~~~
parbo
I did this in three weeks:

<http://bitbucket.org/rogueops/vinzclortho/wiki/Home>

I worked on it on average two hours per day. It's not that ridiculous to
believe that it would be (near) production quality in two weeks of fulltime
work.

~~~
mhewett
It's not hard to get "near production quality". It is hard to get to
production quality.

------
ckoenig
Hmm my first thought was ... well ... WTF Then I thought - man I'd like to
work for this corp. but alas I think I'm to stupid for that. Some moments of
reflection later I'm now convinced that I don't want to work there at all.

The question is somewhat similiar to "give me the _fastest_ sorting algorithm
and prove that it is" Everyone with a CS-degree or something similar will know
this prove and Qsort from his very first years and I claim almost everyone
will have forgotten the prove. (I can remember that you can prove this by
looking at the "descision" tree passed on the 1on1 comparision which will have
depth of about the needet size ;) - but this is it) I think most people will
get the prove (even without google or wiki) in a resonable time (some hours?)
but within interview-time and in this situation? In my case no way.

------
warmfuzzykitten
It's everybody's favorite interviewing technique: Ask a fairly esoteric
question that you are thoroughly familiar with and sit there and feel superior
while the candidate struggles to get to an answer you will accept. Three kinds
of candidates do well: those who are smarter than you are and spit out answers
that surprise you; those who have heard the question and act like they're
thinking it through for the first time; those who are unfamiliar with the area
but have a personality you find congenial, so you coach them to a solution. In
the end, it's all about ego.

------
VladRussian
Using general purpose database in such case is just an approximation of a
solution, ie. if the database supports index-optimized type of table and smart
enough to cache index header blocks quickly enough.

Because of extreme primitiveness of this case, custom solution would work
better here than general purpose db. Sort the pairs by key and place in the
tree, with the root node containing 2^11 keys (512K), second level - 2^11
nodes of 2^11 keys each (512K each node size), 3rd level - 2^22 nodes of 2^10
keys (256K each node size). Keep root and all the second level nodes always in
memory (1G RAM required). Given a key to find, look it up through the root and
the second level nodes in memory - this would result in the index of the 3rd
level 256K node to be read from disk. Time-wise reading a 256K block from disk
is just one IO - pretty much the same time as reading 1 byte.

5000 lookups/sec at 1 IO/lookup - 20 disk array of 250 IO/sec disks - 15K ones
would do it. 2 arrays of 400GB disks is still ~2 times cheaper than 1 TB SSD
array. RAID5(6)-ing the disks will leave us with 6TB. Short-stroke the first
1TB (70Gb x 20) to get much better speed than 250 IO/sec (or just use less
than 20 disks to start with) and enjoy the rest 5TB for non-IO intensive
purposes.

------
follower
I'm trying to work out if this is a thinko on my part or a typo on your part:

First you write:

> a system that lets you look up the value for a given key

and then later on you write:

> if they think the problem of looking up a key by value has been solved
> before

Did you mean to say "looking up a value by key" or am I misunderstanding it?

(Given that no one else has commented on it makes me question my question but
I'm trying to get over my concern of being wrong on the internet and asking to
be sure. :) )

~~~
brianm
The second one is a typo on my part :-/

~~~
follower
Thanks for clarifying that.

------
sagarun
These days i read "The Guerrilla Guide to interviewing" by joel before
attending/conducting an interview.
<http://www.joelonsoftware.com/articles/fog0000000073.html> . The OP's
question doesn't have place there :-)

~~~
ericlc
Who is Joel?

~~~
die_sekte
Joel Spolsky. Former Microsoft software engineer, now has a company that makes
FogBugz. One of the great software opinionators, like Paul Graham.

------
pdamerval
Hello, I would probably make a very poor candidate, but I wanted to ask this -
the question gives a 1TB hard drive, and fills it literally to the brim with 4
billion sets of 256 bytes. And then you say that 4 or 5 years ago this relied
on some sort of distributed system, but now it can be done on one box. What
changed? If the drive is one terabyte, and is used to store exactly one
terabyte of data, aren't you going to need to put your database overhead
somewhere? Am I completely misunderstanding the question? I find it
interesting though. It's an easy problem without the storage space limitation.
With it, it's fascinating. Are we thinking to compress the values and/or the
keys? Or am I off the mark?

------
josephyeo
It seems to me that the problem could actually be solved by using a 64GB SSD.
Given that the storage is 1TB and each of the keys and values are 128Bytes
(and hence the disk is full to capacity), and assuming the keys are unique,
which it would need to be to have all the keys unique, then we can safely say
that every possible value in the 2^32 space is used. Therefore, we could read
the data from the 1TB disk, and for each key we find, use the value of the key
as the address of a memory location on the 64GB SSD. Net effect is we
essentially remove all the key info from the hard disk and compress the
storage into 16GB. ie, no need for the hard disk, just a 64GB SSD.

------
expeditious
If you have more data than, say, GDBM, can swallow in one bite, how could you
break this data apart and then find the right chunk amongst multiple servers
when given a key?

Also, what does the interviewer mean by this:

> I generally ask these folks if they think the problem of looking up a key by
> value has been solved before, especially given the two weeks to be live in
> production requirement.

Is that a typo? s/key by value/value by key/ ?

------
uwittig
I'm in recruiting myself. If I would ask questions like this we would have
been out of new employees very soon. Questions like this only intimidate the
applicant and are much to complex for interviews. Also you seem to have the
idea that you know the only "truth". This is ridiculous.

------
gawker
Great thought question. Being a junior developer, I haven't the slightest clue
on how to answer it. Having been exposed to merely algorithms in school, would
it be safe to assume that this wasn't really targeted at software engineers
but more towards system designers?

------
kplcjl
Sorry, I'm not a hacker, I wouldn't know the first thing about setting up an
interface to a hard drive. However, the first thing that pops out at me is
that you just described a 512 GB addressing scheme. You want half your disk to
be unaddressable?

------
myelin
Hah, I remember Andrey asking me this question in my Ning interviews :)

------
jhuckestein
My favorite: "Write a script that will save you one minute of time every day"

closely followed by

"Draw a picture of your job/Facebook/the future /whatever"

~~~
swaits
A picture? Are you hiring artists? Seriously. What do you get out of that?

~~~
jhuckestein
A picture is a useful abstraction of reality. It forces people to think about
what they'll 'answer' before they do. Some people might draw a diagram, some
might draw a story, who knows. It tells a lot about people and how they think.

For some people this may seem like a fun task and they'll get excited about
it, some may not find it useful and argue about it and others (the ones you
want to avoid) will frown upon it and say 'Fine. I'll do it anyway'.

Most of the people I've interviewed are well accomplished engineers, but that
doesn't always mean I want to work with them.

Why do you think it is NOT useful to draw a picture?

~~~
jemfinch
I think it's not useful because you're not hiring me to draw pictures. I'd
pass on any interview where crayon drawings of my future had any impact
whatsoever on the likelihood of my hire.

------
clarrs
I would walk out of this interview!

