Hacker News new | comments | ask | show | jobs | submit login
A Google Interviewing Story (paultyma.blogspot.com)
201 points by ramanujam on Nov 19, 2010 | hide | past | web | favorite | 116 comments

I enjoy clever ways of approaching problems as much as the next guy, but I would never ding someone in an interview for not coming up with a clever-enough solution. Good software engineering is maybe 99.7% failure-avoidance and 0.3% cleverness. On very rare occasions you need a clever solution, but most of the time you need to solve the problem in a way that you're 100% sure will work, has no nasty failure conditions, and that other competent engineers will understand. If there's no other good solution, or if every bit/cycle matters, then you get to try to be clever, but that happens pretty rarely. I've seen way, way too many problems caused by people using clever solutions for problems that had straightforward-but-less-fun solutions. (And as has been pointed out plenty of times already here, the clever solution in this case is less optimal than a more straightforward one would be). If an interviewer seemed intent on proving they were more clever than me, or on trying to get me to throw out unnecessarily-clever solutions to straightforward problems, it would be a pretty big turnoff.

I doubt that the interviewer mentioned the prime number solution due to insufficient cleverness in the original solution. The answered O(n+m) solution actually contains flaws beyond its lack of cleverness. 1) there's no good reason to use a hash map, hashing is useful in maps for mapping an unevenly distributed set from a large an arbitrarily large space to an evenly distributed set from an arbitrarily small space. The space is only 26 characters so why hash? You can simply use a map (no hash needed), and in this case there's even a very natural way to get indices into an array (index(c) = c - 'a').

Now it's true, that these are both linear time solutions simply with different factors and we might simply say who cares. The thing about the prime solution is that it's actually really dumb itself and I think that's what Guy really wanted you to tell him. He was pointing out that you could leverage the smallness of your data and hoping the interviewee would come up with an even better way to leverage this fact.

The first thing we should notice is that asymptotic bounds are misplaced here if you consider how such a function might actually be used. The strings are basically being used as sets that is AA might as well be A as far as our answer is concerned. So assuming people aren't giving us stupid data (and in an interview you could mention a way to make this assumption true). The strings shouldn't ever have duplicates and thus shouldn't ever be bigger than 26 characters (otherwise they contain all of the characters and the answer is independent). So if a solution has runtime n + m + c, that c actually starts to matter.

The thing about the primes solution is that multiplication and modulo operations are actually very expensive and all you want to keep track of is whether you've seen a character (1 bit of information) so you should really use a bitmap for this. And since it so happens that 26 < 32 these will fit very nicely in a single int. It's even a pretty nice implementation: https://gist.github.com/707639

[edit: link to code]

With your solution, I might have not accepted you to do job because it is obvious you don't have experience with Unicode and localization.

that's outside the scope of the problem. (which is why I really dislike these kinds of tests, because you can't win - if you don't consider unicode the interviewer says "ahh, but what about unicode" and if you do the interviewer says "ahh, but that's outside the scope of the problem".)

The right way to approach this is to clarify the requirements. "Should I consider unicode?" will get you points with the interviewer, and will allow you to quickly clarify the scope of the question.

And I might have not accepted you for over-generalization that is one of the worst things ever you can do as a designer ;)

I don't think you read the post being discussed here.

> 15 seen &= mask(haystack);

Shouldn't it be

seen ^= mask(haystack);

And a personal change would be to define all as:

int all = ~(~0 << ('z' - 'a' + 1));

EDIT: For some reasons, asterisk doesn't appear before haystack.

Right you are, fixed now.

Yup, and you corrected my mistake as well. For some weird reasons, I typed ^ (xor) when I was actually thinking of | (or).

Agreed. The prime-number-division thing is a great answer... FOR SPOCK. Who wants to read code that does that kind of stuff... converting strings in to prime numbers? How about: code it in a reasonable, readable way, and come back and optimize it if and when it needs optimizations. People that code things in the cleverest way, even when it's not needed, drive me batty.

I'd far rather see a developer cross 10 items off the to-do list instead of cross 1/2 of an item off a to-do list, and get lots of style points.

It's good to see that a candidate has creativity to come up with solutions like this. But I'd want to temper that by also seeing if they have the good judgement to know when to use such things (almost never).

Okay, how was it a great answer? Unless I'm miscounting from all the jet lag, it's still O(n+m), only much more convoluted (not to mention slower, since you have a constant factor from all the multiplications/divisions). And seriously, primes? Why not powers of two?

Gah, so much fail in that answer, but the worst thing is that the guy proposed it as a better alternative, and the interviewee admired it!

Powers of two wouldn't work.

If A = 2^1, B = 2^2

AA, B would evaluate to true.

But you're right, the running times aren't any better, except that the constant factors of dealing purely in arithmetic might make it faster in real terms if dealing with a lot of this type of thing than creating hash maps. If response time matters for the application (high frequency trading, etc), it might make a difference.

Sorry, by "powers of two" I meant bitmasks, not actually dividing with powers of two. Let A = 2^1, Z=2^26, and then OR with 2^1 when you encounter an A, etc. If the integer is non-zero when you're done, you've exhausted all the characters in the original string.

This is exactly what I was thinking and I have no idea why it didn't come up. Given an "alphabet" of 26 characters, that neatly fits inside a long int, and that's just crying out for a bitmask.

Ahh, sorry, good answer. Much simpler than finding an arbitrary number of primes.

Powers of two wouldn't actually work there. Using primes lets you use division as a test for presence in the original set. The task is 'detect presence in the set', and integers are uniquely identified by their prime factorization. Powers of two would just give you another power of two. That wouldn't actually preserve the information you're looking for.

Consider an original string 'bb', and a test string 'c'.

With powers of two, you'd have 2*2, and your test would be a division by 4, which would be successful by having no remainder, indicating that 'c' is a subset of 'bb'.

If using powers-of-two, you'd bitwise-OR rather than multiply the values-per-character. Simpler op, and your accumulated value per string never rises (in the 26-letter case) above 2^26.

Yep, that's what I meant, thanks. Powers-of-two is just for getting the correct bits.

Upvote ^10 if I could.

The dinging people for not-clever-enough solutions feels like hazing. It's less about competence than about ego. The most important thing in an interview is to make sure the person's level of skill is at least as big as their conception of their skill.

The most important thing in an interview is to make sure the person's level of skill is enough to be your coworker. Their conception of their skill doesn't matter. Programmers who know they're bad programmers are still bad programmers.

Yeah, but a good programmer who thinks he's a great programmer will likely be a pain in the ass to work with.

Sure but you at least have a chance at helping them out and moving them up. To some extent is culture and happenstance. Sometimes you get off on the wrong foot and you're on each other's nerves. But attitude and work ethic are like multipliers that can go negative.

There are two kinds of cleverness: cleverness that makes things simpler and easier to understand, and cleverness that makes things more complicated and harder to understand. Often the former is the more difficult of the two, and also the mark of a better programmer. Moreover, experienced-but-average programmers will often decry cleverness in general, having observed too many cases of the latter and not enough of the former kind, and then when the former kind comes along, they have the same reaction, and this is bad.

In my life I never got interviewed, even if it is 15 years I work as a programmer. Now I work at VMware thanks to Redis, and VMware did not requested an interview, and my only previous not-self-employed position 10 years ago was likewise triggered by my work on hping.

But, I'm sure, I would suck so much at this kind of interviews. If you are anything like me you'll understand what I mean, in topics where I work day by day I've pretty much the control of what the good solution can be in a few minutes, but for many things to find the best solution requires, at least for me, days of thinking, sleeping, possibly waking up with the solution in mind, to find it's wrong and you need to reiterate the process.

My design abilities are all there, in this days. I'm sure that in the five minutes race I would say many times something of super stupid. Now my question is, are the five-minutes performances really linked to the three days thinking about your problem solution?

Isn't it possible that at least a subset of guys that will get the few-days answer well, will instead provide a poor answer in little time, and sometimes the other way around?

If this can be somewhat true, there is a huge industry selecting runners for 100 meters, in order to run, most of the times, a maraton.

The prime multiplication is a pretty bad solution. It's actually O(n log n) rather than O(n), since you have to use some form of big integer, and multiplying a size-n number by a constant is O(log n). It is also needlessly complicated.

Especially since they're using the primes to encode N = 26 independent variables within a single number. On any modern CPU, you already have 32 (or more) independent variables in each number, they just happen to be called bits. And you can test them and set them in much simpler and faster ways than integer multiplication and mod.

Yeah, seems like dividing ints would be a much more expensive operation than comparing them.

Yeah, I don't get the interest in this answer. It's neat-but-useless.

It's just one anecdote, but it also seems like the kind of thing that happens in companies that grow fast and have trouble maintaining quality in their hiring process. Instead of hiring true cleverness, they hire for someone who feigns the trappings of cleverness (smart enough to regurgitate a "clever" solution that turns out not to be practical in the real world.)

Can you explain what representation you use to get multiplication by a constant in O(log n)? Wouldn't it be at least O(n) for the standard bignum representations, making the algorithm O(n^2).

Multiplying n primes together creates an O(n) digit number. Multiplying an O(n) digit number by a constant is an O(n) operation. So essentially we're doing an O(n) operation n times, hence O(n^2).

The smartest representation that I can think of for the numbers would be to store the prime factor exponents in an array, which would essentially transform the algorithm into the array based version. For example 120 = 2^3x3x5 would be represented as {3,1,1,0,0,...}.

It's O(n) in the number of bits. The number of bits is O(log n) in the value of the number. Multiplying n primes together, where the primes are taken from a restricted set of numbers, creates a number whose value is O(n) (it has n factors, but each factor is bounded by a constant).

BTW, I would've used a bitvector. For lowercase letters only, you could fit it into a 32-bit value (add mixed-case and numbers and you can fit it into 64 bits), then it's just an OR to set a bit and an AND to read if a bit has been set, both fast machine-level operations.

Edit: Didn't see the footnote on the problem. A bitvector doesn't work if counts must be kept, but you can still do this pretty simply with the table solution.

So, the resulting algorithm would be O(n^2) not O(n) where n is the length of the string. The table based solution is exactly like keeping the exponents in an array.

i think it is a joke, you guys here already mentioned multiplication/division are expensive operations, so a simple hash just works well. and who care the complicated way if the simple way just makes the job done nicely. programming is an art.

My last internship in college involved working on thermostat systems for Honeywell with a bunch of people who had been there their entire professional careers. I didn't have much interest in becoming a lifer and ended up interviewing with a couple other companies, including Microsoft.

I interviewed for a Program Management position in the Visual Studio group. My first interview was with the Design Manager for the VS product line. Her final question for me was about building an effective temperature control system for a new house. I launched into a 5 minute analysis of the advantages and disadvantages of a wide range of HVAC systems, obvious ramifications from open floor plans, and so on.

A month later, we sat down for lunch as co-employees for the first time, and I told her the whole story. She got a big laugh out of it. I guess she didn't realize that's what I'd spent the last few months working on.

Your interviewer didn't read your resume?

She did, but I hadn't been super-specific about what I'd been working on.

Angry because neither the hash table or the prime multiplication would be as fast as a boolean array indexed by the char value. As an added bonus, the boolean array actually makes the most intuitive sense.

Even faster is to fill a 32-bit mask with "seen" characters for each string. AND them together and compare result to the mask from the second string.

Maybe. You need 256 bits for all the possible chars, and this solution is perhaps a little less intuitive. Also, the shift + and + cmpz is not necessarily going to be significantly faster than an address lookup + cmpz because of the latency of getting the array from memory or cache. Though it will use less cache space compared to the size of modern caches that's not enough of a win to matter much I would guess.

I was assuming a 26 letter alphabet. At that point the shifting, anding and comparing would take place in registers.

True, although if there are lower case letters too it could become a problem. At any rate, I'm not sure that the register aspect would improve speed significantly because of the latency of getting the string from main memory. You're still going to get pipeline stalls as you reach portions of the string that are not in the cache.

Yeah, the registers will only be faster than manipulating some sort of bool array. Reading the original string (with associated cache stalls) will certainly be necessary.

Doing it this way will use the optimal amount of memory (in the worst case) (assuming an O(n+m) answer). There are exactly 26 bits of state that need to be tracked (where "bit" is used in the information theoretic sense of the word). If you choose to store this state as 26 actual bits, then you win.

Multiplying together the first 26 primes requires that you can store a number up to 232862364358497360900063316880507363070. log2 of that number is about 127.5. So you need 128-bits of storage.

Essentially, the correct solution is to sort both lists, and then do a single pass through checking for unique items in one list. Once you have this solution, you can do some micro-optimizations: - note that the set of symbols is limited, so you can use a counting sort (or other non comparison based "sort") - note that the sorting function doesn't need to be an actual sorting algorithm, and may discard duplicates.

If you apply these two optimizations, you should end up with the bit-field approach (or maybe a slightly more memory hungry one that doesn't store stuff in individual bits... but algorithmically it doesn't bring anything new to the table).

For some reason, a minority of people are then thinking that a further optimization is to use a more inefficient data structure in the sorting algorithm for storing whether or not a letter has been seen. This has the effect of a more expensive read operation, write operation, more memory usage and provides no other benefits.

If you're marking used letters in a mask, why sort first?

It would depend on the hardware used.

Standard counter-intuitive example (from the game development domain) is how to check if an array of integers has a zero in it. The answer, when all things are considered, is to scan through an entire array making a simple arithmetic for each item and one comparison at the end rather than compare each element.

The same may happen with your example. Filling up 32-bit mask might be more expensive than to flip an element in a boolean array. That's not even considering that the array option allows for early termination of the second loop.

Early termination is the biggest advantage of the array version. The 32-bit mask is probably the fastest possibility since it will reside in a register.

The array is a pretty obvious solution (to me, anyways). But the hash table has an added benefit.

Consider the following string, where &#160; is a non-breaking space:

   Main string: "Counter example: &#160;_à²"
   Shorter string: "ಠ_ಠ"

Note: "Say you have one string of alphabetic characters"

The array of bools, bitmap (remember 32-bit int!), hash map; all are the same basic deal, easy to understand code, etc. Any would probably be fine, imho.

Hash map would be my initial implementation, though, if only because I don't know where it would be used, and the map will be immediately understood.

I come from a compiler background. When I hear "alphabet" I don't think a-z - I think of an arbitrary set of distinct symbols that can be arranged into a string.

Yes, but even in compiler land, you don't say "alphabetic" to mean a symbol taken from your particular alphabet (since that is practically a given... a symbol outside the alphabet might as well not exist).

The array is a pretty obvious solution (to me, anyways).

Well, I guess not everyone is an amazing genius like you.

I recently used this algorithm for speeding up an iTunes style search function. Originally I did a strstr over every item in the database, but it wasn't quite fast enough. I precomputed a 32-bit mask - 1 bit per alphabetic char, 5 bits per digits, and the last bit for special characters - for every database item, and used that as an initial search filter. I only needed to use the more costly strstr on items that made it through this filter.

That's pretty clever. It can be tempting in such a situation to immediately go for the big guns (patricia trees or whatever), but often a simple hack like that gives a lot more bang for the buck.

A Bloom filter (http://en.wikipedia.org/wiki/Bloom_filter) might work well in this case.


A Bloom filter is usually used where representation size is important, because it can be orders of magnitude smaller than a hashtable. It doesn't perform very quickly though, because of the hashing operations needed during insert.

A Bloom filter is great for P2P search applications for example. Then peers can pass Bloom filters around, and an initial search can happen locally. If it succeeds then the search can ask the remote host if the file actually exists. The frequency those remote requests will fail depends on the number of false positives the Bloom filter is configured to give (ie, a function of its size).

I haven’t tried a Bloom filter, but I think for my application, the simple bit array might give better performance. The nice thing about the bit array is it works great on the worst-case. After the user has typed in 3 or 4 characters of their query, the search space has been narrowed enough where the performance isn't an issue. It’s that first or second character when the search space is large that gives problems. Using the bit array in the single character case, the 1 to 1 mapping of characters to bits gives the result directly.

the boolean array indexed by char effectively is a hashtable isn't it? where the hashing algorithm is index = ord(theChar).

This is not quite true. It's a mapping, but it's not a hash table because the mapping is bijective (while hash tables just project the keys into the table space). So there is no possibility of collision, the number of keys is fixed, etc. It's just a table, like any other, and most of what is implied when talking about hash tables does not necessarily apply here.

that was a genuine question. I think I'm right, but I might not be. Judging by the upvotes it looks like I am, but I'd like someone to come forward and say "Yes, that's correct" explicitly.

Yes, that's correct.

thanks :)

Exactly. When your domain is restricted (characters or small ints), then why use a hashtable? Use an array. If initializing an array is a concern, use a smartarray.

Your anger is misplaced.

Characters can repeat, so a boolean array doesn't work.

Plus, the story says I mumbled awhile that given the characters were limited to alphabetic (his original specification) that I could use an array instead of a hashtable for some constant time savings but that was about it.

it doesn't matter if characters are repeated. Let's assume ascii characters, so 'A' = 65 in decimal. If you make an array, T[], where T[65] = 1 if 'A' exists and 0 if 'A' didn't exist in the first string, you can run through the 2nd string to check if T[65] was true (1) or false (0), telling you if the character in the 2nd string appeared in the first. Same thing as the hashtable. I think that's what was meant by boolean array.

I guess it depends how the question is presented. Often it is known as the "Ranson Question" (A set of magazines, a given ransom message: can you make the message from the magazines).

In this case the question says find out if all the characters in the smaller string are in the larger string, so yes, I think you are right - the repeats don't matter here.

sorry, i don't get how characters repeating affects things

If there are 3 As in the first string and 4 in the second, it fails. A boolean value can't count to 3.

Maybe in a different problem it would fail, but this is merely a presence test, so booleans are fine. At any rate, an array of int32s instead of bools would defeat the objection for reasonably long strings.

This is that different problem. The author's description is muddy, but you do need the counts.

Oh really? How much faster would that be?

As much faster as the length of the second string times the difference between an address lookup and a hash table lookup.

Which is how much? I wanted you to try and quantify the difference, because the process of doing so would lead to the flaw in your argument.

The hash table solution is O(n+m). 24 operations on the example. The array solution is O(n+m) and 24 operations. The same.

Your intuition tells you that the array is faster because subconsciously you're making assumptions about implementation details. What if you're in a language like PHP where arrays are implemented as hash tables? What if you're in C, but the hash table implementation uses a resizable array and happens to choose an initial size of 26?

Stories like this and the comments that follow just reinforces my belief that I am no where near smart enough to work with most of you guys.

Plus I'll never wear leather pants. That can't be comfortable.

Well, look at the bright side. You're now smarter 'by this question' - You can be rest assured that if you're ever asked this question, you'll be able to answer it.

If you are ever asked an interview question which you've already answered in a previous interview, you should tell the interviewer immediately.

I know some coworkers who will intentionally ask a question they know you were asked in a previous interview to test your integrity. (Edit: Not that I would condone this practice either.)

These types of interview questions are about evaluating how you think far more than what you know. So, more importantly than the risk of getting caught, if you recite an answer from memory and pretend that you're deriving the solution on the fly, you're lying to your future coworker.

I couldn't disagree more.

Where do you draw the line? What if you'd spent the previous two days reading about graph theory and in doing so had come across a neat network flow problem that co es up almost verbatim in the interview? What's the difference?

Interviews are a filter and not necessarily always accurate or fair. This can go both ways. You can have bad days when your brain freezes. You can have good days when you're asked something you know in your sleep. It doesn't really matter how you know it.

Besides just knowing the solution to something doesn't mean you can give the answer, discuss the solution and analyze other solutions, all of which may come up.

And somehow setting up a "trap" for a future colleague to walk right in to shows great "integrity".

How much are you ever deriving on the fly when answering programming questions? You aren't goint to reinvent computer science on the spot, you're going to apply the vast knowledge and experience you hopefully have. The problem in the story is so basic that I would assume anyone who got it right had run into a similar problem before.

A friend of mine interviewed at Microsoft sometime in the mid-90s. They offered him a job, but he declined it. They interviewed him again a few years later, and one of the questions had come up in his previous interview. He said "in fairness I have to disclose that I was asked this in my last interview -- and they must have liked my answer because they offered me the job." For some reason the interviewer was so flustered that the rest of the questions were all softballs :-)

Your mileage may vary.

> If you are ever asked an interview question which you've already answered in a previous interview, you should tell the interviewer immediately.

That's sort of like saying you won't shot a man in the back during a war.

Quite similar to the recently-discussed http://news.ycombinator.com/item?id=1922243 - where hundreds of students received a test they had already obtained from their textbook publisher.

No one came forward - the professor only figured it out after he saw the average mark on the test was one-and-a-half grades higher than normal.

This is right. In my experience, when you tell interviewers that you already know the answer to their favorite interview question (and can explain the answer) they are just as impressed as if you had solved it yourself, and you don't feel like you've deceived them.

In two interviews with G, separated by 5 years, the same 'hard' programming problem came up both times. I answered it badly the first time, then did a bit if research after that interview, and was fully ready to expound on it in the later one. It would be really embarrassing to get tripped up on the same question twice.

If you're future co-worker has on leather pants and is asking questions like that, all bets are off...

There is a limit to how open employee should be. Google actually wants employees to keep things to themselves and not to talk about internal stuff to outside world. 100% honesty would make prospective candidate leak internal Google stuff to the outside world. So it's unlikely that interviewers would set up "integrity traps".

100% honesty would make prospective candidate leak internal Google stuff to the outside world.

That doesn't make any sense.

open != honest. Two different things. It's not dishonest to obey a Non Disclosure Agreement, and so you can be perfectly honest person and still not be "open" about matters which you are not authorized to reveal.

You forgot about context of my comment. I used "100% honesty" in the sense of "never lie and AND be open". Please re-read original context: "If you are ever asked an interview question which you've already answered in a previous interview, you should tell the interviewer immediately."

"Not talking about your past interview experience" has about the level of openness as "not talking about your corporate experience".

I feel like most engineers rarely have to try too hard to get a decent job, but I wonder if it's because of stories like this. I've been on the interview circuit a handful of times so far in my life, and aside from the first time when I was mostly clueless ("So where do you want to be in 5 years?"; "Man, I never think that far ahead."), I feel like interviews have become a very routine process of answering a similar subset of questions over and over again. Am I being hired because they've made a true evaluation of my technical skills, or was I just serendipitously lucky to have heard the answer to how those 5 pirates are going to split those 100 gold pieces?

Probably none of the above. I've heard the interviewer makes up their mind to hire you in the first few seconds of meeting you. Psychology is a strange, strange thing ... but reality is better than fiction.

Never underestimate the influence of narcissism in hiring, manifested as the kind of blink decision you describe. Teams in a large company develop a personality and culture that more often than not extends to attributes that have less to do with skill and competence and more to do with similarity of physical characteristics and outside interests. (The degenerate case is flat out nepotism, but the more typical case is a team of people fitting a core profile--5'10" 30-something males from second tier colleges, or taller-than-average mustachioed conservatives.)

Most large company hiring practices are not so much a selection of specific traits as they are a filter for ensuring a lack of negative traits. When you factor in the bias of narcissism in interviewers, you'll get competent people who are skillful at reflecting the interviewers' traits back at them (such as the author, who didn't demonstrate incompetency with his first answer, but aced the interview by reflecting the cleverness the interviewer must have self-identified with as "Google material".) After a certain point, four or seven or nine layers of interview will certainly guarantee the incompetent ones don't get through, but at the same time it will also select for the kind of highly adaptable social personality that is often found in political operators.

My go to story to describe this involves a guy who was a EMT. He spent his interviews describing his life as an EMT, and rarely dealing with any meaningful technical discussions. He was offered a ton of jobs....

I saw something similar with a badly underqualified sys-admin who had been in marine recon before embarking on a technical career. He had zero issues finding a high paying job.

Yes. I've heard of multiple studies concluding employers hire those candidates that are the most "like" them.

I read a book called "Money Ball" last year. One of the lessons I took away is that a successful baseball team can be created from undervalued stats (i.e. irrational beliefs in value cause inefficiencies). This trend you described forms teams of walkers xor home-runners, for example. I don't know why I believe this (I'm subject to my own criticism), but I strongly believe teams with multiple talents outperform teams with one talent (generalists vs niche).

I've interviewed a fair few people in my time. First impressions count, but they can and do get overturned in the course of the interview. With hindsight, although it was never a conscious choice, if someone made a good first impression I tended to give them an easier ride with the questioning and they were more likely to come out looking like a potential hire as a result.

Seconds? No.

Minutes? Well, yes.

In interviews, my tentative conclusion after two minutes is _usually_ the same as after 60 minutes. (I spend the following 58 minutes trying to disprove my hypothesis, of course.)

I have been in a similar situation, where one interview's curveball ended up being asked in a subsequent interview. I went through the same emotions as the author (including resisting the urge to grin from ear to ear). But like an idiot, I answered with the trick answer right away (though I was calm about it). I then told the interviewer that I had learnt it in a previous interview. It turned out he was looking for the clever answer too; and he was disappointed that I knew it. Maybe he felt I wasn't sufficiently "excited" about the clever answer, but I never got the job. Which is not too bad, since that outfit wasn't my first choice anyways.

I was asked that exact same question over a phone interview.

One of the most interesting questions given to me was, write an algorithm such that given four colors and a rectangle, fill the rectangle with a gradient using a color in each corner. After the fact it wasn't very hard, but having never thought about such a problem, it was pretty difficult white boarding it out. It was a pretty good question because I think that most people aren't thinking or preparing for such a problem, instead opting to study arrays, linked lists, trees, sorting algorithms and running time. You really get to see how a person thinks with it.

Multiplication and division are o(log n) operations.

I've had something similar happen to me at a Microsoft interview. The interviewer asked me a question I explicitly knew the answer to, and it too wasn't "my" answer to the problem but the one I've heard in another interview. So I just told him: "I know this one. Can you ask me something else?" He told me he appreciated my honesty, and didn't have any other questions.

I got the job.

Personally, I would be more inclined to hire a person picking the "throw it in a hash and look it up" solution over any more complex solution. My reasoning is that bad programmers tend to get lost in nuance, or don't understand a problem. Good programmers tend to reason through the proportionate value of a problem. I'm not saying nuance is always a bad thing, but it's probably not what your company is developing unless you work for a math department.

I hate such stupid tricks, frankly a hash table or an array for a restricted domain is way faster, since the whole data structure gets cached on the L1.

I can solve the same problem by using statistical thermodynamics, and show its only o(1), since each string is a configuration of the system and finding common alphabets is like finding degenerate states.

How could you do this in O(1)? Or are you saying you can get a probably right answer in O(1)?

It's impossible to do in O(1) in the worst case, because you have to at least examine all the input (e.g., if the input matches the regex AA*B.)

That was my thought process, but the parent of my post claimed he could do it in O(1) (actually, he claimed it in o(1), but I'm assuming he just typo'd). Hence my question.

The point of the story is not the actual problem that author had to solve, which we have been discussing (but again, its _hacker_ news, after all :-)), but basically to tell how interviewing experience, even at which you fail, prepare you for better.

Hum, Guy's solution is really terrible, division, multiplication and modulo are very expensive operations, and completely unnecessary here. This is basically a case of "I'm clever and you must think like me to be in my team". Uh no thanks.

It is strange how every time someone interviews at Google they feel compelled to write "their" story and share it with the internet.

How many stories like this are out there now? Hundreds?

I paused a bit before reading about the possible solutions, and actually thought of the (prime number) solution the interviewer came up with.

I realise that this is a bit of redundant post .. but, as a person who isn't a brilliant coder I surprised myself. But then again, after reading the comments here - I think maybe I just have an obtuse way of thinking about things.

I'm guessing that the guy in the leather pants probably got the idea from reading about Godel numbering. If you're interested in this, you might like this book: http://www.amazon.com/Godels-Proof-Ernest-Nagel/dp/081475816...

> "Given that our range of characters is limited."

i think what matters is that characters are enumerable, not finite?

Yes, if your code is going to run on an Turing Machine that actually has infinite tape.

Leather pants turn a good story into a great story.

Leather pants in an academic setting always makes me think of Jeff Goldblum in Jurassic Park.

I kind of wish that he didn't emphasize the 'women engineer', would've kinda gotten the picture when he starts using she.

Maybe we'll get there someday...

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact