Hacker News new | comments | show | ask | jobs | submit login
Retiring a Great Interview Problem (thenoisychannel.com)
173 points by dtunkelang 2117 days ago | hide | past | web | 117 comments | favorite

Why retire the question just because you saw it on Glassdoor? The same question was posted to sureinterview last year, so anyone you hired since November is suspect! (If you believe that potential advance knowledge of the question is relevant).

Worse yet, even if you did come up with this problem on your own, this exact problem was a fairly common interview question back in the mid 90s, when string processing interview questions were all the rage for C/C++ programming jobs -- I must have seen it a dozen times in various interviews over the years. The problem is familiar enough that I'd bet a decent amount of money that it must be listed in one of those interview problem books that were popular before the websites for this stuff started showing up over the past couple years... If you really relied on the interviewee having no possible advance knowledge of this question prior to the interview you surely had a false sense of security prior to seeing it appear on glassdoor.

As long as you engage the interviewee to see that s/he really understands the answers they are giving, I don't really see why it matters if the question has appeared on one of these sites. If the interviewee is preparing for their interviews enough that they are actually looking at these sites and understanding the answers to the point where they can intelligently discuss them, that probably already puts them in the upper tier of people you'd want to seriously consider hiring, so retiring the question is probably counter-productive unless you have a non-disclosed alternative that you're sure is as good of a question.

A spiritually similar question at a previous employer resulted in many candidates attempting to iterate over the dictionary rather than iterating over the string. We hired them. At least they could iterate over a dictionary. That's a surprisingly rare skill in the hiring pool.

Maybe I'm just getting cynical in my old age, but sometimes I think the world is awash in incompetence. We see so much of it in tech because our incompetencies are (marginally) harder to hide.

>sometimes I think the world is awash in incompetence. We see so much of it in tech because our incompetencies are (marginally) harder to hide.

A very candid Philip Greenspun in Founders@Work( p341-2) : "In aviation, by the time someone's your employee, their perceived skill and actual skill are reasonably in line. JFK Jr is not working at a charter operation because he's dead. In IT,you have people who think 'I am a great programmer'. But where are the metrics that prove them wrong ? Traffic accidents are very infrequent, so they don't get the feedback that they are a terrible driver.

Programmers walk around with a huge overestimate of their capabilities. That's why a lot of them are very bitter. They sit around stewing at their desks. That's why I don't miss IT, because programmers are very unlikable people. They are not pleasant to manage."

I second that. For musicians it is hard to hide their incompetence, for programmers it is just a matter of picking the right succession of jobs.

I like that you accepted new approaches as valid even if a candidate did not finish them. Reading this I can’t help but think how this person falls into the java trap of trying to make a ridiculously generic solution which I would consider a danger sign but plenty of people love to see.

So, assuming a real world input and real world dictionary you can try plenty of things that break down for a dictionary that includes four hundred A’s but are actually valid solutions. Also, if you want a fast real world solution then sticking with a pure lookup dictionary would slow things down. Being able to toss 5 letters into a lookup table that says the longest and shortest words that start with those letters would save a lot of time. Basically, ‘xxyzy’ = null, null saves you from looking up x to the maximum word length in your dictionary. Secondly sanitizing inputs is going to be both safer and faster. And anything you are coding in 45 minutes is going to be a long way from production ready code.

PS: Even with backtracking you can code an O(n) solution for vary large input sets. Just keep track of the K best output sequences that are aka N, N-1,N-2,…N-K.

Many years ago ('97) we had a Java programming test for new hires that included a question that involved iterating over a Hashtable containing String keys and values and printing them all out.

Nobody ever answered that question correctly and I used to joke that if anyone ever did we'd hire them immediately.

NB I've long since stopped using those kind of trivia questions in interviewing, but I didn't know any better at the time.

I'm sorry, my Java is rusty; is this the equivalent of the Python

    print("\n".join("%s: %s" % (key, value) for key, value in mydict.items()))
? If so, is it hard to do in Java?

In 1997, before for-each and generics, it would've been very verbose. I learned Java in 2006 and haven't written much in a while, so I maybe slightly off, but I think you'd either get an Iterator for the Hashtable's keys and repeatedly call Iterator.next() and Hashtable.get(), or get the length of Hashtable.keys() and use a traditional C-style for loop.

Okay, I couldn't resist looking up the old docs, so this looks about right (Enumeration instead of Iterator -- two interfaces that do basically the same thing):

  void printTable(Hashtable table)
    Enumeration e = table.keys();
    while(e.hasMoreElements()) {
      String key = (String)e.nextElement();
      String value = (String)table.get(key);
      System.out.println(key + ": " + value);

Yeah - that's pretty much it. Except that being lazy I'd have probably not bothered casting the keys and values to Strings and just called toString() on them directly....

Did you expect people to be able to do this by hand? I mean infront of an IDE I could do this in 5-10 minutes. But on a blackboard or pen and paper there is little chance I would get anywhere close to a correct answer.

I don't understand how that's possible. I mean, I'm going to assume that you're telling the truth, but I don't understand how it's possible for that knowledge to reside so entirely in your IDE rather than in your head. Do you mean you know that you would use an Enumeration, but not the names of the Enumeration and Hashtable methods that you would use? If you're trying to debug a piece of code and it's calling the wrong methods on an Enumeration or a Hashtable, how do you tell that they're the wrong methods? How does this work?

I have use a lot of diffrent programming languages. I have trouble seperating what you can in theory do with a hashmap from what a given implementaiton let's you do with a hashmap. I have also engraned typing code out to the point where a lot of basic syntax features quickly become automatic I might think EOL but I type ;<carriage return>

The best way I can discribe it is like singing. You can have trouble speaking the words to something that you can sing without difficulty.

I think onemoreact meant that after relying on a good IDE for awhile you can easily forget whether the method is named table.keys() or table.getKeys(), hasElements() or hasMoreElements(), stuff like that. Forgetting such things doesn't impair your ability to understand or debug the code.

Java's idiosyncratic interfaces don't always help either: there's Enumeration which has nextElement() and hasMoreElements(), and there's Iterator which has next() and hasNext().

This was the last question out of 15 or so that we used to give to people as part of an interview if they claimed they had done Java before - it was a written test. Some people had only done C or C++ before and had only had a quick look at Java and still did pretty well in it.

I see, thanks. It's more complicated than the Python version, which does the iteration implicitly, but I'm still surprised that nobody was able to do it.

FYI - there's a potential defect there - keys are not required to be strings so you should lookup the value before you cast.

Ideally, whatever your key/value is would have the toString method over ridden to meaningfully represent it. Alternatively, I could write it as

for(Map.Entry<K, V> entry : map.entrySet()){




I was relying on the assumption from the original question that the keys and values were all Strings: "...a Hashtable containing String keys and values and printing them all out."

Well, yours does iterate over the dictionary twice and generates a temporary list in the meantime. If you have a huge dictionary, this would explode. The equivalent Java code wouldn't do that (due to lack of list comprehensions).

At first glance, your statement is incorrect: there is no Python code to create a list here, and no list comprehensions. On further examination, though, string_join in stringobject.c invokes PySequence_Fast to convert its iterable argument into a sequence --- a temporary list, in this case --- so your statement that this code generates a temporary list is correct, although I suspect that this is more by accident than because you have a deep knowledge of the implementation of the Python standard library.

The temporary list of temporary strings probably will use less memory than the original dict did, though, so it's only a very mild sort of "explosion". It will, however, be several times bigger than the final output string, which also has to be huge if the dictionary is huge.

Python doesn't have anything like a StringBuffer. If it did, it would be reasonable for string_join to use it instead of generating a temporary list. The Python code above would look the same.

But hey, if you just want to print the values, you could say

    sys.stdout.printlines("%s: %s\n" % (k, v) for k, v in mydict.items())
but frankly I think I would write instead

    for k, v in mydict.items():
        print "%r: %r" % (k, v)

Why doesn't StringIO count as a string buffer?

Sir or madam,

you may have an excellent point, there. But I haven't looked at how cStringIO is implemented: does it construct a list of strings and then join them when they are requested, thus keeping alive all the string objects passed to it? Or does it concatenate the bytes into a buffer?

I just realized that the Python 3 memoryview object also provides the wherewithal to construct a string buffer, even in pure Python.

He is using parentheses instead of square brackets, so the code is actually using a generator expression which will produce each value lazily.

Is mydict.items() also lazy?

In Python 3 yes, in Python 2 no. In Python 2 you'd say mydict.iteritems() if you were concerned about it.

Aside from the fact that I'm using a generator, where does it iterate twice? I don't see it.

No, it's really not - even with the java.util.Hashtable class that was all that Java had at the time. I think it was something that people from a C,C++ background didn't really expect to be able to do.

Iterating over a hashtable (at least in pseudocode if they didn't recall the API functions off-hand) hardly seems a trivia question...

I hope readers aren't getting the impression from this article that the code examples provided are the correct way to do word segmentation in English. (Though I understand this is an article about interviewing and not about word segmentation. And this might be considered a preprocessing step for doing things correctly...)

Norvig gives a very approachable version of English word segmentation that uses a language model below.


Peter actually emailed me directly, though I was already very familiar with his work on this and similar problems. But yes, I make it very clear in the post (and to candidates) that they should not assume an English -- or even a natural language -- dictionary.

I had a similar question with a twist asked of me during an interview. It went something like this:

Given a list of all the short strings in the periodic table of elements (e.g. Na, F, Al, etc) and a list of all the words in the English language: 1) write a method that finds the longest possible English word you can spell given any combination of the strings in the periodic table of elements. Re-usage of elements in the same string are allowed. 2) Describe what kind of data types you would want for the two lists and describe anything special about them. 3) Give a big O estimation.

I thought it was a great question :)

I would have thought of dynamic programming when asked if I could solve it more efficiently, but I've somehow just not had much occasion to use dynamic programming, so would have had to admit it would take an evening of reviewing my algorithms books to actually solve it that way.

However, I would have been able to come up with an alternative O(n^2) solution, involving building a directed graph with vertices representing each position in the string and the end of the string, with edges connecting two vertices if there is a dictionary word starting at the position corresponding to one vertex and ending just before the position corresponding to the second vertex. This can be done in O(n^2), and then you can find the shortest path from the vertex for 0 to the end vertex on this graph in O(n^2) (e.g., by Dijkstra's algorithm), and that gives you an exact covering of the string using the minimal number of dictionary words.

Your approach is correct. The graph that you've built is a DAG. This means that, you can solve for the lower vertices before solving for the upper ones. This is exactly how a typical Dynamic Programming solution works. If for solving a problem, we can see the relationship b/w this problem and other similar but smaller ones, we solve the smaller problems first and use these solutions to build up a solution for the larger case.

EDIT: A naive implementation of your idea can take upto O(n3m) time where m is the no. of words in dictionary. Using a fancy data structure like a trie or suffix tree or an automaton can speed it upto O(n^2). Can we better that?

Actually, my original implementation used Dijkstra's algorithm. But I think the linear table of suffixes (or prefixes if you prefer) is a cleaner approach.

Only slightly joking:

    re.findall("(your|dict|here)", "yourword")
I like the idea of constructing a state machine to do all the matching.

I love it. You spark an (imho) much more interesting debate: that of raw smartness vs pragmatic productivity.

A colleague of mine (now hired) was asked to code a string compare during her interview for a .NET function. She said "String.Compare()". This puzzled the interviewers for a while. They asked her whether she could write it out, she said she didn't see the point.

They ended up hiring her for other reasons (notably, references from trusted colleagues who had worked with her), but I still wonder whether that attitude worked for her or against her.

There's something to be said for NOT reinventing the wheel. Standard libraries are standard for a reason - the "String.Compare()" function is likely faster for just about every case, in addition to being already there.

I keep seeing this argument on threads about interview questions.

The principle of not reinventing the wheel has nothing to do with interview questions. Sure, if you're actually working on solving an actual problem at your job, then most of the time you'll be better off using a standard library function instead of rolling your own.

In the context of the job interview, however, it doesn't matter if the solution to the problem is well-known or whether it can be found in a standard library. The problem of the interviewer is not to find out how to compare strings, it's determining if the candidate will be able to write proper code if hired. And the string-compare question might be a good starting point for evaluating the candidate's level. In any case, the fact that a solution is in a standard library is irrelevant.

If I was the interviewer I'd create a new data structure Foo and then ask the candidate to implement Foo.compare(). The question remains essentially the same, so I wonder what the candidate would reply.

> The problem of the interviewer is not to find out how to compare strings, it's determining if the candidate will be able to write proper code if hired.

The trouble with the question is that proper code to compare strings is almost certainly going to be a call to some existing library function. There are only rare cases (e.g. when you're one of the 50 people in the world who implement libc) that it makes sense to write it yourself. It's not clear from "code a string compare" exactly which set of wheels the interviewer wants reinvented: if strcmp is out of bounds, can you use strlen and memcmp? Because if strcmp was somehow buggy, that might be a reasonable thing to do. If the problem is that strcmp is too slow, should we maybe drop to assembly? Or change to a counted-string representation to avoid byte-by-byte operations? Or calculate hashes when strings are mutated, or intern them?

(Maybe in C strcmp is a bad example, since

    while (*s && *s == *t) { 
    return *t - *s;
is already about as simple as anything you'd do with strlen and memcmp...)

> If I was the interviewer I'd create a new data structure Foo and then ask the candidate to implement Foo.compare().

I think that's a better approach.

They would say Foo.ToString().Compare()... ;)

And not get the job, because it's likely wrong.

That's my point. At work, I'd prefer someone using String.Compare() over some nasty hand-crafted for-loop with switches and things for all kinds of collation issues. Why would I ask something else during the interview?

I love the GGP's solution for the same reason. If the regex is compiled only once, it may be only marginally slower (if at all), and it significantly improves readability and maintainability, and tremendously reduces the chances for bugs.

I'd hire the him. I have the impression most Google-interviewers (and similar dudes/dudettes) wouldn't. I wonder why.

>I have the impression most Google-interviewers (and similar dudes/dudettes) wouldn't. I wonder why

Perhaps that was rhetorical, but I'll answer anyway:

Being able to wire up existing libraries to accomplish a goal is a pretty low bar to set as far as proficiency goes. Google doesn't want code monkeys. The solution above is perfectly good from a software engineering perspective, but it doesn't show the depth of the candidates knowledge nor how strong their grasp of CS techniques is.

Google's interviews are more like IQ tests than software engineering tests, using CS as the measuring tool. When you're Google you can afford to be that selective.

I'm glad that you like my solution, but please note that, as another commenter said, it's a buggier version of the original regex I thought of ("^(dict|here)+$"), which I think should work but doesn't, at least not in Python.

I suspect it's because the match group is being replaced with the last match rather than added as another group, but it will work as a state machine, and is pretty much equivalent to the backtracking example in the article (although with much less code, and no memoization).

That said, I think that the reason interviewers ask about functions for which we have well-known implementations is to see whether or not you know how they work and/or could implement them yourself. Nobody will reasonably expect you to implement your own string comparison routine, but you could score points if, for example, you said Boyer-Moore for string searching rather than the naive iterative version.

Nobody will reasonably expect you to implement your own string comparison routine, ...

Standard string comparisons exit on the first mismatched character, which is insecure.

Insecure how?

Timing attacks. If one of the strings is supplied by the client and the other string is a secret, a comparison that exits at the first mismatch is faster. The client can try every value of the first character until it finds one that takes longer, and it knows that that one is the first character of the secret. It can repeat this with the second character, and so on until the entire secret is known.

Interesting, thanks.

bzzt wrong.

>>> re.findall('(a|aa|aaa|ab)', 'aaab') ['a', 'a', 'a']

The correct answer would be ['aa', 'ab'] but unfortunately findall works greedily and so will not find the optimal solution. It is possible to specify it as a regex, but common implementations might take too much time to come up with the good solution.

The mistake arises from the different semantics of findall versus what I initially wanted to do. I initially tried "^(dict|here)+$", which should return all the subwords, but doesn't (it only returns the last one).

I'm missing something, clearly, but my initial idea was a state machine that contained all the words (sorted by length, perhaps, but that's not necessary if we don't care about the solution with the "most words") and linked back to itself. It's essentially the same as the backtracking example, from a cursory glance.

Of course, findall is a dumb version that only finds disjoint substrings.

EDIT: It turns out that the group count in regexps is fixed, so it's not possible to return all matches of a single group, even if it's repeated. All my solution above does is show you whether there's a match or not, but not what it is (unless it's the single word).

Interestingly, the article explicitly states that our dictionary supports only the exact string lookup command, dict.contains(string). Strictly speaking, the full content of the dictionary isn't available to us, and we can't create the regular expression.

From the problem definition:

Q: What if there are multiple valid segmentations? A: Just return any valid segmentation if there is one.

I think ['a', 'a', 'ab'] is also valid, isn't it? So ['aa', 'ab'] would be one of the correct answers.

(of course ['a', 'a', 'a'] it's not because the 'b' is not a valid word and it's not in the solutions)

Ah right, I was a bit too quick to challenge. Dictionary: aa, aaa, aaaa, ab Word: aaaab

There's a library for constructing a deterministic regular expression from a dictionary, by the way, which would right away give you the exponential-time result. If you wrapped it in a * and applied it with a guaranteed-linear-time regular-expression engine like RE2, you'd find out whether the string was segmentable (and as a bonus you wouldn't have to construct the deterministic RE yourself) but I don't know if you'd get the actual segments.

This was actually going to be my solution as well. Regex engines are optimised for exactly this sort of problem, why duplicate the work manually?

rfumani gave one reason: it won't produce the right answer. Here's another: your regex engine will probably fall over dead if you ask it apply a million-word alternation.

grep could be fine. I wouldn't trust backtracking regex engines, though.

Let's try it:

    >>> re.compile('|'.join(open('/usr/share/dict/words').read().split()), re.I)
    OverflowError: regular expression code size limit exceeded
For smaller dicts, it should work fine, but it evidently doesn't do so well on larger ones.

Just for fun I decided to rewrite his first version in Haskell. This is probably not idiomatic, though.

  segment_string :: String -> Set String -> Maybe String
  segment_string [] _ = Nothing
  segment_string str dict =
    if str `member` dict
    then Just str
    else let pairs = zip (inits str) (tails str)
             pairInDict (x, y) = x `member` dict && y `member` dict in
         do (x, y) <- find pairInDict pairs
            Just (x ++ " " ++ y)

Here's my Haskell attempt (both naive recursive backtrack and dynamic programming version):

    type Dict = Set String

    hasWord :: Dict -> String -> Bool
    hasWord = flip Set.member

    fmtWords :: [[String]] -> Maybe String
    fmtWords = fmap (intercalate " ") . listToMaybe 

    splitWordsSimple :: Dict -> String -> Maybe String
    splitWordsSimple dct = fmtWords . go where
        go [] = return []
        go s = do
            (i,t) <- zip (inits s) (tails s)
            guard $ hasWord dct i
            (i:) <$> go t

    splitWordsDP :: Dict -> String -> Maybe String
    splitWordsDP dct s = fmtWords $ a ! 0 where
        a = listArray (0, len) $ map f [0..len-1] ++ [[[]]]
        len = length s
        f i = do
            l <- [1..len-i]
            let w = take l $ drop i s
            guard $ hasWord dct w
            (w:) <$> a ! (i+l)

So what? This problem (the full version) is more complex than FizzBuzz. Besides, I'm not pasting code here to prove that I can program, but to show how the solution looks like in Haskell.

Yes, doesn't strike me as idiomatic. But I can't come up with a better version on the spot.

How about this, this will compute more than pairs


let dict = Data.Set.fromList["apple", "pie", "bread", "applepie", "piebread"]

let fn q = concat.Data.List.map (\(x,y) -> if (member x dict) then if y == "" then [[x]] else Data.List.map (x:) (fn y) else [] ) $ tail $ zip (inits q) (tails q)

fn "applepiebread" [["apple","pie","bread"],["apple","piebread"],["applepie","bread"]]


Humbling way to start the work week. I could produce the fizzbuzz solution and in my sharper days the recursive backtracing one, but definitely no further.

I wouldn't feel bad. The advanced answers to this question require spending a lot of time understanding string processing. It's like having a CSS question that can be implemented multiple ways: a simple, obvious, slow way and a complicated, "deep knowledge required", fast way. If you have lots of experience with CSS, you might get the fast way, but it doesn't really say how good a programmer you are.

(Yes, not a perfect analogy, but it hopefully gives the idea.)

> The advanced answers to this question require spending a lot of time understanding string processing.

No, the advanced answers are a simple application of dynamic programming. If you've never heard of dynamic programming before, you're unlikely to invent it in response to an interview question, of course; but if you have heard of it, it might occur to you to try it on this problem.

(Actually, if you've heard of memoization but not dynamic programming, you might invent dynamic programming in response to this question.)

I think this is at the opposite end of the spectrum from your CSS example. Dynamic programming has nothing to do with string processing or with any other particular domain. There's a list of 29 significant algorithms that apply it at http://en.wikipedia.org/wiki/Dynamic_programming#Algorithms_.... It might qualify as "deep knowledge", but it's not deep domain knowledge; it's the kind of deep knowledge that would make you want to hire someone from a different domain.

Counterpoint: @Retric congratulations, you’ve reinvented dynamic programming with your O(n) solution That’d be a perfect solution.

And yes, I had zero idea what memorization or dynamic programming meant in that context. After looking it up on wikipedia it seems to mean caching intermediate steps to avoid recalculating them which seems obvious enough.

The advanced answers to this question require spending a lot of time understanding string processing.

Really? Seems like a standard DP question to me

Indeed. The only detail that surprised me was that you could only do exact lookups in the dictionary. That makes it O(n*n). If you had the dictionary stored in a trie it would be O(n) on long strings. (With a maximum constant whose size depends on the length of the longest word in the dictionary.)

You could store the dictionary in a trie in O(m) time before you start. But I don't think it's true that it makes it O(n); being able to tell that the prefix you're considering is the prefix of some word doesn't help you, because you still don't know if the suffix of the string after that word is segmentable. (Or even whether the string contains the entire word.)

It does help you. You're following a dynamic programming solution. Suppose that the maximum dictionary word is of length k. Then for each position in the string you have a maximum of k previous positions that you're tracking for, "We could start a next word here."

Once you've scanned the string, only then do you discover whether the whole string is segmentable.

Put another way, while you're processing you don't immediately know whether or not the whole string is segmentable. But having a trie can let you discard possibilities early. Discarding work early means doing less work means being more efficient.

It's true that the trie allows you to prune the search tree, but I don't think that gets you to O(N). The maximum-dictionary-word-length check does get you to O(N), though, or rather O(kN) if you consider the dictionary as part of the input.

The trie check contains the maximum-dictionary-word-length check as an obvious special case and therefore its worst case is the same order of magnitude efficiency as the other check.

You do have the cost of descending a level in the trie. But that can be made constant (with a jump table) or else the log of the size of the alphabet (which is a constant for all intents and purposes).

I suppose that depends on what you store in your trie. If every node contains a field for the length of the longest path below it, then yes. But that's not a normal thing to store in a trie, and it wasn't at all obvious to me that that was what you meant.

Not true.

If you keep on following the trie, you'll fall out of the trie at the point that there are no words that have that sequence at the start. Which will happen by the time you exceed the longest word in the dictionary, but will probably happen substantially before that.

Am curious to see having the dictionary in a trie allows you to build an implementation that is O(n). Will add it to the blog post if it works.

Bit of a tangent, but a real question: If a writer doesn't approve of a site's behavior (in this case, Glassdoor is desribed as "a site that does not seem to mind when interview candidates violate NDAs"), why does the writer still link to them? Inbound links (w/o a nofollow) help sites, why help sites you don't like?

Point taken. I thought of not linking to them, or even not mentioning them. But they do play an important part in the story of the post, and I don't think I'm helping them so much as I'm raising employees' and employers' awareness of their existence. At least now a few more interviewers might check to see if their questions are posted there. From my limited data, interviewees are already more aware of Glassdoor than interviewers.

As for feeding them page rank, I don't think I have so much to offer that it helps them materially.

> interviewees are already more aware of Glassdoor than interviewers

My first amusing thought when reading this was to assume a correlation; ie. that interviewees who lean heavily on Glassdoor prior to interviewing do not eventually become intervewers.

Quick heads up: page is not rendered properly in Mobile Safari on iPhone: fixed width font lines are cut off.

Sorry about that. Advice on how to fix it while maintaining good rendering elsewhere? Looks good in Chrome on my MacBook and my Nexus One.

It's not rendering well for me, either: http://i.imgur.com/Q89gP.png

My suggested solution: don't wrap it in PRE tags with manual line-breaks. It's not code, so why preserve the exact breaks? Try BLOCKQUOTE - I don't know if it's widely supported anymore - or just italicize the whole thing.

I don't really have a good solution for what to do about the actual code, though. :(

I think the concern about whether or not candidates have seen this, or any, programming question before is missing the point. Think about what we want in the ideal candidate -- we want them to come up with a good (elegant, efficient) solution to the problem, and implement it. We (judging by all the other responses) expect them to do that because they've had a solid CS education (formal or informal) as well as significant experience.

But people with that background will give good answers, even if they haven't seen _this specific problem_, because they have seen lots of problems like it and recognize the pattern. And even in that case, we evaluate them based on how well they can implement the pattern they saw, not just on whether they recognized the correct algorithm. So what if they've seen this problem already? Coding it up efficiently and elegantly in an interview context is still non-trivial, and you can still push them to discuss edge cases and performance tradeoffs.

The person who really has _never_ seen anything like this in his life, and still can give a good answer, I have yet to meet.

With the exception of the last part of the question, you learn everything there in your first year of CS at university. Do people who can't write this really put the language on their resume?

Can I get some stats? I really don't (want to) believe it. What percentage of people get this question wrong? Are they all some sort of eng/cs graduate? I'm not even a coder and I can solve this in a few minutes.

I don't have stats I can share, but I assure you that this problem has confounded many interview candidates with strong resumes. I agree with you that it's all basic material -- that's deliberate. I'm glad you think it's too easy. :-)

Yes, it's basic. But many people who are not fresh out of college may have spent recent years solving a completely different set of programming tasks, and do not have it loaded in their brains.

When applicants prepare for an interview, they do not often know what kind of knowledge to load in their heads. For example, just a couple of weeks ago I was asked to figure out a simple bit-flipping scheme, and bit string manipulations are something that I have not thought about in many years. So it took me about 10 minutes for a problem that I would have done in less than a minute when I was spending time thinking about similar things and my mind was full with them.

Being prepared for a technical interview does not mean to have memorized a few solutions to a few problems, but it means to have played with them sufficiently to have the brain loaded with the material. This helps with intuition as well as specific technical skills.

You have my sympathy :)

Its a great question. Pretty sure its not nearly as much of a secret as the author thinks though. I've seen a detailed write-up before somewhere (HN?) and I'm not even a programmer by profession.

Secret is a strong word -- I link to another post on the subject in the article. Still, seeing it on Glassdoor for my own employer crossed the line of disclosure.

I read this and went, I had someone build this for me years ago!

So I looked at the code, it used the efficient solution. Now I am even more impressed with the programmer. I've always thought highly of him (better than me) but it's hard to evaluate someone better than you. His solution ran circles around mine (I had general simple case with 2 words) and now I know exactly how much more efficient his solution was. Very neat.

Its a nice Dynamic programming problem. The beauty of DP is that simply memorizing one application of it does not guarantee you a solution to an entirely different problem that might have a similar Dynamic Programming solution. Look over the TopCoder SRM archives if you don't believe me.

So even though you are retiring this one, coming up with something similar that tests for basically the same things shouldn't be impossible.

Quick note to the author - holy god, this is annoying: http://i.imgur.com/0cNx4.png

Sorry. The post renders well in Chrome on my Mac and on my Nexus One. But apparently not so well in other browser / platform combinations.

The problem he's having is that good interview questions are getting busted, as people post solutions on the web.

If you have a lot of similar interview questions, then there's no way anyone other than a savant can memorize them without actually learning the theory.

Point taken. But it's hard to come up with good interview questions, as my colleagues here, at Google, and at Endeca can attest. In contrast, it's much easier to post solutions.

That's why I'm working on an approach that assumes the candidate does of prior knowledge of the problem. But not there yet.

There's not too much difference between "people who get dynamic programming, after boning up for the interview", and "people who can answer your dynamic programming question because they read the answer to a similar one while boning up for the interview". Both are probably good candidates.

One thing that I've noticed over the years is that almost no one prepares for an interview in any way so you'll still keep out the worst candidates with this question.

I think Google etc are exceptions, people do prepare for technical interviews there.

The author just mentioned dynamic programming. Usually in dynamic programming, e.g., as in Dreyfus and Law, to say that a problem has a dynamic programming solution we outline the solution. But the author did not outline such a solution.

An outline usually includes at least the definition of the 'stages' of the dynamic programming solution. For the problem of 'string segmentation', the obvious selection of stages would be each of i = 1, 2, ..., n for the given string of length n. But this definition of the stages does not yield a dynamic program because the solution at stage i needs more than just the solution at stage i + 1 and, indeed, potentially needs the solutions at each of stages i + 1, i + 2, ..., n.

So, first-cut, there is no dynamic programming solution. For a second-cut, there might be a dynamic programming solution if the author would outline one!

There is now some question if the Google interviewers really understand dynamic programming!

The "obvious" selection of stages does work. When you are at position j you check all i<j until you find one where there is a segmentation up to i, and the substring [i,j) is a word. Memoization is just syntactic (semantic?) sugar on top of this and he provides the code that basically implements the above. In programming competition circles it's common to just say "it's dynamic programming" when the stages are semi-obvious, and given that he uses memoization I'd say there's a strong chance he's been involved in Topcoder

Yes, from a fast reading, your solution appears to be correct, and so does 'memoization', but, still, in spite of common practice in computing, it's not a dynamic programming algorithm because what 'dynamic programming' is goes back to R. Bellman and his student Dreyfus and the book Dreyfus and Law.

There may be a problem with what you outline: That substring [i, j) is a word is not so impressive! In addition we need to know, for i < p < j < q, is [p, q] a word. That is, even if [i,j) is a word, we can't conclude yet that it will be a word in the final, feasible 'segmentation' of the whole string. Indeed, even if [j, n] 'segments', we can't yet conclude that those segments will be in the final, feasible segmentation of the whole string [1, n].

Maybe in the core of the work what we want to know is, for each i = n, n - 1, ..., 1, is there a 'segmentation' of [i,n]. To tell this, at i, consider j so that i < j <= n and ask if [i, j) is a word AND at the same time if [j,n] can be 'segmented'. If so, then say that [i,n] can be segmented. So, in this, first we ask if [j,n] can be segmented since we have already addressed that issue and recorded the result, true or false.

So, the break with dynamic programming is that at stage i, we have to look at the results of not just stage i + 1 but at the results of each j for j = i + 1, i + 2, ..., n, which breaks the basic 'recursion' that is at the core of dynamic programming. That is, a key feature of dynamic programming is that at stage i, to find what to do, need only the 'state' at stage i and, for each possible state at stage i + 1, what to do at stage i + 1. That is, at stage i, do not have to look at stages i + 2, i + 3, etc.

Also for your approach, you are not working just at stage i but also with [j, i) for 1 <= j < i which is again a break with dynamic programming where we regard each i = 1, 2, ..., n as a stage.

No matter what some common practice is, just saying dynamic programming without being explicit about the stages, states, state transition functions, and the recurrence is sloppy to the point of risking often being wrong. This is not nearly the first time that computing took something solid and precise from applied math and made a mess. Messes, even if common, are not good. You will see a lot of care and discipline in Dreyfus and Law along with essentially all the applied math literature on dynamic programming, back to Nemhauser, Bellman, and others.

It's a cute string exercise with some cute solutions, but it's just not dynamic programming.

I'm confused. Are you saying that if you have a DP table A then to compute A[i+1] you are only allowed to use A[i]? This question is classic DP. A[i] stores the segmentation of the first i letters if it exists (or it can just store a boolean, depending on implementation). Then to compute A[i] you look at the j<i see if A[j] is non-empty and if the [j,i) substring is in the dictionary. If you find a valid j you set A[i] accordingly, otherwise leave it empty.

There is what 'dynamic programming' is, and there is the 'word segmentation' problem of this thread.

The problem is interesting, and so are algorithms to solve it.

My view is that algorithms to solve the problem are not really dynamic programming.

If strain, then can regard the code I include below to solve the problem as dynamic programming, but can do such things quite generally.

For what 'dynamic programming' is, see the original books by R. Bellman or:

Stuart E.\ Dreyfus and Averill M.\ Law, {\it The Art and Theory of Dynamic Programming,\/} ISBN 0-12-221860-4, Academic Press, New York, 1977.\ \

Dimitri P.\ Bertsekas, {\it Dynamic Programming: Deterministic and Stochastic Models,\/} ISBN 0-13-221581-0, Prentice-Hall, Englewood Cliffs, NJ, 1987.\ \

George L.\ Nemhauser, {\it Dynamic Programming,\/} ISBN 0-471-63150-7, John Wiley and Sons, New York, 1966.\ \

Dimitri P.\ Bertsekas and Steven E.\ Shreve, {\it Stochastic Optimal Control: The Discrete Time Case,\/} ISBN 0-12-093260-1, Academic Press, New York, 1978.\ \

E.\ B.\ Dynkin and A.\ A.\ Yushkevich, {\it Controlled Markov Processes,\/} ISBN 0-387-90387-9, Springer-Verlag, Berlin, 1979.\ \

Wendell H.\ Fleming and Raymond W.\ Rishel, {\it Deterministic and Stochastic Optimal Control,\/} ISBN 0-387-90155-8, Springer-Verlag, Berlin, 1979.\ \

For the word segmentation problem, for some positive integer n, we are given a string of length n of characters of the alphabet, and we are given a dictionary of words. We are to show that the string can be partitioned ('segmented') into a sequence of words in the dictionary or show that no such partitioning exists.

A partitioning is a 'feasible segmentation'. We are only looking for a feasible segmentation and not all feasible segmentations.

Consider the string

Okay, the 'substring'

is a word. But that word will not be in a feasible segmentation because the string that remains

     = 'sdoll'
has no 'feasible segmentation'.

Generally, even if there is a feasible segmentation, just because we have found a word does not mean that that word will be in a feasible segmentation.

Generally we suspect that somehow the solution will be 'iterative', 'incremental', maybe 'recursive'. So, in software terms, we suspect that we will have a do-loop with

     i = 1, 2, ..., n

     i = n, n - 1, ..., 1
For 1 <= i <= j <= n, we let s[i,j] be the substring of characters k of the given string for i <= k <= j.

Somehow when we get out of this look we want a feasible segmentation or a claim that there is none.

Let's consider a loop with

     i = 1, 2, ..., n
Okay, let's let b(i) = j > 0, i > j, if s[1, i] has a feasible segmentation and let b(i) = 0 otherwise. When b(i) = j, then s[1, j] has a feasible segmentation and s[j + 1, i] is a word in the dictionary.

In the loop, the pass for i gets the value of b(i) and in this uses the values b(j), 1 <= j <= i - 1.

If we come out of the loop with b(n) = 0 or n = 0, then we conclude that there is no solution.


But it's not really dynamic programming. The obvious candidate for 'stages' would be i = 1, 2, ..., n. But in stage i, we have to consider essentially 'stages' 1 <= j < i, and it's a strain to consider that dynamic programming. That is, in dynamic programming, for the work at stage i, we're supposed to need to use only the results of the work in stage i - 1. Otherwise we have not really decomposed the problem into 'stages' with the usual recurrence relationship between stages. Or if we do the work i = n, n - 1, ..., 1, then to do the work at stage i we're supposed to need to use only the work at stage i + 1.

But if define the stages, decisions, state transition functions, and optimal value functions in relatively tricky ways, then we might be able to regard the problem as dynamic programming.

Below is some corresponding code. The code is in Rexx which is an elegant old 'scripting' language developed something over 25 years age by Mike Cowlishaw at IBM. It is fair to regard all the elementary values as just strings. If in addition, the strings are legal base 10 numbers, then Rexx can perform arithmetic with up to 50 significant decimal digits.

One special feature is a.x for any string x regarded as an 'index'. Then a.x will be a string.

The code:

     /* WORD01.RXS --                                      */
     /*                                                    */
     /*   Solution to 'word segmentation' exercise at      */
     /*                                                    */
     /*        http://news.ycombinator.com/item?id=2859182 */
     /*                                                    */
     /* Created at 15:40:00 on Monday, August 8th, 2011.   */

       exec_name = 'word01.rxs'

     /*   Read dictionary into 'stem' variable dict. from  */
     /*   file in_file:                                    */

       in_file = 'dict.dat'

     /*   Set the 'default' value of 'stem' variable       */
     /*   dict.:                                           */

       dict. = 0
       Do Forever
         If Lines(in_file) = 0 Then Leave
         word = Strip( Linein(in_file) )
         dict.word = 1

     /*   So, now, a string word is in the dictionary if   */
     /*   and only if dict.word = 1                        */

     /*   The given string to check for a feasible 'word   */
     /*   segmentation' solution is:                       */

       input_string = ''
       input_string = 'a'
       input_string = 'b'
       input_string = 'up'
       input_string = 'downup'
       input_string = 'down'
       input_string = 'downzzzz'
       input_string = 'aintgottaword'
       input_string = 'gotword'
       input_string = 'catsdoll'
       input_string = 'therecanbeanotherreasontocontactbefore'

     /*   Then we find the length of this given string:    */

       n = Length( input_string )

     /*   Here is the description of the main logic:  For  */
     /*   i = 1, 2, ..., n, we find b.i so that            */
     /*                                                    */
     /*             /  0,  if Substr( input_string, 1, i ) */
     /*            |       has no solution                 */
     /*            |                                       */
     /*     b.i = <                                        */
     /*            |                                       */
     /*             \  j > 0, otherwise.                   */
     /*                                                    */
     /*   When b.i = j > 0, then                           */
     /*                                                    */
     /*        Substr( input_string, 1, j )                */
     /*                                                    */
     /*   has a solution and                               */
     /*                                                    */
     /*        Substr( input_string, j + 1, i - j )        */
     /*                                                    */
     /*   is a word in the dictionary.                     */

     /*   At in b.jm1 below, it is convenient for the      */
     /*   logic to have:                                   */

       b.0 = 1

     /*   Pass through this look determines if             */
     /*                                                    */
     /*        Substr( input_string, 1, i )                */
     /*                                                    */
     /*   has a feasible word segmentation:                */

       Do i = 1 To n

     /*   We preemptively set b.i = 0 and then in the      */
     /*   loop on j change the value if appropriate:       */

         b.i = 0

     /*   This loop violates the spirit of a dynamic       */
     /*   program with stages i:                           */

         Do j = i To 1 By -1
           jm1 = j - 1
           If b.jm1 = 0 Then Iterate
           word = Substr( input_string, j, i - j + 1 )
           If dict.word = 0 Then Iterate
           b.i = j


     /*   Since in the loop on i above we had i = 1 To n,  */
     /*   it is essentially inevitable that here we will   */
     /*   have to work from i = n down to 1:               */

       If b.n = 0 | n = 0
             Say "There is no solution."
             Say 'There is a solution:'
             k = 0
             i = n
             Do Forever
               j = b.i
               If j = 0 Then
                   If i = 1 Then Leave
                   i = i - 1
               k = k + 1
               segment.k = Substr( input_string, j, i - j + 1 )
               If j = 1 Then Leave
               i = j - 1
             m = k
             Do k = m To 1 By -1
               Say Format( m - k + 1, 5) segment.k

I just put a bottom-up reformulation of the dynamic-programming solution at http://canonical.org/~kragen/sw/inexorable-misc/wordseg.c, in the function "segment". The stages are the prefixes of s: the substrings s[0:0], s[0:1], s[0:2],... s[0:n], where n is the length of s. The state of some stage s[0:i] is a finite map from all of its segmentable prefixes s[0:j] {j∈[0,i)} to segmentations of those prefixes. (Only one segmentation per prefix.) The transition function may leave the state unchanged, or it may add a pair to the map, mapping s[0:i] to some segmentation, which it can do if ∃j:∃w∈dict: (s[0:j] || word) = s[0:i], in which case the segmentation is (state[s[0:j]], word).

(In my program, the state[] table is represented simply as a vector of integers: seglen[i] is simply the length of the last word of state[s[0:i]], or 0 if s[0:i] is not present in the finite map. This is sufficient to efficiently reconstruct a segmentation.)

This is completely different from your formulation where you're thinking about i+1, i+2, etc. It's not surprising that you think that your formulation isn't dynamic programming!

Now, this is a solution by forward induction or "bottom-up dynamic programming", and I wrote it that way because it's easier to see the mapping to dynamic programming. But if you solve the problem by backward induction or "top-down dynamic programming" instead, you may be able to solve the problem a lot more efficiently, because you can avoid computing most of the table entries. And that's what happens if you just write the recurrence out directly as a recursive function and then memoize it.

You may have a sufficiently tricky formulation to have a dynamic programming solution. But your states and stages are a bit strange!

Much of why we use dynamic programming is that the work at stage i needs only the work at stage i + 1 (for the backward iterations), and here in some problems we can get some huge savings in computing. Also this 'framework' does well handling uncertainty.

Yes, the usual way is to do find the solution from the end and then use it starting at the beginning. In the code I posted on this thread, I found the solution starting at the beginning and then used it by starting at the end and then printed out the words in the reverse order in which I found them.

Just to clarify, would you say that you cannot solve optimal matrix chain multiplication using dynamic programming, because each state depends on a non-constant number of other states? If so, what is the proper name for the commonly known algorithm used to solve optimal matrix chain multiplication, which is described in CLRS as "dynamic programming?"

CLRS can write what they want, but they can't rewrite what R. Bellman wrote.

Part of the problem of 'what is dynamic programming' is that if permit the 'states' to keep growing in 'complexity' with the stages, then nearly anything can be called dynamic programming. A big example is stochastic optimal control with an arbitrary exogenous stochastic process. In that case, at each stage, the state is just the whole past history of the process, and then for estimates of the future we take the conditional expectation given the history. So, to make stochastic optimal control, and discrete time stochastic dynamic programming, 'meaningful' we ask for a Markov assumption -- the past and future are conditionally independent given the present -- on the exogenous stochastic process.

The key, central point about dynamic programming is the computational savings we can get, when there is good news, from the basic recurrence relation. There at state i, get to do all the work using just the work did at state i + 1 for backward recurrence and state i - 1 for forward recurrence. If can't do this, and if the 'complexity' of the states keeps growing along with the stages, then are into essentially just arbitrary problems with nothing special, and little hope of gains, from dynamic programming.

How general? There is a big AI theme that the problem is solved: Just use dynamic programming! Alas, the state variables at each stage are the past history of the universe and, thus, a bit much to handle on Windows XP for now!

Again, once again, still again, for, what, a dozen times on this thread, the algorithm of the word segmentation problem is not really dynamic programming because at each stage need to use the results of the work not just at the last stage but at all the previous stages. Yes, can patch up the situation by saying that all the work of all the previous stages is now part of the current state, but if do that then the AI problem is also solved by dynamic programming because can just say that the current state has all the past history of the universe and then we don't have much for dynamic programming being anything different from everything.

For CLRS and "matrix chain multiplication", etc., to know if a solution algorithm is DP, the burden is on the proposer, not me. If someone describes the problem and writes out a clean description of why their solution is DP, then we can consider it.

But clearly what has happened is that essentially every algorithm in computer science with an outer loop from i = 1 to n is called DP, and that's not helpful.

At university we defined dynamic programming as reducing a problem to finding paths in directed acyclic grahp. By that definition, there's a simple way to map the problem to dynamic programming. And yes, it's sloppy, if you don't mention the `stages' or equivalently, how you'd (lazily) construct the graph.

One famous application of dynamic programming is finding shortest paths in directed acyclic graphs. Maybe should credit E. Dijkstra. As I recall, that is not regarded as the most efficient such algorithm, but I haven't considered that problem for years.

As far as I know Dijkstra's algorithm is the fastest path finding algorithm in general directed graphs without cycles of negative length. But you have to employ fibonacci heaps (or so) to get the best run-time.

If you have some heuristics to guide your path finding, e.g. if you are trying to find paths on a map, then you can do better than Dijkstra's algorithm. See A*.

P.S. There's also Bellman-Ford.

The article actually includes a solution that uses memoization, which is equivalent to DP.

I wouldn't say "is equivalent to"; rather "is a form of". While it is a dynamic programming technique, unless it's specified people tend to think of the bottom-up approach to be what's implied by "dynamic", with memoization being a special case. Yes, it's semantics, but I think that where the GP was coming from.

Yes, I can go along with the claim that the article uses memoization. But that 'memoization' is "equivalent to DP" is not correct, not even close to correct, really is just nonsense. What dynamic programming is has been solid, clear, and fixed all the way back to Bellman, and memoization just is not the same thing at all.

Yes, a memoization solution to the segmentation problem might work in a loop on i for i = n, n - 1, ..., 1, and the usual backward recurrence in dynamic programming does also, but that similarity does not mean that the loop is dynamic programming.

For more, dynamic programming has stages, states, state transition functions, and a basic recurrence relationship. The recurrence does the work at stage i just from the work at stage i + 1; the string problem does NOT do this. In the case of uncertainty, we also need essentially a Markov assumption,

Again, the string problem is cute, and there is at least one cute solution, but it's not dynamic programming.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact