
Dynamic Programming for Technical Interviews - Aditya_Ramesh
https://blogarithms.github.io/articles/2019-03/cracking-dp-part-one
======
alangpierce
> problems on DP are pretty standard in most product-company-based hiring
> challenges

This is sad and a little surprising to me. I've always thought of DP problems
as being obviously terrible interview questions, even among the different
types of algo questions (which are already heavily criticized). Candidates who
have learned DP in school usually find them really easy, and candidates who
haven't usually don't even have a chance at solving them, so really they just
test whether the person has learned DP, rather than other algo problems where
you can at least try to claim that it's a proxy for "general problem solving
ability". And DP comes up almost never in the real world (and when it does
come up, you're often still better off taking a heuristic/inexact approach),
so testing if the candidate knows DP is also almost completely useless.

If you're an interviewer at a company that asks DP questions, please
reconsider. There are almost certainly alternatives that are more fair and
higher-signal.

~~~
deepGem
"and candidates who haven't usually don't even have a chance at solving them"

I am one of those candidates, and I don't know why it is called Dynamic
programming. To me a vey naive understanding of DP is this - it's just a
simple cache mechanism to store intermediary results so that you don't have to
recompute them and waste resources.

In the real world we always think about and do such optimizations, be it I/O,
disk access or db access. I would love to understand how DP is any different.

~~~
01100011
In my "real world" we normally don't care about things like big-O complexity.
We worry about doing dumb things and not taking more time than we have
available. I'm not saying big-O is useless or CS wizards are never helpful.
It's just that you need one or two of them for a large team of normies, IME.

I have a problem with this notion that knowledge of algorithms is required to
be a good engineer though. Case in point: Senior algorithm nerd on my project
is going nuts over algorithmic complexity and delaying an early delivery to
another team so they can begin using our new feature. In our case, n is fairly
small no matter what, so 2n vs n^2 really doesn't matter to us. The code he's
optimizing isn't even the long pole in the tent. It calls other routines which
take 2/3 of the total time anyway. We could just deliver now and improve later
when we have real world data on how people want to use our feature, but nope,
we're wasting time on endless rewrites in search of perfection which may not
matter if the other team can't deploy our stuff in time.

~~~
ajbeach22
>> Senior algorithm nerd on my project is going nuts over algorithmic
complexity

This is me, but luckily where I work I have people who can keep me in check
because we generally do design reviews before anything big is built.

However, I have been in situations at previous companies where big(o) was
ignored to take short cuts up front, because the "data was small" and suddenly
scaling to even just 100 users starts to break things because of poor design
decisions when it gets into production.

I guess the lesson here is more the importance of design reviews. Also n^2 is
HUGE even for small data if there is IO or api calls involved. Any public api
you provide, that is n^2 is not a good idea because you never know who may end
up using it for what.

~~~
01100011
> if there is IO or api calls involved

Right. In my case, the operation was an extra memory comparison. For something
already in the cache.

Sure, constraints can change and your assumptions about n<10k may prove
unwise, but that's our call to make as engineers. YAGNI. If you know n is
never going to grow, then why waste time on it? We're not paid to write
pristine code. We're paid to solve problems while hopefully not creating new
ones. Pragmatism and all that.

------
imslavko
This is a comment to demonstrate the differences between DP and memoized
recursion to people in the sibling comments.

When I was learning DP vs Memoization I thought Floyd-Washall algorithm to
find shortest path length between all pairs of nodes in a graph is a good
example of DP that wouldn't work the same way with memoization.

In FW algorithm, because of the order of filling up the table and discarding
old values on the go, we are able to run the algorithm only using O(n^2)
memory, but if we were to run it in a memoized recursive fashion, I would
guess you will have to do with O(n^3) memory.

Another example is when a recurrence in DP only depends on the previous row
(if we are going row by row) and you can only keep around O(n) memory swapping
two rows representing the current row and the previous one. In a recursive
black-boxy memoization method you will have to do with a full O(n^2) memory
usage.

Finally, there is a technique to do DP on a table using O(n) memory vs O(n^2)
and recovering the optimal path by recursively splitting the table into halves
and only storing O(1) rows at a time. This technique is more complicated to
explain in an HN comment tho.

Update: forgot the simplest example: fibonacci numbers. In the top-down
approach, you would memoize results based on the index of the fib number, so
you would need an array of results caching O(n) values. But if you build it up
bottom-up, you can get away with only using two variables: previous and
current and the do something like:

    
    
        prev, cur = cur, prev + cur

~~~
agbell
That is interesting but I still don't totally get how DP is different from
simple memoization.

Your fib example can be expressed as corecursion, and in fact, its an example
often used to explain it.

    
    
      val fibsViaUnfold =
          unfold((res0, res1)) { case (f0, f1) => Some((f0, (f1, f0 + f1))) }
    
        fibsViaUnfold.take(7).toList shouldBe List(0, 1, 1, 2, 3, 5, 8)
    

That's scala, but should work anywhere where we can produce a lazy stream of
values.

Here is a python version from wikipedia

    
    
      def nth_factorial(k):
        n, f = 0, 1
        while n < k:
            n, f = n + 1, f * (n + 1)
        yield f
    

Is DP a techique to arrive at the corecursive solution?

~~~
zodiac
There's a "simple memoization" version of fibonacci that takes more space than
`fibsViaUnfold`. I think the term "simple memoization" (and equivalents used
in these comments) are a bit too imprecise to be useful.

------
candeira
Dynamic Programming and memoization are definitely related techniques, but
they are emphatically _not_ the same. Memoization is a black-box approach that
can be applied to a generic recursive top-down, depth-first algorithm. Dynamic
Programming is about rewriting the recursive top-down algorithm in a bottom-
up, breadth-first manner.

Shriram Krishnamurthi explains it best:

[https://blog.racket-lang.org/2012/08/dynamic-programming-
ver...](https://blog.racket-lang.org/2012/08/dynamic-programming-versus-
memoization.html)

~~~
chillee
This is one definition, but I don't think it's the common one. The more common
definition is that dynamic programming refers to solving a complicated problem
by breaking it up into simpler overlapping subproblems that can be solved
independently.

Solving it with recursion/memoization vs. bottom-up is merely an
implementation detail, while DP refers to a class of algorithms.

EDIT: Corrected definition of DP.

~~~
rifung
> The more common definition is that dynamic programming refers to solving a
> complicated problem by breaking it up into simpler subproblems that can be
> solved independently

I dont think that's sufficient? I thought DP also implies you actually reuse
the answers from subproblems.

From
[https://en.m.wikipedia.org/wiki/Dynamic_programming](https://en.m.wikipedia.org/wiki/Dynamic_programming)

"There are two key attributes that a problem must have in order for dynamic
programming to be applicable: optimal substructure and overlapping sub-
problems. If a problem can be solved by combining optimal solutions to non-
overlapping sub-problems, the strategy is called "divide and conquer" instead"

~~~
chillee
Yeah, you're right. The subproblems must overlap.

------
sanderjd
I have always felt that dynamic programming is only confusing because of the
name. The concept of caching previously calculated values in order to save
time by avoiding recalculation (at the cost of using more space) is intuitive.

What am I missing?

~~~
clickok
Richard Bellman coined the name, and according to legend it was because the
phrase 'dynamic programming' was so anodyne that not even the most officious
bureaucrat could object.

In Bellman's own words[0]:

"An interesting question is, ‘Where did the name, dynamic programming, come
from?’ The 1950s were not good years for mathematical research. We had a very
interesting gentleman in Washington named Wilson. He was Secretary of Defense,
and he actually had a pathological fear and hatred of the word, research. I’m
not using the term lightly; I’m using it precisely. His face would suffuse, he
would turn red, and he would get violent if people used the term, research, in
his presence. You can imagine how he felt, then, about the term, mathematical.
The RAND Corporation was employed by the Air Force, and the Air Force had
Wilson as its boss, essentially. Hence, I felt I had to do something to shield
Wilson and the Air Force from the fact that I was really doing mathematics
inside the RAND Corporation. What title, what name, could I choose? In the
first place I was interested in planning, in decision making, in thinking. But
planning, is not a good word for various reasons. I decided therefore to use
the word, ‘programming.’ I wanted to get across the idea that this was
dynamic, this was multistage, this was time-varying—I thought, let’s kill two
birds with one stone. Let’s take a word that has an absolutely precise
meaning, namely dynamic, in the classical physical sense. It also has a very
interesting property as an adjective, and that is it’s impossible to use the
word, dynamic, in a pejorative sense. Try thinking of some combination that
will possibly give it a pejorative meaning. It’s impossible. Thus, I thought
dynamic programming was a good name. It was something not even a Congressman
could object to. So I used it as an umbrella for my activities."

\-----

0\. From [http://arcanesentiment.blogspot.com/2010/04/why-dynamic-
prog...](http://arcanesentiment.blogspot.com/2010/04/why-dynamic-
programming.html)

~~~
roryokane
A link to the original paper that your source quotes:
[https://web.archive.org/web/20060209011347/http://www.eng.ta...](https://web.archive.org/web/20060209011347/http://www.eng.tau.ac.il/~ami/cd/or50/1526-5463-2002-50-01-0048.pdf).
“Richard Bellman on the Birth of Dynamic Programming” by Stuart Dreyfus.

------
ram_rar
After spending a lot of time interviewing candidates. I have pretty much come
to the conclusion that DP problems dont give good signals about candidates
problem solving skills. In my experience, less 5% of candidates can genuinely
crack problems using DP. Most of them either give me memorized solution or
just plain give up.

If you are interviewing for a regular CRUD job aka web application. There are
soo many other problems which can give much more refined signal about
candidates skill. Please for love of God, dont ask DP. Unless you actually use
it at work.

------
vardhanw
In the first problem illustrated, I wanted to also get the list of coins. I
made this attempt, but this hardly seems elegant. Any comments to make it
better?

    
    
      n=10
      denom = [1,3,4]
      dp = [-1 for i in range(n+1)]
      dpl = [[] for i in range(n+1)]
      def f(n):
          if dp[n]!= -1:
              return dp[n]
          ans = 10**10
          if n<=0:
              return 0
          mini=n
          for i in denom:
              if (n-i)>=0:
                  new = f(n-i)+1
                  if new <= ans:
                      mini=i
                      ans = new
          dpl[n].append(mini)
          if mini!=n:
              dpl[n] = [item for sublist in [dpl[n], dpl[n-mini]] for item in sublist]
    
          dp[n]=ans
    
          return ans
    
      if __name__ == "__main__":
          ans = f(n)
          print(ans, dpl)

~~~
pwaivers
To be more elegant, you can remove the "dp" array. If you want to keep track
of the full list, then you only need to keep track of "dpl". Here is code that
I wrote and it works:

    
    
      n = 10
      denom = [1, 3, 4]
      dpl = [[] for i in range(n+1)]
      def f(n):
          if dpl[n]:
              return dpl[n]
      
          if n <= 0:
              return []
    
          ans = list(range(n))  # this is the max size possible
          for i in denom:
              if n-i >= 0:
                  new = f(n-i) + [i]  # append i to the end of the array
                  if len(new) <= len(ans):
                      ans = new
    
          dpl[n] = ans
          return ans
    
      if __name__ == "__main__":
          sol = f(n)
          print(sol)

------
tyingq
So "DP" is just recursion with memoization? Or an I missing another piece?

Edit: apparently I'm not the only one thinking this:
[https://news.ycombinator.com/item?id=19395862](https://news.ycombinator.com/item?id=19395862)

~~~
mrburton
I think many people struggle with understanding "Dynamic Programming."
Hopefully the following clears it up for you.

Dynamic Programming can be done using Memoization (top-down; aka recursively)
or Tabular method (bottom-up). So what's the difference?

When you see top-down, it means to start from the root of the problem and
recursively descend to the base case. As you pop up the stack, you will either
calculate and store the result or look up the value in the cache. e.g., in the
Fibonacci sequence, check to see if fib(4) was already calculated. No?
Calculated and store it so the next time you come across this, you can use the
result and not worry about processing fib(1), fib(2), fib(3), etc.

When you see bottom-up, think about filling out a table from the upper left
corner column by column then row by row. To speed performance, you'll look at
prior values, the value in the column above vs. column to the left vs. or
column diagonally to the left. I know this sounds a bit strange, but if you
solve the following problems, you'll see a repeating pattern.

Edit Distance 0/1 Knapsack Rod Cutting Longest Common Subsequence Longest Path
in Matrix Coin Change

I've been writing about this in detail. Eventually I'll publish my writing to
help others. I solve ~20 problems using Memoization and the Tabular method. As
I solve each problem, I compare the solutions with prior problems showing the
pattern. What I want to do is help people spot "patterns" vs. memorizing
algorithms that are very problem specific.

~~~
zestyping
Thanks for helping clear this up.

It seems that many dynamic programming solutions can be arrived at by starting
with a recursive formulation, adding memoization, and then optimizing the
cache based on specialized knowledge of the problem. For example:

1\. Q: How can you compute Fibonacci number f(n) recursively? A: f(n) = f(n-1)
+ f(n-2)

2\. Q: How can you memoize the results of your recursive function(s) to
dramatically reduce the number of calls? A: Make a cache keyed on n that
stores f(n).

3\. Q: Given specialized understanding of the problem, how can you minimize
the size of the cache? A: Notice that you don't need to keep all O(n) slots;
you only need to keep two ints.

Can every dynamic programming solution be explained this way? Or is there a
good example of a dynamic programming problem for which you really need to
make a leap that can't be sensibly reached through this sequence of three
questions?

------
akhilcacharya
The problem with DP problems (for me) is there seem to be a really large set
of unique DP problems and coin change, knapsack, and grid DP problems like
number of paths are only a small subset of them. What's more is that the rest
of the problems can't be easily based on the approaches used for these...there
might be dozens of problem classes to understand!

~~~
ralusek
That might be the only dynamic thing about the otherwise terribly-named set of
problems.

------
adamnemecek
I’ve found this blog post to be the best of the bunch
[http://blog.ezyang.com/2010/11/dp-zoo-
tour/](http://blog.ezyang.com/2010/11/dp-zoo-tour/)

~~~
ctchocula
This way of illustrating categories of DP problems seems really intuitive to
me. Thanks for sharing.

------
neilwilson
Thirty years after it was first highlighted, we still don’t ask a juggler to
actually juggle before hiring them.

Admittedly we have moved on from simply talking about the balls to getting
them to arrange the balls in a particular order. Still not juggling though.

(For those that don’t get the reference. It’s a chapter in Peopleware)

------
js4ever
DP is clearly more related to maths than programming. If you are looking for a
mathematician it make sense... But if you are looking for a good real world
developer it's completely wrong!

------
mesarvagya
One of the techniques as described in CLRS is to first find the subproblem
graph.

Consider Fibonacci sequence: f(0) = 0; f(1) = 1; f(n) = f(n-1) + f(n-2)

If we solve it naively, complexity will be O(1.6^n).

Now we can solve it in DP using two ways:

1\. Top Down: Instead of recursively computing the same subproblem, just store
the value of this computation and look it up when needed. That's it.

2\. Bottom Up: One cannot come up with bottom up representation directly,
unless we identify its recursive pattern and subproblems. Once we have this
recursive pattern, just plot a graph for some values and we can identify
overlapping subproblem and its overall graph.

Constructing graph for fib(5), we can see solutions from fib(4) and fib(3) are
needed. Therefore, we need to find fib(3) and fib(4) before even solving for
fib(5). Once we identify this graph, we can do bottom up, where we solve base
solution (trivial or first node in graph) and construct our solution based in
it.

Therefore, easy approach is to solve it top-down. Once it is done, we can
identify subproblem graph and construct bottom up solution.

~~~
akhilcacharya
This is all well and good for fibonacci but it feels like the difficulty of
the problem becomes exponentially greater when you start running into more
complicated optimization strategies. Text justification is an example:

[0]
[http://courses.csail.mit.edu/6.006/fall09/lecture_notes/lect...](http://courses.csail.mit.edu/6.006/fall09/lecture_notes/lecture20.pdf)

------
JJMcJ
Coin changing - my understanding is that if each denomination of coin is at
least twice that of the next smaller, then greedy is optimal. Note to self -
see if you can prove it.

That is the case for current US coinage, 1, 5, 10, 25, 50, 100, cents.

------
zerocool2750
In the knapsack example they say...

> "No global variables should be modifed in the function"

Am I wrong or does the author immediately go on to modify the global variable
'dp'

p.s. @author typo in that sentence 'modifed'

~~~
Aditya_Ramesh
Thanks for pointing out the typo! I'll fix that right away.

And in regards to the "global variable", my variable 'dp' is my cache array.
It's where I'm storing my precomputed results.

If I don't modify it, it's not DP anymore, it's plain recursion. :)

I guess I should've mentioned that the actual memoization table is an
exception.

Anyway, thanks for reading my article! :)

~~~
zerocool2750
Totally makes sense I just think it's slightly confusing in that context.
Anyway, awesome article I enjoyed the read!

~~~
Aditya_Ramesh
Thank you! I'll make sure to go over some non-classical problems in my next
article on DP to add more value :)

------
mlthoughts2018
I never much enjoyed dynamic programming and I do think it’s a poor choice for
timed interview questions, but I did become more interested in it when I
realized there are patterns to the cache strategies that can be used to group
problems. For example like the usual matrix raster fill approach for e.g. 0-1
knapsack, or sepatately using two cursors to fill the upper triangle of a
matrix like for longest common subsequence and for optimal order of
associative computation (like matrix multiplication).

------
lowdest
I spent 2-3 weeks going through all DP problems on leetcode after work. I got
that Tetris-effect where I was starting to hallucinate/dream DP problems and
solutions as I would fall asleep at night.

I did this specifically for interviewing. As I've said before, these kind of
interviews screen for unusual geniuses or people with the motivation to spend
many hours studying, which are two types of acceptable hires.

------
bjs250
>"I'll show you how to do DP"

Hey it's the same 3 example problems that are in every textbook,
GeeksforGeeks, Leetcode, etc.

~~~
Aditya_Ramesh
Thanks for taking the time to read my post. :) Like I mentioned, this article
only lays the groundwork. The next article I intend to write on is DP+Strings
(for example, finding the longest substring of a string which is a subsequence
of another, etc)

I'm starting with classical problems and I'll soon diverge into non-classical
problems, as I've mentioned in my article too.

In any case, hope my other articles on my blog added more value to you in
comparison!

~~~
bjs250
Yeah, I was being snarky up above, but will definitely look forward to the
future articles.

------
petermcneeley
The knapsack problem here is of particular interest since despite the
optimization problem being NP-hard the solution can in fact be found in O(n *
W). This feels similar to how the theoretical best comparison sort is O( n log
n) but radix can do this in O( n).

~~~
mesarvagya
Knapsack still is a NP-Hard Problem. Even though DP solution looks linear, it
has pseudo-polynomial time complexity[1][2]

[1][https://en.wikipedia.org/wiki/Knapsack_problem#Computational...](https://en.wikipedia.org/wiki/Knapsack_problem#Computational_complexity)

[2] [https://stackoverflow.com/questions/4538581/why-is-the-
knaps...](https://stackoverflow.com/questions/4538581/why-is-the-knapsack-
problem-pseudo-polynomial)

------
zestyping
Is dynamic programming the same as memoized recursion, or does it include
techniques beyond that?

~~~
gowld
Consider Fibonacci. Memoized recursion uses O(n) memory, because you don't
garbage-collect anything. Bottom-up dynamic programming is O(1) memory,
because you only need to remember the largest 2 subvalues you've computed.

------
golergka
So, in all of these examples, the author basically makes a set of all possible
solution into a tree (where moving from ancestor to child is that tree
correlates to making a possible decision) and then recursively traverses this
tree to find the best option?

------
sramij
This is poorly written. I was hoping hacker news would have better scrutiny on
articles.

------
albertzeyer
Note that the memory requirements in the Coin Change solution is O(N), and
that is not optimal. You should be able to get away with O(max(denom) -
min(denom)) (you don't need to cache everything).

------
aboutruby
Only company I know of asking for "Dynamic Programming" is LinkedIn, is there
any others?

~~~
thewarrior
Google ?

------
tomerbd
Hi, can you point the github source project of this blog? thanks.

------
jparkie
I agree with other's opinions that asking a Dynamic Programming question is a
weak signal for a candidate's competencies in general software development
work like frontend, backend, mobile applications, and the like.

However, I would like to counter a common opinion that eventually follows in
similar threads and some of my social circles: "Algorithms is an undergraduate
course in which students learn specialized solutions to esoteric math
problems, all of which they ultimately forget when they spend real time
working in the industry, so the knowledge shouldn't be relevant in an
interview."

It is fair that if you don't exercise what you learned, you will gradually
forget it, but I believe its still important for a candidate to cherish
algorithm design and analysis because I consider it a great toolbox of the
trade.

For all the concepts and the techniques I learned from my undergraduate course
in data structures and algorithms, I utilized them as the basis of my Software
Engineering Toolbox.

What is my Software Engineering Toolbox? It is a collection of algorithm
design concepts and techniques that I can employ anytime I am faced with a
novel problem or a problem whose standard Stack Overflow solutions are
inadequate.

The Software Engineering Toolbox contains the following: Arrays, Linked List,
Stack, Queue, Hash Table, Binary Search Tree, Priority Queue, Set, Trie,
Sorting, Binary Search, Divide-and-Conquer, Backtracking, Dynamic Programming,
Range Query Trees, Graph Algorithms, Bit Mask Optimizations, Square Root
Bucket Optimizations, and Multiple Pointer Optimizations.

First, I rarely implement my own data structures from scratch; all the
programming languages that I use provide great standard libraries. Yet, I
always remind myself the use of these data structures, because you would be
surprised with the amount of people you can meet who tries to answer a problem
that boils down to a set membership with a HashMap<Integer, Boolean> when they
can just use a HashSet<Integer> or with the amount of people who manually
treat an Array as a Stack or a Queue when those data structures are readily
available.

Second, I rarely implement my own sort or search functions from scratch;
again, all the programming languages that I use provide great optimized
functions. I treat Sorting and Binary Search as techniques that lend
themselves to optimizing the locality of a data set such that you can easily
answer basic statistics, find the bucket for a token in a ring, or merge data
sets. These are simple techniques developers should readily know to exist when
optimizing their code.

Third, why do I have Divide-and-Conquer and Backtracking in my toolbox? I
believe that no matter what problem you face, you should be able to bruteforce
it. You can't always tell someone that you can't implement something because
you didn't find a Stack Overflow answer or you didn't deduce a collage of
standard library functions or third-party libraries to solve your problem.
Using these techniques, you can at least arrive at a pretty weak solution
which is still a solution. To actually address Divide-and-Conquer and
Backtracking in relation to bruteforcing, these techniques allow you to easily
traverse through a search space to filter for a certain combination or
permutation of items that satisfy a customer's constraints. Furthermore,
Backtracking is a relatively easy to medium difficulty technique that is the
basis for a lot of the Graph Algorithms people keep balking at!

Fourth, Dynamic Programming. To be honest, I rarely utilize it, but I
appreciate it because the common subproblem types of 1D, 2D, Range, and Sub-
Trees taught me how to order subproblems successively to solve other problems,
which applies beyond DP. I discourage people from trying to pattern match
Dynamic Programming problems and solutions, and I encourage them to truly
digest CLRS and understand its 4 rules for Dynamic Programming to consider
possible dependencies and structures for various combinations and permutations
of the problem parameters to identify what the optimal substructure really is.

Finally, the remaining things in my toolbox are included in my toolbox because
they are useful in my work experience with real-time network anomaly detection
and streaming analytics. For example, topological sorting distributed tracing
events into a rooted tree that I encode into a bit vector using a left-child
right sibling binary tree. Not everyone will do this, but with my toolbox I
never worry much about facing new frontiers of problems or being tasked to
create libraries and tools for myself and others to use instead of being at
whims of someone else on the Internet.

Overall, I hope people can look back at their courses in algorithm design and
analysis and say, "Yeah, the problems and the solutions were really weird, but
the techniques hidden away within them are actually GENERALIZABLE and are a
fundamental basis to build new things and solve complex problems!"

Nonetheless, I don't want anyone who is weak in algorithms design and analysis
to feel discourage. Play to your own strengths whatever they maybe, or you can
always strengthen them; it's never too late.

Finally, my Software Engineering Toolbox has way more stuff like actual
"engineering" stuff like automatic formatters, linters, fuzzers, automation,
tests, mocks, coverage, "Infrastructure as Code", and blah blah blah. :")

I would like to close by saying that a good engineer knows the right tools for
the job. :)

