
The Knapsack Problem Is All Around Us - jgrahamc
https://www.smithsonianmag.com/science-nature/why-knapsack-problem-all-around-us-180974333/
======
samatman
What I find interesting about the knapsack problem, and the traveling salesman
problem as well, is that they're fast to satisfy, but incredibly time-
consuming to solve.

Once upon a time, I had a 4x8 sheet of small plastic parts to route out on a
ShopBot. The naive layout from the vector drawing program was incredibly
inefficient, and we ended up downloading a general-purpose traveling-salesman-
solver written in Java.

It converged upon a dramatically better solution in a few seconds, and didn't
really improve it in the next few minutes. So we just stopped it and ran the
good-enough route, while leaving the solver running for two days out of pure
curiosity.

Now, it was _visibly_ not perfect, there were a couple skips where we could
see a shorter way to do it. And it did eventually find those routes.

So if all you need is a reasonable route for your traveling salespersons, or
an acceptably efficient packing arrangement for your shipping container, this
is achievable and easy, if you're willing to leave that last 1-2% sitting on
the table.

~~~
Jedd
> It converged upon a dramatically better solution in a few seconds, and
> didn't really improve it in the next few minutes.

Which in turn is, I _think_ , a manifestation of the Secretary Problem.

[https://en.wikipedia.org/wiki/Secretary_problem](https://en.wikipedia.org/wiki/Secretary_problem)

~~~
karussell
I think this is more the general behaviour of heuristics. They are not
guaranteed to give you the optimal solution as they are forced to 'converge'.

------
klenwell
First time I encountered this in the wild was years ago when my sister, who
worked as auditor, approached me with a problem she hoped I could help her
solve. She would often be given a spreadsheet full of values and she would
have a target or aggregate value she was trying to match. Could I write a
script to find the set of values from the spreadsheet that would add up to her
target value?

Piece of cake I thought. After a couple hours of fruitless coding I realized
some research might help. After a few more hours of googling, I finally
figured out the name for this type of problem. Actually, I first came across
the term "subset sum problem" which led to "knapsack problem". That led to an
introduction to the concept of "nondeterministic polynomial time".

Eventually I found a diophantine algorithm someone had written in Rexx and
managed to translate it to Python. It worked! Sorta. (I was surprised by how
many different matching combinations a random set of numbers could generate
for a given value.)

By the time I returned to my sister with my solution, I think she had found a
Excel plugin that did it for her.

~~~
andreareina
Subset sum is something that comes up quite often when bookkeepers need to
reconcile accounts and there's some transactions missing. And indeed, the
Excel solvers work well enough. These days I'm no longer surprised to find out
that a classic selection problem has an Excel plug-in to solve it. There's
even one for the multiple knapsack apparently.

~~~
larrydag
You can create a solution for the knapsack problem using Excel Solver.

------
_bxg1
I've always thought of it as "the Skyrim problem", since it's precisely what
you do in any RPG with tons of loot and finite carrying capacity

My mental algorithm is:

1) Set a value/weight ratio above which new items are picked up

2) If I run out of space and find something above that threshold, start
dropping the worst-ratio'd items to make room

3) Raise the threshold accordingly

4) Repeat

~~~
dahfizz
While this is usually a decent approximation, it's important to note this is
not an optimal solution. For example:

Your bag can hold 10 kg. Item a weighs 10kg and is worth $10. Item b weighs
5kg and is worth $6.

Item b is more value dense, but picking up item a is optimal.

~~~
mjevans
This is true, but in the Skyrim case nearly all items are small enough that a
'near optimal' answer is the indicated algorithm, with that extra space a more
literal 'garbage collection' free zone.

In videogames, making the user GC isn't fun, and therefore reducing the number
of GCs is more optimal than strictly solving the problem.

------
daenz
The Knapsack Problem is what originally led me down the path of Linear
Programming, which is mind-blowingly cool. So cool, in fact, that in the
1960s, LP was used (along with other techniques) to solve a 49-city Travelling
Salesman Problem. They didn't even have to check all possible solutions(!?) by
using the principles of linear programming duality to prove that their
solution was the most optimal.

~~~
froindt
My optimization class was one of the most interesting classes of my undergrad
(industrial engineering). My professor was trying to explain why optimization
was important.

"Optimizing your solution ensures someone else can't come into your industry,
get the same suppliers and contract terms, and beat you"

We then went through an example for an oil refinery with different sources of
oil (differing proportions of sweet and sour, and quantities available),
refinery production constraints, and market prices for different end products.
He showed the difference between a "naive manual optimization" and the
mathematical optimal.

Ever since that class, I'm particular about what "optimize" means. Factories
talk about optimizing many things at the same time (on-time delivery, profit,
throughput on a machine, throughput on the bottleneck machine, minimizing
labor). I can't tell you how many times I've said "pick one thing to optimize
- you're not going to hit them all at once". You can make an objective
function a weighting of all those factors, but they won't all be at their best
possible values.

~~~
carlmr
> I can't tell you how many times I've said "pick one thing to optimize -
> you're not going to hit them all at once".

What to optimize is actually probably one of the most interesting optimization
problems out there. In a factory you always need to search for the bottleneck.
That's why I think Kaizen is so important. Apply incremental optimization to
the biggest issues and you will succeed.

~~~
froindt
Absolutely! My company makes thousands of unique specs with a constantly
changing mix. The bottleneck is a moving target as business conditions change.
It makes things more challenging for sure, but also keeps you on your toes.

------
tcgv
I first encountered the Knapsack problem at a job interview. Intuitively I
tried to solve it using what I later found out to be the greedy approximation
algorithm [1], which proved unsuccessful to solve it for the interview's
sample input.

Since I had not much time left I switched to an inefficient brute force
approach, basically a test of all possible combinations, which wasn't well
received by the interviewer back then since he was expecting a recursive
dynamic solution [2].

When I started a blog last year I decided to revisit this problem while
writing about complexity classes of problems, and tried to demonstrate how to
translate a solver to the Knapsack problem for solving another NP-complete
problem, the partition problem [3] ;)

It's indeed a very interesting and fun problem to tackle!

[1]
[https://en.wikipedia.org/wiki/Knapsack_problem#Greedy_approx...](https://en.wikipedia.org/wiki/Knapsack_problem#Greedy_approximation_algorithm)

[2]
[https://github.com/TCGV/Knapsack/blob/master/Tcgv.Combinator...](https://github.com/TCGV/Knapsack/blob/master/Tcgv.CombinatorialOptimization/KnapsackProblem/RecursiveDynamicSolver.cs)

[3] [https://thomasvilhena.com/2019/08/complexity-classes-of-
prob...](https://thomasvilhena.com/2019/08/complexity-classes-of-problems)

------
hiker
Integer factorization is also reducible to Knapsack:

To factorize integer N invoke a Knapsack solver with knapsack size of log(N)
and items of size logarithm of all prime numbers up to sqrt(N): [log 2, log 3,
log 5, ...].

If N=pq (say p and q are prime) then log(N)=log(p)+log(q).

So from all possible items in the item set, only log(p) and log(q) will fill
the knapsack as tight as possible leaving zero empty space in it.

~~~
stevefan1999
Well, you mentioned integer factorization, that immediately leads me to RSA,
which then leads me to PKI, and then leads me to cryptosystem. Combining it
with the Knapsack problem, I was ultimately there for MH knapsack
cryptosystem, which is unfortunately cracked [1].

[1]:
[https://en.wikipedia.org/wiki/Merkle%E2%80%93Hellman_knapsac...](https://en.wikipedia.org/wiki/Merkle%E2%80%93Hellman_knapsack_cryptosystem)

------
jonbaer
Probably one of the more practical libraries on the subject I have come across
...
[https://developers.google.com/optimization/bin/knapsack](https://developers.google.com/optimization/bin/knapsack)

~~~
mustntmumble
Do you, or any other HN readers, know of any articles on how to solve the
problem of working out the best Point of Sale checkout promotional offer?

If I have a set of retail promotional offers going such as:

Offer 1: buy two shirts from this set of shirts, and get a 50% discount

Offer 2: buy a shirt from this set of shirts, a jacket from this set of
jackets, and a tie from this set of ties all for a set price of $99

Offer 3: all ties are on sale at half price

I'm trying to figure out if the google libraries linked above can be used to
solve this problem, but I can't figure out how to convert the offers into
values that can be used by the google library...

~~~
joshuahutt
This was one of the first problems I tackled as a developer. My naive approach
was to calculate the values of the offers independently, pick the top one, and
then continue to try to apply remaining offers. Obviously, it's not the
optimal solution, but I thought it worked well for the problem, because people
like thinking they got a really big discount, even if their total savings
might be less.

------
beastman82
Autoplay video with sound, no thanks

------
viig99
Scheduling is a knapsack problem.

