Here's an interesting partitioning problem: Split the first N primes into 2 groups such that the difference of products is as small as possible but not 1. If this number is less than the square of the Nth prime, it will be prime. For example:
split 2,3,5 into 2 groups: (2,5) and (3) the difference of their products 2x5-3 = 7.
split 2,3,5,7 into 2 groups (3,7) and (2,5) the difference is 11. Note that (2,7) and (3,5) has a difference of 1 so don't go there.
5x11 - 2x3x7 = 13
Is it always possible to create a partition that produces the N+1th prime?
AFAIK this is a problem of my own conjuring, but I also know that most such things already exist and have a name.
It's an interesting heuristic, but I think your limitation that it be less than the square of the Nth prime is a case of overfitting.
Here's a list of results for the first N primes, for different Ns:
3: 7
4: 11
5: 13
6: 17
7: 107
8: 41
9: 157
10: 1811
11: 1579
12: 18859
13: 95533
14: 310469
15: 1995293
16: 208303
17: 2396687
18: 58513111
19: 299808329
20: 2933961157
21: 3952306763
22: 33298242781
23: 115405393057
I just wrote a script to brute force all combinations here, so the 24th iteration got very slow, but of all those the nonprime ones are n=23, 19, 18, 16, 14, and 13.
So, your rule appears to hold true as long as N is less than the cube of the Nth prime. The actual rule is probably more complex than a simple power, considering 83^5 is far less than the result in N=23.
I think I'll play with this a bit more this evening then ping some mathematician friends about it. They always love playing with weird properties of primes.
>> It's an interesting heuristic, but I think your limitation that it be less than the square of the Nth prime is a case of overfitting.
Maybe, but the result is obviously guaranteed to not be a multiple of any of the N primes. Since the smallest factor has to be greater than N, that means prime for up to N^2 (or (N+1)^2 I suppose).
I don't recall. I don't think I went beyond the use of 32bit integers (or was it 64) when I first looked at it, so not very large. The intermediate products can get quite big quickly. I do remember using a binary number to represent group membership of each prime, and seeing a pattern that broke down after only a few primes.
Just had a great idea for exploring this... If you add the logarithms of the primes in a group you can know how big the product will be. The products must be of similar size, and if so you can do the math mod k where k is a power of 2, like 2^64. This might allow some speedy C code without bignum support.
Great article! This gives an excellent overview of NP-complete problems, the difficulty of constructing hard instances, the critical ratio in determining the solution count, and the analogy to physical systems' phase transitions. Also gives some intuition on why the greedy algorithms can be good enough in pracitce and when they wouldn't be.
This reminds me of a challenge at the Free University Berlin. A lecturer gave the following list of numbers to his students. The task was to find two different subsets that have the same sum:
His point was: It it simple to prove that those subsets must exist, but it is very hard to compute them. To make his point even more clear, he offered some prize money if anyone found an explicit solution.
A friend told me about that challenge, and I tried. With a lot of conceptual work, programming, optimization and some luck I was finally able find such a solution (and received the prize):
You just have to compare the number of subsets with the number of possible sum values.
1) There are 90 numbers, so we have 2^90 subsets.
2) All numbers are < 10^25, so the sum of all 90 numbers is < 90 * 10^25 < 10^27. So each sum of a subset is an integer in the range of 0 to 10^27-1. So we have ≤ 10^27 possible sum values.
Since 2^90 > 10^27, we have more subsets than possible sum values. Hence, there are at least two subsets having the same sum.
Regarding item 1, yes there are 2^90 ways to choose one subset, but since you are comparing two separate subsets, those subsets can't overlap (or else you could just choose the same subset twice!), so it seems like there should be "more" to this calculation? Am I missing something here?
Edit: wait I think I've got it:
Each subset implicitly also chooses it's counter subset, e.g. if you choose the subset consisting of the first 45 numbers, you've also therefore said the other subset consists of the last 45 numbers. Since there are more of these "dual subsets" than possible sums, the pigeonhole principle yadda-yadda.
You are not forced to use all numbers. (And in fact, my explicit solution doesn't use all numbers.) There is no restriction that the union of both subsets must be equal to the full set. Note that this restriction would change the question dramatically. See also: https://news.ycombinator.com/item?id=10023098
Also note that it is allowed for both chosen subsets to overlap. In fact, the proof just says that two different (i.e. not entirely equal) subsets with the same sum exist.
However, once you have a pair of different overlapping subsets of the same sum, you can simply remove the intersection from both sets. Both sums decrease by the same amount. You then get a pair of two disjoint sets that have the same sum.
And no, I don't think the proof would not work when requiring the two subsets to form a partition. Also note that for partitions, the number of sums would be one, as the only possible sum is half of the sum of all numbers. (And the problem would be unsolvable if the sum of all numbers was odd, but all numbers were still integers.)
The first heuristic algorithm that jumps to my mind to solve this problem goes like this:
First sort the numbers into a list.
Second take elements from the head of the list to construct a subset A.
Third take elements from the tail to construct a subset B
The second and third steps are done in such a way that the sum of A and B are nearly equal. Then one can hope that this procedure make the elements that are still in the list to be near identical and so allow you to construct nearly identical sums.
The hope is that the remaining elements in the list l are similar in the sense that max(l)/min(l) is small, the heuristic could be improved one you try to make a trade off between max(l)/min(l) in the list and the difference between sum A and sum B.
Stephen Mertens Paper * The Easiest Hard Problem: Number Partitioning* is a good review paper of the subject covering the phase transition between P and NP this problem exhibits and extending it to k-partitions.
At first his language was a little confusing -- when I saw "set" and then saw two 2s listed in his set. Are "set" and "multiset" often this interchangeable?
No, they aren't. This is just sloppy wording in the article. The concept of sets is deeply established in mathematics and when you talk about multisets rather than sets you have to state this explicitly.
In this, case, it's trivial to word the problem a "sets of objects, each with integer size, partition into sets of equal sum size". It is still polite to state that explicitly outside of the sample data, though.
split 2,3,5 into 2 groups: (2,5) and (3) the difference of their products 2x5-3 = 7.
split 2,3,5,7 into 2 groups (3,7) and (2,5) the difference is 11. Note that (2,7) and (3,5) has a difference of 1 so don't go there.
5x11 - 2x3x7 = 13
Is it always possible to create a partition that produces the N+1th prime?
AFAIK this is a problem of my own conjuring, but I also know that most such things already exist and have a name.