
A Short Guide to Hard Problems - sohkamyung
https://www.quantamagazine.org/a-short-guide-to-hard-problems-20180716/
======
ouid
It’s worth mentioning that the discovery of a deterministic polynomial time
algorithm to determine primality is pretty recent, 2002.
[https://en.m.wikipedia.org/wiki/AKS_primality_test](https://en.m.wikipedia.org/wiki/AKS_primality_test)

~~~
SilasX
It's also worth mentioning that AKS didn't have much impact on practical
cryptography. The probabilistic algorithms for checking primality are (and
were) much faster, and have a tolerably-low chance of failure in selecting
prime numbers for crypto.

I bring this up because, one (heuristic) reason to believe BPP=P is that, in
practice, moving PRIMES from BPP to P didn't make a difference.

~~~
ouid
someone (recently?) constructed a composite number which failed miller rabin
for 400 iterations or something.

------
daveguy
This seems counter intuitive to me.

From the article:

>BPP is exactly the same as P, but with the difference that the algorithm is
allowed to include steps where its decision-making is randomized. Algorithms
in BPP are required only to give the right answer with a probability close to
1.

Also from the article:

>Computer scientists would like to know whether BPP = P. If that is true, it
would mean that every randomized algorithm can be de-randomized. They believe
this is the case — that there is an efficient deterministic algorithm for
every problem for which there exists an efficient randomized algorithm — but
they have not been able to prove it.

The reason why I think this seems counter-intuitive:

The BPP is only required to give the right answer most of the time (with a
probability close to 1). So... A heuristic solution. It seems counter-
intuitive that a not-always-correct solution can have a deterministic solution
in P. You could get higher and higher probability of the solution repeating N
times... So N*O(original). But it seems like that slice of probability that is
given up in the heuristic shortcut could be significant... Adding an N
multiplier to O(N^2) would still be polynomial but you also still wouldn't be
deterministic, just approach deterministic.

Maybe I am mixing heuristics and probability where they shouldn't be, or maybe
there's a good reason to think the heuristic can always be reformulated to
deterministic? Because you can approach the answer with increasing certainty
and stay in P? Can someone explain why it is believed BPP = P ?

~~~
SilasX
>Can someone explain why it is believed BPP = P ?

Someone correct me if I'm wrong, but:

One intuition is that if a problem is in BPP, you can repeat the test only a
polynomial number of times to get exponentially high confidence in the
outcome. That means, in effect, arbitrarily high confidence, very quickly.

So you could have situations where you're more confident in the answer to the
question, than in the hardware you're running it on. You would have to believe
in some fundamental separation between "arbitrary confidence" and "proof".

There's the joke about "So this randomized algorithm has a low enough error
rate that we use it in military encryption or multibillion dollar financial
transactions ... but what about theorem proving, where you just can't take any
chances?"

~~~
through_17
Your sketch is correct, but what you're describing is not derandomization. It
is "accuracy amplification" by repeating the test. Each round of the test uses
more random bits, right? So you're converting _more_ random bits into
arbitrary confidence.

The objective of derandomization is to completely remove the need to flip
random bits from the algorithm. If you had a deterministic PRG, you could
completely remove the error from a BPP algorithm. Here is a sketch of why,
where I'm eliding the quantitative details:

A PRG takes as input a short string of truly random bits (a "seed") and
produces as output a long string of "pseudorandom bits". By the definition of
a PRG, an efficient algorithm cannot tell the different between the PRG's
output and a long string that was really sampled uniformly at random.

So, the obvious way to use this to derandomize a BPP machine M is to toss a
small number of random bits, run them through the PRG, and use the output of
the PRG as the random bits that M would use. We've used the PRG to shrink the
amount of randomness required, but not completely eliminate it.

To fully eliminate the randomness, brute-force over every possible seed, run M
on an input x with each PRG output serving as the random coins, and take the
majority vote. This is a fully deterministic process. It is _efficient_ if the
ratio between the seed length of the PRG and output length of the PRG is such
that 2^(seed length) is still polynomial.

The strategy is _always correct_ because, if there is an input y where this
brute-force-and-vote strategy fails to produce the correct answer, we can use
that input to produce an efficient algorithm that distinguishes between PRG
outputs and truly random strings. Why? Because on random bit-strings, M(y,r)
usually produces the "right answer" on y, by definition of BPP. On PRG
outputs, we just assumed that the majority of answers are wrong, because our
vote got the answer wrong! So M(y,random) and M(y,PRG) will tend to disagree.

This is a contradiction -- we assumed no efficient procedure could distinguish
between PRG outputs and truly random strings.

The above is a hasty sketch, that I hope gets a few of the ideas across. See
Pseudorandom Generators: A Primer
([http://www.wisdom.weizmann.ac.il/~oded/PDF/prg10.pdf](http://www.wisdom.weizmann.ac.il/~oded/PDF/prg10.pdf))
theorem 2.16.

Anyway, if we had good enough lower bounds against procedures with advice (see
other comments) we could produce such PRGs.

------
crsv
This was an interesting read as someone unfamiliar with this space - there
were mentions of researcher's trying to prove things about P and it's
relationship to other problem classes. Are there any examples of what actual
research is being performed or how they're experimenting or trying to tackle
these things? Is there practical work involved or is it more about
mathematical proof from theory?

~~~
through_17
Work on determining the relationships between complexity classes is mostly
theoretical. I've outlined some of the frontier of this work regarding BPP in
another comment. There are two broad classes of exceptions that I know of:

* Where this work intersects cryptography: in order for cryptosystems to be secure, we generally have to assume the difficulty of certain specific problems: a complexity assumption! Recent work (see here: [https://eprint.iacr.org/2015/907](https://eprint.iacr.org/2015/907)) links these assumptions to how easy it is to notice they are false. Intuitively, the easier it is to notice that an assumption is false, the better it is for crypto. A false assumption can be detected and crypto based on this assumption abandoned.

* Where computer search is used to assist proofs: A line of work terminating in [https://eccc.weizmann.ac.il//report/2011/031/](https://eccc.weizmann.ac.il//report/2011/031/) used computer search to show time-space tradeoffs for solving satisfiability.

~~~
commandlinefan
The linked article had an interesting way to phrase the traveling salesman
problem. I've always heard it formulated as "given a set of cities and their
distances, what is the shortest path that takes the salesman through each
city", which wouldn't be NP by the definition given: if you were given the
answer, you couldn't check it in polynomial time (unless you could solve the
problem in polynomial time).

~~~
dorgo
I had a problem with this too. But actually the problem can be stated as an NP
problem: Compute a round-trip shorter than k. If you have an algorithm to
solve the NP version then you can use it to find the shortest round-trip by
repeadly applying it to different k-parameters.

------
danharaj
> This problem is equivalent to the P versus NP problem because if
> (unexpectedly) P = NP, then all of PH collapses to P (that is, P = PH).

It should be noted that the polynomial hierarchy could conceivably collapse at
a higher level even if P!=NP. I wouldn't bet on it though.

------
through_17
The article lists what we'd like to know, but the research frontiers for each
class are also interesting. In each case, how far are we from our objectives?
I'll sketch a little bit of the situation for BPP below.

Simple constructions using hard decision problems would completely derandomize
BPP into P. This is the "hardness to randomness" paradigm: if we could prove
that some problem in exponential time was hard enough for _circuits_ , we
could completely derandomize BPP. See
[https://en.wikipedia.org/wiki/Pseudorandom_generator#Pseudor...](https://en.wikipedia.org/wiki/Pseudorandom_generator#Pseudorandom_generators_for_derandomization)
for more information on this approach.

Unfortunately, the hardness to randomness program is far from realization.
Circuit lower bounds are hard to prove; the best known lower bounds separate
NONDETERMINISTIC quasi-polynomial (n^(log n)) time from a circuit class with
stringent structural restrictions (no threshold gates, constant depth,
polynomial size). See
[https://eccc.weizmann.ac.il/report/2017/188/](https://eccc.weizmann.ac.il/report/2017/188/)
for the state of the art, and [https://eccc.weizmann.ac.il//eccc-
reports/1994/TR94-010/inde...](https://eccc.weizmann.ac.il//eccc-
reports/1994/TR94-010/index.html) for formal evidence that circuit lower
bounds are difficult to prove.

On the other hand, if we are willing to accept "average-case" derandomization,
we appear closer to BPP "is basically equal to" P. By "average-case"
derandomization, I mean that the deterministic version of the algorithm is
wrong on some inputs, but it is impossible to sample those inputs efficiently.
This is actually another application of the hardness to randomness paradigm:
if we knew EXP != BPP, we would obtain an average-case derandomization of BPP
into _sub_ -exponential time. This is not quite what we want, of course: it is
an open problem to improve that to fully-polynomial time! See
[https://link.springer.com/article/10.1007%2Fs00037-007-0233-...](https://link.springer.com/article/10.1007%2Fs00037-007-0233-x)
for the state of the art in this approach.

Because separations like EXP vs BPP seem easier than circuit lower bounds, (as
mentioned in the article, we DO know EXP != P) average-case derandomization
seems closer than full derandomization.

For the record, EXP contains BPP, because we can just enumerate over all
possible random strings, compute the probability of acceptance, and answer
accordingly. I don't think the article mentioned this trivial derandomization.
So, that's why getting BPP into even _sub_ -exponential time is an interesting
result.

