

Worse Than Random (2008) - spindritf
http://lesswrong.com/lw/vp/worse_than_random/

======
alexbecker
I think Eliezer is failing to consider a very important set of problems--those
for which the set of dumb things to do is small, but delimiting that set is
extremely difficult. An example given in another comment is Monte Carlo
integration: you know that the set of points at which the function is ill-
behaved is small, and that you want to avoid these points, but it is
computationally unfeasible to determine which points they are. These examples
abound in mathematics, where it is often easy to show that the set of "trouble
points" in some generalized sense is small, but very hard to determine where
these points are.

In a more technical sense, whether randomness gives algorithms more power is
still an open question in computer science (depending on interpretation, the
question is either whether RP=P or whether BPP=P). The answer is widely
believed to be that it does not, but this is on the basis of the hypothesized
existence of strong psuedorandom algorithms rather than randomness being
useless.

~~~
Verdex
As far as your first point is concerned, it has been 6 years since Eliezer
first wrote that post so I wonder if maybe he would give a more nuanced view
if he had to write it again. Perhaps to include the twice mentioned Monte
Carlo integration.

Your second point is something I hadn't really thought about. If I understand
you correctly, you're saying that Eliezer is probably correct, but it doesn't
matter because randomness is an incredibly useful tool that we would use
regardless of whether or not RP = P or BPP = P. I'm interested if you (or
really anyone) could expound on that a bit?

I guess the analogue that comes to mind is that I find monads to be a really
easy way to write parsers and it's really lucky for us that bind happens to be
derivable from the monadic laws, but I'm not doing anything that can't be done
on a turing machine. Is that more or less on track?

~~~
alexbecker
My point that probably P=RP=BPP is that randomness probably doesn't give you
any computational power, but not for the reason Eliezer gives (that we can
come up with an equally good structured way to do things), but rather because
we can make deterministic algorithms which look sufficiently like they're
using randomness.

I don't think your analogy really holds up--monads are just an abstraction
implemented on a turing machine, they can't possibly give you anything a
turing machine can't, since you are actually using a turing machine. With
randomness you are in principal actually extending a turing machine, although
it's an open question whether or not this actually gains you anything.

------
Strilanc
Usually I agree with Eliezer, but this is one of the cases where I don't.
Randomness is a useful tool that you can use to make algorithms more efficient
and harder to exploit. In fact, there are several examples in the wikipedia
article [1].

A more practical example: suppose you are writing a poker bot. It looks at the
game state, and outputs a single move to make. After you complete the bot you
will publish it, and an opponent will analyze the source code and make a
counter-bot to play against yours. Access to randomness is clearly a huge
advantage here! Without it, your bluffs are less effective.

That being said, I do agree with the general sentiment that in most cases
randomness is a quick simple hack that can be improved upon.

1:
[http://en.wikipedia.org/wiki/Randomized_algorithm#Where_rand...](http://en.wikipedia.org/wiki/Randomized_algorithm#Where_randomness_helps)

~~~
alexbecker
At the end of the article he notes that algorithms designed to work against an
intelligent adversary are a specific exception.

------
Verdex
I wanted to understand more about how randomized algorithms worked, so I
bought Randomized Algorithms[1]. The book is full of proofs involving
probability theory and computational complexity theory. It's also kind of dry;
I had a hard time reading it.

My complexity and probability theory are both at kind of an elementary level,
so I'm sure it will take me several more years of periodically reading this
type of material to get a good grasp of what "randomized algorithms" really
are.

However, like I said this type of material contains a bunch of complexity
and/or probability theory, but Eliezer's conclusion doesn't really seem to be
drawn from the same level of rigor that I've seen elsewhere (at least he
didn't publish his proofs here, did I miss something). So I'm not sure how
much I really want to trust his conclusion.

Also if his conclusion is correct in a way that provides useful
derandomization algorithms (that is an algorithm to derandomize other
algorithms), then why are the academics all talking about nondeterministic
turing machines? It seems like we could move past that (although I'm willing
to accept that nondeterministic turing machines are an archaic idea from a
bygone era that people continue to cling to because it gets them grant money,
but I would like the proof of this so I can be certain it's safe not to spend
my time on it).

Finally, if his conclusion is correct, is it correct in a way that will allow
me to get work done? I guess my point is that I would rather not rely on
intuition in order to derandomize my algorithms if I can use some sort of
technique or series of techniques. Additionally, it seems like understanding
such a technique would also answer a lot of the questions I had about the
nature of randomized algorithms in the first place, so if there's anyone out
there who knows about these things (or has a list of papers/books/websites) I
would really appreciate if you step in and give us your two cents.

[1] -
[http://books.google.com/books/about/Randomized_Algorithms.ht...](http://books.google.com/books/about/Randomized_Algorithms.html?id=QKVY4mDivBEC)

~~~
pkhuong
There are several kinds of nondeterminism. In the case of nondeterministic
complexity classes like NP, it's angelic nondeterminism: whenever the
algorithm makes a non-deterministic choice, the machine obeys an oracle that
will guide it to an accept state if there is any such choice. (I prefer the
certifier view of NP, but that doesn't really explain the N in NP.)
Nondeterministic complexity classes have little to do with randomness, and I
think it's clear that always guessing right would be a good mutation to have
;)

There are _randomised_ complexity classes, which are germane to the
discussion. Some classical ones are RP, ZPP and BPP (and RP's negative twin,
co-RP). They are all variants of Polynomial-time with access to random bits.

\- RP captures Las Vegas algorithms (if the algorithm returns NO, the string
is definitely rejected; otherwise, we may have a false positive with a
probability bounded away from 0).

\- ZPP is the intersection of RP and co-RP: there is no error when the machine
finally terminate, but only the _expected_ running time is polynomial.

\- BPP captures Monte Carlo algorithms: the machine always terminates in
polynomial time and is correct more often than not. However, any result
(accept or reject) can be spurious.

None of these complexity classes are known to be equivalent to P, although
there's a strong feeling that BPP = P. In practice, any implementation is
likely backed by a quick PRNG with a couple thousand bits of state at most, so
it's not clear what separating ZPP and P would mean ;)

If we visit more exotic complexity classes, we get some interesting insight in
the relationship between P, NP and randomness. Probabilistically checkable
proofs define families of proof systems. A PCP(r(n), q(n)) verifier accepts an
input string and a specially constructed certificate (e.g., a proof witness
that this SAT instance is satisfiable), and confirms that the certificate is
valid in polynomial time, by using at most r(n) random bits and reading q(n)
bits of the certificate. The _verifier_ has to run in polynomial time; the
witness itself can be arbitrarily complicated to build.

It's clear that PCP(0, 0), the set of languages that can be checked
probabilistically with 0 bits of randomness and 0 bits of certificate, is
equal to P: the verifier has no additional tools. PCP(O(log(n)), 0) is also
equal to P: a deterministic verifier could just enumerate all sequences of
log(n) bits. The same is true if the verifier has access to log(n) bits of the
witness, but no randomness: the verifier can enumerate all witnesses of log(n)
bits.

The magic of the PCP theorem is that PCP(O(log(n)), O(1)) = NP! That is, if I
have access to log(n) random bits and a suitable proof that a given instance
is in a language like SAT (or any other language in NP), I can verify that
proof in polynomial time, by looking at a constant number of bits of that
proof. The key is that we have enough bits of randomness to choose randomly
from all the bits in the witness.

Intuitively, I still have a hard time believing the PCP construction works.
However, it does hint at how randomness can give us more leverage. That said,
I definitely agree that most uses of randomness seem unwarranted in practice,
especially if we understand the problem domain very well. But then again, what
if we don't? I would often rather have a consistently robust but suboptimal
solution than one that is optimal in a small range of parameters but fails
hard outside of the expected input.

~~~
ufo
> (I prefer the certifier view of NP, but that doesn't really explain the N in
> NP.)

I like the "parallel worlds" interpretation: whenever you need to make a
choice, the algorithm tries both at once, _in parallel_. Its still magic
because a real computer can't run an exponential number of threads in parallel
but its a bit less magic than an angel that can predict the future.

------
stared
How about Monte Carlo integration?

