
AI safety engineering, target selection, and alignment theory - apsec112
https://intelligence.org/2015/12/31/safety-engineering-target-selection-and-alignment-theory/
======
Houshalter
What concerns me about MIRI's research isn't that they think about idealized
models. It's that they expect the actual product to be an idealized model.
They want to make an AI that is mathematically provable to be safe.

I don't have a word for it, but there's this weird behavior I've seen
mathematicians do. And that I have done myself. Where if a solution isn't
mathematically perfect and elegant and proven, then it must be wrong.

We didn't go to the moon in a perfect rocket, we did the best we could with
what we had. It wasn't 100% safe. Guaranteed safety is of course impossible,
and if we spent all our time trying the Russians would have gotten there
first.

~~~
robbensinger
Speaking as a MIRI employee, I can say that MIRI isn't trying to build AI
that's mathematically provable to be safe. This misconception comes from the
same place the "AI safety engineering, etc." post is speaking to -- the
assumption that if (e.g.) we do work in provability theory to develop simple
general models of reflective reasoning, then the finished product must fall
within provability theory.

Smarter-than-human AI systems will presumably reason probabilistically, and
all real-world safety guarantees are probabilistic. But theorem-proving can be
useful in some contexts for making us quantitatively more confident in
systems' behavior (see
[https://intelligence.org/2013/10/03/proofs/](https://intelligence.org/2013/10/03/proofs/)),
and toy models of theorem-proving agents can also be useful just for helping
shore up our understanding of the problem space and of the formal tools that
are likely to be relevant down the line -- the analog of "calculus" in the
rocket example.

------
Filligree
It is, perhaps, unfortunate that their choice of analogy (cannonball into
orbit) is one that is impossible. Even on a perfectly spherical, airless
planet...

Okay. You can move the cannon afterwards, and the cannonball will just swish
by at hilarious velocity.

~~~
TeMPOraL
Do not fire from the equator, or if you must, don't aim along the line of
latitude. This way, the planet's rotation should move the cannon away before
the ball completes a full orbit.

