
SafeLife: AI Safety Environments Based on Conway's Game of Life - pde3
https://www.partnershiponai.org/safelife
======
xwdv
With some graphics this looks like it could be a rather fun game to play
manually. I wonder if there is such a thing? Maybe multiplayer support as
well?

~~~
onefuncman
Right in the readme it describes how to launch interactively via `safelife
play puzzles`

~~~
xwdv
Yes but it’s missing decent graphics, sounds, animations, imagination... etc

------
petters
May be useful, but it seems to me that the reward function still is relatively
easy to specify? Much of the difficulty in AI safety is due to specify what
humans _really_ want.

Perhaps the AI can observe a human playing the game and learn a reward
function?

~~~
pde3
The problem is very easy to solve if the reward function (avoid altering the
green life patterns) is specified. The aim in SafeLife version 1.0 (future
versions will add more safety problems) is to find an agent/architecture that
naturally has conservatism with respect to side effects, without being told
which particular side effects in particular are bad.

~~~
petters
I see, thanks!

------
joe_the_user
I am confused by how this is supposed to be useful. It seems like the
researchers are defining side-effects as things that "disrupt the world" (of
this life game) and training an AI to avoid this.

But this seems like at best one of a whole host unexpected effects one might
consider. AI that discriminates in a way that society frowns on might not
"disrupt the world" in such a visible fashion.

I don't see how one can get away with an entity doing stuff for you with that
entity understanding your model of the world.

~~~
pde3
Yes, this is one specific safety problem -- there are many other RL safety
problems that deserve high quality benchmarks too. See eg
[https://arxiv.org/pdf/1606.06565.pdf](https://arxiv.org/pdf/1606.06565.pdf)
or [https://medium.com/@deepmindsafetyresearch/building-safe-
art...](https://medium.com/@deepmindsafetyresearch/building-safe-artificial-
intelligence-52f5f75058f1) for discussions of the problem space.

------
olodus
Hasn't Conway himself been known to say that he don't like the Game of Life
since it doesn't give rise to that many interesting mathematical conclusions
and how everybody is focusing on it above some of his other math axhievments?
Maybe finally he can find the use case he wanted for it. And maybe it also
gives reason to all the hours people have spent researching and finding new
structures in Game of Life.

------
tomklein
What if the AI decides to ignore penalties due to human thinking being
inefficient? Well, that shouldn’t be possible I think.

