The basic argument is trivial: it is plausible that future systems achieve super...

lxnn · on May 30, 2023

Technically, I think it's not that instrumental goals tend to converge, but rather that there are instrumental goals which are common to many terminal goals, which are the so-called "convergent instrumental goals".

Some of these goals are ones which we really would rather a misaligned super-intelligent agent not to have. For example:

- self-improvement;

- acquisition of resources;

- acquisition of power;

- avoiding being switched off;

- avoiding having one's terminal goals changed.