Not a great analogy. If - there were a range of expert opinions that P(destroy-t...

JoshTriplett · 2025-02-12T21:20:50 1739395250

> there were a range of expert opinions that P(destroy-the-world) < 100

I would hope that it would not suffice to say "not a 100% chance of destroying the world". Because there's a wide range of expert opinions saying values in the 1-99% range (see https://en.wikipedia.org/wiki/P(doom) for sample values), and none of those values are even slightly acceptable.

But sure, by all means stipulate all the things you said; they're roughly accurate, and comparably discouraging. I think it's completely, deadly wrong to think that "race to find it" is safer than "stop everyone from finding it".

Right now, at least, the hardware necessary to do training runs is very expensive and produced in very few places. And the amount of power needed is large on an industrial-data-center scale. Let's start there. We're not yet at the point where someone in their basement can train a new frontier model. (They can run one, but not train one.)

philomath_mn · 2025-02-12T21:32:28 1739395948

> Let's start there

Ok, I can imagine a domestic policy like you describe. Through the might and force of the US government, I can see this happening in the US (after considerable effort).

But how do you enforce something like that globally? When I say "not really possible" I am leaving out "except by excessive force, up to and including outright war".

For the reasons I've mentioned above, lots of people around the world will want this technology. I haven't seen an argument for how we can guarantee that everyone will agree with your level of "acceptable" P(doom). So all we are left with is "bombing the datacenters", which, if your P(doom) is high enough, is internally consistent.

I guess what it comes down to is: my P(doom) for AI developed by the US is less than my P(doom) from the war we'd need to stop AI development globally.

JoshTriplett · 2025-02-12T22:22:43 1739398963

OK, it sounds like we've reached a useful crux. And, also, much appreciation for having a consistent argument that actually seriously considers the matter and seems to share the premise of "minimize P(doom)" (albeit by different means), rather than dismissing it; thank you. I think your conclusion follows from your premises, and I think your premises are incorrect. It sounds like you agree that my conclusion follows from my premises, and you think my premises are incorrect.

I don't consider the P(destruction of humanity) of stopping larger-than-current-state-of-the-art frontier model training (not all AI) to be higher than that of stopping the enrichment of uranium. (That does lead to conflict, but not the destruction of humanity.) In fact, I would argue that it could potentially be made lower, because enriched uranium is restricted on a hypocritical "we can have it but you can't" basis, while frontier AI training should be restricted on a "we're being extremely transparent about how we're making sure nobody's doing it here either" basis.

(There are also other communication steps that would be useful to take to make that more effective and easier, but those seem likely to be far less controversial.)

If I understand your argument correctly, it sounds like any one of three things would change your mind: either becoming convinced that P(destruction of humanity) from AI is higher than you think it is, or becoming convinced that P(destruction of humanity) from stopping larger-than-current-state-of-the-art frontier model training is lower than you think it is, or becoming convinced that nothing the US is doing is particularly more likely to be aligned (at the "don't destroy humanity" level) than anyone else.

I think all three of those things are, independently, true. I suspect that one notable point of disagreement might be the definition of "destruction of humanity", because I would argue it's much harder to do that with any standard conflict, whereas it's a default outcome of unaligned AGI. (I also think there are many, many, many levers available in international diplomacy before you get to open conflict.)

(And, vice versa, if I agreed that all three of those things were false, I'd agree with your conclusion.)