The Basic Argument for AI Safety
High-stakes uncertainty warrants caution and research
When I see confident dismissals of AI risk from other philosophers, it’s usually not clear whether our disagreement is ultimately empirical or decision-theoretic in nature. (Are they confident that there’s no non-negligible risk here, or do they think we should ignore the risk even though it’s non-negligible?) Either option seems pretty unreasonable to me, for the general reasons I previously outlined in X-Risk Agnosticism. But let me now take a stab at spelling out an ultra-minimal argument for worrying about AI safety in particular:
It’s just a matter of time until humanity develops artificial superintelligence (ASI). There’s no in-principle barrier to such technology, nor should we by default expect sociopolitical barriers to automatically prevent the innovation.
Indeed, we can’t even be confident that it’s more than a decade away.
Reasonable uncertainty should allow at least a 1% chance that it occurs within 5 years (let alone 10).
The stakes surrounding ASI are extremely high, to the point that we can’t be confident that humanity would long survive this development.
Even on tamer timelines (with no “acute jumps in capabilities”), gradual disempowerment of humanity is a highly credible concern.
We should not neglect credible near-term risks of human disempowerment or even extinction. Such risks warrant urgent further investigation and investment in precautionary measures.
If there’s even a 1% chance that, within a decade, we’ll develop technology that we can’t be confident humanity would survive—that easily qualifies as a “credible near-term risk” for purposes of applying this principle.
Conclusion: AI risk warrants urgent further investigation and precautionary measures.1
My question for those who disagree with the conclusion: which premise(s) do you reject?
[Edited to add:] See also:
- Helen Toner’s “Long” timelines to advanced AI have gotten crazy short
- Kelsey Piper’s If someone builds it, will everyone die?, and
- Vox’s How to kill a rogue AI—tagline: “none of the options are very appealing”.
Of course, there’s a lot of room for disagreement about what precise form this response should take. But resolving that requires further discussion. For now, I’m just focused on addressing those who claim not to view AI safety as worth discussing at all.



Premise zero should be "ASI defined" and premise one should be "ASI possible." Most dismissive views seem to imagine ASI is something very different from what you do, and further they don't believe it is possible at all in some sense, though this is easy to conflate with "ASI soon."
I like that you've not neglected gradual disempowerment! The situation seems almost like this: Okay, we have AGI/ASI. Solved alignment? No? AI takeover and extinction. Solved alignment, yes? Is ASI under control by some one guy/company? "Human takeover" i.e. Extreme power concentration. Value lock-in. No? Okay, well eventually: Gradual disempowerment. Existential risks on all sides. :( Seems like the most robust way of reducing existential risks (not just extinction risk) is to enforce a Pause / Moratorium (perhaps backed up by Deterrence, cf. Hendrycks' MAIM) while Humanity works hard at solving both the alignment problem and also the governance/coordination problems (and Philosophers like yourself and also political scientists, economists etc. are much needed here!). If we only solve the former but not the latter, we open humanity to various existential risks. And a Pause looks necessary to solve those two sets of problems, especially if AGI/ASI is on the horizon. So in short, preventing existential risks (understood as threats to our long-term global flourishing, of which survival is one part) as a whole requires Pause.
And it seems undeniable (unless one wants to have some supreme arrogance in humanity) that some safe, aligned AGI could have far greater likelihood in solving the problems of philosophy than human philosophers (ctrl-f-replace with science and scientists and this also makes sense—indeed It's what animates Demis Hassabis). So philosophers, whether they prioritize global flourishing or just philosophy alone, should prioritize AI safety first.