This is a linkpost for https://manifold.markets/tailcalled/will-ai-xrisk-seem-to-be-handled-se?r=dGFpbGNhbGxlZA
Would "we get strong evidence that we're not in one of the worlds where iterative design is guaranteed to fail, and it looks like the group's doing the iterative design are proceeding with sufficient caution" qualify as a YES?
No, by the "If my opinion on the inherent danger of AI xrisk changes during the resolution period, I will try to respond based on the level of risk implied by my criteria, not based on my later evaluation of things." rule, but maybe in such a case I would change the title to reflect the relevant criteria.
https://manifold.markets/tailcalled/will-ai-xrisk-seem-to-be-handled-se?r=dGFpbGNhbGxlZA
Lately it seems like there's been a bunch of people taking big AI risks seriously, from OpenAI's Governance of superintelligence to Deepmind's Model evaluation for extreme risks.
We're not quite there yet in terms of my standards for safety, and even less there in terms of e.g. Eliezer Yudkowsky's. However I wonder if this marks a turning point.
Obviously this is a very subjective question, so I am warning you ahead of time that it is going to resolve in opinionated ways. Trying to anchor the discussion, I expect the following to be necessary for a YES resolution:
Please ask me more questions in the comments to help cement the resolution criteria. If my opinion on the inherent danger of AI xrisk changes during the resolution period, I will try to respond based on the level of risk implied by my criteria, not based on my later evaluation of things.
However, if it turns out that there is a similarly powerful way of handling AI xrisk which is qualitatively different, which gets implemented in practice, I will also resolve this question to YES.