User Comment Replies

Nice post Ryan! This kind of modeling strikes me as a very useful exercise, despite the fact that reasoning about systems of conditional probabilities based on conditions with complex descriptions at this scale is a little clunky for our human brains.

Regardless of the final estimates of P(scheming), which are likely to have high uncertainty, I also see a lot of value in the list of predictive factors you have called out here and their relative magnitude. If nothing else, these factors can help us by serving as warning signs or signs that things are going i... (read more)

Would catching your AIs trying to escape convince AI developers to slow down or undeploy?

Andrew Dickson7moΩ290

I very much agree with this concern and I think that synthetic biology can be a good comparable case to ground our intuitions and help estimate reasonable priors.

For years, researchers have been sounding the alarm around the risks of advanced biotech, especially around tools that allow gene synthesis and editing. And then we had Covid-19, a virus that regardless of the politicization, probably was created in a lab. And in any case, regardless of whether you believe it was or wasn't it seems clear that it easily could have been. Worse, it's clear that somet... (read more)

Limitations on Formal Verification for AI Safety

Andrew Dickson8mo43

Thanks Steve! I love these examples you shared. I wasn't aware of them and I agree that they do a very good job of illustrating the current capability level of formal methods versus what is being proposed for AI safety.

Limitations on Formal Verification for AI Safety

Andrew Dickson8mo208

Agustin - thanks for your thoughtful comment.

The concern you raise is something that I thought about quite a bit while writing this post/paper. I do address your concern briefly in several parts of the post and considered addressing it explicitly in greater detail, but ultimately decided not to because the post was already getting quite long. The first part where I do mention it is quoted below. I also include the quote from [1] that is from that part of the post as well, since it adds very helpful context.

At the same time, obtaining an estimate that a D

... (read more)

LESSWRONG
LW

All of Andrew Dickson's Comments + Replies