Dr_Manhattan comments on Q&A with experts on risks from AI #1 - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (66)
So, I've heard this argument before, and every time I hear it I like this introduction less and less. I feel like it puts me on the defensive and assumes what seems like an unreasonable level of incaution.
Suppose the utility function is something like F(lumens at detector)-G(resources used). F plateaus in the optimal part of the band, then smoothly decreases on either side, and probably considers possible ways for the detectors to malfunction or be occluded. (There would probably be several photodiodes around the street corner.) F also only accumulates for the next 5 years, as we expect to reevaluate the system in 5 years. G is some convex function of some measure of resources, which might be smooth or might shoot up at some level we think is far above reasonable.
And so the system does resist premature decommissioning (as that's more likely to be hostile than authorized), worry about asteroids, and so on, but it's cognizant of its resource budget (really, increasing marginal cost of resources) and so stops worrying about something once if it doesn't expect cost-effective countermeasures (because worry consumes resources!). Even if it has a plan that's guaranteed of success, it might not use that plan because the resource cost would be higher than the expected lighting gains over its remaining lifespan.
I don't think I've seen an plausible argument that a moderately well-designed satisficer will destroy humanity, though I agree that even a very well-designed maximizer has an unacceptably high chance of destroying humanity. I'm curious, though, and willing to listen to any arguments about satisficers.
I've thought along somewhat similar lines of 'resource budget' before, and can't find anything obviously wrong with that argument. That is possibly because I haven't quite defined 'resources'. Still seems like an obvious containment strategy, I wonder if it's been discussed here already.
The AI danger crowd seems happy to assume that the AI wants to maximize its available free energy, so I would assume they're similarly happy to assume the AI can measure its available free energy. I do agree that this is a potential sticking point, though, as it needs to price resources correctly, which may be vulnerable to tampering.