You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Elithrion comments on Q&A with experts on risks from AI #1 - Less Wrong Discussion

29 Post author: XiXiDu 08 January 2012 11:46AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (66)

You are viewing a single comment's thread. Show more comments above.

Comment author: Vaniver 09 January 2012 04:17:56AM 5 points [-]

You (would) have just sentenced humanity to extinction and incidentally burned the entire cosmic commons. Oops.

So, I've heard this argument before, and every time I hear it I like this introduction less and less. I feel like it puts me on the defensive and assumes what seems like an unreasonable level of incaution.

Suppose the utility function is something like F(lumens at detector)-G(resources used). F plateaus in the optimal part of the band, then smoothly decreases on either side, and probably considers possible ways for the detectors to malfunction or be occluded. (There would probably be several photodiodes around the street corner.) F also only accumulates for the next 5 years, as we expect to reevaluate the system in 5 years. G is some convex function of some measure of resources, which might be smooth or might shoot up at some level we think is far above reasonable.

And so the system does resist premature decommissioning (as that's more likely to be hostile than authorized), worry about asteroids, and so on, but it's cognizant of its resource budget (really, increasing marginal cost of resources) and so stops worrying about something once if it doesn't expect cost-effective countermeasures (because worry consumes resources!). Even if it has a plan that's guaranteed of success, it might not use that plan because the resource cost would be higher than the expected lighting gains over its remaining lifespan.

I don't think I've seen an plausible argument that a moderately well-designed satisficer will destroy humanity, though I agree that even a very well-designed maximizer has an unacceptably high chance of destroying humanity. I'm curious, though, and willing to listen to any arguments about satisficers.

Comment author: Elithrion 21 January 2013 07:40:19PM 0 points [-]

It seems like this would work for cases where there is little variation in maximally achievable F and the resource cost is high, however I suspect that if there is more uncertainty there is more room for problems to arise (especially if the cost of thinking is low relative to the overall resource use, or something like that).

For example, imagine the AI decides that it needs to minimize G. So, it iterates on itself to make itself more intelligent, plays the stock market, makes a lot of money, buys a generator to reduce its own thought cost to zero, then proceeds to take over the world and all that good stuff to make sure that no one messes with all the generators it sticks on all the lamps (alternatively, if the resource cost is monitored internally, it has a duplicate of itself built without this monitor). Now, in this particular case you might be able to plausibly argue that the resource cost of all the thinking would make it not worth it, however it's not clear that this would be the case for any realistic scale projects. (Although it's possible that I just abused the one minimization-like part you accidentally left in there and there is some relatively simple patch that I'm not seeing.)

Comment author: Vaniver 21 January 2013 11:01:57PM 1 point [-]

Although it's possible that I just abused the one minimization-like part you accidentally left in there and there is some relatively simple patch that I'm not seeing.

I meant "resources used" in the sense of "resources directed towards this goal" rather than "resources drawn from the metropolitian utility company"- if the streetlamps play the stock market and accumulate a bunch of money, spending that money will still decrease their utility, and so unless they can spend the money in a way that improves the illumination cost-effectively they won't.

Now, defining "resources directed towards this goal" in a way that's machine-understandable is a hard problem. But if we already have an AI that thinks causally- such that it can actually make these plans and enact them- then it seems to me like that problem has already been solved.

Comment author: Elithrion 21 January 2013 11:58:36PM 0 points [-]

Hm, all right, fair enough. That actually sounds plausible, assuming we can be sure that the AI appropriately takes account of something vaguely along the lines of "all resources that will be used in relation to this problem", including, for example, creating a copy of itself that does not care about resources used and obfuscates its activities from the original. Which will probably be doable at that point.