RobertLumley comments on Satisficers want to become maximisers - Less Wrong

21 Post author: Stuart_Armstrong 21 October 2011 04:27PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (67)

You are viewing a single comment's thread. Show more comments above.

Comment author: RobertLumley 21 October 2011 06:29:21PM 3 points [-]

But you run into other problems then, like the certainty the OP touched on. Then the agent will spend significant resources ensuring that it has exactly 9 paperclips made, and wouldn't accept a 90% probability of making 10 paperclips, because a 99.9999% probability of making 9 paperclips would yield more utility for it.

Comment author: timtyler 22 October 2011 12:39:52PM 1 point [-]

Sooo - you would normally give such an agent time and resource-usage limits.

Comment author: RobertLumley 22 October 2011 04:15:15PM 0 points [-]

But the entire point of building FAI is to not require it to have resource usage limits, because it can't help us if it's limited. And such resource limits wouldn't necessarily be useful for "testing" whether or not an AI was friendly, because if it weren't, it would mimic the behaviour of a FAI so that it could get more resources.

Comment author: timtyler 22 October 2011 07:21:38PM *  -1 points [-]

But the entire point of building FAI is to not require it to have resource usage limits, because it can't help us if it's limited.

Machines can't cause so much damage if they have resource-usage limits. This is a prudent safety precaution. It is not true that resource-limited machines can't help us.

And such resource limits wouldn't necessarily be useful for "testing" whether or not an AI was friendly, because if it weren't, it would mimic the behaviour of a FAI so that it could get more resources.

So: the main idea is to attempt damage limitation. If the machine behaves itself, you can carry on with another session. If it does not, it is hopefully back to the drawing board, without too much damage done.