Houshalter comments on The AI, the best human advisor - Less Wrong

7 Post author: Stuart_Armstrong 13 July 2015 03:33PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (15)

You are viewing a single comment's thread. Show more comments above.

Comment author: Houshalter 09 September 2015 11:55:51PM 0 points [-]

Then perhaps we should research ways to measure and restrict intelligence/optimization power.

Just off the top of my head, one way would be to add another term to it's utility function. Representing the amount of computing power used (or time). It would then have an incentive to use as little computing power as possible to meet it's goal.

An example, you ask the AI to solve a problem for you. The utility function is maximizing the probability that it's answer will be accepted by you as a solution. But after the probability goes above 90%, the utility stops, and a penalty is added for using more computing power.

So the AI tries to solve the problem, but uses the minimal amount of optimization necessary, and doesn't over optimize.

Comment author: Stuart_Armstrong 10 September 2015 08:51:30AM 0 points [-]

Those approaches fail the "subagent problem". As in, the AI can pass it by creating a subagent to solve the problem for it, without the subagent having those restrictions.

Comment author: Houshalter 11 September 2015 12:04:58AM 0 points [-]

I'm assuming the AI exists in a contained box. We can accurately measure the time it is on and/or resources used within the box. So it can't create any subagents that also don't use up it's resources and count towards the penalty.

If the AI can escape from the box, we've already failed. There is little point in trying to control what it can do with it's output channel.

Comment author: Stuart_Armstrong 11 September 2015 08:21:24AM 0 points [-]

Reduced impact can control an AI that has the ability to get out of its box. That's what I like about it.