Stuart_Armstrong comments on 'Dumb' AI observes and manipulates controllers - Less Wrong

33 Post author: Stuart_Armstrong 13 January 2015 01:35PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (19)

You are viewing a single comment's thread. Show more comments above.

Comment author: Stuart_Armstrong 23 July 2015 08:41:10AM 0 points [-]

I see no reason this story-writing AI would need to be allowed to plan more than one story at time.

Because the AI is programmed by people who hadn't thought of this issue, and the other way turned out to be simpler/easier?

dynamic inconsistency can provide intrinsic protection from unwanted long-term strategies from the AI.

I know. The problem is that inconsistency is unstable (which is why we're using other measures to maintain it, eg using a tool AI only). That's one of the reasons I was interested in stable versions of these kind of unstable motivations http://lesswrong.com/r/discussion/lw/lws/closest_stable_alternative_preferences/ .

Comment author: V_V 23 July 2015 09:41:34AM -1 points [-]

Because the AI is programmed by people who hadn't thought of this issue, and the other way turned out to be simpler/easier?

Ok, but if this is a narrow AI rather than an AGI agent used for that particular activity, then it seems intuitive to me that designing it to plan over a single task at time would be simpler.

I know. The problem is that inconsistency is unstable (which is why we're using other measures to maintain it, eg using a tool AI only). That's one of the reasons I was interested in stable versions of these kind of unstable motivations http://lesswrong.com/r/discussion/lw/lws/closest_stable_alternative_preferences/ .

The post you liked doesn't deal with dynamic inconsistency. It refers to agents that are expected utility maximizers under Von Neumann–Morgenstern utility theory, but this theory only deals with one-shot decision making, not decision making over time.

You can reduce the problem of decision making over time to one-shot decision making by combining instantaneous utilities into a cumulative utility function ( * ) and then using it as a one-shot utility function.

If you combine the instantaneous utilities by their (exponentially discounted) sum over an infinite time horizon, you obtain a dynamically consistent expected utility maximizer agent. But if you sum utilities up to a fixed time horizon, you still obtain an agent that at each instant is an expected utility maximizer, but it is not dynamically consistent.

You may argue that dynamical inconsistency is not stable under evolution by random mutations and natural selection, but it is not obvious to me that AIs would face such scenario. Even an AI that modifies itself or generate successors has no incentive to maximize its evolutionary fitness unless you specifically program it to do so.

Comment author: Stuart_Armstrong 23 July 2015 09:51:26AM 0 points [-]

Actually, you could use corrigibility to get dynamic inconsistency https://intelligence.org/2014/10/18/new-report-corrigibility/ .