red75 comments on Imagine a world where minds run on physics - Less Wrong

12 Post author: cousin_it 31 October 2010 07:09PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (29)

You are viewing a single comment's thread.

Comment author: red75 04 November 2010 02:54:59PM *  0 points [-]

You seem to assume that world is indifferent to agent's goals. But if there's another agent, then it can be not the case.

Let G1 be "tile the universe with still-life", G2 be "tile upper half of the universe with still-life".

If agent A has goal G1, it will be provable destroyed by agent B, if A will change it's goal to G2, then B will not interfere.

A and B have full information on world's state.

Should A modify its goal?

Edit: Goal stability != value stability. So my point isn't valid.

Comment author: cousin_it 04 November 2010 04:13:09PM 0 points [-]

You seem to assume that world is indifferent to agent's goals.

No, I don't need that assumption. What conclusion in the post depends on it, in your opinion?

Comment author: red75 04 November 2010 04:43:12PM *  1 point [-]

It's error on my part, I assumed that goal stability equals value stability. But then it looks like that it can be impossible to reconstruct agent's values given only its current state.

Comment author: cousin_it 04 November 2010 06:38:36PM 0 points [-]

I'm afraid I still don't understand your reasoning. How are "goals" different from "values", in your terms?

Comment author: red75 04 November 2010 08:26:13PM 0 points [-]

Goal is what an agent optimizes for at a given point in time. Value is the initial goal of an agent (in your toy model at least).

In my root post it seems to be optimal for agent A to self-modify into agent A', which optimizes for G2, thus agent A' succeeds in optimizing world according to its values (goal of agent A). But original goal doesn't influence its optimization procedure anymore. Thus if we'll analyze agent A' (without knowledge of world's history), we'll be unable to infer its values (its original goal).

Comment author: cousin_it 04 November 2010 08:30:29PM 1 point [-]

Yes, that seems to be correct.