You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Mark_Friedenbach comments on Simulation argument meets decision theory - Less Wrong Discussion

14 Post author: pallas 24 September 2014 10:47AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (54)

You are viewing a single comment's thread. Show more comments above.

Comment author: [deleted] 24 September 2014 07:27:00PM 1 point [-]

It should be (world-history, identity)=>R. Different agents have different goals, which give different utility values to actions.

Comment author: jimrandomh 24 September 2014 08:25:25PM 2 points [-]

You've then incorporated identity twice: once when you gave each agent its own goals, and again inside of those goals. If an agent's goals have a dangling identity-pointer inside, then they won't stay consistent (or well-defined) in case of self-copying, so by the same argument which says agents should stop their utility functions from drifting over time, it should replace that pointer with a specific value.

Comment author: lackofcheese 24 September 2014 08:45:24PM *  5 points [-]

So, in other words: If I am D and all I want is to be king of the universe, then before stepping into a copying machine I should self-modify so that my utility function will say "+1000 if D is king of the universe" rather than "+1000 if I am king of the universe", because then my copy D2 will have a utility function of "+1000 if D is king of the universe", and that maximises my chances of being king of the universe.

That is what you mean, right?

I guess the anthropic counter is this: What if, after stepping into the machine, I will end up being D2 instead of being D!? If I was to self-modify to care only about D then I wouldn't end up being king of the universe, D would!

Comment author: DanArmak 24 September 2014 10:15:28PM 1 point [-]

The agent, and the utility function's implementation in the agent, are already part of the world and its world-history. If two agents in two universes cannot be distinguished by any observation in their universes, then they must exhibit identical behavior. I claim it makes no sense to say two agents have different goals or different utility functions if they are physically identical.