You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

cousin_it comments on An example of self-fulfilling spurious proofs in UDT - Less Wrong Discussion

20 Post author: cousin_it 25 March 2012 11:47AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (39)

You are viewing a single comment's thread. Show more comments above.

Comment author: cousin_it 26 March 2012 10:08:14AM *  1 point [-]

I don't understand your comment. Our proof-theoretic algorithms can handle worlds with randomness just fine, because the logical fact of the algorithm's return value will be more or less correlated with many physical facts. The proof-theoretic algorithms don't require slight randomness to work, and my grandparent comment gave a reason why it seems hard to write a version that relies completely on slight randomness to kill spurious proofs, as you suggested. Of course if you do find a way, it will be cool.

Comment author: Dmytry 26 March 2012 10:13:42AM *  0 points [-]

Well, your grandparent's comment spoke of a problem with adding randomness, rather than with lack of necessity to add randomness. Maybe i simply misunderstood it.

Note btw that the self description will have to include randomness, to be an accurate description of an imperfect agent.

Comment author: cousin_it 26 March 2012 10:27:17AM *  2 points [-]

Let me try to explain. Assume you have an agent that's very similar to our proof-theoretic ones, but also has a small chance of returning a random action. Moreover, the agent's self-description includes a mention of the random variable V that makes the agent return a random action.

The first problem is that statements like "A()==a" are no longer logical statements. That seems easy to fix by making the agent look at conditional expected values instead of logical implications.

The second and more serious problem is this: how many independent copies of V does the world contain? Imagine the agent is faced with Newcomb's Problem. If the world program contains only one copy of V that's referenced by both instances of the agent, that amounts to abusing the problem formulation to tell the agent "your copies are located here and here". And if the two copies of the agent use uncorrelated instances of V, then looking at conditional expected values based on V buys you nothing. Figuring out the best action reduces to same logical question that the proof-theoretic algorithms are faced with.