Wei_Dai comments on The Absent-Minded Driver - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (139)
S() is suppose to be an implementation of UDT. By looking at the world program P, it should determine that among all possible input-output mappings, those that return "EXIT" for 1/3 of all inputs (doesn't matter which ones) maximize average payoff. What made me express the payoff as a function of p is by stepping through what S is supposed to do as an implementation of UDT.
Does that make sense?
I'm still confused. Your response seems to just say, "I did it because it works." -- which is a great reason! But I want to know if UDT gave you more guidance than that.
Does UDT require that you look at the consequences of doing something p% of the time (irrespective of which ones), on all problems?
Basically, I'm in the position of that guy/gal that everyone here probably helped out in high school:
"How do you do the proof in problem 29?" "Oh, just used identities 3 and 5, solve for t, and plug it back into the original equation." "But how did you know to do that?"
No, UDT (at least in my formulation) requires that you look at all possible input-output mappings, and choose the one that is optimal. In this case it so happens that any function that returns "EXIT" for 1/3 of inputs is optimal.