mavant comments on So You Want to Save the World - Less Wrong

41 Post author: lukeprog 01 January 2012 07:39AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (146)

You are viewing a single comment's thread. Show more comments above.

Comment author: lukeprog 27 December 2011 09:44:04PM *  4 points [-]

Stuart Armstrong's explanation of the 5-and-10 problem is:

The five-and-ten problem (sometimes known as the heavy ghost problem) is a problem in certain types of [updateless decision theory]-like decision theories, when the fact that a counterfactual is known to be false makes the algorithm implement it.

Specifically, let there be a decision problem which involves the choice between $5 and $10, a utility function that values the $10 more than the $5, and an algorithm A that reasons something like:

"Look at all proposition of the type '(A decides to do X) implies (Utility=y)', and find the X that maximises y, then do X."

When faced with the above problem, certain types of algorithm can reason:

"The utility of $10 is greater than the utility of $5. Therefore I will never decide to choose $5. Therefor (A decides to do 'choose $5') is a false statement. Since a false statement implies anything, (A decides to do 'choose $5') implies (Utility=y) for any, arbitrarily high, value of y. Therefore this is the utility maximising decision, and I should choose $5."

That is the informal, natural language statement of the problem. Whether the algorithm is actually vulnerable to the 5-and-10 problem depends on the details of what the algorithm is allowed to deduce about itself.

However, some think Drescher's explanation is more accurate. Somebody should write a short paper on the problem so I can cite that instead. :)

Comment author: Vladimir_Nesov 28 December 2011 07:40:43AM *  20 points [-]

This is an incorrect description of 5-and-10. The description given is of a different problem (one of whose aspects is addressed in the recent cousin_it's writeup, the problem is resolved in that setting by Lemma 2).

5-and-10 problem is concerned with the following (incorrect) line of reasoning by a hypothetical agent:

"I have to decide between $5 and $10. Suppose I decide to choose $5. I know that I'm a money-optimizer, so if I do this, $5 must be more money than $10, so this alternative is better. Therefore, I should choose $5."

Comment author: mavant 12 November 2013 02:11:23PM 0 points [-]

I don't really understand how this could occur in a TDT-agent. The agent's algorithm is causally dependent on '(max $5 $10), but considering the counterfactual severs that dependence. Observing a money-optimizer (let's call it B) choosing $5 over $10 would presumably cause the agent (call it A) to update its model of B to no longer depend on '(max $5 $10). Am I missing something here?

Comment author: Vladimir_Nesov 12 November 2013 07:13:43PM *  0 points [-]

Correctly getting to the comparison of $5 and $10 is the whole point of the exercise. An agent is trying to evaluate the consequences of its action, A, which is defined by agent's algorithm and is not known explicitly in advance. To do that, it could in some sense consider hypotheticals where its action assumes its possible values. One such hypothetical could involve a claim that A=$5. The error in question is about looking at the claim that A=$5 and making incorrect conclusions (which would result in an action that doesn't depend on comparing $5 and $10).