Vladimir_Nesov comments on So You Want to Save the World - Less Wrong

41 Post author: lukeprog 01 January 2012 07:39AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (146)

You are viewing a single comment's thread. Show more comments above.

Comment author: lukeprog 27 December 2011 09:44:04PM *  4 points [-]

Stuart Armstrong's explanation of the 5-and-10 problem is:

The five-and-ten problem (sometimes known as the heavy ghost problem) is a problem in certain types of [updateless decision theory]-like decision theories, when the fact that a counterfactual is known to be false makes the algorithm implement it.

Specifically, let there be a decision problem which involves the choice between $5 and $10, a utility function that values the $10 more than the $5, and an algorithm A that reasons something like:

"Look at all proposition of the type '(A decides to do X) implies (Utility=y)', and find the X that maximises y, then do X."

When faced with the above problem, certain types of algorithm can reason:

"The utility of $10 is greater than the utility of $5. Therefore I will never decide to choose $5. Therefor (A decides to do 'choose $5') is a false statement. Since a false statement implies anything, (A decides to do 'choose $5') implies (Utility=y) for any, arbitrarily high, value of y. Therefore this is the utility maximising decision, and I should choose $5."

That is the informal, natural language statement of the problem. Whether the algorithm is actually vulnerable to the 5-and-10 problem depends on the details of what the algorithm is allowed to deduce about itself.

However, some think Drescher's explanation is more accurate. Somebody should write a short paper on the problem so I can cite that instead. :)

Comment author: Vladimir_Nesov 28 December 2011 07:40:43AM *  20 points [-]

This is an incorrect description of 5-and-10. The description given is of a different problem (one of whose aspects is addressed in the recent cousin_it's writeup, the problem is resolved in that setting by Lemma 2).

5-and-10 problem is concerned with the following (incorrect) line of reasoning by a hypothetical agent:

"I have to decide between $5 and $10. Suppose I decide to choose $5. I know that I'm a money-optimizer, so if I do this, $5 must be more money than $10, so this alternative is better. Therefore, I should choose $5."

Comment author: lukeprog 28 December 2011 02:52:04PM 2 points [-]

Thanks!

Comment author: [deleted] 31 December 2011 07:47:11PM 4 points [-]

Has anyone emailed Judea Pearl, John Harrison, Jon Williamson, et cetera, asking them to look at this?

Comment author: lukeprog 01 January 2012 12:25:07AM 3 points [-]

I doubt it.

Comment author: [deleted] 01 January 2012 02:30:44AM *  7 points [-]

Because academics don't care about blogs? Or doing so would project the wrong image of the Singularity Institute? Or no one thought of doing it? Or someone thought of it, but there were more important things to do first? Perhaps because it's inefficient marketing? Or people who aren't already on lesswrong have failed some ratioanlity competence test? Or you're not sure it's safe to discuss?

Comment author: lukeprog 01 January 2012 02:55:16PM 10 points [-]

My plan has been to write up better, more precise specifications of the open problems before systematically sending them to top academics for comments.

Comment author: XiXiDu 01 January 2012 11:21:12AM 2 points [-]

Has anyone emailed Judea Pearl, John Harrison, Jon Williamson, et cetera, asking them to look at this?

Why don't you do it? I would if I could formulate those problems adequately.

Comment author: Malcolm_Edwards 04 January 2012 06:17:50AM *  0 points [-]

It seems to me that any agent unable to solve this problem would be considerably less intelligent than a human.

Comment author: Benja 25 August 2012 06:25:44AM 1 point [-]

It does seem unlikely that an "expected utility maximizer" reasoning like this would manage to build interstellar spaceships, but that insight doesn't automatically help with building an agent that is immune to this and similar problems.

Comment author: SilasBarta 03 January 2012 08:06:23PM *  0 points [-]

That's strange, Luke normally has good understanding of his sources, and uses and explains them correctly, and so usually recognizes an incorrect explanation.

Comment author: mavant 12 November 2013 02:11:23PM 0 points [-]

I don't really understand how this could occur in a TDT-agent. The agent's algorithm is causally dependent on '(max $5 $10), but considering the counterfactual severs that dependence. Observing a money-optimizer (let's call it B) choosing $5 over $10 would presumably cause the agent (call it A) to update its model of B to no longer depend on '(max $5 $10). Am I missing something here?

Comment author: Vladimir_Nesov 12 November 2013 07:13:43PM *  0 points [-]

Correctly getting to the comparison of $5 and $10 is the whole point of the exercise. An agent is trying to evaluate the consequences of its action, A, which is defined by agent's algorithm and is not known explicitly in advance. To do that, it could in some sense consider hypotheticals where its action assumes its possible values. One such hypothetical could involve a claim that A=$5. The error in question is about looking at the claim that A=$5 and making incorrect conclusions (which would result in an action that doesn't depend on comparing $5 and $10).

Comment author: linkhyrule5 26 July 2013 07:24:43AM 0 points [-]

This is probably a stupid question, but is this reducible to the Lobian obstacle? On the surface, it seems similar.