V_V comments on Newcomblike problems are the norm - LessWrong

39 Post author: So8res 24 September 2014 06:41PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (108)

You are viewing a single comment's thread. Show more comments above.

Comment author: V_V 28 September 2014 02:39:27PM 1 point [-]

That would require the AIXI agent to have been pretrained to understand English (or some language as expressive as English) and have some experience at solving problems given a verbal explanation of the rules.

In this scenario, the AIXI internal program ensemble concentrates its probability mass on programs which associate each pair of one English specification and one action to a predicted reward. Given the English specification, AIXI computes the expected reward for each action and outputs the action that maximizes the expected reward.

Note that in principle this can implement any computable decision theory. Which one it would choose depend on the agent history and the intrinsic bias of its UTM.
It can be CDT, EDT, UDT, or, more likely, some approximation of them that worked well for the agent so far.

Comment author: lackofcheese 28 September 2014 03:08:43PM *  0 points [-]

That would require the AIXI agent to have been pretrained to understand English (or some language as expressive as English) and have some experience at solving problems given a verbal explanation of the rules.

I don't think someone posing Newcomb's problem would be particularly interested in excuses like "but what if the agent only speaks French!?" Obviously as part of the setup of Newcomb's problem AIXI has to be provided with an epistemic background that is comparable to that of its intended target audience. This means it doesn't just have to be familiar with English, it has to be familiar with the real world, because Newcomb's problem takes place in the context of the real world (or something very much like it).

I think you're confusing two different scenarios:
- Someone training an AIXI agent to output problem solutions given problem specifications as inputs.
- Someone actually physically putting an AIXI agent into the scenario stipulated by Newcomb's problem.

The second one is Newcomb's problem; the first is the "what is the optimal strategy for Newcomb's problem?" problem.

It's the second one I'm arguing about in this thread, and it's the second one that people have in mind when they bring up Newcomb's problem.

Comment author: V_V 28 September 2014 03:37:18PM *  0 points [-]

Then AIXI ensemble will be dominated by programs which associate "real world" percepts and actions to predicted rewards.

The point is that there is no way, short of actually running the (physically impossible) experiment, that we can tell whether the behavior of this AIXI agent will be consistent with CDT, EDT, or something else entirely.