You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

cousin_it comments on Example decision theory problem: "Agent simulates predictor" - Less Wrong Discussion

23 Post author: cousin_it 19 May 2011 03:16PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (76)

You are viewing a single comment's thread. Show more comments above.

Comment author: cousin_it 19 May 2011 03:36:18PM *  2 points [-]

Yeah, in ASP the predictor looks inside the agent. But the problem still seems "fair" in a certain sense, because the predictor's prediction is logically tied to the agent's decision. A stupid agent that one-boxed no matter what would get rewarded, and a stupid agent that two-boxed would get punished. In other words, I'd still like a decision theory that does well on some suitably defined class of ASP-like problems, even if that class is wider than the class of "TDT-fair" problems that Eliezer envisioned. Of course we need a lot of progress to precisely define such classes of problems, too.

Comment author: wedrifid 19 May 2011 05:09:17PM 2 points [-]

In other words, I'd still like a decision theory that does well on some suitably defined class of ASP-like problems, even if that class is wider than the class of "TDT-fair" problems that Eliezer envisioned. Of course we need a lot of progress to precisely define such classes of problems, too.

It would be useful to have list of problems that TDT can handle, a list that current specifications of UDT can handle and a list that are still in the grey area of not quite resolved. Among other things that would make the difference between TDT and UDT far more intuitively clear!

Comment author: jimrandomh 19 May 2011 03:51:23PM 0 points [-]

The class of problems which are well-posed in this type system is exactly the class of problems that would not change if you gave the agent a chance to self-modify in between receiving the problem statement and the start of the world program. Problems outside this class are neither fair nor solvable in principle nor interesting.

Comment author: cousin_it 19 May 2011 06:22:01PM *  0 points [-]

I work under the assumption that the problem statement is a world program or a prior over world programs. Maybe it's a bad assumption. Can you suggest a better one?

Comment author: jimrandomh 19 May 2011 06:43:24PM 0 points [-]

I use that same assumption, with only the slight caveat that I keep the world program and the preference function separate for clarity's sake, and you need both, but I often see them combined into one function.

The big difference, I think, is that the way I do it, the world program doesn't get a decision theory directly as input; instead, the world program's source is given to the decision theory, the decision theory outputs a strategy, and then the strategy is given to the world program as input. This is a better match for how we normally talk about decision theory problems, and prevents a lot of shennanigans.

Comment author: cousin_it 19 May 2011 06:46:36PM *  2 points [-]

Of course the world program shouldn't get the decision theory as input! In the formulation I always use, the world program doesn't have any inputs, it's a computation with no arguments that returns a utility value. You live in the world, so the world program contains your decision theory as a subroutine :-)

Comment author: jimrandomh 19 May 2011 07:05:29PM 0 points [-]

That's very wrong. You've given up the distinctions between question and answer, and between answer and solution procedure. This also lets in stupid problems, like "you get a dollar iff the source code to your decision theory is prime", which I would classify as not a decision theory problem at all.

Comment author: cousin_it 19 May 2011 07:10:52PM *  3 points [-]

Yeah, you can't control arbitrary world programs that way, but remember that our world really is like that: a human mind runs within the same universe that it's trying to control. One idea would be to define a class of "fair" world programs (perhaps those that can be factorized in appropriate ways) and care only about solving those. But I guess I'd better wait for your writeup, because I don't understand what kind of formalism you prefer.

Comment author: Vladimir_Nesov 19 May 2011 10:57:49PM 2 points [-]

This also lets in stupid problems, like "you get a dollar iff the source code to your decision theory is prime", which I would classify as not a decision theory problem at all.

In this case, the outcome is the same for every action you make. A decision theory can still consider this problem, and rightfully conclude that it's irrelevant which action it takes (or maybe it infers that jumping on a right foot has a slightly higher chance of making Dark Lords of the Matrix to have written your source code so that it's prime). I don't see how this is a problem.

Comment author: Vladimir_Nesov 19 May 2011 10:59:18PM 1 point [-]

That's very wrong. You've given up the distinctions between question and answer, and between answer and solution procedure.

But the world is really like that. It is essential to attain this.

Comment author: jimrandomh 19 May 2011 11:24:57PM 0 points [-]

I don't know of any important aspect of the world that can't be captured in question-answer-procedure format. Could you give an example?

Comment author: Vladimir_Nesov 19 May 2011 11:35:43PM *  1 point [-]

But who does the capturing? (To push the analogy one step up, I know of no solvable question that doesn't admit an answer written on a sheet of paper.)

See the section on "utility functions" in this post. UDT/ADT seeks exactly to infer functional dependencies that can then be used in the usual manner (at least, this is one way to look at what's going on). It is intended to solve the very problem which you argue must always be solvable.

My examples of explicit dependence bias show that there are wrong and right ways of parsing the outcome as depending on agent's decision. If we are guaranteed to be given a right parsing, then that part is indeed covered, and there is no need to worry about the wrong ones. Believing that a right parsing merely exists doesn't particularly help in finding one.

So I guess the problem you are having with UDT is that you assume that the problem of finding a correct explicit dependence of outcome on action is already solved, and we have a function World ready specified in a way that doesn't in any way implicitly depend on agent's actions. But UDT is intended to solve the problem while not making this assumption, and instead tries to find an unbiased dependence on its own. Since you assume the goal of UDT as a prerequisite in your thinking about decision theory, you don't see the motivation for UDT, which indeed there would be none had we assumed this problem solved.

Comment author: jimrandomh 19 May 2011 11:49:09PM 1 point [-]

We're talking past each other, and this back-and-forth conversation isn't going anywhere because we're starting from very different definitions. Let's restart this conversation after I've finished the post that builds up the definitions from scratch.