Vladimir_Nesov comments on Towards a New Decision Theory - Less Wrong

50 Post author: Wei_Dai 13 August 2009 05:31AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (142)

You are viewing a single comment's thread. Show more comments above.

Comment author: Eliezer_Yudkowsky 16 August 2009 11:27:01PM 2 points [-]

I suggest that regarding crazy decision theories with compact preferences as sane decision theories with noncompact preferences is a step backward which will only confuse yourself and the readers. What is accomplished by doing so?

Comment author: Vladimir_Nesov 16 August 2009 11:41:59PM *  1 point [-]

How to regard humans then? They certainly don't run a compact decision algorithm, their actions are not particularly telling of their preferences. And still, they have to be regarded as having a TDT preference, to extract that preference and place it in a TDT AI. As I envision a theory that would define what TDT preference humans have, it must also be capable of telling what is the TDT preference of crazy AIs or petunia or the Sun.

(Btw, I'm now not sure that CDT-generated AI will give crazy answers on questions about the past, it may just become indifferent to the past altogether, as that part of preference is already erased from its mind. CDT gave crazy answers, but when it constructed the TDT, it already lost the part of preference that corresponds to giving those crazy answers, and so the TDT won't give them.)

Comment author: Eliezer_Yudkowsky 17 August 2009 12:11:56AM *  2 points [-]

If you regard humans as sane EU maximizers with crazy preferences then you end up extracting crazy preferences! This is exactly the wrong thing to do.

I can't make out what you're saying about CDT-gen AI because I don't understand this talk about "that part of preference is already erased from its mind". You might be better off visualizing Dai's GLT, which a "half timeless decision theory" is just the compact generator of.

Comment author: Vladimir_Nesov 17 August 2009 12:29:20AM *  3 points [-]

If you regard humans as sane EU maximizers with crazy preferences then you end up extracting crazy preferences! This is exactly the wrong thing to do.

No, that's not what I mean. Humans are no more TDT agents with crazy preferences than CDT agents are TDT agents with crazy preferences: notice that I defined CDT's preference to be the preference of TDT to which CDT rewrites itself. TDT preference is not part of CDT AI's algorithm, but it follows from it, just like factorial of 72734 follows from the code of factorial function. Thus (if I try to connect the concepts that don't really fit) humanity's preference is analogous to preference of TDT AI that humanity could write if the process of writing this AI would be ideal according to the resulting AI's preference (but without this process wireheading on itself, more like a fixpoint, and not really happening in time). Which is not to say that it's the AI that humanity is most likely to write, which you can see from the example of trying to define petunia's preferences. Well, if I could formalize this step, I'd had it written up already. It seems to me like a direction towards better formalization from "if humans thought faster, were smarter, knew more, etc."