Why isn't the following decision theory optimal?
I've recently read the decision theory FAQ, as well as Eliezer's TDT paper. When reading the TDT paper, a simple decision procedure occurred to me which as far as I can tell gets the correct answer to every tricky decision problem I've seen. As discussed in the FAQ above, evidential decision theory get's the chewing gum problem wrong, causal decision theory gets Newcomb's problem wrong, and TDT gets counterfactual mugging wrong.
In the TDT paper, Eliezer postulates an agent named Gloria (page 29), who is defined as an agent who maximizes decision-determined problems. He describes how a CDT-agent named Reena would want to transform herself into Gloria. Eliezer writes
By Gloria’s nature, she always already has the decision-type causal agents wish they had, without need of precommitment.
Eliezer then later goes on the develop TDT, which is supposed to construct Gloria as a byproduct.
Gloria, as we have defined her, is defined only over completely decision-determined problems of which she has full knowledge. However, the agenda of this manuscript is to introduce a formal, general decision theory which reduces to Gloria as a special case.
Why can't we instead construct Gloria directly, using the idea of the thing that CDT agents wished they were? Obviously we can't just postulate a decision algorithm that we don't know how to execute, and then note that a CDT agent would wish they had that decision algorithm, and pretend we had solved the problem. We need to be able to describe the ideal decision algorithm to a level of detail that we could theoretically program into an AI.
Consider this decision algorithm, which I'll temporarily call Nameless Decision Theory (NDT) until I get feedback about whether it deserves a name: you should always make the decision that a CDT-agent would have wished he had pre-committed to, if he had previously known he'd be in his current situation and had the opportunity to precommit to a decision.
In effect, you are making an general precommittment to behave as if you made all specific precommitments that would ever be advantageous to you.
NDT is so simple, and Eliezer comes so close to stating it in his discussion of Gloria, that I assume there is some flaw with it that I'm not seeing. Perhaps NDT does not count as a "real"/"well defined" decision procedure, or can't be formalized for some reason? Even so, it does seem like it'd be possible to program an AI to behave in this way.
Can someone give an example of a decision problem for which this decision procedure fails? Or for which there are multiple possible precommitments that you would have wished you'd made and it's not clear which one is best?
EDIT: I now think this definition of NDT better captures what I was trying to express: You should always make the decision that a CDT-agent would have wished he had precommitted to, if he had previously considered the possibility of his current situation and had the opportunity to costlessly precommit to a decision.
Help create an instrumental rationality "stack ranking"?
I recently heard about SIAI's Rationality Minicamp and thought it sounded cool, but for logistical/expense reasons I won't be going to one.
There are probably lots of people who are interested in improving their instrumental rationality, know about and like LessWrong, but haven't read the vast majority of content because there is just so much material, and the practical payoff is uncertain.
It would be cool if it was much easier for people to find the highest ROI material on LessWrong.
My rough idea for how this new instrumental rationality tool might work:
- It starts off as a simple wiki focused on instrumental rationality. People only add things to the wiki (often just links to existing LessWrong articles) if they have tried them and found them very useful for achieving their goals.
- People are encouraged to add "exercises" that help you develop the skill represented by the article, of the type that are presumably done at the Rationality Minicamps.
- Only people who have tried the specific thing in question should add comments about their experiences with it.
- Long Term Goal: Every LessWrong user can define their own private stack rank of the most important concepts/techniques/habits for instrumental rationality. These stack ranks are globally merged by some LessWrong software to create an overall stack rank of the highest ROI ideas/behaviors/techniques as judged by the LessWrong community at any given time. People looking to improve their instrumental rationality can then just visit this global stack rank and pick the highest item that they haven't tried yet to experiment with, and work backwards from there if there are any prerequisites.
Do you think others would find this useful? Anyone have suggested improvements?
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)