Dr_Manhattan comments on Stupid Questions Open Thread - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (265)
(I super-upvoted this, since asking stupid questions is a major flinch/ugh field)
Ok, my stupid question, asked in a blatantly stupid way, is: where does the decision theory stuff fit in The Plan? I have gotten the notion that it's important for Value-Preserving Self-Modification in a potential AI agent, but I'm confused because it all sounds too much like game theory - there all all these other-agents it deals with. If it's not for VPSM, and it fact some exploration of how AI would deal with potential agents, why is this important at all? Let AI figure that out, it's going to be smarter than us anyway.
If there is some Architecture document I should read to grok this, please point me there.
I think Eliezer's reply (point '(B)') to this comment by Wei Dai provides some explanation, as to what the decision theory is doing here.
From the reply (concerning UDT):
My impression is that, with self-modification and time, continuity of identity becomes a sticky issue. If I can become an entirely different person tomorrow, how I structure my life is not the weak game theory of "how do I bargain with another me?" but the strong game theory of "how do I bargain with someone else?"
Other agents are complicated regularities in the world (or a more general decision problem setting). Finding problems with understanding what's going on when we try to optimize in other agents' presence is a good heuristic for spotting gaps in our understanding of the idea of optimization.
I think the main reason is simple. It's hard to create a transparent/reliable agent without decision theory. Also, since we're talking about a super-power agent, you don't want to mess this up. CDT and EDT are known to mess up, so it would be very helpful to find a "correct" decision theory. Though you may somehow be able to get around it by letting an AI self-improve, it would be nice to have one less thing to worry about, especially because how the AI improves is itself a decision.