JGWeissman comments on General purpose intelligence: arguing the Orthogonality thesis - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (156)
Such an agent would also not change its decision theory as a result of philosophical consideration, which potentially limits its power.
I wouldn't argue against this as written, but Stuart was claiming that convergence is "very unlikely" which I think is too strong.
I don't think that follows, or at least the agent could change its decision theory as a result of some consideration, which may or may not be "philosophical". We already have the example that a CDT agent that learns in advance it will face Newcomb's problem could predict it would do better if it switched to TDT.
I wrote earlier
XDT<CDT> (or in Eliezer's words, "crippled and inelegant form of TDT") is closer to TDT but still worse. For example, XDT<CDT> would fail to acausally control/trade with other agents living before the time of its self-modification, or in other possible worlds.
Ah, yes, I agree that CDT would modify to XDT<CDT> rather than TDT, though the fact that it self modifies at all shows that goal driven agents can change decision theories because the new decision theory helps it achieve its goal. I do think that it's important to consider how a particular decision theory can decide to self modify, and to design an agent with a decision theory that can self modify in good ways.