JGWeissman comments on General purpose intelligence: arguing the Orthogonality thesis - Less Wrong

20 Post author: Stuart_Armstrong 15 May 2012 10:23AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (156)

You are viewing a single comment's thread. Show more comments above.

Comment author: Wei_Dai 16 May 2012 07:43:23PM *  12 points [-]

I think we don't just lack introspective access to our goals, but can't be said to have goals at all (in the sense of preference ordering over some well defined ontology, attached to some decision theory that we're actually running). For the kind of pseudo-goals we have (behavior tendencies and semantically unclear values expressed in natural language), they don't seem to have the motivational strength to make us think "I should keep my goal G1 instead of avoiding arbitrariness", nor is it clear what it would mean to "keep" such pseudo-goals as one self-improves.

What if it's the case that evolution always or almost always produces agents like us, so the only way they can get real goals in the first place is via philosophy?

Comment author: JGWeissman 16 May 2012 08:06:05PM 4 points [-]

The primary point of my comment was to argue that an agent that has a goal in the strong sense would not abandon its goal as a result of philosophical consideration. Your response seems more directed at my afterthought about how our intuitions based on human experience would cause us to miss the primary point.

I think that we humans do have goals, despite not being able to consistantly pursue them. I want myself and my fellow humans to continue our subjective experiences of life in enjoyable ways, without modifying what we enjoy. This includes connections to other people, novel experiences, high challenge, etc. There is, of course, much work to be done to complete this list and fully define all the high level concepts, but in the end I think there are real goals there, which I would like to be embodied in a powerful agent that actually runs a coherent decision theory. Philosophy probably has to play some role in clarifying our "pseudo-goals" as actual goals, but so does looking at our "pseudo-goals", however arbitrary they may be.

Comment author: Wei_Dai 16 May 2012 08:39:57PM 5 points [-]

The primary point of my comment was to argue that an agent that has a goal in the strong sense would not abandon its goal as a result of philosophical consideration.

Such an agent would also not change its decision theory as a result of philosophical consideration, which potentially limits its power.

Philosophy probably has to play some role in clarifying our "pseudo-goals" as actual goals, but so does looking at our "pseudo-goals", however arbitrary they may be.

I wouldn't argue against this as written, but Stuart was claiming that convergence is "very unlikely" which I think is too strong.

Comment author: JGWeissman 16 May 2012 09:01:18PM 2 points [-]

Such an agent would also not change its decision theory as a result of philosophical consideration, which potentially limits its power.

I don't think that follows, or at least the agent could change its decision theory as a result of some consideration, which may or may not be "philosophical". We already have the example that a CDT agent that learns in advance it will face Newcomb's problem could predict it would do better if it switched to TDT.

Comment author: Wei_Dai 16 May 2012 10:15:20PM 2 points [-]

I wrote earlier

"ability to improve decision theory via philosophical reasoning" (as opposed to CDT-AI changing into XDT<CDT> and then being stuck with that)

XDT<CDT> (or in Eliezer's words, "crippled and inelegant form of TDT") is closer to TDT but still worse. For example, XDT<CDT> would fail to acausally control/trade with other agents living before the time of its self-modification, or in other possible worlds.

Comment author: JGWeissman 16 May 2012 11:30:22PM 0 points [-]

Ah, yes, I agree that CDT would modify to XDT<CDT> rather than TDT, though the fact that it self modifies at all shows that goal driven agents can change decision theories because the new decision theory helps it achieve its goal. I do think that it's important to consider how a particular decision theory can decide to self modify, and to design an agent with a decision theory that can self modify in good ways.

Comment author: Dolores1984 16 May 2012 09:37:06PM 1 point [-]

Not strictly. If strongly goal'd agent determines that a different decision theory (or any change to itself) better maximizes its goal, it would adopt that new decision theory or change.