You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Mark_Friedenbach comments on [LINK] Wait But Why - The AI Revolution Part 2 - Less Wrong Discussion

17 Post author: adamzerner 04 February 2015 04:02PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (87)

You are viewing a single comment's thread. Show more comments above.

Comment author: [deleted] 07 February 2015 03:41:45PM 0 points [-]

I'm not sure you understood what FeepingCreature was saying.

Comment author: pinyaka 08 February 2015 03:27:24PM 0 points [-]

Would you care to try and clarify it for me?

Comment author: [deleted] 08 February 2015 05:07:27PM 0 points [-]

The way in which artificial intelligences are often written, a terminal goal is a terminal goal is a terminal goal, end of story. "Whatever seemingly terminal goal you've given it isn't actually terminal" is anthropomorphizing. In the AI, a goal is instrumental if it has a link to a higher-level goal. If not, it is terminal. The relationship is very, very explicit.

Comment author: pinyaka 08 February 2015 08:31:33PM 0 points [-]

I think FeepingCreature was actually just pointing out a logical fallacy in a misstatement on my part and that is why they didn't respond further in this part of the thread after I corrected myself (but has continued elsewhere).

If you believe that a terminal goal for the state of the world other than the result of a comparison between a desired state and an actual state is possible, perhaps you can explain how that would work? That is fundamentally what I'm asking for throughout this thread. Just stating that terminal goals are terminal goals by definition is true, but doesn't really show that making a goal terminal is possible.

Comment author: [deleted] 08 February 2015 10:00:28PM *  0 points [-]

If you believe that a terminal goal for the state of the world other than the result of a comparison between a desired state and an actual state is possible, perhaps you can explain how that would work?

Sure. My terminal goal is an abstraction of my behavior to shoot my laser at the coordinates of blue objects detected in my field of view.

Just stating that terminal goals are terminal goals by definition is true, but doesn't really show that making a goal terminal is possible.

That's not what I was saying either. The problem of "how do we know a terminal goal is terminal?" is dissolved entirely by understanding how goal systems work in real intelligences. In such machines goals are represented explicitly in some sort of formal language. Either a goal makes causal reference to other goals in its definition, in which case it is an instrumental goal, or it does not and is a terminal goal. Changing between one form and the other is an unsafe operation no rational agent and especially no friendly agent would perform.

So to address your statement directly, making a terminal goal is trivially easy: you define it using the formal language of goals in such a way that no causal linkage is made to other goals. That's it.

That said, it's not obvious that humans have terminal goals. That's why I was saying you are anthropomorphizing the issue. Either humans have only instrumental goals in a cyclical or messy spaghetti-network relationship, or they have no goals at all and instead better represented as behaviors. The Jury is out on this one, but I'd be very surprised if we had anything resembling an actual terminal goal inside us.

Comment author: pinyaka 09 February 2015 12:48:58AM 0 points [-]

Sure. My terminal goal is an abstraction of my behavior to shoot my laser at the coordinates of blue objects detected in my field of view.

Well, I suppose that does fit the question I asked. We've mostly been talking about an AI with the ability to read and modify it's own goal system which Yvain specifically excludes in the blue-minimizer. We're also assuming that it's powerful enough to actually manipulate it's world to optimize itself. Yvain's blue minimizer also isn't an AGI or ASI. It's an ANI, which we use without any particular danger all the time. He said something about having human level intelligence, but didn't go into what that means for an entity that is unable to use it's intelligence to modify it's behavior.

That's not what I was saying either. The problem of "how do we know a terminal goal is terminal?" is dissolved entirely by understanding how goal systems work in real intelligences. In such machines goals are represented explicitly in some sort of formal language. Either a goal makes causal reference to other goals in its definition, in which case it is an instrumental goal, or it does not and is a terminal goal. Changing between one form and the other is an unsafe operation no rational agent and especially no friendly agent would perform.

I am arguing that the output of the thing that decides whether a machine has met it's goal is the actual terminal goal. So, if it's programmed to shoot blue things with a laser, the terminal goal is to get to a state where the perception of reality is that it's shooting a blue thing. Shooting at the blue thing is only instrumental in getting the perception of itself into that state, thus producing a positive result from the function that evaluates whether the goal has been met. Shooting the blue thing is not a terminal value. A return value of "true" to the question of "is the laser shooting a blue thing" is the terminal value. This, combined with the ability to understand and modify it's goals, means that it might be easier to modify the goals than to modify reality.

So to address your statement directly, making a terminal goal is trivially easy: you define it using the formal language of goals in such a way that no causal linkage is made to other goals. That's it.

I'm not sure you can do that in an intelligent system. It's the "no causal linkage is made to other goals" thing that sticks. It's trivially easy to do without intelligence provided that you can define the behavior you want formally, but when you can't do that it seems that you have to link the behavior to some kind of a system that evaluates whether you're getting the result you want and then you've made that a causal link (I think). Perhaps it's possible to just sit down and write trillions of lines of code and come up with something that would work as an AGI or even an ASI, but that shouldn't be taken as a given because no one has done it or proven that it can be done (to my knowledge). I'm looking for the non-trivial case of an intelligent system that has a terminal goal.

That said, it's not obvious that humans have terminal goals.

I would argue that getting our reward center to fire is likely a terminal goal, but that we have some biologically hardwired stuff that prevents us from being able to do that directly or systematically. We've seen in mice and the one person that I know of who's been given the ability to wirehead that given that chance, it only takes a few taps on that button to cause behavior that

Comment author: [deleted] 09 February 2015 01:37:42AM 0 points [-]

I would argue that getting our reward center to fire is likely a terminal goal.

How do you explain Buddhism?

Comment author: pinyaka 09 February 2015 02:15:18AM 0 points [-]

How is this refuted by Buddhism?

Comment author: [deleted] 09 February 2015 05:26:10AM 0 points [-]

People lead fulfilling lives guided by a spiritualism that reject seeking pleasure. Aka reward.

Comment author: pinyaka 10 February 2015 01:26:51PM 0 points [-]

Pleasure and reward are not the same thing. For humans, pleasure almost always leads to reward, but reward doesn't only happen with pleasure. For the most extreme examples of what you're describing, ascetics and monks and the like, I'd guess that some combination of sensory deprivation and rhythmic breathing cause the brain to short circuit a bit and release some reward juice.