royf comments on Fake Causality - Less Wrong

41 Post author: Eliezer_Yudkowsky 23 August 2007 06:12PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (86)

Sort By: Old

You are viewing a single comment's thread. Show more comments above.

Comment author: wedrifid 04 June 2012 03:55:56AM 1 point [-]

Sadly, we humans can't rewrite our own code, the way a properly designed AI could.

Sure we can!

Not the way a properly designed AI could. The difference is qualitative.

Comment author: royf 04 June 2012 04:51:09AM 0 points [-]

Having asserted that your claim is, in fact, new information: can you please clarify and explain why you believe that?

Comment author: CuSithBell 04 June 2012 04:56:01AM 1 point [-]

An advanced AI could reasonably be expected to be able to explicitly edit any part of its code however it desires. Humans are unable to do this.

Comment author: royf 04 June 2012 05:08:54AM 0 points [-]

I believe that is a misconception. Perhaps I'm not being reasonable, but I would expect the level at which you could describe such a creature in terms of "desires" to be conceptually distinct from the level at which it can operate on its own code.

This is the same old question of "free will" again. Desires don't exist as a mechanism. They exist as an approximate model of describing the emergent behavior of intelligent agents.

Comment author: CuSithBell 04 June 2012 05:18:20AM 0 points [-]

You are saying that a GAI being able to alter its own "code" on the actual code-level does not imply that it is able to alter in a deliberate and conscious fashion its "code" in the human sense you describe above?

Generally GAIs are ascribed extreme powers around here - if it has low-level access to its code, then it will be able to determine how its "desires" derive from this code, and will be able to produced whatever changes it wants. Similarly, it will be able to hack human brains with equal finesse.

Comment author: wedrifid 04 June 2012 05:47:02AM *  0 points [-]

Generally GAIs are ascribed extreme powers around here

(Yes, and this is partly just because AIs that don't meet a certain standard are implicitly excluded from the definition of the class being described. AIs below that critical threshold are considered boring and irrelevant for most purposes.)

Comment author: TheOtherDave 04 June 2012 01:27:20PM 0 points [-]

Indeed, the same typically goes for NIs. Though some speakers make exceptions for some speakers.

Comment author: royf 04 June 2012 06:13:17AM 2 points [-]

I am saying pretty much exactly that. To clarify further, the words "deliberate", "conscious" and "wants" again belong to the level of emergent behavior: they can be used to describe the agent, not to explain it (what could not be explained by "the agent did X because it wanted to"?).

Let's instead make an attempt to explain. A complete control of an agent's own code, in the strict sense, is in contradiction of Gödel's incompleteness theorem. Furthermore, information-theoretic considerations significantly limit the degree to which an agent can control its own code (I'm wondering if anyone has ever done the math. I expect not. I intend to look further into this). In information-theoretic terminology, the agent will be limited to typical manipulations of its own code, which will be a strict (and presumably very small) subset of all possible manipulations.

Can an agent be made more effective than humans in manipulating its own code? I have very little doubt that it can. Can it lead to agents qualitatively more intelligent than humans? Again, I believe so. But I don't see a reason to believe that the code-rewriting ability itself can be qualitatively different than a human's, only quantitatively so (although of course the engineering details can be much different; I'm referring to the algorithmic level here).

Generally GAIs are ascribed extreme powers around here

As you've probably figured out, I'm new here. I encountered this post while reading the sequences. Although I'm somewhat learned on the subject, I haven't yet reached the part (which I trust exists) where GAI is discussed here.

On my path there, I'm actively trying to avoid a certain degree of group thinking which I detect in some of the comments here. Please take no offense, but it's phrases like the above quote which worry me: is there really a consensus around here about such profound questions? Hopefully it's only the terminology which is agreed upon, in which case I will learn it in time. But please, let's make our terminology "pay rent".

Comment author: CuSithBell 04 June 2012 02:49:18PM 0 points [-]

You are saying that a GAI being able to alter its own "code" on the actual code-level does not imply that it is able to alter in a deliberate and conscious fashion its "code" in the human sense you describe above?

I am saying pretty much exactly that. To clarify further, the words "deliberate", "conscious" and "wants" again belong to the level of emergent behavior: they can be used to describe the agent, not to explain it (what could not be explained by "the agent did X because it wanted to"?).

Sure, but we could imagine an AI deciding something like "I do not want to enjoy frozen yogurt", and then altering its code in such a way that it is no longer appropriate to describe it as enjoying frozen yogurt, yeah?

Let's instead make an attempt to explain. A complete control of an agent's own code, in the strict sense, is in contradiction of Gödel's incompleteness theorem. Furthermore, information-theoretic considerations significantly limit the degree to which an agent can control its own code (I'm wondering if anyone has ever done the math. I expect not. I intend to look further into this). In information-theoretic terminology, the agent will be limited to typical manipulations of its own code, which will be a strict (and presumably very small) subset of all possible manipulations.

This seems trivially false - if an AI is instantiated as a bunch of zeros and ones in some substrate, how could Godel or similar concerns stop it from altering any subset of those bits?

Can an agent be made more effective than humans in manipulating its own code? I have very little doubt that it can. Can it lead to agents qualitatively more intelligent than humans? Again, I believe so. But I don't see a reason to believe that the code-rewriting ability itself can be qualitatively different than a human's, only quantitatively so (although of course the engineering details can be much different; I'm referring to the algorithmic level here).

You see reasons to believe that any artificial intelligence is limited to altering its motivations and desires in a way that is qualitatively similar to humans? This seems like a pretty extreme claim - what are the salient features of human self-rewriting that you think must be preserved?

Generally GAIs are ascribed extreme powers around here

As you've probably figured out, I'm new here. I encountered this post while reading the sequences. Although I'm somewhat learned on the subject, I haven't yet reached the part (which I trust exists) where GAI is discussed here.

On my path there, I'm actively trying to avoid a certain degree of group thinking which I detect in some of the comments here. Please take no offense, but it's phrases like the above quote which worry me: is there really a consensus around here about such profound questions? Hopefully it's only the terminology which is agreed upon, in which case I will learn it in time. But please, let's make our terminology "pay rent".

I don't think it's a "consensus" so much as an assumed consensus for the sake of argument. Some do believe that any hypothetical AI's influence is practically unlimited, some agree to assume that because it's not ruled out and is a worst-case scenario or an interesting case (see wedrifid's comment on the grandparent (aside: not sure how unusual or nonobvious this is, but we often use familial relationships to describe the relative positions of comments, e.g. the comment I am responding to is the "parent" of this comment, the one you were responding to when you wrote it is the "grandparent". I think that's about as far as most users take the metaphor, though.)).

Comment author: royf 04 June 2012 11:27:53PM 0 points [-]

Thanks for challenging my position. This discussion is very stimulating for me!

Sure, but we could imagine an AI deciding something like "I do not want to enjoy frozen yogurt", and then altering its code in such a way that it is no longer appropriate to describe it as enjoying frozen yogurt, yeah?

I'm actually having trouble imagining this without anthropomorphizing (or at least zoomorphizing) the agent. When is it appropriate to describe an artificial agent as enjoying something? Surely not when it secretes serotonin into its bloodstream and synapses?

This seems trivially false - if an AI is instantiated as a bunch of zeros and ones in some substrate, how could Godel or similar concerns stop it from altering any subset of those bits?

It's not a question of stopping it. Gödel is not giving it a stern look, saying: "you can't alter your own code until you've done your homework". It's more that these considerations prevent the agent from being in a state where it will, in fact, alter its own code in certain ways. This claim can and should be proved mathematically, but I don't have the resources to do that at the moment. In the meanwhile, I'd agree if you wanted to disagree.

You see reasons to believe that any artificial intelligence is limited to altering its motivations and desires in a way that is qualitatively similar to humans? This seems like a pretty extreme claim - what are the salient features of human self-rewriting that you think must be preserved?

I believe that this is likely, yes. The "salient feature" is being subject to the laws of nature, which in turn seem to be consistent with particular theories of logic and probability. The problem with such a claim is that these theories are still not fully understood.

Comment author: TheOtherDave 05 June 2012 01:17:33AM 0 points [-]

When is it appropriate to describe a natural agent as enjoying something?

Comment author: royf 05 June 2012 01:47:13AM 0 points [-]

As I said, when it secretes serotonin into its bloodstream and synapses.

Comment author: Kindly 05 June 2012 01:58:27AM *  0 points [-]

It's not a question of stopping it. Gödel is not giving it a stern look, saying: "you can't alter your own code until you've done your homework". It's more that these considerations prevent the agent from being in a state where it will, in fact, alter its own code in certain ways. This claim can and should be proved mathematically, but I don't have the resources to do that at the moment. In the meanwhile, I'd agree if you wanted to disagree.

I'd like to understand what you're saying here better. An agent instantiated as a binary program can do any of the following:

  • Rewrite its own source code with a random binary string.

  • Do things until it encounters a different agent, obtain its source code, and replace its own source code with that.

It seems to me that either of these would be enough to provide "complete control" over the agent's source code in the sense that any possible program can be obtained as a result. So you must mean something different. What is it?

Comment author: royf 05 June 2012 02:19:29AM *  1 point [-]

Rewrite its own source code with a random binary string

This is in a sense the electronic equivalent of setting oneself on fire - replacing oneself with maximum entropy. An artificial agent is extremely unlikely to "survive" this operation.

any possible program can be obtained as a result

Any possible program could be obtained, and the huge number of possible programs should hint that most are extremely unlikely to be obtained.

I assumed we were talking about an agent that is active and kicking, and with some non-negligible chance to keep surviving. Such an agent must have a strongly non-uniform distribution over its next internal state (code included). This means that only a tiny fraction of possible programs will have any significant probability of being obtained. I believe one can give a formula for (at least an upper bound on) the expected size of this fraction (actually, the expected log size), but I also believe nobody has ever done that, so you may doubt this particular point until I prove it.

Comment author: CuSithBell 05 June 2012 06:20:43PM 0 points [-]

Thanks for challenging my position. This discussion is very stimulating for me!

It's a pleasure!

Sure, but we could imagine an AI deciding something like "I do not want to enjoy frozen yogurt", and then altering its code in such a way that it is no longer appropriate to describe it as enjoying frozen yogurt, yeah?

I'm actually having trouble imagining this without anthropomorphizing (or at least zoomorphizing) the agent. When is it appropriate to describe an artificial agent as enjoying something? Surely not when it secretes serotonin into its bloodstream and synapses?

Yeah, that was sloppy of me. Leaving aside the question of when something is enjoying something, let's take a more straightforward example: Suppose an AI were to design and implement more efficient algorithms for processing sensory stimuli? Or add a "face recognition" module when it determines that this would be useful for interacting with humans?

This seems trivially false - if an AI is instantiated as a bunch of zeros and ones in some substrate, how could Godel or similar concerns stop it from altering any subset of those bits?

It's not a question of stopping it. Gödel is not giving it a stern look, saying: "you can't alter your own code until you've done your homework". It's more that these considerations prevent the agent from being in a state where it will, in fact, alter its own code in certain ways. This claim can and should be proved mathematically, but I don't have the resources to do that at the moment. In the meanwhile, I'd agree if you wanted to disagree.

Hm. It seems that you should be able to write a simple program that overwrites its own code with an arbitrary value. Wouldn't that be a counterexample?

You see reasons to believe that any artificial intelligence is limited to altering its motivations and desires in a way that is qualitatively similar to humans? This seems like a pretty extreme claim - what are the salient features of human self-rewriting that you think must be preserved?

I believe that this is likely, yes. The "salient feature" is being subject to the laws of nature, which in turn seem to be consistent with particular theories of logic and probability. The problem with such a claim is that these theories are still not fully understood.

This sounds unjustifiably broad. Certainly, human behavior is subject to these restrictions, but it is also subject to much more stringent ones - we are not able to do everything that is logically possible. Do we agree, then, that humans and artificial agents are both subject to laws forbidding logical contradictions and the like, but that artificial agents are not in principle necessarily bound by the same additional restrictions as humans?

Comment author: royf 05 June 2012 09:50:25PM 0 points [-]

Suppose an AI were to design and implement more efficient algorithms for processing sensory stimuli? Or add a "face recognition" module when it determines that this would be useful for interacting with humans?

The ancient Greeks have developed methods of improved memorization. It has been shown that human-trained dogs and chimps are more capable of human-face recognition than others of their kind. None of them were artificial (discounting selective breeding in dogs and Greeks).

It seems that you should be able to write a simple program that overwrites its own code with an arbitrary value. Wouldn't that be a counterexample?

Would you consider such a machine an artificial intelligent agent? Isn't it just a glorified printing press?

I'm not saying that some configurations of memory are physically impossible. I'm saying that intelligent agency entails typicality, and therefore, for any intelligent agent, there are some things it is extremely unlikely to do, to the point of practical impossibility.

Do we agree, then, that humans and artificial agents are both subject to laws forbidding logical contradictions and the like, but that artificial agents are not in principle necessarily bound by the same additional restrictions as humans?

I would actually argue the opposite.

Are you familiar with the claim that people are getting less intelligent since modern technology allows less intelligent people and their children to survive? (I never saw this claim discussed seriously, so I don't know how factual it is; but the logic of it is what I'm getting at.) The idea is that people today are less constrained in their required intelligence, and therefore the typical human is becoming less intelligent.

Other claims are that activities such as browsing the internet and video gaming are changing the set of mental skills which humans are good at. We improve in tasks which we need to be good at, and give up skills which are less useful. You gave yet another example in your comment regarding face recognition.

The elasticity of biological agents is (quantitatively) limited, and improvement by evolution takes time. This is where artificial agents step in. They can be better than humans, but the typical agent will only actually be better if it has to. Generally, more intelligent agents are those which are forced to comply to tighter constraints, not looser ones.

Comment author: wedrifid 04 June 2012 05:09:00AM 0 points [-]

Having asserted that your claim is, in fact, new information

I wouldn't assert that. I thought I was stating the obvious.

can you please clarify and explain why you believe that?

See CuSithBell's reply.

Comment author: CuSithBell 04 June 2012 05:20:49AM 0 points [-]

Having asserted that your claim is, in fact, new information

I wouldn't assert that. I thought I was stating the obvious.

Yes, I think I misspoke earlier, sorry. It was only "new information" in the sense that it wasn't in that particular sentence of Eliezer's - to anyone familiar with discussions of GAI, your assertion certainly should be obvious.

Comment author: wedrifid 04 June 2012 05:23:47AM 0 points [-]

Ahh. That's where the "new information" thing came in to it. I didn't think I'd said anything about new so I'd wondered.