Lightwave comments on Two straw men fighting - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (157)
And what about an AI that can predict it's own decisions (because it knows its source code)?
Also, are you a compatibilist?
I believe that a compatibilist can accept both freewill and determinism at the same time. I reject them both as not useful to understanding decisions. I think there is a difference between believing both A and B and believing neither A or B. It seems to me unlikely that an AI could predict its own decisions by examining its source code but not running the code. But I am not sure it is completely impossible just because I cannot see how it would be done. If it were possible I would be extremely surprised if it was faster or easier that just running the code.
As I've stated before, no AI can predict its own decisions in that sense (i.e. in detail, before it has made them.) Knowing its source code doesn't help; it has to run the code in order to know what result it gets.
I suggest that it can but it is totally pointless for it to do so.
Things can be proved from source code without running it. This applies to any source code, including that of oneself. Again, it doesn't seem a particularly useful thing to do in most cases.
I'm wondering why this got downvoted - it's true!
For example if the top-level decision function of an AI is:
... and the AI doesn't self-modify, then it can predict that it will decide to self destruct if it falls in the water, only by analysing the code, without running it (also assuming, of course, that it is good enough at code analysis).
Of course, you can imagine AIs that can't predict any of it's decisions, and as wedfrid says, in most non-trivial cases, most probably wouldn't be able to.
(This may be important, because having provable decisions in certain situations could be key to cooperation in prisonner's-dilemma-type situations)
Of course that is predictable, but that code wouldn't exist in any intelligent program, or at least it isn't an intelligent action; predicting it is like predicting that I'll die if my brain is crushed.
Unknowns, we've been over this issue before. You don't need to engage in perfect prediction in order to be able to usefully predict. Moreover, even if you can't predict everything you can still examine and improve specific modules. For example, if an AI has a module for factoring integers using a naive, brute-force factoring algorithm, it could examine that and decide to replace it with a quicker, more efficient module for factoring (that maybe used the number field sieve for example). It can do that even though it can't predict the precise behavior of the module without running it.
I certainly agree that an AI can predict some aspects of its behavior.
That's also because this is a simplified example, merely intended to provide a counter-example to your original assertion.
Agreed, it isn't an intelligent action, but if you start saying intelligent agents can only take intelligent decisions, then you're playing No True Scotsman.
I can imagine plenty of situations where someone might want to design an agent that takes certain unintelligent decisions in certain circumstances, or an agent that self-modifies in that way. If an agent can not only make promises, but also formally prove by showing it's own source code that those promises are binding and that it can't change them - then it may be at an advantage for negociations and cooperation over an agent that can't do that.
So "stupid" decisions that can be predicted by reading one's own source code isn't a feature that I consider unlikely in the design-space of AIs.
I would agree with that. But I would just say that the AI would experience doing those things (for example keeping such promises) as we experience reflex actions, not as decisions.
Why not?
In what way is it like that, and how is that relevant to the question?
It's like that precisely because it is easily predictable; as I said in another reply, an AI will experience its decisions as indeterminate, so anything it knows in advance in such a determinate way, will not be understood as a decision, just as I don't decide to die if my brain is crushed, but I know that will happen. In the same way the AI will merely know that it will self-destruct if it is placed under water.
From this, it seems like your argument for why this will not appear in its decision algorithm, is simply that you have a specific definition for "decision" that requires the AI to "understand it as a decision". I don't know why the AI has to experience its decisions as indeterminate (indeed, that seems like a flawed design if its decisions are actually determined!).
Rather, any code that leads from inputs to a decision should be called part of the AI's 'decision algorithm' regardless of how it 'feels'. I don't have a problem with an AI 'merely knowing' that it will make a certain decision. (and be careful - 'merely' is an imprecise weasel word)
It isn't a flawed design because when you start running the program, it has to analyze the results of different possible actions. Yes, it is determined objectively, but it has to consider several options as possible actions nonetheless.
This is false for some algorithms, and so I imagine it would be false for the entirety of the AI's source code. For example (ANSI C):
I know that i is equal to 5 after this code is executed, and I know that without executing the code in any sense.
That isn't an algorithm for making decisions.
No, but note the text:
It is, incidently, trivial to alter the code to an algorithm for making decisions and also simple to make it an algorithm that can predict it's decision before making it.
The doselfanalysis method (do they call them methods or functions? Too long since I've used C) can browse the entire source code of the AI, determine that the above piece of code is the algorithm for making the relevant decision, prove that do_self_analysis doesn't change anything or perform any output and does return in finite time and then go on to predict that the AI will behave like a really inefficient defection rock. Quite a while later it will actually make the decision to defect.
All rather pointless but the concept is proved.
When the AI runs the code for predicting it's action, it will have the subjective experience of making the decision. Later "it will actually make the decision to defect" only in the sense that the external result will come at that time. If you ask it when it made it's decision, it will point to the time when it analyzed the code.
You are mistaken. I consider the explanations given thus far by myself and others sufficient. (No disrespect intended beyond that implicit in the fact of disagreement itself and I did not vote on the parent.)
The explanations given say nothing about the AI's subjective experience, so they can't be sufficient to refute my claim about that.
Consider my reply to be to the claim:
If you ask the AI when it made its decision it will either point to the time after the analysis or it will be wrong.
I avoided commenting on the 'subjective experience' side of things because I thought it was embodying a whole different kind of confusion. It assumes that the AI executes some kind of 'subjective experience' reasoning that is similar to that of humans (or some subset thereof). This quirk relies on lacking any strong boundaries between thought processes. People usually can't predict their decisions without making them. For both the general case and the specific case of the code I gave a correctly implemented module that could be given the label 'subjective experience' would see the difference between prediction and analysis.
I upvoted the parent for the use of it's. I usually force myself to write its in that context but cringe while doing so. The syntax of the English language is annoying.
Really? Do you also cringe when using theirs, yours, ours, mine, and thine?
"If you ask the AI when it made its decision it will either point to the time after the analysis or it will be wrong."
I use "decision" precisely to refer the experience that we have when we make a decision, and this experience has no mathematical definition. So you may believe yourself right about this, but you don't have (and can't have) any mathematical proof of it.
(I corrected this comment so that it says "mathematical proof" instead of proof in general.)
No, but surely some chunks of similarly-transparent code would appear in an algorithm for making decisions. And since I can read that code and know what it outputs without executing it, surely a superintelligence could read more complex code and know what it outputs without executing it. So it is patently false that in principle the AI will not be able to know the output of the algorithm without executing it.
Any chunk of transparent code won't be the code for making an intelligent decision. And the decision algorithm as a whole won't be transparent to the same intelligence, but perhaps only to something still more intelligent.
Do you have a proof of this statement? If so, I will accept that it is not in principle possible for an AI to predict what its decision algorithm will return without executing it.
Of course, logical proof isn't entirely necessary when you're dealing with Bayesians, so I'd also like to see any evidence that you have that favors this statement, even if it doesn't add up to a proof.
It's not possible to prove the statement because we have no mathematical definition of intelligence.
Eliezer claims that it is possible to create a superintelligent AI which is not conscious. I disagree with this because it is basically saying that zombies are possible. True, he would say that he only believes that human zombies are impossible, not that zombie intelligences in general are impossible. But in that case he has no idea whatsoever what consciousness corresponds to in the physical world, and in fact has no reason not to accept dualism.
My position is more consistent: all zombies are impossible, and any intelligent being will be conscious. So it will also have the subjective experience of making decisions. But it is essential to this experience that you don't know what you're going to do before you do it; when you experience knowing what you're going to do, you experience deciding to do it.
Therefore any AI that runs code capable of predicting its decisions, will at that very time subjectively experience making those decisions. And on the other hand, given that a block of code will not cause it to feel the sensation of deciding, that block of code must be incapable of predicting its decision algorithm.
You may still disagree, but please note that this is entirely consistent with everything you and wedrifid have argued, so his claim that I have been refuted is invalid.
As I recall, Eliezer's definition of consciousness is borrowed from GEB- it's when the mind examines itself, essentially. That has very real physical consequences, so the idea of non-conscious AGI doesn't support the idea of zombies, which require consciousness to have no physical effects.
Any AGI would be able to examine itself, so if that is the definition of consciousness, every intelligence would be conscious. But Eliezer denies the latter, so he also implicitly denies that definition of consciousness.
Yes we do, ability to apply optimization pressure in a wide variety of environments. The platonic ideal of which is AIXI.
Can you please provide a link?
http://lesswrong.com/lw/x5/nonsentient_optimizers/
I don't have any problem granting that "any intelligent being will be conscious", nor that "It will have the subjective experience of making decisions", though that might just be because I don't have a formal specification of either of those - we might still be talking past each other there.
I don't grant this. Can you elaborate?
I'm not sure that's true, or in what sense it's true. I know that if someone offered me a million dollars for my shoes, I would happily sell them my shoes. Coming to that realization didn't feel to me like the subjective feeling of deciding to sell something to someone at the time, as compared to my recollection of past transactions.
Okay, that follows from the previous claim.
If I were moved to accept your previous claim, I would now be skeptical of the claim that "a block of code will not cause it to feel the sensation of deciding". Especially since we've already shown that some blocks of code would be capable of predicting some decision algorithms.
This follows, but I draw the inference in the opposite direction, as noted above.
I would distinguish between "choosing" and "deciding". When we say "I have some decisions to make," we also mean to say that we don't know yet what we're going to do.
On the other hand, it is sometimes possible for you to have several options open to you, and you already know which one you will "choose". Your example of the shoes and the million dollars is one such case; you could choose not to take the million dollars, but you would not, and you know this in advance.
Given this distinction, if you have a decision to make, as soon as you know what you will or would do, you will experience making a decision. For example, presumably there is some amount of money ($5? $20? $50? $100? $300?) that could be offered for your shoes such that you are unclear whether you should take the offer. As soon as you know what you would do, you will feel yourself "deciding" that "if I was offered this amount, I would take it." It isn't a decision to do something concretely, but it is still a decision.
Now, I am not certain about this, but we have to examine that code before we know it's outcome.
While this isn't "Running" the code in the traditional sense of computation as we are familiar with it today, it does seem that the code is sort of run by our brains as a simulation as we scan it.
As sort of meta-process if you will...
I could be so wrong about that though... eh...
Also, that code is useless really, except maybe as a wait function... It doesn't really do anything (Not sure why Unknowns gets voted up in the first post above, and down below)...
Also, leaping from some code to the Entirety of an AI's source code seems to be a rather large leap.
"some code" is part of "the entirety of an AI's source code" - if it doesn't need to execute some part of the code, then it doesn't need to execute the entirety of the code.