sark comments on David Chalmers' "The Singularity: A Philosophical Analysis" - Less Wrong

33 Post author: lukeprog 29 January 2011 02:52AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (202)

You are viewing a single comment's thread. Show more comments above.

Comment author: sark 29 January 2011 11:05:13AM 0 points [-]

Is designing "consider the designer's ideals" in an AI difficult?

Comment author: Vladimir_Nesov 29 January 2011 12:02:16PM *  3 points [-]

Currently expected to be difficult, since we don't know of an easy way to do so. That it'll turn out to be easy (in the hindsight) is not totally out of the question.

Comment author: Perplexed 29 January 2011 09:44:11PM 1 point [-]

Is designing "consider the designer's ideals" in an AI difficult?

Currently expected to be difficult, since we don't know of an easy way to do so.

Has anyone considered approaching this problem in the same way we might approach "read the user's handwriting"? That is, the task is not one we program the AI to accomplish - instead, we train the AI to accomplish it. And, most importantly, we train the AI to ask for further clarification in ambiguous cases.

Comment author: Vladimir_Nesov 29 January 2011 10:03:49PM 2 points [-]

Mirrors and Paintings (yes, you want to point your program at the world and have it figure out what you referred to), The Hidden Complexity of Wishes (if you need to answer AI's question or give it instructions, you're doing something wrong and it won't work).

Comment author: Perplexed 30 January 2011 01:24:48AM *  2 points [-]

I have to admit, as someone who has worked in software testing, I find it difficult to take the suggestion (non-destructive full-brain scan) in the first link very seriously. How, exactly, do I become convinced that the AI can come to know more about what I want by scanning me than I can know by introspection? How can I (or it) even do a comparison between the two without it asking me questions?

But then we get down to doing the comparison. The AI informs me that what I really want is to kill my father and sleep with my mother. I deny this. Do we take this as evidence that the AI really does know me better than I know myself, or as a symptom of a bug?

I would argue that if you don't need to answer the AI's questions or give it instructions, you're doing something wrong and it won't work. By definition. At least for the first ten thousand scans or so. And even then there will remain questions on which the AI and introspection would deliver different answers. Questions with hidden complexity. I just don't see how anyone would trust a CEV extrapolated from brain scans until we had decades of experience suggesting that scanning and modeling yields better results than introspection.

Comment author: jacob_cannell 30 January 2011 02:22:21AM 0 points [-]

I would argue that if you don't need to answer the AI's questions or give it instructions, you're doing something wrong and it won't work. By definition.

Agreed. And any useful AI will have to understand human language to do or learn much anything of value.

The detailed analysis of full brain scanning tech I've seen puts it far into the future, well beyond human-level AGI.

Comment author: Vladimir_Nesov 30 January 2011 01:53:39AM *  0 points [-]

And even then there will remain questions on which the AI and introspection would deliver different answers.

You have to make sure AI predictably gives a better answer even on questions where you disagree. And there will be questions which can't even be asked of a human.

Comment author: Vladimir_Nesov 30 January 2011 01:48:41AM *  0 points [-]

I have to admit, as someone who has worked in software testing, I find it difficult to take the suggestion (non-destructive full-brain scan) in the first link very seriously. How, exactly, do I become convinced that the AI can come to know more about what I want by scanning me than I can know by introspection? How can I (or it) even do a comparison between the two without it asking me questions?

Irrelevant. Assume you magically have a perfect working simulation of yourself.

Comment author: Perplexed 30 January 2011 02:23:26AM 1 point [-]

Assume you magically have a perfect working simulation of yourself.

Why would I want to do that? I.e. how would making that assumption lead me to take Eliezer's suggestion more seriously? My usual practice is to take things less seriously when magic is involved.

And how does this assumption interact with your other comment stating that I have to make sure the AI is somehow even better than myself if there is any difference between simulation and reality? Haven't you just asked me to assume that there are no differences?

Sorry, I simply don't understand your responses, which suggests to me that you did not understand my comment. Did you notice, in my preamble, that I mentioned software testing? Perhaps my point may be clearer to you if you keep this preamble in mind when formulating your responses.

Comment author: Vladimir_Nesov 30 January 2011 02:30:42AM 0 points [-]

Why would I want to do that?

Because that's a conceptually straightforward assumption that we can safely make in a philosophical argument.

The upload is not the AI (and Eliezer's post doesn't refer to uploads IIRC, but for the sake of the argument assume they are available as raw material). You make AI correct on strong theoretical grounds, and only test things to check that theoretical assumptions hold in ways where you expect it to be possible to check things, not in every situation.

Did you notice, in my preamble, that I mentioned software testing?

What would I need to make of that?

Comment author: Perplexed 30 January 2011 03:33:36AM 0 points [-]

Because that's a conceptually straightforward assumption that we can safely make in a philosophical argument.

But this is not a philosophical argument.

To recap:

  • I suggested that an AI which is a precursor to the FAI should come to understand human values by interacting (over an extended 'training' period) with actual humans - asking them questions about their values and perhaps performing some experiments as in a psych or game theory laboratory.
  • You responded by linking to this, which as I read it suggests that the most accurate and efficient way to extract the values of a human test subject would be by carrying out a non-destructive brain scan. Quoting the posting:

So when we try to make an AI whose physical consequence is the implementation of what is right, we make that AI's causal chain start with the state of human brains - perhaps non-destructively scanned on the neural level by nanotechnology, or perhaps merely inferred with superhuman precision from external behavior - but not passed through the noisy, blurry, destructive filter of human beings trying to guess their own morals.

  • I asked how we could possibly come to know by testing that the scanning and brain modeling was working properly. I could have asked instead how we could test the hypothesis that the inference from behavior was working properly.

These are questions about engineering and neuroscience, not questions of philosophy. The question of what is right/wrong is a philosophical question. The question of what do humans believe about right and wrong is a psychology question. The question of how those beliefs are represented in the brain is a neuroscience question. The question of how an AI can come to learn these things is GOFAI. The question of how we will know we have done it right is a QC question. Software test. That was the subject of my comment. It had nothing at all to do with philosophy.

You make AI correct on strong theoretical grounds, and only test things to check that theoretical assumptions hold in ways where you expect it to be possible to check things, not in every situation.

Ok, in this context, I interpret this to mean that we will not program in the neuroscience information that it will use to interpret the brain scans. Instead we will simply program the AI to be a good scientist. A provably good scientist. Provable because it is a simple program and we understand epistemology well enough to write a correct behavioral specification of a scientist and then verify that the program meets the specification. So we can let the AI design the brain scanner and perform the human behavioral experiments to calibrate its brain models. We only need to spot-check the science it generates, because we already know that it is a good scientist.

Hmmm. That is actually a pretty good argument, if that is what you are suggesting. I'll have to give that one some thought.

Comment author: Vladimir_Nesov 30 January 2011 03:43:59AM *  0 points [-]

These are questions about engineering and neuroscience, not questions of philosophy. The question of what is right/wrong is a philosophical question. The question of what do humans believe about right and wrong is a psychology question. The question of how those beliefs are represented in the brain is a neuroscience question. The question of how an AI can come to learn these things is GOFAI. The question of how we will know we have done it right is a QC question. Software test. That was the subject of my comment. It had nothing at all to do with philosophy.

Sorry, not my area at the moment. I gave the links to refer to arguments for why having AI learn in the traditional sense is a bad idea, not for instructions on how to do it correctly in a currently feasible way. Nobody knows that, so you can't expect an answer, but the plan of telling the AI things we think we want it to learn is fundamentally broken. If nothing better can be done, too bad for humanity.

Ok, in this context, I interpret this to mean that we will not program in the neuroscience information that it will use to interpret the brain scans. Instead we will simply program the AI to be a good scientist.

This is much closer, although a "scientist" is probably a bad word to describe that, and given that I don't have any idea what kind of system can play this role, it's pointless to speculate. Just take as the problem statement what you quoted from the post:

try to make an AI whose physical consequence is the implementation of what is right

Comment author: jacob_cannell 30 January 2011 02:18:50AM *  0 points [-]

Irrelevant. Assume you magically have a perfect working simulation of yourself.

Relevant - Can we just assume you magically have a friendly AI then?

If the plan for creating a friendly AI depends on a non-destructive full-brain scan already being available, the odds of achieving friendly AI before other forms of AI vanish to near zero.

Comment author: Vladimir_Nesov 30 January 2011 02:23:02AM 0 points [-]

One step at a time, my good sir! Reducing the philosophical and mathematical problem of Friendly AI to the technological problem of uploading would be an astonishing breakthrough quite by itself.

Comment author: jacob_cannell 30 January 2011 02:39:21AM -1 points [-]

I think this reflects the practical problem with Friendly AI - it is an ideal of perfection taken to an extreme that expands the problem scope far beyond what is likely to be near term realizable.

I expect that most of the world, research teams, companies, the VC community and so on will be largely happy with an AGI that just implements an improved version of the human mind.

For example, humans have an ability to model other agents and their goals, and through love/empathy value the well-being of others as part of our own individual internal goal systems.

I don't see yet why that particular system is difficult or more complex than the rest of AGI.

It seems likely that once we can build an AGI as good as the brain we can build one that is human-like but only has the love/empathy circuitry in it's goal system with the rest of the crud stripped out.

In other words if we can build AGI's modeled after the best components of the best examples of altruistic humans, this should be quite sufficient.

Comment author: jacob_cannell 30 January 2011 02:15:12AM 0 points [-]

That is, the task is not one we program the AI to accomplish - instead, we train the AI to accomplish it. And, most importantly, we train the AI to ask for further clarification in ambiguous cases

This is the straightforward approach.

Once you have an AGI that has the cognitive capability and learning capacity of a human infant brain, you teach it everything else in human language - right/wrong, ethics/morality, etc.

Programming languages are precise and well suited for creating the architecture itself, but human languages are naturally more effective for conveying human knowledge.

Comment author: Perplexed 30 January 2011 02:36:26AM 1 point [-]

I tend to agree that we need a natural language interface to the AI. But it is far easier to create automatic proofs of program correctness when the really important stuff (like ethics) is presented in a formal language equipped with a deductive system.

There is something to be said for treating all the natural language input as if it were testimony from unreliable witnesses - suitable, perhaps, for locating hypotheses, but not really suitable as strong evidence for accepting the hypotheses.

Comment author: jacob_cannell 30 January 2011 02:42:38AM 0 points [-]

But it is far easier to create automatic proofs of program correctness

I'm not sure how this applies - can you formally prove the correctness of a probabilistic belief network? Is that even a valid concept?

I can understand how you can prove a formal deterministic circuit or the algorithms underlying the belief network and learning systems, but the data values?

Comment author: Perplexed 30 January 2011 03:41:34AM 1 point [-]

Agree. That is why I suggest that the really important stuff - meta-ethics, epistemology, etc., be represented in some other way than by 'neural' networks. Something formal and symbolic, rather than quasi-analog. All the stuff which we (and the AI) need to be absolutely certain doesn't change meaning when the AI "rewrites its own code"

Comment author: jacob_cannell 30 January 2011 04:05:37AM *  0 points [-]

By formal, I assume you mean math/code.

The really important stuff isn't a special category of knowledge. It is all connected - a tangled web of interconnected complex symbolic concepts for which human language is a natural representation.

What is the precise mathematical definition of ethics? If you really think of what it would entail to describe that precisely, you would need to describe humans, civilization, goals, brains, and a huge set of other concepts.

In essence you would need to describe an approximation of our world. You would need to describe a belief/neural/statistical inference network that represented that word internally as a complex association between other concepts that eventually grounds out into world sensory predictions.

So this problem - that human language concepts are far too complex and unwieldy for formal verification - is not a problem with human language itself that can be fixed by using other language choices. It reflects a problem with the inherit massive complexity of the world itself, complexity that human language and brain-like systems are evolved to handle.

Comment author: Perplexed 30 January 2011 04:48:49AM 0 points [-]

So this problem - that human language concepts are far too complex and unwieldy for formal verification - is not a problem with human language itself that can be fixed by using other language choices. It reflects a problem with the inherit massive complexity of the world itself, complexity that human language and brain-like systems are evolved to handle.

These folks seem to agree with you about the massive complexity of the world, but seem to disagree with you that natural language is adequate for reliable machine-based reasoning about that world.

As for the rest of it, we seem to be coming from two different eras of AI research as well as different application areas. My AI training took place back around 1980 and my research involved automated proofs of program correctness. I was already out of the field and working on totally different stuff when neural nets became 'hot'. I know next to nothing about modern machine learning.

Comment author: jacob_cannell 30 January 2011 09:42:50AM 0 points [-]

I've read about CYC a while back - from what I recall/gather it is a massive handbuilt database of little natural language 'facts'.

Some of the new stuff they are working on with search looks kinda interesting, but in general I don't see this as a viable approach to AGI. A big syntactic database isn't really knowledge - it needs to be grounded to a massive sub-symbolic learning system to get the semantics part.

On the other hand, specialized languages for AGI's? Sure. But they will need to learn human languages first to be of practical value.

Comment author: Vladimir_Nesov 30 January 2011 03:53:30AM 0 points [-]

To get to that point we have to start from the right meaning to begin with, and care about preserving it accurately, and Jacob doesn't agree those steps are important or particularly hard.

Comment author: jacob_cannell 30 January 2011 04:21:55AM -1 points [-]

Not quite.

As for the start with the right meaning part, I think it is extremely hard to 'solve' morality in the way typically meant here with CEV or what not.

I don't think that we need (or will) wait to solve that problem before we build AGI, any more or less than we need to solve it for having children and creating a new generation of humans.

If we can build AGI somewhat better than us according to our current moral criteria, they can build an even better successive generation, and so on - a benevolence explosion.

As for the second part about preserving it accurately, I think that ethics/morality is complex enough that it can only be succinctly expressed in symbolic associative human languages. An AGI could learn how to model (and value) the preferences of others in much the same way humans do.

Comment author: wedrifid 30 January 2011 04:27:22AM 3 points [-]

I don't think that we need (or will) wait to solve that problem before we build AGI, any more or less than we need to solve it for having children and creating a new generation of humans.

If we can build AGI somewhat better than us according to our current moral criteria, they can build an even better successive generation, and so on - a benevolence explosion.

Someone help me out. What is the right post to link to that goes into the details of why I want to scream "No! No! No! We're all going to die!" in response to this?

Comment author: nshepperd 30 January 2011 06:05:05AM *  2 points [-]

Why would an AI which optimises for one thing create another AI that optimises for something else? Not every change is an improvement, but every improvement is necessarily a change. Building an AI with a different utility function is not going to satisfy the first AI's utility function! So whatever AI the first one builds is necessarily going to either have the same utility function (in which case the first AI is working correctly), or have a different one (which is a sign of malfunction, and given the complexity of morality, probably a fatal one).

It's not possible to create an AGI that is "somewhat better than us" in the sense that it has a better utility function. To the extent that we have a utility function at all, it would refer to the abstract computation called "morality", which "better" is defined by. The most moral AI we could create is therefore one with precisely that utility function. The problem is that we don't exactly know what our utility function is (hence CEV).

There is a sense in which a Friendly AGI could be said to be "better than us", in that a well-designed one would not suffer from akrasia and whatever other biases prevent us from actually realizing our utility function.

Comment author: Will_Newsome 29 January 2011 09:19:23PM -1 points [-]

Currently expected to be difficult, since we don't know of an easy way to do so. That it'll turn out to be easy (in the hindsight) is not totally out of the question.

There are some promising lines of attack (grounded in decision theory) that might take only a few years of research. We'll see where they lead. Other open problems in FAI might start looking very solvable if we start making progress on this front.

Comment author: Vladimir_Nesov 29 January 2011 09:28:16PM 2 points [-]

Show me.

Comment author: Will_Newsome 29 January 2011 09:45:15PM -1 points [-]

PM'd.

Comment author: wedrifid 29 January 2011 11:49:59AM 0 points [-]

Yes. :)