Gunnar_Zarncke comments on Debunking Fallacies in the Theory of AI Motivation - LessWrong

8 Post author: Richard_Loosemore 05 May 2015 02:46AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (343)

You are viewing a single comment's thread. Show more comments above.

Comment author: Richard_Loosemore 06 May 2015 04:08:27PM 5 points [-]

I see where you are coming from in what you have just said, but to give a good answer I need to take a high-level stance toward what you are saying. This is because there is a theme running through your ideas, here, and it is the theme, rather than the specifics, that I need to address.

You have mentioned on the serval occasions the idea that "AGI-concepts" and "Human-concepts" might not align, with the result that we might have difficulty understanding what they are really meaning when they use a given concept. In particular, you use the idea that there could be some bad misalignments of concepts - for example, when the AGI makes a conceptual distinction between "giving choices to people" and "forcing them to do something", and even though our own version of that same distinction corresponds closely to the AGI's version most of the time, there are some peculiar circumstances (edge cases) where there is a massive or unexpectedly sharp discrepancy.

Putting this idea in the form of an exaggerated, fictional example, it is as if we meet a new culture out in the middle of Darkest Africa, and in the course of translating their words into ours we find a verb that seems to mean "cook". But even though there are many examples (cooking rice, cooking bread, cooking meat, and even brewing a cup of tea) that seem to correspond quite closely, we suddenly find that they ALSO use this to refer to a situation where someone writes their initials on a tree, and another case where they smash someone's head with a rock. And the natives claim that this is not because the new cases are homonyms, they claim that this is the very same concept in all cases.

We might call this a case of "alien semantics".

The first thing to say about this, is that it is a conceptual minefield. The semantics (or ontological grounding) of AI systems is, in my opinion, one of the least-well developed parts of the whole field. People often pay lip-service to some kind of model-theoretical justification for an AI's semantic foundations, but in practice this actually means very little, since the theoretical ideas shade off into philosophy, have some huge unresolved gaps in them, and frequently take recourse in infinitely large (i.e. uncomputable) mappings between sets of 'possible worlds'. Worst of all, the area is rife with question-begging (like using technical vocabulary which itself has a poorly defined semantics to try to specify exactly what 'semantics' is!).

Why does that matter? Because many of the statements that people make about semantic issues (like the alien semantics problem) are predicated on precisely which semantic theory they subscribe to. And, it is usually the case that their chosen semantic theory is just a vague idea that goes somewhat in the direction of Tarski, or in the direction of Montague, or maybe just what they read in Russell and Norvig. The problem is that those semantic theories have challengers (some of them not very well defined, but even so...), such as Cognitive Semantics, and those other semantic formalisms have a truly gigantic impact on some of the issues we are discussing here.

So, for example, there is an interpretation of semantics that says that it is not even coherent to talk about two concept landscapes that are semantic aliens. To be sure, this can happen in language -- things expressible in one language can be very hard to say in another language -- but the idea that two concept spaces can be in some way irreconcilable, or untranslatable, would be incoherent (not "unlikely" but actually not possible).

[A brief word about how that could be the case. If concepts are defined by large clusters of constraints between concepts, rather than precise, atomic relations of the sort you find in logical formalisms, then you can always deal with situations in which two concepts seem near to one another but do not properly overlap: you can form some new, translator concepts that take a complex union of the two. There is a lot of talk that can be given about how that complex union takes place, but here is one very important takeaway: it can always be made to happen in such a way that there will not, in the future, be any Gotcha cases (those where you thought you did completely merge the two concepts, but where you suddenly find a peculiar situation where you got it disastrously wrong). The reason why you won't get any Gotcha cases is that the concepts are defined by large numbers of weak constraints, and no strong constraints -- in such systems, the effect of smaller and smaller numbers of concepts can be guaranteed to converge to zero. (This happens for the same reason that the effect of smaller and smaller sub-populations of the molecules in a gas will converge to zero as the population sizes go to zero). Finally, you will notice the appearance of the key phrase "large clusters of constraints" in what I just explained ...... that should be familiar. This is precisely the semantics of those Swarm Relaxation systems that I talked about in the paper.]

So, one of the implications of that kind of semantics is that different intelligent systems that use the basic idea of massive, weak constraint clusters to build concepts is that those systems will tend to converge on the same semantics.

[continued in next comment......]

Comment author: Gunnar_Zarncke 06 May 2015 10:25:03PM 1 point [-]

There is a lot of talk that can be given about how that complex union takes place, but here is one very important takeaway: it can always be made to happen in such a way that there will not, in the future, be any Gotcha cases (those where you thought you did completely merge the two concepts, but where you suddenly find a peculiar situation where you got it disastrously wrong). The reason why you won't get any Gotcha cases is that the concepts are defined by large numbers of weak constraints, and no strong constraints -- in such systems, the effect of smaller and smaller numbers of concepts can be guaranteed to converge to zero.

That is an interesting aspect of one particular way to deal with the problem that I have not yet heard about and I'd like to see a reference for that to read up on it.

Comment author: Richard_Loosemore 07 May 2015 01:31:44AM 6 points [-]

I first started trying to explain, informally, how these types of systems could work back in 2005. The reception was so negative that it led to a nasty flame war.

I have continued to work on these systems, but there is a problem with publishing too much detail about them. The very same mechanisms that make the motivation engine a safer type of beast (as described above) also make the main AGI mechanisms extremely powerful. That creates a dilemma: talk about the safety issues, and almost inevitably I have to talk about the powerful design. So, I have given some details in my published papers, but the design is largely under wraps, being developed as an AGI project, outside the glare of publicity.

I am still trying to find ways to write a publishable paper about this class of systems, and when/if I do I will let everyone know about it. In the mean time, much of the core technology is already described in some of the references that you will find in my papers (including the one above). The McClelland and Rumelhart reference, in particular, talks about the fundamental ideas behind connectionist systems. There is also a good paper by Hofstadter called "Jumbo" which illustrates another simple system that operates with multiple weak constraints. Finally, I would recommend that you check out Geoff Hinton's early work.

In all you neural net reading, it is important to stay above the mathematical details and focus on the ideas, because the math is a distraction from the more important message.

Comment author: Gunnar_Zarncke 07 May 2015 10:58:00PM *  2 points [-]

I have read McClelland and Rumelhart first ~20 years ago and it has a prominent place in my book shelf. I havn't been able to actively work in AI but I have followed the field. I put some hopes in integrated connectionist symbolic systems and was rewarded with deep neural networks lately. I think that every advanced system will need some non-symbolic approach to integrate reality. I don't know whether it will be NNs or some other statistical means. And the really tricky part will be to figure out how to pre-wire it such that it 'does what it should'. I think a lot will be learned how the same is realized in the human brain.