Jiro comments on AI risk, new executive summary - Less Wrong

12 Post author: Stuart_Armstrong 18 April 2014 10:45AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (76)

You are viewing a single comment's thread. Show more comments above.

Comment author: [deleted] 21 April 2014 12:57:29AM *  0 points [-]

Sorry, didn't mean to call you personally any of those adjectives :)

None taken then.

Pretty much, yes, I find it totally possible. I am not saying that I am confident that this is the case, just that I find it more likely than the alternative, which would require an additional reason why it isn't so.

Well, tell me what you think of this argument:

Lets divide the meta-language into two sets: P (the sentences that cannot be rendered in English) and Q (the sentences that can). If you expect Q to be empty, then let me know and we can talk about that case. But let's assume for now that Q is not empty, since I assume we both think that an AGI will be able to handle human language quite easily. Q is, for all intents and purposes, a 'human' language itself.

Premise one is that that translation is transitive: if I can translate language a into language b, and language b into language c, then I can translate language a into language c (maybe I need to use language b as an intermediate step, though).

Premise two: If I cannot translate a sentence in language a into an expression in language b, then there is no expression in language b that expresses the same thought as that sentence in language a.

Premise three: Any AGI would have to learn language originally from us, and thereafter either from us or from previous versions of itself.

So by stipulation, every sentence in Q can be rendered in English, and Q is non-empty. If any sentence in P cannot be rendered in English, then it follows from premise one that sentences in P cannot be rendered in sentences in Q (since then they could thereby be rendered into English). It also follows, if you accept premise two, that Q cannot express any sentence in P. So an AGI knowing only Q could never learn to express any sentence in P, since if it could, any speaker of Q (potentially any non-improved human) could in principle learn to express sentences in P (given an arbitrarily large amount of resources like time, questions and answers, etc.).

Hence, no AGI, beginning from a language like English could go on to learn how to express any sentence in P. Therefore no AGI will ever know P.

I'm not super confident this argument is sound, but it seems to me to be at least plausible.

If you agree with Eliezer's definition of intelligence as optimization power

Well, that's a fine definition, but it's tricky in this case. Because if intelligence is optimization power, and optimizing presupposes something to optimize, then intelligence (on that definition) isn't strictly a factor in (ultimate) goal formation. If that's right, than something's being much more intelligent would (as I think someone else mentioned) just lead to very hard to understand instrumental goals. It would have no direct relationship with terminal goals.

Comment author: Jiro 21 April 2014 02:21:39AM 0 points [-]

By this reasoning no AGI beginning from English could ever know French either, for similar reasons. (Note that every language has sentences that cannot be rendered in another language, in the sense that someone who knows the truth value of the unrendered sentence can know the truth value of the rendered sentence; consider variations on Godel-undecideable sentences.)

Comment author: [deleted] 21 April 2014 02:39:23AM *  1 point [-]

By this reasoning no AGI beginning from English could ever know French either, for similar reasons.

This is true only if this...

Note that every language has sentences that cannot be rendered in another language

is true. But I don't think it is. English and French, for instance, seem to me to be entirely inter-translatable. I don't mean that we can assign, for every word in French, a word of equivalent meaning in English. But maybe it would be helpful if I made it more clear what I mean by 'inter-translatable'. I think language L is inter-translatable with language M if for ever sentence in language L, I can express the same thought using an arbitrarily complex expression in language M.

By 'arbitrarily complex' I mean this: Say I have a sentence in L. In order to translate it into M, I am allowed to write in M an arbitrarily large number of sentences qualifying and triangulating the meaning of the sentence in L. I am allowed to write an arbitrarily large number of poems, novels, interpretive dances, etymological and linguistic papers, and encyclopedias discussing the meaning and spirit of that sentence in L. In other words, two languages are by my standard inter-translatable if for any expression in L of n bits, I can translate it into M in n' bits, where n' is allowed to be any positive number.

I think, by this standard, French and English count as inter-translatable, as are any languages I can think of. I'm arguing, effectively, that for any language, either none of that language is inter-translatable with any language we know (in which case, I doubt we could recognize it as a language at all), or all of it is.

Now, even if I have shown that we and an AGI will necessarily be able to understand each other entirely in principle, I certainly haven't shown that it can be done in practice. However, I want to push the argument in the direction of a practical problem, just because in general, I think I can argue that AGI will be able to overcome practical problems of any reasonable difficulty.