Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

TheDude comments on Failed Utopia #4-2 - Less Wrong

52 Post author: Eliezer_Yudkowsky 21 January 2009 11:04AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (248)

Sort By: Old

You are viewing a single comment's thread. Show more comments above.

Comment author: TheDude 31 May 2013 08:12:32PM 2 points [-]

I think you have a point Will (an AI that interprets speech like a squish djinn would require deliberate effort and is proposed by no one), but I think that it is possible to construct a valid squish djinn/AI analogy (a squish djinn interpreting a command would be roughly analogous to an AI that is hard coded to execute that command).

Sorry to everyone for the repetitive statements and the resulting wall of text (that unexpectedly needed to be posted as multiple comments since it was to long). Predicting how people will interpret something is non trivial, and explaining concepts redundantly is sometimes a useful way of making people hear what you want them to hear.

Squish djinn is here used to denote a mind that honestly believes that it was actually instructed to squish the speaker (in order to remove regret for example), not a djinn that wants to hurt the speaker and is looking for a loophole. The squish djinn only care about doing what it is requested to do, and does not care at all about the well being of the requester, so it could certainly be referred to as hostile to the speaker (since it will not hesitate to hurt the speaker in order to achieve its goal (of fulfilling the request)). A cartoonish internal monologue of the squish djinn would be: "the speaker clearly does not want to be squished, but I don't care what the speaker wants, and I see no relation between what the speaker wants and what it is likely to request, so I determine that the speaker requested to be squished, so I will squish" (which sounds very hostile, but contains no will to hurt the speaker). The typical story djinn is unlikely to be a squish djinn (they usually have a motive to hurt or help the speaker, but is restricted by rules (a clever djinn that wants to hurt the speaker might still squish, but not for the same reasons as a squish djinn (such a djinn would be a valid analogy when opposing a proposal of the type "lets build some unsafe mind with selfish goals and impose rules on it" (such a project can never succeed, and the proposer is probably fundamentally confused, but a simple and correct and sufficient counter argument is: "if the project did succeed, the result would be very bad")))).

Comment author: TheDude 31 May 2013 08:12:57PM 0 points [-]

To expand on you having a point. I have obviously not seen every AI proposal on the internet, but as far as I know, no one is proposing to build a wish granting AI that parses speech like a squish djinn (and ending up with such an AI would require a deliberate effort). So I don't think the squish djinn is a valid argument against proposed wish granting AIs. Any proposed or realistic speech interpreting AI would (as you say) parse english speech as english speech. An AI that makes arbitrary distinctions between different types of meaning would need serious deliberate effort, and as far as I know, no one is proposing to do this. This makes the squish djinn analogy invalid as an argument against proposals to build a wish granting AI. It is a basic fact that statements does not have specified "meanings" attached to them, and AI proposals takes this into account. To take an extreme example to make this very clear would be Bill saying: "Steve is an idiot" to two listeners where one listener will predictably think of one Steve and the other listener will predictable think of some other Steve (or a politician making a speech that different demographics will interpret differently and to their own liking). Bill (or the politician) does not have a specific meaning of which Steve (or which message) they are referring to. This speaker is deliberately making a statement in order to have different effects on different audiences. Another standard example is responding to a question about the location of an object with: "look behind you" (anyone that is able to understand english and has no serious mental deficiencies would be able to guess that the meaning is that the object is/might be behind them (as opposed to following the order and be surprised to see the object lying there and think "what a strange coincidence")). Building an AI that would parse "look behind you" without understanding that the person is actually saying "it is/might be behind you" would require deliberate effort as it would be necessary to painstakingly avoid using most information while trying to understand speech. Tone of voice, body language, eye gaze, context, prior knowledge of the speaker, models of people in general, etc, etc all provide valuable information when parsing speech. And needing to prevent an AI from using this information (even indirectly, for example through models of "what sentences usually mean") would put enormous additional burdens on an AI project. An example in the current context would be writing: "It is possible to communicate in a way so that one class of people will infer one meaning and take the speaker seriously and another class of people will infer another meaning and dismiss it as nonsense. This could be done by relying on the fact that people differ in their prior knowledge of the speaker and in their ability to understand certain concepts. One can use non standard vocabulary, take non standard strong positions, describe non common concepts, or otherwise give signals indicating that the speaker is a person that should not be taken seriously so that the speaker is dismissed by most people as talking nonsense. But people that knows the speaker would see a discrepancy and look closer (and if they are familiar with the non standard concepts behind all the "don't listen to me" signs they might infer a completely different message).".

To expand on the valid AI squish djinn analogy. I think that hard coding an AI that executes a command is practically impossible. But if it did succeeded, it would act sort of like a squish djinn given that command. And this argument/analogy is a valid and sufficient argument against trying to hard code such a command, making it relevant as long as there exists people that propose to hardcode such commands. If someone tried to hardcode an AI to execute such a command, and they succeeded in creating something that had a real world impact, I predict this represents a failure to implement the command (it would result in an AI that does something other than the squish djinn and something other than what the builders expect it to do). So the squish djinn is not a realistic outcome. But it is what would happen if they succeeded, and thus the squish djinn analogy is a valid argument against "command hard coding" projects. I can't predict what such an AI would actually do since that depends on how the project failed. Intuitively the situation where confused researchers fail to build a squish djinn does not feel very optimal, but making an argument on this basis is more vague, and require that the proposing researchers accepts their own limited technical ability (saying "doing x is clearly technically possible, but you are not clever enough to succeed" to the typical enthusiastic project proposer (that considers themselves to be clever enough to maybe be the first in the world to create a real AI) might not be the most likely argument to succeed (here I assume that the intent is to be understood, and not to lay the groundworks for later smugly saying "I pointed that out a long time ago" (if one later wants to be smug, then one should optimize for being loud, taking clear and strong positions, and not being understood))). The squish djinn analogy is simply a simpler argument. "Either you fail or you get a squish djinn" is true and simple and sufficient to argue against a project. When presenting this argument, you do spend most of the time arguing about what would happen in a situation that will never actually happen (project success). This might sound very strange to an outside observer, but the strangeness is introduced by the project proposers (invalid) assumption that the project can succeed (analogous to some atheist saying: "if god exists, and is omnipotent, then he is not nice, cuz there is suffering").

Comment author: TheDude 31 May 2013 08:13:23PM 0 points [-]

(I'm arrogantly/wisely staying neutral on the question of whether or not it is at all useful to in any way engage with the sort of people whose project proposals can be validly argued against using squish djinn analogies)

(jokes often work by deliberately being understood in different ways at different times by the same listener (the end of the joke deliberately changes the interpretation of the beginning of the joke (in a way that makes fun of someone)). In this case the meaning of the beginning of the joke is not one thing or the other thing. The listener is not first failing to understand what was said and then, after hearing the end, succeeding to understand it. The speaker is intending the listener to understand the first meaning until reaching the end, so the listener is not "first failing to encode the transmission". There is no inherently true meaning of the beginning of the joke, no inherently true person that this speaker is actually truly referring to. Just a speaker that intends to achieve certain effects on an audience by saying things (and if the speaker is successful, then at the beginning of the joke the listener infers a different meaning from what it infers after hearing the end of the joke). One way to illuminate the concepts discussed above would be to write: "on a somewhat related note, I once considered creating the username "New_Willsome" and to start posting things that sounded like you (for the purpose of demonstrating that if you counter a ban by using sock puppets, you loose your ability to stop people from speaking in your name (I was considering the options of actually acting like I think you would have acted, and the option of including subtle distortions to what I think you would have said, and the option of doing my best to give better explanations of the concepts that you talk about)). But then a bunch of usernames similar to yours showed up and were met with hostility, and I was in a hurry, and drunk, and bat shit crazy, and God told me not to do it, and I was busy writing fanfic, so I decided not to do it (the last sentence is jokingly false. I was not actually in a hurry ... :) ... )")