Eliezer_Yudkowsky comments on Reply to Holden on 'Tool AI' - Less Wrong

94 Post author: Eliezer_Yudkowsky 12 June 2012 06:00PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (348)

You are viewing a single comment's thread. Show more comments above.

Comment author: Eliezer_Yudkowsky 13 June 2012 01:27:30AM 5 points [-]

Holden didn't actually suggest that. And while this suggestion is in a certain sense ingenious - it's not too far off from the sort of suggestions I flip through when considering how/if to implement CEV or similar processes - how do you "report the actions"? And do you report the reasons for them? And do you check to see if there are systematic discrepancies between consequences in the true model and consequences in the manipulated one? (This last point, btw, is sufficient that I would never try to literally implement this suggestion, but try to just structure preferences around some true model instead.)

Comment author: private_messaging 13 June 2012 04:39:29AM *  -2 points [-]

how do you "report the actions"?

How do you report the path the car should take? On the map. How do you report better transistor design? In the blueprint. How do we report software design? With UML diagram. (how do you report why that transistor works? Show simulator). It's just the most irreparable clinical psychopaths whom generate all outputs via extensive (and computationally expensive) modelling of the cognition (and decision process) of the listener. edit: i.e. modelling as to attain an outcome favourable to them; failing to empathise with listener that is failing to treat the listener as instance of self, but instead treating listener as a difficult to control servomechanism.

Comment author: khafra 13 June 2012 01:24:07PM 0 points [-]

Isn't the relevant quality of a "clinical psychopath," here, something like "explicitly models cognition of the listener, instead of using empathy," where "empathy"==something like "has an implicit model of the cognition of the listener"?

Comment author: private_messaging 14 June 2012 07:06:37AM *  1 point [-]

Implicit model that is rather incomplete and not wired for exploitation. That's how psychopaths are successful at exploiting other people and talking people into stuff even though they have substandard model when it comes to actual communication, and their model actually sucks and is inferior to normal.

The human friendliness works via non modelling decision processes of other people when communicating; we do that when we deceive, lie, and bullshit, while when we are honest we sort of share the thoughts. This idea of oracle here is outright disturbing. It is clear nothing good comes out of full model of the listener; firstly it wastes the computing time and secondarily it generates bullshit, so you get something inferior at solving technical problems, and more dangerous, at the same time.

Meanwhile, much of the highly complex information that we would want to obtain from oracle is hopelessly impossible to convey in English anyway - hardware designs, cures, etc.

Comment author: Armok_GoB 13 June 2012 07:25:09PM *  1 point [-]

I can think of a bunch of random standard modes of display (top candidate: video and audio of what the simulated user sees and hears, plus subtitles of their internal model), and for the dispensaries you could run the simulation many times with random variations roughly along the same scope and dimensions as the differences between the simulations and reality, either just reacting plans that have to much divergence, or simply showing the display of all of them (wich'd also help against frivolous use if you have to watch the action 1000 times before doing it). I'd also say make the simulated user a total drone with seriously rewired neurology to try to always and only do what the AI tells it to.

Not that this solves the problem - I've countered the real dangerous things I notice instantly, but 5 mins to think of it and I'll notice 20 more - but I though someone should actually try to answer the question in spirit and letter and most charitable interpretation.

also, it'd make a nice movie.

Comment author: private_messaging 15 June 2012 01:55:16AM *  -2 points [-]

I don't see why the 'oracle' has to work from some real world goal in the first place. The oracle may have as it's terminal goal the output of the relevant information on the screen with the level of clutter compatible with human visual cortex, and that's it. Up to you to ask it to represent it in particular way.

Or not even that; the terminal goal of the mathematical system is to make some variables represent such output; an implementation of such system has those variables be computed and copied to the screen as pixels. The resulting system does not even self preserve; the abstract computation making abstract variables have certain abstract values is attained in the relevant sense even if the implementation is physically destroyed. (this is how software currently works)

Comment author: Armok_GoB 15 June 2012 02:32:08AM -1 points [-]

The screen is a part of the real world.