You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

gRR comments on Example decision theory problem: "Agent simulates predictor" - Less Wrong Discussion

23 Post author: cousin_it 19 May 2011 03:16PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (76)

You are viewing a single comment's thread. Show more comments above.

Comment author: gRR 07 March 2012 06:28:18PM 0 points [-]

to make a guaranteed friendly AI you probably need to prove theorems about your AI design.

Wouldn't this be a level mismatch in a multi-level AI architecture? Like, proving things about low-level neural computational substrate instead of about the conceptual level where actual cognition would take place, and where the actual friendliness would be defined? [and this level can't be isomorphic to any formal logical system, except in symbolic AI, which doesn't work]

figuring out the right decision theory in the presence of copies seems to be a necessary step on the road to FAI

And again, a conceptual-level understanding should do the trick. Like, knowing that I play PD against myself would be sufficient to cooperate. Besides, as EY frequently says, it's really hard to find oneself in a true PD. Usually, it's iterated PD, or some Schelling's conflict game [BTW, huge thanks for recommending his book in one of your posts!]

Comment author: cousin_it 07 March 2012 06:49:45PM 0 points [-]

If a multilevel architecture (whatever it is) makes provable friendliness impossible, then FAI can't use it.

I imagine the future FAI as closer to AIXI, which works fine without multilevel architecture, than to the Lisp programs of the 70s.

Comment author: gRR 07 March 2012 07:10:44PM 0 points [-]

AFAICT, the general architecture that EY advocates (-ed?) in "Levels of Organization in GI" is multilevel. But this doesn't automatically mean that it's impossible to prove anything about it. Maybe it's possible, just not using the formal logic methods. [And so maybe getting not a 100% certainty, but 100-1e-N%, which should be sufficient for large enough N].

AIXI doesn't work so much more than symbolic AI Lisp programs of the 70s. I mean, the General Problem Solver would be superintelligent given infinite computing power.

Comment author: cousin_it 07 March 2012 07:32:51PM *  1 point [-]

Eliezer says here:

A good deal of the material I have ever produced – specifically, everything dated 2002 or earlier – I now consider completely obsolete. (...) I no longer consider LOGI’s theory useful for building de novo AI.

To make the General Problem Solver or any other powerful computing device do anything interesting in the real world, you need to give it a formal description that contains the real world as a special case. You could use the universal prior, which gives you AIXI. Or you could use the yet-unspecified prior of UDT, which gives you the yet-unspecified UDT-AIXI.

The central difficulty of decision theory doesn't get easier if you have lots of computing power. Imagine you're playing the PD against someone. You both know each other's source code, but you have a halting oracle and your opponent doesn't. With so much power, what do you do? I simulate the other guy and... whoops, didn't think this through. Looks like I must avoid looking at the result. Hmmm.

Comment author: gRR 07 March 2012 08:05:02PM 0 points [-]

Oh... LOGI's totally relinquished then? They should mark the paper as completely obsolete in the list of SI publications, otherwise it's confusing :) I was under impression I read some relatively recent Eliezer's text where he says the prospective FAI researchers must thoroughly understand LOGI before moving to the current even more advanced undisclosed architecture...

The central difficulty of decision theory doesn't get easier if you have lots of computing power

Yes, this is an interesting problem. And it appears to produce some neat metaphors. Like: maintain illusion of free will by deliberately avoiding knowing your own decision in advance [or become crazy]. And avoid de-humanizing your opponents [or get defected].

But does it remain relevant given limits on the computing power? [= assuming neither simulation nor any kind of formal proof is feasible]

Comment author: cousin_it 07 March 2012 08:28:32PM *  0 points [-]

I was under impression I read some relatively recent Eliezer's text where he says the prospective FAI researchers must thoroughly understand LOGI before moving to the current even more advanced undisclosed architecture...

That sounds weird. Can you find a link?

But does it remain relevant given limits on the computing power? [= assuming neither simulation nor any kind of formal proof is feasible]

That seems to be a much stronger assumption than just limiting computing power. It can be broken by one player strategically weakening themselves, if they can benefit from being simulated.

Comment author: gRR 07 March 2012 08:46:28PM 0 points [-]

That sounds weird. Can you find a link?

This.

It can be broken by one player strategically weakening themselves, if they can benefit from being simulated.

Are you sure this is possible? I tried to do this with the "impersonate other agents" strategy, but it does not seem to work if the opponent has your source code. The other agent knows you're not actually them, just impersonating :)

There is a possibility to send out a different simple program instead of yourself (or fully self-modify into the said program, there is no difference), but it would be a wholly different problem (and easily solvable) from the original one.

Comment author: cousin_it 08 March 2012 01:56:48PM 0 points [-]

Ouch, that text sounds painful, it's probably about as old as LOGI.

Comment author: gRR 08 March 2012 02:36:44PM *  1 point [-]

Well, not quite that old, but yes, not very recent. The internet archive says the page was created at the end of 2009, but it was probably not done by EY himself. The earliest reference google gives is in 2007...

So, you're saying, now the party line is on single-level formal system-style architectures? But does it even make sense to try to define FAI-meaningful concepts in such architecture? Isn't it like trying to define 'love', 'freedom', and 'justice' in terms of atoms?

I remember EY saying somewhere (can't find where now) that AIXI design was very commendable in the sense that here finally is a full AGI design that can be clearly shown to kill you :)

Comment author: Wei_Dai 08 March 2012 11:28:13PM 5 points [-]

Here is a 2003 reference to the original SL4 wiki post, which is still online but for some reason not indexed by Google.

Comment author: cousin_it 08 March 2012 03:53:18PM *  1 point [-]

I only know what the decision theory folks are doing, don't know about the SingInst party line.

Formally defining "love" may be easier than you think. For example, Paul Christiano's blog has some posts about using "pointers" to our world: take a long bitstring, like the text of Finnegans Wake, and tell the AI to influence whatever algorithm was most likely to produce that string under the universal prior. Also I have played with the idea of using UDT to increase the measure of specified bitstrings. Such ideas don't require knowing correct physics down to the level of atoms, and I can easily imagine that we may find a formal way of pointing the AI at any human-recognizable idea without going through atoms.