gRR comments on Example decision theory problem: "Agent simulates predictor" - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (76)
Wouldn't this be a level mismatch in a multi-level AI architecture? Like, proving things about low-level neural computational substrate instead of about the conceptual level where actual cognition would take place, and where the actual friendliness would be defined? [and this level can't be isomorphic to any formal logical system, except in symbolic AI, which doesn't work]
And again, a conceptual-level understanding should do the trick. Like, knowing that I play PD against myself would be sufficient to cooperate. Besides, as EY frequently says, it's really hard to find oneself in a true PD. Usually, it's iterated PD, or some Schelling's conflict game [BTW, huge thanks for recommending his book in one of your posts!]
If a multilevel architecture (whatever it is) makes provable friendliness impossible, then FAI can't use it.
I imagine the future FAI as closer to AIXI, which works fine without multilevel architecture, than to the Lisp programs of the 70s.
AFAICT, the general architecture that EY advocates (-ed?) in "Levels of Organization in GI" is multilevel. But this doesn't automatically mean that it's impossible to prove anything about it. Maybe it's possible, just not using the formal logic methods. [And so maybe getting not a 100% certainty, but 100-1e-N%, which should be sufficient for large enough N].
AIXI doesn't work so much more than symbolic AI Lisp programs of the 70s. I mean, the General Problem Solver would be superintelligent given infinite computing power.
Eliezer says here:
To make the General Problem Solver or any other powerful computing device do anything interesting in the real world, you need to give it a formal description that contains the real world as a special case. You could use the universal prior, which gives you AIXI. Or you could use the yet-unspecified prior of UDT, which gives you the yet-unspecified UDT-AIXI.
The central difficulty of decision theory doesn't get easier if you have lots of computing power. Imagine you're playing the PD against someone. You both know each other's source code, but you have a halting oracle and your opponent doesn't. With so much power, what do you do? I simulate the other guy and... whoops, didn't think this through. Looks like I must avoid looking at the result. Hmmm.
Oh... LOGI's totally relinquished then? They should mark the paper as completely obsolete in the list of SI publications, otherwise it's confusing :) I was under impression I read some relatively recent Eliezer's text where he says the prospective FAI researchers must thoroughly understand LOGI before moving to the current even more advanced undisclosed architecture...
Yes, this is an interesting problem. And it appears to produce some neat metaphors. Like: maintain illusion of free will by deliberately avoiding knowing your own decision in advance [or become crazy]. And avoid de-humanizing your opponents [or get defected].
But does it remain relevant given limits on the computing power? [= assuming neither simulation nor any kind of formal proof is feasible]
That sounds weird. Can you find a link?
That seems to be a much stronger assumption than just limiting computing power. It can be broken by one player strategically weakening themselves, if they can benefit from being simulated.
This.
Are you sure this is possible? I tried to do this with the "impersonate other agents" strategy, but it does not seem to work if the opponent has your source code. The other agent knows you're not actually them, just impersonating :)
There is a possibility to send out a different simple program instead of yourself (or fully self-modify into the said program, there is no difference), but it would be a wholly different problem (and easily solvable) from the original one.
Ouch, that text sounds painful, it's probably about as old as LOGI.
Well, not quite that old, but yes, not very recent. The internet archive says the page was created at the end of 2009, but it was probably not done by EY himself. The earliest reference google gives is in 2007...
So, you're saying, now the party line is on single-level formal system-style architectures? But does it even make sense to try to define FAI-meaningful concepts in such architecture? Isn't it like trying to define 'love', 'freedom', and 'justice' in terms of atoms?
I remember EY saying somewhere (can't find where now) that AIXI design was very commendable in the sense that here finally is a full AGI design that can be clearly shown to kill you :)
Here is a 2003 reference to the original SL4 wiki post, which is still online but for some reason not indexed by Google.
I only know what the decision theory folks are doing, don't know about the SingInst party line.
Formally defining "love" may be easier than you think. For example, Paul Christiano's blog has some posts about using "pointers" to our world: take a long bitstring, like the text of Finnegans Wake, and tell the AI to influence whatever algorithm was most likely to produce that string under the universal prior. Also I have played with the idea of using UDT to increase the measure of specified bitstrings. Such ideas don't require knowing correct physics down to the level of atoms, and I can easily imagine that we may find a formal way of pointing the AI at any human-recognizable idea without going through atoms.