gRR comments on Example decision theory problem: "Agent simulates predictor" - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (76)
Well, yes, but... is there a feeling at SI that these kinds of problems are relevant to AGI, friendly or no? Or is this just something generally interesting (maybe with the hope that these problems may point to something tangential to them, but which would turn out to be highly relevant)?
I mean, generally I would say that ideas connected to these approaches fall into the "symbolic AI" paradigm, which is supposed to be discredited by some seriously revered people, like Hofstadter...
Well, to make a guaranteed friendly AI you probably need to prove theorems about your AI design. And our universe most likely contains many copies of everything. So figuring out the right decision theory in the presence of copies seems to be a necessary step on the road to FAI. I don't speak for SingInst here, this is just how I feel.
Wouldn't this be a level mismatch in a multi-level AI architecture? Like, proving things about low-level neural computational substrate instead of about the conceptual level where actual cognition would take place, and where the actual friendliness would be defined? [and this level can't be isomorphic to any formal logical system, except in symbolic AI, which doesn't work]
And again, a conceptual-level understanding should do the trick. Like, knowing that I play PD against myself would be sufficient to cooperate. Besides, as EY frequently says, it's really hard to find oneself in a true PD. Usually, it's iterated PD, or some Schelling's conflict game [BTW, huge thanks for recommending his book in one of your posts!]
If a multilevel architecture (whatever it is) makes provable friendliness impossible, then FAI can't use it.
I imagine the future FAI as closer to AIXI, which works fine without multilevel architecture, than to the Lisp programs of the 70s.
AFAICT, the general architecture that EY advocates (-ed?) in "Levels of Organization in GI" is multilevel. But this doesn't automatically mean that it's impossible to prove anything about it. Maybe it's possible, just not using the formal logic methods. [And so maybe getting not a 100% certainty, but 100-1e-N%, which should be sufficient for large enough N].
AIXI doesn't work so much more than symbolic AI Lisp programs of the 70s. I mean, the General Problem Solver would be superintelligent given infinite computing power.
Eliezer says here:
To make the General Problem Solver or any other powerful computing device do anything interesting in the real world, you need to give it a formal description that contains the real world as a special case. You could use the universal prior, which gives you AIXI. Or you could use the yet-unspecified prior of UDT, which gives you the yet-unspecified UDT-AIXI.
The central difficulty of decision theory doesn't get easier if you have lots of computing power. Imagine you're playing the PD against someone. You both know each other's source code, but you have a halting oracle and your opponent doesn't. With so much power, what do you do? I simulate the other guy and... whoops, didn't think this through. Looks like I must avoid looking at the result. Hmmm.
Oh... LOGI's totally relinquished then? They should mark the paper as completely obsolete in the list of SI publications, otherwise it's confusing :) I was under impression I read some relatively recent Eliezer's text where he says the prospective FAI researchers must thoroughly understand LOGI before moving to the current even more advanced undisclosed architecture...
Yes, this is an interesting problem. And it appears to produce some neat metaphors. Like: maintain illusion of free will by deliberately avoiding knowing your own decision in advance [or become crazy]. And avoid de-humanizing your opponents [or get defected].
But does it remain relevant given limits on the computing power? [= assuming neither simulation nor any kind of formal proof is feasible]
That sounds weird. Can you find a link?
That seems to be a much stronger assumption than just limiting computing power. It can be broken by one player strategically weakening themselves, if they can benefit from being simulated.
This.
Are you sure this is possible? I tried to do this with the "impersonate other agents" strategy, but it does not seem to work if the opponent has your source code. The other agent knows you're not actually them, just impersonating :)
There is a possibility to send out a different simple program instead of yourself (or fully self-modify into the said program, there is no difference), but it would be a wholly different problem (and easily solvable) from the original one.
Ouch, that text sounds painful, it's probably about as old as LOGI.