Nick_Tarleton comments on Building Phenomenological Bridges - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (116)
Not sure I agree. Let me try to spell it out.
Here's what I always assumed a UDT agent to be doing: on one hand, it has a prior over world programs - let's say without loss of generality that there's just one world program, e.g. a cellular automaton containing a computer containing an operating system running the agent program. It's not pre-sliced, the agent doesn't know what transistors are, etc. On the other hand, the agent knows a quined description of itself, as an agent program with inputs and outputs, in whatever high-level language you like.
Then the agent tries to prove theorems of the form, "if the agent program implements such-and-such mapping from inputs to outputs, then the world program behaves a certain way". To prove a theorem of that form, the agent might notice that some things happening within the world program can be interpreted as "inputs" and "outputs", and the logical dependence between these is provably the same as the mapping implemented by the agent program. (So the agent will end up finding transistors within the world program, so to speak, while trying to prove theorems of the above form.) Note that the agent might discover multiple copies of itself within the world and set up coordination based on different inputs that they receive, like in Wei's post.
This approach also has a big problem, which is kind of opposite to the problem described in the RobbBB's post. Namely, it requires us to describe our utility function at the base level of reality, but that's difficult because we don't know how paperclips are represented at the base level of reality! We only know how we perceive paperclips. Solving that problem seems to require some flavor of Paul's "indirect normativity", but that's broken and might be unfixable as I've discussed with you before.
Do you have a link to this discussion?
Yes, see this post. Most of the discussion happened in PM exchanges, but I think you can still get the idea. Feel free to PM me for explanations if you want.