V_V comments on Building Phenomenological Bridges - Less Wrong

56 Post author: RobbBB 23 December 2013 07:57PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (116)

You are viewing a single comment's thread. Show more comments above.

Comment author: V_V 30 December 2013 12:22:09PM *  1 point [-]

Namely, it requires us to describe our utility function at the base level of reality, but that's difficult because we don't know how paperclips are represented at the base level of reality! We only know how we perceive paperclips.

In principle you could have a paper-clip perception module which counts paper-clips and define utility in terms of its output, and include huge penalties for world states where the paper-clip perception module has been functionally altered (or, more precisely, for world states where you can't prove that the paper-clip perception module hasn't been functionally altered).

Comment author: cousin_it 03 January 2014 05:10:01PM *  2 points [-]

Note that a utility function in UDT is supposed to be a mathematical expression in closed form, with no free variables pointing to "perception". So applying your idea to UDT would require a mathematical model of how agents get their perceptions, e.g. "my perceptions are generated by the universal distribution" like in UDASSA. Such a model would have to address all the usual anthropic questions, like what happens to subjective probabilities if the perception module gets copied conditionally on winning the lottery, etc. And even if we found the right model, I wouldn't build an AI based on that idea, because it might try to hijack the inputs of the perception module instead of doing useful work.

I'd be really interested in a UDT-like agent with a utility function over perceptions instead of a closed-form mathematical expression, though. Nesov called that hypothetical thing "UDT-AIXI" and we spent some time trying to find a good definition, but unsuccessfully. Do you know how to define such a thing?

Comment author: Squark 02 March 2014 07:37:08AM 1 point [-]

I'd be really interested in a UDT-like agent with a utility function over perceptions instead of a closed-form mathematical expression, though. Nesov called that hypothetical thing "UDT-AIXI" and we spent some time trying to find a good definition, but unsuccessfully. Do you know how to define such a thing?

My model of naturalized induction allows it: http://lesswrong.com/lw/jq9/intelligence_metrics_with_naturalized_induction/