Squark comments on Introducing Corrigibility (an FAI research subfield) - Less Wrong

29 Post author: So8res 20 October 2014 09:09PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (28)

You are viewing a single comment's thread. Show more comments above.

Comment author: Squark 06 March 2015 02:55:02PM 1 point [-]

Thanks, that is a good explanation.

Regarding problem 5, one approach I thought of is what I call "epistemic boxing". Namely, we put the AGI in a virtual world ("box") and program it to optimize utility expectation value over a "hard-coded" (stochastic) model of the box rather than over a Solomonoff measure. This assumes the utility function is given explicitly in terms of the box's degrees of freedom.

Such an AGI can still recursively self-improve and become superintelligent, however it will never escape the box since the possibility is a non-sequitur in its epistemology. In particular, the box can have external inputs but the AGI will model them as e.g. random noise and won't attempt to continue whatever pattern they contain (it will always consider it "accidental").

Regarding question 2, I think there is a non-negligible probability it is unsolvable. That is not to say we shouldn't look for solutions but IMO we should be prepared for the possibility there are none.