Zetetic comments on Safety Culture and the Marginal Effect of a Dollar - Less Wrong

23 Post author: jimrandomh 09 June 2011 03:59AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (105)

You are viewing a single comment's thread. Show more comments above.

Comment author: Zetetic 10 June 2011 07:49:37AM *  0 points [-]

My understanding was that the CEV approach is a meta-level approach to stable self improvement, aiming to design code that outputs what we would want an FAI's code to look like (or something like this). I could certainly be wrong of course, and I have very little to go on here, as the Knowability of FAI and CEV are both more vague than I would like (since, of course, the problems are still way open) and several years old, so I have to piece the picture together indirectly.

If that interpretation is correct it seems (and I stress that I might be totally off base with this) that stable recursive self-improvement over time is not the biggest conceptual concern, but rather the biggest conceptual difficulty is determining how to derive a coherent goal set from a bunch of Bayesian utility maximizers equipped with each individual person's utility function (and how to extract each person's utility function), or something like that. A stable self-improving code would then (hopefully) be extrapolated by the resulting CEV, which is actually the initial dynamic.

Comment author: asr 10 June 2011 07:59:08AM 0 points [-]

My comment wasn't directed towards CEV at all -- CEV sounds like a sensible working definition of "friendly enough", and I agree that it's probably computationally hard.

I was suggesting that any program, AI or no, that is coded to rewrite critical parts of itself in substantial ways is likely to go "splat", not "FOOM" -- to degenerate into something that doesn't work at all.