TimS comments on Superintelligent AGI in a box - a question. - Less Wrong

14 Post author: Dmytry 23 February 2012 06:48PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (77)

You are viewing a single comment's thread. Show more comments above.

Comment author: TimS 25 February 2012 03:24:22AM 0 points [-]

Well, one way to be a better optimizer is to ensure that one's optimizations are actually implemented. When the program self-modifies, how do we ensure that this capacity is not created? The worst case scenario is that the program learns to improve its ability to persuade you that changes to the code should be authorized.

In short, allowing the program to "optimize" itself does not define what should be optimized. Deciding what should be optimized is the output of some function, so I suggest calling that the "utility function" of the program. If you don't program it explicitly, you risk such a function appearing through unintended interactions of functions that were programmed explicitly.

Comment author: jacobt 25 February 2012 03:36:35AM 0 points [-]

Well, one way to be a better optimizer is to ensure that one's optimizations are actually implemented.

No, changing program (2) to persuade the human operators will not give it a better score according to criterion (3).

In short, allowing the program to "optimize" itself does not define what should be optimized. Deciding what should be optimized is the output of some function, so I suggest calling that the "utility function" of the program. If you don't program it explicitly, you risk such a function appearing through unintended interactions of functions that were programmed explicitly.

I assume you're referring to the fitness function (performance on training set) as a utility function. It is sort of like a utility function in that the program will try to find code for (2) that improves performance for the fitness function. However it will not do anything like persuading human operators to let it out in order to improve the utility function. It will only execute program (2) to find improvements. Since it's not exactly like a utility function in the sense of VNM utility it should not be called a utility function.

Comment author: TimS 25 February 2012 04:18:14AM 0 points [-]

allow the improvement if it makes it do better on average on the sample optimization problems without being significantly more complex (to prevent overfitting). That is, the fitness function would be something like (average performance - k * bits of optimizer program).

Who exactly is doing the "allowing"? If the program, the criteria for allowing changes hasn't been rigorously defined. If the human, how are we verifying that there is improvement over average performance? There is no particular guarantee that the verification of improvement will be easier than discovering the improvement (by hypothesis, we couldn't discover the latter without the program).

Comment author: jacobt 25 February 2012 04:21:09AM *  0 points [-]

Who exactly is doing the "allowing"?

Program (3), which is a dumb, non-optimized program. See this for how it could be defined.

There is no particular guarantee that the verification of improvement will be easier than discovering the improvement (by hypothesis, we couldn't discover the latter without the program).

See this. Many useful problems are easy to verify and hard to solve.