How to measure optimisation power

Stuart_Armstrong

As every school child knows, an advanced AI can be seen as an optimisation process - something that hits a very narrow target in the space of possibilities. The Less Wrong wiki entry proposes some measure of optimisation power:

One way to think mathematically about optimization, like evidence, is in information-theoretic bits. We take the base-two logarithm of the reciprocal of the probability of the result. A one-in-a-million solution (a solution so good relative to your preference ordering that it would take a million random tries to find something that good or better) can be said to have log₂(1,000,000) = 19.9 bits of optimization.

This doesn't seem a fully rigorous definition - what exactly is meant by a million random tries? Also, it measures how hard it would be to come up with that solution, but not how good that solution is. An AI that comes up with a solution that is ten thousand bits more complicated to find, but that is only a tiny bit better than the human solution, is not one to fear.

Other potential measurements could be taking any of the metrics I suggested in the reduced impact post, but used in reverse: to measure large deviations from the status quo, not small ones.

Anyway, before I reinvent the coloured wheel, I just wanted to check whether there was a fully defined agreed upon measure of optimisation power.

One way to think mathematically about optimization, like evidence, is in information-theoretic bits. We take the base-two logarithm of the reciprocal of the probability of the result. A one-in-a-million solution (a solution so good relative to your preference ordering that it would take a million random tries to find something that good or better) can be said to have log₂(1,000,000) = 19.9 bits of optimization.

Other potential measurements could be taking any of the metrics I suggested in the reduced impact post, but used in reverse: to measure large deviations from the status quo, not small ones.

Anyway, before I reinvent the coloured wheel, I just wanted to check whether there was a fully defined agreed upon measure of optimisation power.

The objective function that measures the value of the solution should be a part of the problem: it's part of figuring out what counts as a solution in the first place. Maybe in some cases the solution is binary: if you're solving an equation, you either get a root or you don't.

In some cases, the value of solution is complicated to assess: how do you trade off a cure for cancer that fails in 10% of all cases, versus a cure for cancer that gives the patient a permanent headache? But either way you need an objective function to tell the AI (or the human) what a "cure for cancer" is; possibly your intuitive understanding of this is incomplete, but that's another problem.

Edit: Your objection does have some merit, though, because you could have two different utility functions that yield the same optimization problem. For instance, you could be playing the stock market to optimize $$ or log($$), and the ranking of solutions would be the same (although expected values would be thrown off, but that's another issue); however, the concept of a solution that's "a tiny bit better" is different.

the ranking of solutions would be the same

Only pure solutions - they would rank lotteries differently.

13

How to measure optimisation power

13

13

13

How to measure optimisation power

13

13