AlexMennen comments on Mathematical Measures of Optimization Power - Less Wrong

3 Post author: Alex_Altair 24 November 2012 10:55AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (16)

You are viewing a single comment's thread. Show more comments above.

Comment author: aaronde 25 November 2012 01:48:59AM *  6 points [-]

What I can't figure out is how to specify possible worldstates “in the absence of an OP”.

Can we just replace the optimizer's output with random noise? For example, if we have an AI running in a black box, that only acts on the rest of the universe through a 1-gigabit network connection, then we can assign a uniform probability distribution over every signal that could be transmitted over the connection over a given time (all 2^(10^9) possibilities per second), and the probability distribution of futures that yields is our distribution over worlds that "could have been". We could do the same thing with a human brain and, say, all combinations of action potentials that could be sent down the spinal cord over a given time. This is desirable, because it separates optimization power from physical power. So paralyzed people aren't less intelligent just because "raise arm" isn't an option for them (That is, no combination of action potentials in their head will cause their arm to move).

More formally, an agent is a function or program that has a range or datatype. The range/datatype is the set of what we would call the agent's options. So assume we can generate counterfactual outcomes for each option in the range, the same way your favorite decision theory does. Then we can take optimization power to be the difference between EU given what the agent actually does, and the average EU over all the counterfactuals.*

If the OP is some kind of black-box AI agent, it's easier to imagine this. But if the OP is evolution, or a forest fire, it's harder to imagine.

I'm not so sure. Choosing to talk about natural selection as an agent means defining an agent which values self-replication and outputs a replicator. So if you have a way of measuring how good a genome is at replicating, you could just subtract from that how good a random sequence of base-pairs is, on average, at replicating, to get a measure of how much natural selection has optimized that genome. Of course, you could do the same thing with an entire animal versus a random clump of matter, because the range of the agent is just part of the definition.

EDIT: * AlexMennen had a much better idea for normalizing this than I did ;)

Comment author: AlexMennen 25 November 2012 04:45:52AM 1 point [-]

So paralized people aren't less intelligent just because "raise arm" isn't an option for them (That is, no combination of action potentials in their head will cause their arm to move).

Caveat: if someone is paralyzed because of damage to their brain, rather than to their peripheral nerves or muscles, then this is not true, which creates and undesirable dependency of the measured optimization power on the location of the cause of the disability. Despite this drawback, I like this formalization.

Erm, you probably want to use something like (EU* - EU[av]) / EU[av], where EU* is just the actual expected utility, and EU[av] is the average of the expected utilities of the counterfactual probability distributions over world states associated with each of the agents options.

No, that clearly makes no sense if EU[av] <= 0. If you want to divide by something to normalize the measured optimization power (so that multiplying the utility function by a constant doesn't change the optimization power), the standard deviation of the expected utilities of the counterfactual probability distributions over world states associated with each of the agent's options would be a better choice.

Comment author: aaronde 25 November 2012 06:05:31AM 1 point [-]

Caveat: if someone is paralyzed because of damage to their brain, rather than to their peripheral nerves or muscles, then this is not true,

That's why I specified that the you don't get penalized for disabilities that have nothing to do with the signals leaving your brain.

which creates and undesirable dependency of the measured optimization power on the location of the cause of the disability.

I disagree. I think that's kind of the point of defining "optimization power" as distinct from "power". A man in a prison cell isn't less intelligent just because he has less freedom.

No, that clearly makes no sense if EU[av] <= 0. If you want to divide by something to normalize the measured optimization power (so that multiplying the utility function by a constant doesn't change the optimization power), the standard deviation of the expected utilities of the counterfactual probability distributions over world states associated with each of the agent's options would be a better choice.

Great idea! I was really sloppy about that, realized at the last minute that taking a ratio was clearly wrong, and just wanted to make sure that you couldn't get different answers by scaling the utility function. I guess |EU[av]| does that, but now we can get different answers by shifting the utility function, which shouldn't matter either. Standard deviation is infinitely better.