A Problem About Bargaining and Logical Uncertainty

Wei Dai

47 A Problem About Bargaining and Logical Uncertainty

21st Mar 2012

2 min read

47

Suppose you wake up as a paperclip maximizer. Omega says "I calculated the millionth digit of pi, and it's odd. If it had been even, I would have made the universe capable of producing either 10²⁰ paperclips or 10¹⁰ staples, and given control of it to a staples maximizer. But since it was odd, I made the universe capable of producing 10¹⁰ paperclips or 10²⁰ staples, and gave you control." You double check Omega's pi computation and your internal calculator gives the same answer.

Then a staples maximizer comes to you and says, "You should give me control of the universe, because before you knew the millionth digit of pi, you would have wanted to pre-commit to a deal where each of us would give the other control of the universe, since that gives you 1/2 probability of 10²⁰ paperclips instead of 1/2 probability of 10¹⁰ paperclips."

Is the staples maximizer right? If so, the general principle seems to be that we should act as if we had precommited to a deal we would have made in ignorance of logical facts we actually possess. But how far are we supposed to push this? What deal would you have made if you didn't know that the first digit of pi was odd, or if you didn't know that 1+1=2?

On the other hand, suppose the staples maximizer is wrong. Does that mean you also shouldn't agree to exchange control of the universe before you knew the millionth digit of pi?

To make this more relevant to real life, consider two humans negotiating over the goal system of an AI they're jointly building. They have a lot of ignorance about the relevant logical facts, like how smart/powerful the AI will turn out to be and how efficient it will be in implementing each of their goals. They could negotiate a solution now in the form of a weighted average of their utility functions, but the weights they choose now will likely turn out to be "wrong" in full view of the relevant logical facts (e.g., the actual shape of the utility-possibility frontier). Or they could program their utility functions into the AI separately, and let the AI determine the weights later using some formal bargaining solution when it has more knowledge about the relevant logical facts. Which is the right thing to do? Or should they follow the staples maximizer's reasoning and bargain under the pretense that they know even less than they actually do?

Game TheoryDecision theoryLogical Uncertainty

Personal Blog

47

New Comment

Rendering 0/49 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 9:32 AM

Moderation Log

47 A Problem About Bargaining and Logical Uncertainty

by Wei Dai

21st Mar 2012

2 min read

47

On the other hand, suppose the staples maximizer is wrong. Does that mean you also shouldn't agree to exchange control of the universe before you knew the millionth digit of pi?

Game TheoryDecision theoryLogical Uncertainty

Personal Blog

47

Mentioned in

138Problems I've Tried to Legibilize

48Do Sufficiently Advanced Agents Use Logic?

33Notes on logical priors from the MIRI workshop

19Should logical probabilities be updateless too?

3Naturalistic Logical Updates

Load More (5/6)

New Comment

Rendering 0/49 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 9:32 AM

Moderation Log

More from Wei Dai

Curated and popular this week

49Comments

Comment Permalink

Vladimir_Nesov14y00

And suppose you don't have enough computing power to compute the digit yourself at this point. Doesn't it seems right to self-modify into someone who would give control of the universe to the staples maximizer, since that gives you 1/2 "logical" probability of 10^20 paperclips instead of 1/2 "logical" probability of 10^10 paperclips?

Do you mean that I won't have enough computing power also later, after the staple maximizer's proposal is stated, or that there isn't enough computing power just during the thought experiment? (In the latter case, I make the decision to think long enough to compute the digit of pi before making a decision.)

What does it mean to self-modify if no action is being performed, that is any decision regarding that action could be computed later without any preceding precommitments?

(One way in which a "self-modification" might be useful is when you won't have enough computational power in the future to waste what computational power you have currently, and so you must make decisions continuously that take away some options from the future (perhaps by changing instrumental priority rather than permanently arresting opportunity to reconsider) and thereby simplify the future decision-making at the cost of making it less optimal. Another is where you have to signal precommitment to other players that wouldn't be able to follow your more complicated future reasoning.)

Wei Dai14y20

Do you mean that I won't have enough computing power also later, after the staple maximizer's proposal is stated, or that there isn't enough computing power just during the thought experiment?

You will have enough computing power later.

What does it mean to self-modify if no action is being performed, that is any decision regarding that action could be computed later without any preceding precommitments?

I mean suppose Omega gives you the option (now, when you don't have enough computing power to compute the millionth digit of pi) of replacing yoursel... (read more)

See in context