Comment Permalink

Vladimir_Nesov16y00

You are making the situation worse for scenario 2, at least you give the source code for scenario 1. But yes, in general you'd want to not just give source code, but also a way of proving from that source code that it has certain properties (otherwise you are faced with halting problem/Rice theorem issues, and even more practical problems that halt you far short of those). It's easy for the explicit situation you describe in scenario 1, but worse in other cases.

In some cases, you don't really need an intermediary of explicit source code, you may just directly communicate a relatively small formal model allowing to prove required properties. This is useful for formally proving properties of systems for which formal models can't be constructed automatically, but can more or less reliably be constructed manually. See, for example, model checking (although the Wikipedia article is too short to reflect this aspect).

In general, you are interested in some aspect of runtime semantics, not of source code per se; as was noted by one commenter, what the program does also depends on how it's compiled and interpreted, what hardware it runs on, and what that hardware does as a result. But you are also not interested in most of that runtime semantics, only the aspects you care about. The code in a programming language usually (at least nominally) comes with formal semantics, that allows you to construct an overdetailed formal model, from which you can then try to abstract out the details that you don't care about, to find the answers to your questions, like "does this program ever crash at runtime?"

cousin_it16y00

I admit I'm in a fog. Intuitively, abstractly interpreting an opponent that's also an abstract interpreter should only give you information about its unconditional behavior, not its behavior against a concrete you. Do you have a simple specific example of what you're describing - a mathematical game where both players can analyze certain properties of each other?

See in context

32 Re-formalizing PD

by cousin_it

28th Apr 2009

2 min read

32

The Prisoner's Dilemma has been discussed to death here on OB/LW, right? Well, here's a couple new twists to somewhat... uh... expand the discussion.

Warning: programming and math ahead.

Scenario 1

Imagine a PD tournament between programs that can read each other's source code. In every match, player A receives the source code of player B as an argument, and vice versa. Matches are one-shot, not iterated.

In this situation it's possible to write a program that's much better than "always defect". Yes, in an ordinary programming language like C or Python, no futuristic superintelligent oracles required. No, Rice's theorem doesn't cause any problems.

Here's an outline of the program:

 // begin PREFIX
 Strategy main(SourceCode other)
 {
     // get source code of this program from "begin PREFIX" to "end PREFIX",
     // using ordinary quine (self-printing program) techniques
     String PREFIX = "..."
 
     if (other.beginsWith(PREFIX))
         return Strategy.COOPERATE;
     else
         return anythingElse(other);
 }
 // end PREFIX
 
 // from this point you can write anything you wish
 Strategy anythingElse(SourceCode other)
 {
     return Strategy.DEFECT;
 }

Some features of this program:

It cooperates with itself.
It cooperates with any other program that begins with PREFIX.
It's impossible to cheat, because opponents that begin with PREFIX can't not cooperate with this program.
~~Other authors now have an incentive to include PREFIX in their programs, moving their original logic into the "anythingElse" subroutine. This modification has no downside.~~

~~So, introducing such a program into the tournament should lead to a chain reaction until everyone cooperates. Unless I've missed something~~. What say ye?

Edit: the last point and the conclusion were wrong. Thanks to Warrigal for pointing this out.

Scenario 2

Now imagine another tournament where programs can't read each other's source code, but are instead given access to a perfect simulator. So programs now look like this:

Strategy main(ObjectCode self, ObjectCode other, Simulator simulator) {...}

and can call simulator.simulate(ObjectCode a, ObjectCode b) arbitrarily many times with any arguments. To give players a chance to avoid bottomless recursion, we also make available a random number generator.

Problem: in this setting, is it possible to write a program that's better than "always defect"?

The most general form of a reasonable program I can imagine at the moment is a centipede:

Programmer invents a number N and a sequence of real numbers 0 < p₁ < p₂ < ... < p_N < 1.
Program generates a random number 0 < r < 1.
If r < p₁, cooperate.
Simulate the opponent's reaction to you.
If opponent defects, defect.
Otherwise if r < p₂, cooperate.
And so on until N.

Exercise 1: when (for what N and p_i) does this program cooperate against itself? (To cooperate, the recursive tree of simulations must terminate with probability one.)

Exercise 2: when does this program win against a simple randomizing opponent?

Exercise 3: what's the connection between the first two exercises, and does it imply any general theorem?

Epilogue

Ordinary humans playing the PD othen rely on assumptions about their opponent. They may consider certain invariant properties of their opponent, like altruism, or run mental simulations. Such wetware processes are inherently hard to model, but even a half-hearted attempt brings out startling and rigorous formalizations instead of our usual vague intuitions about game theory.

Is this direction of inquiry fruitful?

What do you think?

Prisoner's Dilemma

Personal Blog

32

Mentioned in

60AI cooperation is already studied in academia as "program equilibrium"

45AI cooperation in practice

22An Alternative Approach to AI Cooperation

22Loebian cooperation, version 2

14Fairness and Geometry

Load More (5/7)

New Comment

63 comments, sorted by

top scoring

Click to highlight new comments since: Today at 5:04 PM

[-][anonymous]16y120

'Other authors now have an incentive to include PREFIX in their programs, moving their original logic into the "anythingElse" subroutine. This modification has no downside.'

Not really. Including this prefix makes it impossible to defect against a PREFIX program. It's still better to exclude the PREFIX and get the other problem to cooperate while you defect, if possible. It is true, though, that this isn't worthwhile if many programs will never cooperate if you defect, which includes that where anythingElse == Strategy.DEFECT;.

[-]cousin_it16y10

Good catch, thanks. You're right. Upvoted the comment, edited the post.

A weaker statement seems to be true: that the program provided is an evolutionarily stable strategy.

[-]Wei Dai16y20

Scenario 2 seems to share some similarities with Rolf Nelson's AI Deterrence Problem. You might want to check it out if you haven't already.

[-]Liron16y10

This is in my top 5 favorite LW posts so far.

[-]robert16y10

Let's introduce a time limit. Say that after a maximum of S computations (i.e., computation steps using some standardized notion) have passed, each player is forced to make a decision.

Now, write a program that is opaque to introspection: to find out what it decides (i.e. to COOPERATE or DEFECT) , it must be simulated until it halts. This program could use cryptography or other obsfuscation systems (random numbers would be useful). Engineer this program so that it take exactly S steps to run to completion.

The simulating player does not have time to both simulate and interpret the results of its simulation.

Seemingly, restricting all machines to the same time limit serves to reduce the efficacy of many (all?) of these adversarial simulation strategies.

[-]Eliezer Yudkowsky16y100

The simulating player does not have time to both simulate and interpret the results of its simulation.

...and so defects, because it's obvious what the other player intends.

[-]robert16y00

...and so defects, because it's obvious what the other player intends.

More interestingly, what if the program being simulated has a really clever algorithm that just happens to take S steps to compute?

[-]Liron16y50

A program can't be "clever" if it is indistinguishable from a permanent defector to other programs.

[-]cousin_it16y10

In the second scenario we can, losing a little generality, prohibit obfuscation by accepting only programs already organized as decision graphs (possible nodes: throw dice and branch; simulate opponent and branch; cooperate; defect). The problem stays meaningful because programs have no access to each other's source and must resort to simulation. Maybe the problem thus formulated has a trivial solution, but I don't see it yet.

[-]borisonanovitch16y10

Slightly tangential question about source code swapping: If A's source code depends on what it reads from B, and B on what it reads from A... Is there any chance of a Halting problem?

[-][anonymous]16y10

No. A's action depends on the soucecode of B and vice versa. As the sourcecode does not depend upon the sourcecode, nor the action on the action, you aren't into recursion

[-]AngryParsley16y30

Let's say I write a program A that compiles and runs B with A's source code as input. If B's output is "cooperate", A cooperates. If B's output is "defect", A defects.

Now I pit A against itself. Oops, infinite recursion. This isn't exactly the same as the halting problem, but it rhymes.