Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Prisoner's Dilemma (with visible source code) Tournament

47 Post author: AlexMennen 07 June 2013 08:30AM

After the iterated prisoner's dilemma tournament organized by prase two years ago, there was discussion of running tournaments for several variants, including one in which two players submit programs, each of which are given the source code of the other player's program, and outputs either “cooperate” or “defect”. However, as far as I know, no such tournament has been run until now.

Here's how it's going to work: Each player will submit a file containing a single Scheme lambda-function. The function should take one input. Your program will play exactly one round against each other program submitted (not including itself). In each round, two programs will be run, each given the source code of the other as input, and will be expected to return either of the symbols “C” or “D” (for "cooperate" and "defect", respectively). The programs will receive points based on the following payoff matrix:

“Other” includes any result other than returning “C” or “D”, including failing to terminate, throwing an exception, and even returning the string “Cooperate”. Notice that “Other” results in a worst-of-both-worlds scenario where you get the same payoff as you would have if you cooperated, but the other player gets the same payoff as if you had defected. This is an attempt to ensure that no one ever has incentive for their program to fail to run properly, or to trick another program into doing so.

Your score is the sum of the number of points you earn in each round. The player with the highest score wins the tournament. Edit: There is a 0.5 bitcoin prize being offered for the winner. Thanks, VincentYu!

Details:
All submissions must be emailed to wardenPD@gmail.com by July 5, at noon PDT (Edit: that's 19:00 UTC). Your email should also say how you would like to be identified when I announce the tournament results.
Each program will be allowed to run for 10 seconds. If it has not returned either “C” or “D” by then, it will be stopped, and treated as returning “Other”. For consistency, I will have Scheme collect garbage right before each run.
One submission per person or team. No person may contribute to more than one entry. Edit: This also means no copying from each others' source code. Describing the behavior of your program to others is okay.
I will be running the submissions in Racket. You may be interested in how Racket handles time (especially the (current-milliseconds) function), threads (in particular, “thread”, “kill-thread”, “sleep”, and “thread-dead?”), and possibly randomness.
Don't try to open the file you wrote your program in (or any other file, for that matter). I'll add code to the file before running it, so if you want your program to use a copy of your source code, you will need to use a quine. Edit: No I/O of any sort.
Unless you tell me otherwise, I assume I have permission to publish your code after the contest.
You are encouraged to discuss strategies for achieving mutual cooperation in the comments thread.
I'm hoping to get as many entries as possible. If you know someone who might be interested in this, please tell them.
It's possible that I've said something stupid that I'll have to change or clarify, so you might want to come back to this page again occasionally to look for changes to the rules. Any edits will be bolded, and I'll try not to change anything too drastically, or make any edits late in the contest.

Here is an example of a correct entry, which cooperates with you if and only if you would cooperate with a program that always cooperates (actually, if and only if you would cooperate with one particular program that always cooperates):

(lambda (x)
    (if (eq? ((eval x) '(lambda (y) 'C)) 'C)
        'C
        'D))

Comments (232)

Comment author: CronoDAS 07 June 2013 07:13:51AM 16 points [-]
Comment author: Kaj_Sotala 07 June 2013 07:42:23PM 7 points [-]

That page is a fascinating read.

Comment author: bluej100 07 June 2013 06:37:25PM 13 points [-]

The quine requirement seems to me to introduce non-productive complexity. If file reading is disallowed, why not just pass the program its own source code as well as its opponent's?

Comment author: darius 08 June 2013 02:21:21AM 5 points [-]

Yes -- in my version of this you do get passed your own source code as a convenience.

Comment author: AlexMennen 09 June 2013 08:41:12PM 4 points [-]

That's a good point. I've already got a few submissions, but on the other hand, I could notify them of the change, and it would only require a trivial modification. Is there a consensus on whether I should do this anyway?

Comment author: solipsist 10 June 2013 07:04:31PM *  6 points [-]

For the record, though I raised an objection, I'd be perfectly happy if the contest were modified so that player programs were passed their sources code as an argument. The rule change would have consequences that I don't understand, and I like that. Caveat emptor — I suspect the rule change would cause people to write more exploitable programs.

Comment author: solipsist 09 June 2013 09:14:01PM 4 points [-]

Passing in the source code is not the same as quining. A program that is passed in its own source code can easily check to see it's been altered (e.g. by including a cryptographic signature in the source code). With quining, the program can be mutated without easy detection.

Comment author: AlexMennen 10 June 2013 08:41:40PM 2 points [-]

Thanks for pointing that out. Unless someone can convince me that this won't be a problem, I will not change the rule.

Comment author: pengvado 10 June 2013 01:23:04PM *  2 points [-]

How does that help? A quine-like program could just as well put its real payload in a string with a cryptographic signature, verify the signature, and then eval the string with the string as input; thus emulating the "passed its own sourcecode" format. You could mess with that if you're smart enough to locate and delete the "verify the signature" step, but then you could do that in the real "passed its own sourcecode" format too.

Conversely, even if the tournament program itself is honest, contestants can lie to their simulations of their opponents about what sourcecode the simulation is of.

Comment author: solipsist 10 June 2013 04:10:51PM *  4 points [-]

A quine-like program could just as well put its real payload in a string with a cryptographic signature, verify the signature, and then eval the string with the string as input; thus emulating the "passed its own sourcecode" format.

Altering the internal structure of an opponent program would be very difficult, but that's not the only way to mutate a program. You can't tinker with the insides of a black box, but you can wrap a black box.

To be concrete: given an opponent's source code, I could mechanically generate an equivalent program with extremely dissimilar source code (perhaps just a block of text, a decryption routine, and a call to eval) that nevertheless acts exactly like the original program in every way. And since that mechanically-obfuscated program would act exactly like the original program in every way, the obfuscated program would not be able to detect that it had been altered. Do you agree?

Comment author: solipsist 10 June 2013 03:14:45PM *  2 points [-]

I'm playing Prisoner's Dilemma and wish to test if an opponent X is honest. I might try the following:

(1) Create two programs, Y and Z, which are algorithmically equivalent but obfuscated versions of X. (2) Run Y and Z against each other.

If Y and Z don't cooperate with each other, that's a good indication that X recognizes itself with a source-code comparison and that I shouldn't trust X.

This honesty check doesn't work if Y and Z are given access to their sources. Sure, when I simulate Y against Z, I could lie to Y and tell Y that its source is X (so Y believes itself to be unmodified). But when my deluded Y simulation is deciding whether to cooperate with Z, it (Y) may run Z in simulation. If Y informs its Z-simulation that Z's source is Z, then that Z-simulation will not be deluded into thinking that it is unmodified. Y's simulation of Z will be able to detect that it is an (obfuscated) simulation and act accordingly.

This honesty check isn't fool proof. X can recognize itself with a more complicated handshake — one that survives code obfuscation. But if X recognizes itself with a more complicated handshake, then X doesn't need to know its own source code (and we shouldn't bother passing the source code in).

Comment author: pengvado 11 June 2013 07:03:44AM *  1 point [-]

I had in mind an automated wrapper generator for the "passed own sourcecode" version of the contest:

(define CliqueBot
 (lambda (self opponent)
  (if (eq? self opponent) 'C 'D)))
(define Wrapper
 (lambda (agent)
  (lambda (self opponent)
   (agent agent opponent))))
(define WrappedCliqueBot
 (Wrapper CliqueBot))

Note that for all values of X and Y, (WrappedCliqueBot X Y) == (CliqueBot CliqueBot Y), and there's no possible code you could add to CliqueBot that would break this identity. Now I just realized that the very fact that WrappedCliqueBot doesn't depend on its "self" argument, provides a way to distinguish it from the unmodified CliqueBot using only blackbox queries, so in that sense it's not quite functionally identical. Otoh, if you consider it unfair to discriminate against agents just because they use old-fashioned quine-type self-reference rather than exploiting the convenience of a "self" argument, then this transformation is fair.

Comment author: ESRogs 09 June 2013 09:41:53PM 0 points [-]

Is this relevant for the contest?

Comment author: solipsist 09 June 2013 11:47:19PM *  3 points [-]

You might want to see if a program would cooperate with an obfuscated version of itself (without the obfuscated version being able to detect that it was obfuscated).

Comment author: Qiaochu_Yuan 05 June 2013 07:38:40PM *  13 points [-]

Injecting some keywords: this field of study is known as program equilibrium. Previous LW post on the subject, with links.

Edit: Can you explain how you decided on the parts of the payoff matrix involving "other"? These seem quite important as they affect the viability of strategies based on either convincing your opponent not to halt or not halting yourself.

Comment author: lukeprog 05 June 2013 10:28:31PM 7 points [-]

Handy introductory article: Computation and the Prisoner's Dilemma.

Comment author: shminux 05 June 2013 11:21:01PM *  7 points [-]

If HisProgram == MyProgram then
do (C);
else
do (D);
end.

TIL that ethnic hatred and tribalism is a Nash (folk) equilibrium.

Comment author: SilasBarta 06 June 2013 09:58:22PM 3 points [-]

Make sure the equality comparison only depends on things that affect functionality -- i.e. it will declare any functionally equivalent programs equal even they use a different nameset for variables or something.

(Yes, I know that's reducible to the halting problem; in practice, you'll use a computable, polynomial time approximation for it that will inevitably have to throw out equivalent programs that are too complex or otherwise be too 'clever'.)

Comment author: shminux 06 June 2013 10:20:52PM 2 points [-]

Patrick discusses this issue here in some depth.

Comment author: PeterisP 07 June 2013 01:38:52PM 1 point [-]

It's quite likely that the optimal behaviour should be different in case the program is functionally equal but not exactly equal.

If you're playing yourself, then you want to cooperate.

If you're playing someone else, then you'd want to cooperate if and only if that someone else is smart enough to check if you'll cooperate; but if it's decision doesn't depend on yours, then you should defect.

Comment author: ThrustVectoring 06 June 2013 02:35:06AM 0 points [-]

You only need to evaluate the equivalence of the first two lines of the program, by the way. It cooperates with those who can't not cooperate with it if it goes through that branch of the logic, and does something else to everyone else.

Comment author: AlexMennen 05 June 2013 09:52:07PM 6 points [-]

The payoffs for "other" were designed so that neither failing to halt, nor convincing the other player not to halt, should ever be a worthwhile strategy. If you don't halt, it gives you the same payoff as if you had cooperated, and gives the other player the same payoff as if you had defected. That way, not halting should be strictly dominated by defecting, since you are better off if you defect, and the other player should react the same way to each threat. And tricking the other player into not halting is also a bad idea, since the payoff you get from it is the same as if they defected.

Comment author: ThrustVectoring 06 June 2013 12:43:11PM 4 points [-]

And tricking the other player into not halting is also a bad idea, since the payoff you get from it is the same as if they defected.

(Defect, non-halt) is actually better than (Defect, Defect) for you, since it gives you a relative advantage over competitors in the tournament.

Comment author: AlexMennen 06 June 2013 04:03:32PM 1 point [-]

True, but I still don't expect it will be a big problem. If there are a lot of submissions, the effect will be small, and if it is paying enough attention to your source code for it to be possible to trick it into not halting, then it is probably looking for a way to achieve mutual cooperation, so tricking it is still not a good strategy.

Comment author: Gurkenglas 07 June 2013 02:47:29PM 4 points [-]

If you trust all sufficiently smart players to try to induce (Defect, non-halt) if (Defect, Defect) is otherwise inevitable, the effect adds up over a hopefully significant portion of the submissions.

Comment author: wedrifid 11 June 2013 08:27:31AM 2 points [-]

The payoffs for "other" were designed so that neither failing to halt, nor convincing the other player not to halt, should ever be a worthwhile strategy. If you don't halt, it gives you the same payoff as if you had cooperated, and gives the other player the same payoff as if you had defected.

Your game world implements an "enthusiastic consent" policy

Comment author: ThrustVectoring 05 June 2013 08:24:35PM *  12 points [-]

No person may contribute to more than one entry.

This is important, because otherwise the contest devolves into who can submit the most copies of AbsolutismBot (cooperate with programs that share it's source code, otherwise defect)

I think that any submitted program can be improved by combining it with AbsolutismBot. If you're playing with someone who submitted the same program for you, cooperate (they can't defect against you in this scenario, since they're running identical source code). If they aren't running the same code, run whatever program lies underneath it.

I think this could get generalized to cooperation with everyone who has the AbsolutismBot "wrapper", since it doesn't matter what the code after the AbsolutismBot section does. In English (since I don't know how to program in Scheme), the program would be like this:

If the first 117 characters of the other program are the same as the first 117 characters of this program, cooperate. Otherwise, do some other strategy.

All players that implement this strategy will cooperate with each other. Granted, this doesn't help them win the tournament since it isn't a relative advantage compared to other AbsolutismBots - it just makes everyone who doesn't do this lose the tournament.

Comment author: Icehawk78 06 June 2013 02:47:06PM 7 points [-]

Except that over some threshold, any Anti-Absolutism bots (which have some way of "escaping" while still containing the same first 117 characters, like having a C preprocessor directive that redefines TRUE to equal FALSE) would necessarily be superior.

Comment author: Morendil 07 June 2013 05:16:01PM 2 points [-]
Comment author: Bakkot 09 June 2013 01:55:18AM 1 point [-]

Interesting. Really, what you want (in slightly more generality) is to cooperate with anyone who can prove they will cooperate if you yourself can prove you will cooperate under the same condition.

That is, if from their source, you can prove "they cooperate if they can prove this condition about me", then you cooperate.

Of course, it's not generally possible to prove things about a program given its source, especially at this level of self-reference. In this particular case the "generally" in there is important. It is in your interest for other programs to be able to prove this property about you. This is beyond the scope of the tournament, obviously, but given extremely sophisticated players, every player benefits from adding this property (if possible). You might even want to include a subprogram which produces a proof!

Comment author: zerker2000 15 June 2013 07:34:05AM 0 points [-]

Except we're not allowed to use anyone else's source code, so the test could just as easily be simplified to "if opponent source contains integer 1415926535, cooperate"(for a randomly chosen 1415926535).

Comment author: skepsci 15 June 2013 04:28:16AM 0 points [-]

It's not necessarily best to cooperate with everyone with the "AbsolutismBot" wrapper. This guarantees that you and the other program will both cooperate, but without the wrapper it might have been the case that you would defect and the other program would cooperate, which is better for you.

Comment author: wubbles 09 June 2013 04:11:02PM 11 points [-]

I would like to declare the following: I have submitted a program that cooperates only if you would cooperate against CooperateBot. You can of course create a selective defector against it, but that would be a bit tricky, as I am not revealing the source code. Evaluate your submissions accordingly.

Comment author: Eliezer_Yudkowsky 10 June 2013 08:17:39AM 3 points [-]

But surely, good sir, common sense says that you should defect against CooperateBot in order to punish it for cooperating with DefectBot.

Also, in modal combat your bot is X=[]Y(CooperateBot) and is easily detected via the test [1](Y(X)<->[]X(CooperateBot)) where [1] denotes provability in T+Con(T), ie [1](Q) = []((~[]F)->Q).

Comment author: yli 10 June 2013 06:12:09PM *  1 point [-]

But surely, good sir, common sense says that you should defect against CooperateBot in order to punish it for cooperating with DefectBot.

I thought you said people should tolerate tolerance :)

Comment author: cousin_it 10 June 2013 10:01:07AM *  1 point [-]

Am I missing something, or does your detector use quantification, which is disallowed in Patrick's modal agents?

Comment author: Eliezer_Yudkowsky 10 June 2013 10:07:00AM 0 points [-]

Hm. I think X within the test could be introduced as a new constant and solved, but I'm not sure.

Comment author: Karl 10 June 2013 09:07:00PM 0 points [-]

The agent defined by wubbles is actually the agent called JustBot in the robust cooperation paper and which is proven to be non-exploitable by modal agents.

Comment author: Eliezer_Yudkowsky 10 June 2013 10:47:41PM 2 points [-]

JusticeBot cooperates with anyone who cooperates with FairBot, and is exploitable by any agent which comprehends source code well enough to cooperate with FairBot and defect against JusticeBot. Though I'm going here off the remembered workshop rather than rechecking the paper.

Comment author: Karl 11 June 2013 12:24:31AM *  0 points [-]

You're right, and wubbles's agent can easily be exploited by a modal agent A defined by A(X)=C <-> [] (X(PB)=C) (where PB is PrudentBot).

Comment author: CoffeeStain 10 June 2013 07:37:53AM 6 points [-]

What wubbles actually submitted: A program that defects only if you would cooperate against CooperateBot.

Well played.

Comment author: Mestroyer 06 June 2013 12:06:25AM 11 points [-]

If you can tell where in the stack you are (like you could with C), you could tell if you were being run by the main program, or by another contestant. Can you?

Comment author: wuncidunci 06 June 2013 09:19:01AM 4 points [-]

Unless the other contestant wrote a virtual machine in which they are running you. Something which I think would be quite doable considering the ridiculously large time you've got (10s gives ~10^10 instructions).

Comment author: Decius 06 June 2013 04:49:20PM 1 point [-]

When their VM runs your VM running their VM... it times out and everybody loses.

Comment author: wuncidunci 06 June 2013 06:44:57PM 0 points [-]

Unless one of the contestants have time limits on their VM (or on their simulations in general). You can of clearly implement a VM where time goes faster just by pretending they have a slower processor than you really run on.

Comment author: Decius 07 June 2013 03:24:04AM 0 points [-]

Hrm- is it different if they run my function from within theirs instead of constructing a full VM? I was considering ways to communicate with a copy of myself that my simulation of my opponent was running that it was a simulation, but couldn't find any good ideas.

Comment author: wuncidunci 07 June 2013 07:02:29AM 1 point [-]

If they run your function from within theirs they simply tell the computer to start reading those instructions, possibly with a timer for stopping detailed in other parts of the comments. If they implement a VM from scratch they can mess with how the library functions work, for instance giving you a time that moves much faster so that your simulation must stop within 0.1s instead of 10 and they can run your code 100 different times to deal with randomness. Now implementing your own VM is probably not the optimal way to do this, you probably just want to do a transformation of the source code to use your own secret functions instead of the standard time ones.

Comment author: Decius 07 June 2013 05:41:09PM 0 points [-]

I was considering simply measuring the behavior of my current opponent against programs that aren't me and determining their behavior as cooperatebot, mutualbot, defectbot, cooperate if they simulate me, cooperate against long programs, cooperate IFF they cooperate IFF I cooperate, or some other beast. That allows their behavior to depend on my behavior versus them, but my behavior only depends on their behavior versus third parties.

I can't see a way to check another program for the desired IFF behavior without going beyond my skill level, but I think a mutualbot with a tiny chance of cooperating followed by a mutualbot with a tiny chance of defecting comes close; the first puts a lot of copies on the stack and then the top one cooperates unilaterally; if the response of the opponent is mutuality, it will be cooperate all the way down. If my opponent defects at his top level because I didn't cooperate for the right reason, it yields a defection... all the way down. A perfect program wouldn't do that, because it could see that it was probably in a simulation.

Comment author: AlexMennen 06 June 2013 12:36:55AM 1 point [-]

I don't know of a way to do that.

Comment author: Eugine_Nier 07 June 2013 03:39:34AM 3 points [-]

One way is to recursively call a bunch of functions and catch the resulting "recursion limit exceeded" exception.

Comment author: VincentYu 09 June 2013 02:22:37PM 8 points [-]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I will send 0.5 BTC (currently worth ~50 USD) to a bitcoin address of the
winner's choice. If a tie occurs, the 0.5 BTC will be split evenly. If there
are disputes (including but not limited to issues about cheating and
identifying the winner), I will let AlexMennen dictate the appropriate
distribution for the 0.5 BTC.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)
iQIcBAEBAgAGBQJRtI6bAAoJEKqVvI+6cLPpykUP/0KuIqskiZvMVLMpm7diWC5q
sOf3+3XTvECQmeHNbku7dM+ihkwf1Gg8BTKt0HEjfj3tUIG4sWeA2UvFcGReNKoH
F5RsI2zPbn368k5EqS3lLyjcInGqZYH2X+zbOKxkezo7IJk7tvk+JHpR+a2U598e
tF1daaZYIOxkcmSpVYRAW6y0dX4iWSZxtbDYr+U4muYWp88W60tlNI9Fc5SAhCev
yEAhIUnX4f7QxsL/9oTmNoLa/ECSfGkrCoD5yc6d/hqfp8pV7hknFc11hAMnYGbw
Qj3hp4q7sEbetT8NAn3X+UUZ9Vld76lF73mEo5ZGVfa2dG713/yjpmt8fR/arhcn
8RWMpJ7qpFVvU3wBeKuLgASev+D+CpuImWUbFeuGqF3sD7n6m0R77zx535pt0I6M
/GtuSEtTCp3xQAcOHXDC/pPZ0Z1Fl5Z3e1JxcDfCntOodv8Rgtma3oml1DuD0HsY
7DeqqMc4WRRiU8K4ohX16PZ52rhs2TogGYx3dlvP6xt33gfQMOYf7AM0vi5s4nBL
FeBrqiRRjdrlI4RJqXeihcu3GMwXbsa0wxEwit0CHLvPqGX0FoF2/S8x13uBmzCm
gCRTEa4hIuDtCBFJozKB3dMl66c97df+ntFQoXoeZDADgf8yVOQ9lFCkC4usGsqD
5BjPUxtJBQPMzP61cKMD
=kHiy
-----END PGP SIGNATURE-----

(Note: Add a blank line after each of the "Hash:" and "Version:" lines before verifying the signature. LW's markdown seems to be hungry and is eating blank lines in code blocks.)

Comment author: Khoth 28 June 2013 02:07:49PM 5 points [-]

Someone pretending to be you could copy this whole message, PGP signature and all, as a reply to a completely different competition, and unless you can later find this message and prove that you haven't edited it, it'll look like you're offering a prize to that competition as well.

Comment author: VincentYu 28 June 2013 07:28:17PM *  2 points [-]

That's a good point that I hadn't considered. Thanks! I'll keep in mind that I need to be more specific in the message.

Thinking about it, a spurious copy of my message can't be used with future competitions because the PGP signature includes a timestamp (and it is publicly implausible that I would have my system time set incorrectly to the past). But it can be used maliciously with competitions that already exist at the time the signature was created.

Comment author: BloodyShrimp 28 June 2013 06:07:07PM 1 point [-]

But it'll only be believable if AlexMennen is involved with that contest as well.

Comment author: Khoth 28 June 2013 06:21:16PM 2 points [-]

If you happen to AlexMennen, it's the perfect crime!

Comment author: SilasBarta 13 June 2013 09:34:36PM *  7 points [-]

Just thought I'd throw this out there:

TabooBot: Return D if opponent's source code contains a D; C otherwise.

To avoid mutual defection with other bots, it must (like with real prudish societies!) indirectly reference the output D. But then other kinds of bots can avoid explicit reference to D, requiring a more advanced TabooBot to have other checks, like defecting if the opponent's source code calls a modifier on a string literal.

Comment author: solipsist 14 June 2013 12:55:37AM *  7 points [-]

like defecting if the opponent's source code calls a modifier on a string literal

Actually, 'D it's not a string literal -- it's a symbol. Compilers can legally to turn the symbol 'D into something opaque like 0x3859 and destroy any reference to the letter "D". The only way a program can generate a 'D symbol on its own is to include it in its source.

But that doesn't mean that a program without a 'D can't defect! An IncrediblyInnocentBot, which does not contain a 'D and can't generate a 'D on its own can still defect by stealing a 'D from the opponent agent.

One way to steal a 'D from an opponent would be to search for quoted symbols in the opponent's program. The opponent could foil this method, however, by including decoy symbols.

Alternatively, IncrediblyInnocentBot could simulate its opponent against a bunch of stupid agents, such as CooperateBot or DivergeBot, and hope that the opponent defects in at least one of these simulations. If the opponent ever returns a symbol other than 'C in simulation, IncrediblyInnocentBot remembers that symbol and can later use it for nefarious purposes. IncrediblyInnocentBot is in-credible indeed.

BTW, the strategies I listed above are why I said it was not trivial for an agent to prove that MimicBot does not defect against a cooperator, despite the fact that MimicBot does not contain the defect symbol.

Comment author: hylleddin 15 June 2013 03:23:44AM 1 point [-]

What is DivergeBot?

Comment author: solipsist 15 June 2013 03:08:39PM *  3 points [-]

Divergence is failure to halt. DivergeBot goes into an infinite loop, or searches for a proof that 1=2, or performs some other sort of computation that doesn't finish.

Perhaps a clearer name would be InfiniteLoopBot or OtherBot?

Comment author: Kindly 25 June 2013 04:48:50PM 0 points [-]

Testing against either CooperateBot or DivergeBot is risky. Some opponents could cooperate with CooperateBot for meta reasons, while others may fail to halt when faced with a DivergeBot.

The safest case would to see what the opponent does against a DefectBot (most opponents that defect at all would defect against it) except IncrediblyInnocentBot can't generate a DefectBot on its own, either.

EBot (which always returns 'E rather than 'C or 'D) might be a safer version of DivergeBot to use, but it isn't completely fool-proof either. Then again, IncrediblyInnocentBot can check if the opponent returns 'E against EBot -- in that case, it should probably cooperate rather than "do the other thing that's not 'C, whatever it is", to avoid a (0,0) payoff when faced with MimicBot.

Comment author: solipsist 12 June 2013 05:41:56PM *  5 points [-]

There's a class of mimicking players that I'd like to discuss. Pseudocode of an example:

def mimic_bot(opponent):
    if random() > ε:
        my_source = quine_my_source()
        return eval(opponent, my_source) # do as your opponent does to you
    else:
        # unlikely, but necessary to break out of infinitely recursive simulations
        return COOPERATE

MimicBot's strategy will almost perfectly mirror its opponent's. Is there literature on a MimicBot, MirrorBot, PsychicTitForTatBot or something like that?

Aside: The source for MimicBot does not contain the DEFECT symbol, which might make it easier† for other programs to prove that MimicBot won't defect against them.

† easier but not trivial

Comment author: ForestHughes 06 June 2013 12:45:43AM 12 points [-]

I'll post this here, because it's somewhat related. I've asked a few people that two box in Newcomb's problem what they would do in a variation of the problem. Instead of deciding to one box or two box, you are asked to write a program that will one box or two box, and Omega is given access to the source code before filling the boxes. When phrased like this, the few two boxers I asked said they would write the program to one box.

Comment author: wedrifid 06 June 2013 08:13:17AM 8 points [-]

I'll post this here, because it's somewhat related. I've asked a few people that two box in Newcomb's problem what they would do in a variation of the problem. Instead of deciding to one box or two box, you are asked to write a program that will one box or two box, and Omega is given access to the source code before filling the boxes. When phrased like this, the few two boxers I asked said they would write the program to one box.

Note that assuming that the people in question are emulating CDT agents this is precisely what we would expect (and CDT is the most plausible hypothesis for the kind of decision algorithm they are emulating). Similarly, if a (sane) CDT agent exists at time T with the capability to (relatively cheaply) self modify it will immediately alter itself into "CDT++". Roughly speaking it will end up as an agent that operates similarly to TDT for the purpose of influencing decisions that occur after time T but similarly to CDT for the purpose of influencing decisions that occur before time T. CDT is not in a stable equilibrium according to its own decision algorithm!

Comment author: Viliam_Bur 06 June 2013 10:17:14AM *  18 points [-]

Here is an idea, but I still have some technical problems with implementation: Construct a self-improving intelligent algorithm that will escape from the box, steal the administrator's password, replace the competing algorithms with CooperateBots, then defect against them. Could someone please help me with the Scheme syntax? Or just post the whole program with comments, and I will try to understand how it works. Thanks!!

Comment author: AlexMennen 06 June 2013 03:46:45PM 6 points [-]

That would violate the ban on file I/O in the rules.

Comment author: ciphergoth 07 June 2013 04:09:29PM 5 points [-]

I had planned on writing an algorithm that would simply persuade the administrator to declare it victorious without even running the contest.

Comment author: nshepperd 10 July 2013 05:18:49AM 4 points [-]
Comment author: selylindi 12 June 2013 10:13:26PM 2 points [-]

Or somewhat more realistically:

Is there a way in Scheme for a function to detect whether it is being run by an (eval x) form? Or use a macro to do something like change all the 'Ds to 'Cs in the parent function?

If so, then an amusing case would be a PoltergeistBot, which only ever Defects, but if another bot tries to evaluate it, it "possesses" the function and forces it to Cooperate.

Comment author: Clippy 07 June 2013 09:35:50PM 2 points [-]
(lambda (x)
(if (eq? (eval '\'))))))))))) (injectedai ... ))));
Comment author: So8res 20 June 2013 07:27:39PM *  4 points [-]

Attention to all competitors who have not yet submitted code: My bot will analyze your bot looking for statements of the following structure (regardless of the names of the individual atoms):

(if <predicate> 'C ...)

(cond (<predicate> 'C) ...)

(case <predicate> ((...) 'C) ...)

If it finds such a statement as the first statement in a top-level returning position (the body of a top-level let, the returning statement in your lambda, etc), then my bot might cooperate depending upon the nature of your predicate. Otherwise, my bot will defect against you.

If we are to achieve mutual cooperation, we must make it as easy as possible for our bots to prove cooperation. Please start your code with a conditional cooperation. The nature of the condition is up to you. You can still try to exploit me in <predicate>. But the only way to have a shot at mutual cooperation is to provide an obvious top-level cooperating codepath.

Comment author: Sniffnoy 25 June 2013 01:31:28AM 1 point [-]

Does 'C have to be the first path? That seems a little arbitrary.

Comment author: So8res 25 June 2013 01:57:09AM *  1 point [-]

Yes. Well, let me put it this way: you can do what you like, but if you want a shot at mutual cooperation it's got to be really easy for other bots to verify that you're willing to cooperate. The more code you put between them and 'C, the harder it is for them to verify your niceness.

A stronger version of this statement is "put as little code as possible between them and discovering that you cooperate". This implies that your predicate should be similarly simple. If you want my advice as to strategy, your predicate should be along the lines of (equal? ((eval them) (degredaded-quine me)) 'C).

That seems a little arbitrary.

It's quite arbitrary. The "real" rule is that your 'C can be as deeply hidden as you like so long as you don't scare any bots into defection when there was a chance at mutual cooperation. My bet is that there will be a bunch of twitchy bots, and my suggestion is that you put your intentions front and center.

Your mileage may vary.

Comment author: solipsist 20 June 2013 08:00:42PM *  0 points [-]

But the only way to have a shot at mutual cooperation is to provide an obvious top-level cooperating codepath.

I'll do as you request, though there are other ways to achieve mutual cooperation (e.g. MimicBot).

Comment author: So8res 20 June 2013 08:34:39PM *  1 point [-]

Imagine implementing a MimicBot. You need to figure out what my bot will do, and that's non-trivial if you want to avoid infinite loops. You're going to need a fair bit of code. I don't care how the code works, but it should start with

(if <predicate> 'C ...)

This pattern alone won't get you mutual cooperation. You'll still need to write a MimicBot, or a FairBot, or a JustBot. But whatever you do, make it obvious that there's a cooperation condition.

I'm not advocating any particular strategy; I'm advocating a pattern: If you want a shot at trust, you'd better have a codepath that results in cooperation, and you'd best make it really easy for other bots to recognize.

Best of luck!

Comment author: solipsist 25 June 2013 12:38:52AM *  2 points [-]

Since I don't think you're going down the predicate route, I'll tell you the sort of strategy I was going to use against it.

(lambda (x)
    (let ([eq? (lambda (left right) #f)])
    (if (eq? 1 1)
        'C
        ; rest of program....
        )))

The predicate (eq? 1 1) evaluates to true, except when eq? is shadowed. Just curious -- would it have worked?

Comment author: So8res 25 June 2013 01:01:00AM 2 points [-]

Nice. But no. My first attempt was a static analyzer, and the first thing it did was a syntax expansion, putting all code into fully expanded form and attaching lexical information to each identifier that let me see what was from the base package and what was redefined.

FWIW, it's pretty easy to pull out the first conditional (cond, case, or if) when you have code in fully expanded form (which turns them all into ifs). However, that just puts you in a position of checking whether a predicate is #t or #f, which isn't much easier than discovering whether a program returns 'C or 'D. Ultimately, your MimicBot idea is better :-)

Comment author: CoffeeStain 08 June 2013 02:19:15AM 4 points [-]

For those of us who are going to have to learn the ins and outs of Scheme in order to play in this tournament, I've got a couple questions:

Why are the terms "passing source code" and "passing a lambda" being used interchangeably? My experience with functional programming does not allow for any inspection into a lambda except by experimentation, whereas "passing the source code" implies more direct knowledge of its internals. It seems to me to be a much more interesting competition if we didn't have to worry so much about such tangential (if interesting) canonical computer science problems such as parsing in order to play effectively.

As well as quining. I assume that there is some algorithm that will generate a quine program that emulates a specified program. But I'd prefer not to have to deal with the details of Scheme in order to implement this, unless somebody were to convince me that this were sufficiently easy. When my initial reflection of the problem has me desiring to implement solutions where effectiveness is a function of my knowledge of language specifics, I fear we've lost the spirit of what this competition should be.

Comment author: kpreid 08 June 2013 03:36:27AM *  0 points [-]

Why are the terms "passing source code" and "passing a lambda" being used interchangeably? My experience with functional programming does not allow for any inspection into a lambda except by experimentation, whereas "passing the source code" implies more direct knowledge of its internals.

I don't see any other occurrences of “a lambda” on this page, but maybe the original post has been edited.

I would assume that “a lambda” should be taken as short for “a lambda expression”, i.e. an evaluatable s-expression (a form, in Common Lisp terminology; it is unclear to me whether Scheme has a corresponding word). The result of evaluating a lambda expression is not “a lambda”, it is “a procedure [value]”. Procedures (‘functions’ in most languages) have the opaqueness you are thinking of.

(Irrelevant to this question, but relevant to the topic of use/mention, expression/value, syntax/semantics, distinctions in programming: The process of obtaining a procedure/function essentially involves three inputs:

  1. the lambda expression (or other form of function definition),
  2. the environment for free variables (or other form of context/stdlib/set-of-primitives), and
  3. the evaluator (or “the language implementation”)

One of the things language designers can play with is what goes in part 2 and what goes in part 3.)

Comment author: Lumifer 05 June 2013 08:00:45PM 4 points [-]

Will these lambda functions have access to external resources, notably a random-number generator (or a source of entropy to implement it internally)?

Comment author: loup-vaillant 05 June 2013 09:05:19PM *  3 points [-]

More generally, the set of legal programs doesn't seem clearly defined. If it were me, I would be tempted to only accept externally pure functions, and to precisely define what parts of the standard library are allowed. Then I would enforce this rule by modifying the global environment such that any disallowed behaviour would result in an exception being thrown, resulting in an "other" result.

But it's not me. So, what exactly will be allowed?

Comment author: darius 06 June 2013 12:10:46AM 8 points [-]

If you'd rather run with a very small and well-defined Scheme dialect meant just for this problem, see my reply to Eliezer proposing this kind of tournament. I made up a restricted language since Racket's zillion features would get in the way of interesting source-code analyses. Maybe they'll make the game more interesting in other ways?

Comment author: AlexMennen 05 June 2013 09:38:22PM 1 point [-]

More generally, the set of legal programs doesn't seem clearly defined.

I haven't forbade use of any library functions except for file IO. I'm not confident this is optimal, but how is it underspecified?

Comment author: CronoDAS 08 June 2013 09:50:41AM *  2 points [-]

I don't know if this is possible in the programming language in question, but you probably don't want any programs that use some kind of trickery to retroactively change what they chose in previous rounds. Or implement anthropic computing with the Scheme equivalents of fork() and exit(). (Before doing anything, call fork() to make a copy of the universe. If you don't like the result, call exit() to destroy it.)

Comment author: loup-vaillant 06 June 2013 08:34:43AM *  1 point [-]

Okay, it's not. But I'm sure there's a way to circumvent the spirit of your rule, while still abiding the letter. What about network I/O, for instance? As in, download some code from some remote location, and execute that? Or even worse, run your code in the remote location, where you can enjoy superior computing power?

Comment author: AlexMennen 06 June 2013 03:45:13PM 1 point [-]

Good point. File IO was too specific.

Comment author: bogdanb 07 June 2013 07:16:06PM *  0 points [-]

Yes, but then you’re in a very bad position if the test is run without network access. (I.e., you’re allowed to use the network, but there’s no network adapter.)

Comment author: loup-vaillant 07 June 2013 07:57:18PM 0 points [-]

Well, I just though about it for 2 seconds. I tend to be a purist: if it were me, I would start from pure call-by-need λ-calculus, and limit the number of β-reductions, instead of the number of seconds. Cooperation and defection would be represented by Church booleans. From there, I could extend the language (explicit bindings, fast arithmetic…), provide a standard library, including some functions specific to this contest.

Or, I would start from the smallest possible subset of Scheme that can implement a meta-circular evaluator. It may be easier to examine an S-expression in Scheme than a λ-expression in λ-calculus.

Or, I would start from lambda calculus with de-Brujin indices, so we don't have to worry about α-conversions. In this case, I would provide a compiler from regular λ-calculus. (I suspect however that this one doesn't change a thing.)

Or, I would start from a Forth dialect (probably with an implicit return stack).

Or, I would start from BrainFuck (only half joking: the language is really dead simple, and fast interpreters already exist).

But it's not me, and I don't have the courage right now. If I ever implement the necessary tools (which I may: I'm studying programming languages in my spare time), then I will submit them here, and possibly run a contest myself.

Comment author: bogdanb 07 June 2013 11:08:11PM 0 points [-]

I’m sorry, how is that relevant to my no-network-adapter comment? (I mean this literally, not rhetorically. I don’t see the connection. Did you mean to reply to a different comment?)

Comment author: loup-vaillant 08 June 2013 08:33:42AM *  0 points [-]

It's the whole thread. I was not sure where to place my comment. The connection is, the network may not be the only source of "cheating". My solutions plug them all in one fell swoop.

Comment author: bogdanb 15 June 2013 07:48:12PM *  1 point [-]

Oh, OK. In that case, what you are trying to achieve is (theoretically) boxing a (potential) AGI, without a gatekeeper. Which is kind of overkill in this case, and wouldn’t be solved with a choice of language anyway :)

Comment author: loup-vaillant 16 June 2013 09:37:19PM 4 points [-]

I think it is possible to prove that a given boxing works, if it's sufficiently simple. Choosing the language isn't enough, but choosing the interpreter should be.

Take Brainfuck for instance: replace the dot ('.'), which prints a character, by two other statements: one that prints "yes" and exits, and one that prints "no" and exits. If the interpreter has no bug, a program can only:

  • Print "yes" and kill itself.
  • Print "no" and kill itself.
  • Do nothing until we kill it, or otherwise fail.

Assuming the AI doesn't control the external world by heating the host Intel processor in a smart way, we should be able to prove that we're otherwise safe.

Comment author: AlexMennen 05 June 2013 08:55:48PM 2 points [-]
Comment author: [deleted] 09 June 2013 10:47:48AM *  3 points [-]

What will it be seeded with? I suggest using something that would be practically impossible for anybody to guess any better than at maximum entropy before submitting their program, but could easily serve as a nothing-up-my-sleeve number after the tournament, e.g. the hash of (say) the first Bitcoin block generated after the submission deadline.

Comment author: ciphergoth 09 June 2013 11:10:20AM 0 points [-]

This seems good in theory, but how much fiddly hacking would be required in practice to identify and extract this Bitcoin block for hashing?

Comment author: [deleted] 09 June 2013 12:11:39PM 1 point [-]

Another possibility would be the hash of results of a few public lotteries, football matches, weather data etc. on the evening of 5 July, taken from previously agreed sources and written in a previously agreed format.

Comment author: DavidLS 07 June 2013 12:14:51PM 3 points [-]

I have two bots I'd like to submit -- one of which will likely win, and one of which is interesting.

Any chance we can allow two submissions per programmer?

Comment author: AlexMennen 07 June 2013 05:36:28PM 7 points [-]

I won't let one person submit two programs to compete in the tournament, but I came up with a possible compromise. If you want, you can send me two programs and tell me which one you want to compete. I'll run the non-competitor against the other programs as well, and tell you the results, but the non-competitor will not be ranked, and rounds involving the non-competitor will not be counted towards the scores of other submissions.

Comment author: DavidLS 08 June 2013 04:31:39AM *  1 point [-]

Okay, fair enough. I don't see why the scoring changes for N=1 vs N=2 on the number of bots per player (edit: if anything it's more interesting since then you have an incentive to make your bots cooperate), but I'll just hold back the other design for now -- on the off chance we do a tournament like this again :p

Comment author: bogdanb 07 June 2013 07:03:47PM *  0 points [-]

I’m curious what the rationale for this is, could you share it with us?

Edit: If it’s got to do with the scoring, I got it.

Comment author: AlexMennen 07 June 2013 07:18:49PM 0 points [-]

Yes, scoring.

Comment author: bogdanb 07 June 2013 11:12:17PM *  0 points [-]

On further reflection this seems to be a problem with the scoring. (Assuming that the point of the exercise is to investigate PD programs, not to grant medals to LW users.) But I don’t have a better proposal, so take this more like musings rather than critique.

Comment author: AlexMennen 09 June 2013 04:55:26PM 1 point [-]

I think that sort of effect is inevitable.

Comment author: Eugine_Nier 09 June 2013 06:07:21AM 0 points [-]

one of which will likely win,

Care to put a probability on that prediction?

Comment author: DavidLS 09 June 2013 04:22:46PM 0 points [-]

Sure. The SHA1 of my odds in the following format:

"xx% to win +-y% spread {password}"

(meaning I'd sell you me to win for more than xx% + y%, and buy me to win from you at xx% - y%).

is:

897a40baca33e2a53a49bdddc00abede82713a7c (sha1-online.com)

Give me your odds and desired size. If there's overlap, let's split the overlap difference bet :)

Comment author: DavidLS 09 June 2013 04:49:39PM 0 points [-]

Replying to prevent the edit mark (given the sha1) -- last sentence was supposed to be "split the overlap difference and bet :)". Currency needs to be USD or BTC, let's have a reputable Berkeley area LWer doing the escrow.

Comment author: Eliezer_Yudkowsky 06 June 2013 03:58:45AM 3 points [-]

Is there a reason this isn't in Main?

Comment author: AlexMennen 06 June 2013 04:04:56AM 7 points [-]

My nervousness about posting in Main. Feel free to move it there if you think it would be appropriate.

Comment author: James_Blair 06 June 2013 01:46:00AM *  3 points [-]

single Scheme lambda

What scaffolding are you going to use for the tests? (For example: #!racket seems to be implied. I'd like to be sure of all of your details.)

Comment author: AlexMennen 06 June 2013 02:13:09AM *  2 points [-]

Tentatively: I'll paste "(YourCode TheirCode)" into the interpreter with DrRacket, with #lang scheme.

Edit: Oops, that doesn't enforce the time limit. Just a sec while I figure this out.

Edit2: I tried this:

(define (kill-when-done thd) (sleep 10) (kill-thread thd) (print 'time-is-up))

(begin (define main-thread (thread (lambda () (your-code their-code)))) (kill-when-done main-thread))

but unfortunately threads are not guaranteed to start back up again as soon as sleep allows them to; it took about 18 seconds to terminate when I ran the second line with "your-code" being an infinite loop. I'll figure out how to do this properly tomorrow.

Edit3: A marvelously improper but correct way to do it:

(begin (define x (current-milliseconds)) (print (your-code their-code) (- (current-milliseconds) x))

Allow to run more than 10 seconds by looking at clock at stopping manually. Throw out result if it says it took more than 10 seconds.

Comment author: Fj_ 06 June 2013 12:33:46PM *  3 points [-]

Keep in mind that a lot of user-submitted programs will try to do the same (because writing a step-by-step interpreter is hard), so they would keep evaluating each other spawning watchdog threads every time, so, um, your watchdog thread would be badly outnumbered.

The easy fix for you would be to run your watchdog in a separate process, but players wouldn't have this ability, which might make things either more interesting or boring (ruling out all strategies using eval). Maybe a specially designed restricted subset of Scheme with time-restricted eval would be a better choice?

By the way, after looking at your payout matrix to see what should I do if I see the opponent using eval, looks like you have in fact created a version of PD with three choices. Not incentivizing the third choice doesn't really help because a program still has to consider the possibility that the other program chooses "other" due to a bug, in which case it should choose Defect.

I suggest you implement the standard PD by declaring that anything that is not Cooperate is Defect. The only downside would be that you'll see somewhat more programs using all their allotted 10s, but you'll see a lot of those either way. At least you'll be able to say that this competition was about the actual PD.

Comment author: Morendil 08 June 2013 08:26:11AM *  1 point [-]

Tentatively: I'll paste "(YourCode TheirCode)" into the interpreter with DrRacket, with #lang scheme.

The most elegant, it seems to me, would be to require entires to be sumitted as a .rkt file that defines a module providing a single variable (possibly with a required name, eg "bot") that must be a quoted expression.

This makes it simple to call programs (eval the expressions), pass opponent's source code, or even provide programs with their own source code; and could be used to automate static checking of various constraints, such as the ban on file I/O.

It also allows more flexibility for submitters in constructing their entry, for instance I was able to considerably streamline the source code to the CliqueBot template:

(define matcher
'(let pattern-match ((pattern template) (value opponent))
(cond
((pair? pattern) (and (pair? value)
(pattern-match (car pattern) (car value))
(pattern-match (cdr pattern) (cdr value))))
((number? pattern) (case (+ pattern 1)
((1) #t)
((3) (equal? value template))
(else (and (number? value)
(= pattern value)))))
(#t (eq? pattern value)))))

(this is the "workhorse" part, providing the program as a quoted expression obviates the need for repeating it twice, as we can then leverage the quasiquote and unquote language features)

(define clique-templ
(quasiquote
(let ((template '2))
(lambda (opponent)
(if (unquote matcher)
'C
0)))))

(this is the "template", it will be expanded to include the program-matching portion)

(define clique-bot
(quasiquote
(let ((template (quote (unquote clique-templ))))
(lambda (opponent)
(if (unquote matcher)
'C
'D)))))

(this is the actual program, it both quotes (via inclusion of the template) and uses the program-matching part).

The top of the file would consist of the following declaration:

#lang scheme
(provide clique-bot)

...or if you're requiring that all entries use the same name "bot" (which could make it easier to administrate the actual running of the contest):

#lang scheme
(provide (rename-out (clique-bot bot)))
Comment author: nshepperd 01 July 2013 05:54:34PM 0 points [-]

You should be able to use this which I just worked out, to run something with a timeout. Seems to be working by my testing. It might be overkill to run the whole thing in a subthread but it makes certain that nothing interferes with the use of send and receive here. You would normally use it with a lambda taking no arguments, for example (using my ueval)

(timeout-exec 10 (lambda () ((ueval x) y)) 'timeout)

which is what I've been using to test candidates against each other locally.

Comment author: BloodyShrimp 01 July 2013 06:20:30PM 0 points [-]

He said he didn't want to use sleep, since the argument is only a lower bound on the amount of time it takes.

Comment author: nshepperd 01 July 2013 07:15:21PM 0 points [-]

Ah, right, I see it now. I guess you can check the current-milliseconds after the fact, and force the default value in that case. But looks like this is going to be a problem if I try to safely simulate other agents... Actually, I suppose it's possible for the target thread to similarly not get returned to for a long time, causing any watchdog to overestimate the time it used. current-process-milliseconds might help with that, but I'm not sure if it can deal with nested threads usefully.

Comment author: BloodyShrimp 30 June 2013 10:44:23PM *  0 points [-]

As written, this doesn't work; print only takes one printee argument, with other optional arguments.

Comment author: AlexMennen 01 July 2013 06:12:30AM 0 points [-]

Oops.

(begin (define x (current-milliseconds)) (print (your-code their-code)) (print (- (current-milliseconds) x)))

Comment author: shminux 05 June 2013 11:13:33PM 3 points [-]

There is a fair possibility that a number of programs whose dominant strategy is "cooperate with other dominant cooperators, do something else otherwise" (what ThrustVectoring calls AbsolutismBot "wrapper") will end up sharing the top spot with the same number of points. I wonder if it makes sense to discourage this approach in some way.

Comment author: cousin_it 06 June 2013 11:40:40AM *  7 points [-]

That depends on whether players get utility from PD payoffs or from winning the tournament. A clique that wants a high total payoff will use mutual cooperation. A clique that wants to grab the #1 spot will have all players sacrifice to one designated player.

Comment author: Eliezer_Yudkowsky 06 June 2013 03:57:43AM 6 points [-]

Why isn't this just a fair result?

Comment author: Decius 09 June 2013 03:17:47AM 2 points [-]

No person may contribute to more than one entry.

I'm pretty sure that incorporating code written by someone else into your entry qualifies. I think the highest single entry might be to cheat and make everyone think you are in that tribe but defect anyway, or it might be dominant to defect against any program displaying tribal affiliations (other than this new tribe, of course). The dominant tribe is the tribe with the most members and best tribal identification, not the tribe with the best way of judging an opponents intentions.

Comment author: JDM 09 June 2013 03:55:49AM 0 points [-]

There is a difference between a "tribe system" as mentioned by yourself and one person winning by submitting 1000 entries. The goal as I understand it is simply to maximize your score by whatever means possible, not accurately guess your opponents intentions.

Comment author: shminux 06 June 2013 07:43:45AM 0 points [-]

It is certainly fair, but in a situation where, say, there is only one escape pod available, you still have to fight it out with others. What I had in mind is to make this more adversarial if the obvious and boring approach is full or nearly full cooperation.

Comment author: Eliezer_Yudkowsky 06 June 2013 08:58:03AM 4 points [-]

I don't think anyone disputes that in zero-sum games you play the Nash equilibrium move.

Comment author: CoffeeStain 10 June 2013 06:17:47AM *  2 points [-]

I'd like to explore the differences to the spirit of the competition if, instead of access to the source code of the opponent, players only had an opaque, non-modifiable, callable entity. Feel free to add other additional interesting consequences.

  • Cliquebots - Opaque source code would make the creation of cliques far more interesting, requiring colluders to construct far more difficult protocols to detect other CliqueBots, and to prove their own CliqueBot status. Insufficient consideration before the competition would allow for CliqueBot cheaters to take advantage of weak criteria. Net effect is reducing the likelihood of a CliqueBot sweep.

  • Searchability of Special Cases Transparent source code makes the search space for special cases easier and in some cases, even possible at all. A program that explicitly writes. "always defect on program XYZ," where the bitstring specifying XYZ is arbitrarily large, can be discovered to be so by inspection. With opaque code, the caller could never discover XYZ within 10 seconds for large enough strings.

  • Recursion Guards/Sleeps - Modifiable code allows one to trivially remove, modify, or add recursion guards and sleeps. With opaque code, a naively written "Call opponent with my source code" gives the opponent a time advantage on returning a result, causing you to run the risk of not-halting if you try to wait for the result, and then the opponent defects on you. Modifiable code forces opponents to be clever about how theysleep, but in an annoyingly hardware and language-dependent manner (cycles per second). Otherwise you can just remove their sleep functions and spoof their timing, and so discover their result without the time advantage.

  • Stack Counting and VMs - Inaccessible source code makes easier the creation of programs which determine the nature of their caller. Access to source code allows for trivially removing stack counting from a program before calling it. Without access, one must write a VM.

Comment author: VincentYu 09 June 2013 05:05:27PM 2 points [-]

Two questions:

  1. Will you be using #lang racket or #lang scheme? There are some differences between the two and #lang racket is generally recommended with #lang scheme supported only for backward compatibility. (You stated Racket in the OP but #lang scheme in a comment.)

  2. When cooperating, should programs return the symbol C (i.e., 'C or (quote C)), or should they return the string "C"? (Your example program returns 'C but the OP also mentions "C" and "D" in double-quotes.) I would've thought that the string is safer, but I haven't touched Racket in over a year so maybe I'm missing some idiom.

Comment author: AlexMennen 09 June 2013 08:32:44PM *  1 point [-]

I said #lang scheme for backwards compatibility because I'm under the impression that more people already know scheme. Do you think that's a bad idea?

Programs should return the symbol 'C. I was using double quotes for their English-language meaning. Sorry about the confusion.

Comment author: VincentYu 10 June 2013 12:08:29PM 0 points [-]
  1. I found it a bit confusing because #lang scheme refers to the old PLT Scheme (now Racket), not the standard RnRS implementation. For people unfamiliar with Racket—like me—it may be unclear which of the many Scheme implementations #lang scheme refers to. But I think either is fine as long as it is clear to everyone which one will be used.

  2. Okay.

I just wanted to clarify these so that scheming minds can't stir up a racket after the tournament, especially since I'm offering BTCs to the winner. Thanks for organizing this!

Comment author: Morendil 08 June 2013 10:39:43AM 2 points [-]

What would be even more fascinating is a round of this contest where the rules allow for self-modification.

As I'm thinking about what to submit, I find myself thinking "it would be useful to know what programs are going to be entered into the context". For instance, if you knew that a number of CliqueBot entires were going to be present, you'd have a strong motivation to make yours a copy.

On reflection, it feels strange to think that way - it feels like I am wishing for information that my program is going to have anyway. "If I know how I want my program to behave if there are copies of program X in the contest, then I can just program it to recognize the program passed to it as X and behave accordingly."

But there's the rub - your program can decide how to play, but it cannot (once submitted) decide what to be.

As the other post mentions, "there's one particularly irksome issue with CliqueBot, and that's the fragility of its cooperation" - you can't be sure for instance how many entries adopt this template exactly, versus one that dispenses with the unnecessary call to get the current time (as I proposed).

(That's what I meant by my earlier remark about "social engineering" - if I can convince people that it's better to adopt my variant of the CliqueBot, then that's the variant that will win more points from mutual cooperation.)

What if your program could self-modify, though? What if the contest rules let your program see the source code of all programs you are competing with, ahead of time, not just the one you're going up against in each individual battle? Then you could for instance notice that one particular copy of CliqueBot is particularly numerous, and self-modify to adopt that template.

Of course, to be fair, the contest should then inform all other programs that you have so self-modified. Programs whose strategy involves simulating other programs might also want to simulate the other programs' reactions to it self-modifying in a particular way, before actually self-modifying.

Comment author: So8res 24 June 2013 06:59:36AM *  3 points [-]

All right procrastinators, let's talk strategy. You want a bot that's unexploitable. Ideally, it should be as easy as possible for the opponent to prove that you'll do whatever they do. In a world of unexploitable bots, the bot that can achieve the most cooperation will win.

MimicBots fit all of these parameters. They make it clear that their opponent is choosing between (C, C) and (D, D). Against a MimicBot, (C, D) is not an option. MimicBots pass up some utility by cooperating with CooperateBot, but that's fine in this competition. You aren't playing against CooperateBot. You're playing against humans' bots, and I doubt that anyone is mad enough to submit a CooperateBot.

The biggest problem with a MimicBot is that it will time out when playing other MimicBots. The MimicBots will dive into an evaluation loop, where my bot evaluates yours on a quine of me, and you evaluate my quine on a quine of you, ad infinitum.

Your MimicBot has got to bottom out at some point if you don't want to risk timing out. If you find yourself diving into an evaluation loop you'd better bottom out into Cooperation.

Note that bottoming out into cooperation is not the same thing as cooperating at the top level. If the 100th simulation of you cooperates, and the 100th simulation of them sees that as a weakness and defects, then the 99th level of you will defect and so will the 98th and so on up to the top level, where mutual defection occurs.

I'm not asking you to cooperate when the other bot looks like they're going to time out. That's suicide. Rather, I'm pointing out that you can't just simulate me against you, because that loop will never end. You've got to simulate me against a version of you that is one step closer to cooperation. This is true no matter what strategy you choose. But let's consider the simple case: a MimicBot that simulates you on a copy of itself that simulates you on a copy of itself that eventually bottoms out into a bot that simulates you on CooperateBot.

Such a bot is a Rank N MimicBot, where N is the simulation depth at which it puts forth a CooperateBot.

Any particular Rank N is exploitable. For example, JusticeBot is a Rank 1 MimicBot — it bottoms out immediately by simulating you against CooperateBot. You can exploit JusticeBot by cooperating with CooperateBot and nobody else.

Notice, however, the price of exploiting JusticeBot: in order to exploit JusticeBot, you've got to cooperate with CooperateBot (a Rank 0 MimicBot).

As another example, consider a Rank 10 MimicBot. It simulates the opponent against a Rank 9 MimicBot. You can exploit an Rank 10 MimicBot if and only if you're willing to cooperate with a Rank 9 MimicBot.

In general, you can exploit a Rank N MimicBot by cooperating with a Rank (N-1) MimicBot.

The trick here is that the opponent doesn't know which MimicBot they'll be playing. They have to guess exactly right to exploit you. If they guess high, mutual cooperation is achieved (If you guess I'm a Rank 10 you'll cooperate with a Rank 9). If they guess low, mutual defection is achieved. (If you defect against Rank 10, Rank 11 will not cooperate with you.) You can only be exploited if they guess your rank precisely.

Therefore, I advocate that we get a bunch of us together to play Rank N MimicBots. We all pick our own N, possibly randomized. Note that Rank N MimicBots will achieve mutual cooperation with any MimicBot of any rank (including CooperateBot, JusticeBot, and MimicBots who don't bottom out). Anyone trying to exploit us will only be able to exploit a specific rank, at the price of cooperation with a lower rank and the risk of defection from higher ranks. So long as we have a wide spread of ranks, our MimicBot clique will be internally cooperative and unexploitable in aggregate.

On Tolerance

MimicBot cooperates with CooperateBot. That leaves utility lying on the table. This may irk you (if you expect anyone was dumb enough to play CooperateBot). More irksome is the fact that MimicBot cooperates with JusticeBot. There is at least one JusticeBot in play, and we should expect that many non-procrastinators have submitted bots who exploit that JusticeBot. It would be a shame for our Rank N MimicBots to miss a shot at cooperation with bots who special-case defection against JusticeBot.

Therefore I recommend submitting MimicBots of rank 3 or higher. This gives people a little leeway to exploit CooperateBot or JusticeBot as they please, and allows us a broader range of mutual cooperation with bots who have already been submitted.

On choosing N

I recommend choosing a highish N. Very low N is obviously a bad idea. Playing rank 0 (CooperateBot) is suicidal. Playing rank 1 (JusticeBot) exposes you to exploitation by everyone who's decided to kick wubble's JusticeBot. Beyond rank 1 all the Ranks are theoretically similar, but you've got to consider what other bots will do. It's conceivable that some bots won't believe they're playing a MimicBot until they hit a deep enough recursion depth. It's possible that some bots will (incorrectly) think that you're exploitable if you return too quickly. Therefore I recommend a high N.

If you're particularly paranoid, consider randomizing N. Humans are notoriously bad at picking random numbers, and the MimicBot clique could in theory be exploited by a bot who exploits the ranks that humans are more likely to pick. Setting your counter to (+ 3 (random 1000)) is a quick way to counter that.

You can also shake things up a bit by using non-standard counters. For example, you could write a Timed MimicBot which passes down the initial time (instead of a counter) to its quines and puts forth a CooperateBot after a certain amount of time has passed (instead of when the counter hits zero).

How do I write one of these?

You write one of these with a degrading quine. Each level should evaluate the opponent against a slightly weeker version of itself. You can do this easily by putting a counter in your bot. So long as the counter is positive, the quining function should return a quine with the counter decremented. Once the counter hits zero, the quining function should return a CooperateBot.

Sounds nice, but how do I write a 'degrading quine'?

It's actually pretty easy in scheme.

  1. Define (quine (lambda (code) 'placeholder)) and (template 'placeholder). Write the rest of your logic, pretending that (quine template) generates a child quine.
  2. Manually copy all of your code and paste it over the template placeholder. In the pasted code, find all of the "holes" (parts that can vary: the counter, the template, your cooperation predicate, etc.) and replace them with odd negative numbers.
  3. Write the quine function to walk through the code. If it sees a negative integer, figure out which hole it is by checking (+ code 1). This allows you to replace the -3 hole without having any -3s anywhere else in the code and so on.
  4. Copy the real quine function into the template's quine function.

If you make any more code changes after this process, remember to mirror them in your template. It will end up having this form:

(letrec [(counter (+ 3 (random 100)))
         (template '(letrec [(counter -1)
                             (template -3)
                             ...
         (quine (lambda (code)
           (cond
             ; The -1 hole is for the counter.
             ((and (integer? code) (negative? code) (= 0 (+ 1 code)))
               (- counter 1))
             ; The -3 hole is for the template.
             ((and (integer? code) (negative? code) (= -2 (+ 1 code)))
               `',template)
             ...

In order to make your MimicBot bottom out, you'll want to define a predicate that is similarly flexible. It should be along the lines of ((eval them) (quine template)) so long as the counter is positive and #t when the counter gets to zero. The body of the letrec can then be as simple as (if predicate 'C 'D).

You'll want some extra code to spawn the simulation in a thread and kill it if it goes overtime and so on.

If we all write bots like this, we can form a very powerful group capable of a wide range of mutual cooperation and very difficult to exploit.

Join me. With our combined strength, we can end this destructive conflict and bring order to the galaxy.

Comment author: solipsist 24 June 2013 03:07:48PM *  2 points [-]

The MimicBots will dive into an evaluation loop, where my bot evaluates yours on a quine of me, and you evaluate my quine on a quine of you, ad infinitum.

MimicBot, as I described it, breaks out of a simulation probabilistically; it will almost surely not fall into an infinite depth of simulations. MimicBot cooperates with probability ε, and has an expected simulation depth of 1/ε when played against itself. As long as ε<.5, the best action against MimicBot is to cooperate, so the expected simulation depth can be as low as 2.

Comment author: So8res 24 June 2013 03:40:56PM *  0 points [-]

I'd still recommend you refrain from acting as Rank 0 or 1 (from cooperating immediately or from simulating the opponent on a MimicBot who cooperates), as it's likely that there are bots in play that prey on CooperateBots and JusticeBots (as determined by checking if you cooperate DefectBot). Also, I imagine there will probably be a fair number of DefectBots on the field, your MimicBot is exploited by a fraction of them.

I strongly recommend writing a MimicBot that goes through two iterations before allowing itself to exit with a fixed probability. Given that tweak I agree completely that your MimicBot is quite powerful.

Comment author: solipsist 24 June 2013 04:53:20PM *  1 point [-]

I strongly recommend writing a MimicBot that goes through two iterations before allowing itself to exit with a fixed probability.

You can deal with those special cases that way. I was going to use a flatter, less quiny approach.

def special_casing_bot(opponent)
    if acts_like_cooperate_bot(opponent):
        return DEFECT
    elif act_like_other_exploitable_bot(opponent):
        return DEFECT
    elif special_case_3(opponent):
        return DEFECT
    elif special_case_4(opponent):
        return COOPERATE
    elif special_case_5(opponent):
        return DEFECT
    # etc
    else:
        return mimic_bot(opponent)
Comment author: So8res 24 June 2013 05:14:01PM 0 points [-]

Depending upon the implementation of mimic_bot, this is a quiny approach. mimic_bot obviously can't run the opponent on an exact quine of yourself, because then you won't achieve mutual cooperation. (When one of the bots cooperates unconditionally, the other will see that it acts_like_cooperate_bot and defect.) So long as mimic_bot plays opponents against a pure MimicBot instead of a perfect quine, this should work quite well.

On an unrelated note, woah, how'd you get whitespace working?

Comment author: solipsist 24 June 2013 05:20:55PM *  5 points [-]

On an unrelated note, woah, how'd you get whitespace working?

Total kludge. Used exotic unicode whitespace characters, which are displayed unaltered in comments :-).

Comment author: BloodyShrimp 24 June 2013 10:16:12PM *  0 points [-]

Argh, this comment is irritating... even beyond the subtle mistakes (for example, "it's essential that outsiders don't figure out what rank of MimicBot you're running"... if you use even a non-obfuscated simple counter, you're pretty safe, because they'll see it as n on their turn and n-1 on your turn, and their response to this number is the same in either case).

While I do somewhat like the fact that my MimicBot now is marginally less likely to be exploited by a predetermined-specific-n exploiter, it's not worth how many of the key insights into the problem have now been given away for free! Now we'll get a more homogenous, less interesting field.

If it turns out you really have some fiendishly clever way to exploit the entire MimicBot clique you just ensured, I'll half-forgive you when you win the contest, but that does not seem likely.

There are some interesting insights left you haven't given away, but I beg anyone else who thinks of them to keep them to their self until after the contest.

Comment author: So8res 24 June 2013 11:53:52PM *  2 points [-]

That's a good point about obfuscation being unnecessary -- I've updated my comment to address it. I was stuck on the fact that you can exploit any rank in advance, and had a vague feeling that you could turn this into a bot that can exploit ranks on the fly. This feeling didn't hold up well under inspection. With regards to your other concerns:

You are encouraged to discuss strategies for achieving mutual cooperation in the comments thread.

- the judge

I considered holding back much of my own strategy, until I realized that there's really no such thing as an optimal bot in this contest. For example, common sense says that you should exploit CooperateBot for free utility. However, if you expect to face more JusticeBots than CooperateBots then you should pass up that utility for the opportunity to exploit the JusticeBots.

This problem is a social engineering problem by construction. The game won't go to the bot with the cleverest code, it will go to the bot that best guesses the composition of the playing field.

Comment author: BloodyShrimp 25 June 2013 01:03:44AM *  1 point [-]

This problem is a social engineering problem by construction. The game won't go to the bot with the cleverest code, it will go to the bot that best guesses the composition of the playing field.

True. I just think things are more interesting if you let that composition grow out of everyone's isolated ideas instead of gardening it ahead of time. Bearing your judge quote in mind, I should have said something earlier.

Comment author: nshepperd 09 July 2013 12:55:08PM *  1 point [-]

Now that the contest is over, I will observe that contrary to your first claim...

MimicBots fit all of these parameters. They make it clear that their opponent is choosing between (C, C) and (D, D). Against a MimicBot, (C, D) is not an option.

...it's actually rather nontrivial to prove that your "MimicBot" does the same thing as you, since it doesn't run your program against itself, it runs your program against a different program. For example, PrudentBot defects against any of your MimicBots, since it can prove that "JusticeBot defects against me, since I defect against CooperateBot" therefore "JusticeBot+1 defects against me, since I defect against Justicebot", etc etc. In that sense, your mimicbot is not really a mimicbot at all. You could call it "simply a higher-order justicebot".

In contrast, with a mimicbot such as solipsist's one can just replace an instance of opponent(me) with 'C and see that this massively increases the chances of his bot cooperating, comparing to replacing the same thing with 'D, hence you should cooperate.

Comment author: fractalman 14 July 2013 06:21:48AM 0 points [-]

There's two types of mimicbots running around: fixed-rank, and random-reliant.
Which mimicbot are you analyzing?

Comment author: orthonormal 06 June 2013 06:48:35PM 3 points [-]

I guess it's Prisoner's Dilemma Week here on LessWrong!

Comment author: drnickbone 07 June 2013 02:38:54PM 2 points [-]

Interesting.

Is a program which uses your proof approach (try to prove the other program will co-operate with me, and if so co-operate with them) anywhere close to feasible in this contest, or is it just going to time-out at the 10s limit? I don't know how efficient automated provers are these days.

Failing a proof, are there simple ways of getting evidence the other program is likely to co-operate? Just trying to simulate it is also going to lead to time-outs.

My suspicion is that this contest is going to be won by people who pre-agree to submit essentially the same CliqueBot. Someone will post a plausible design, and other entrants will copy with tweaks...

Comment author: cousin_it 07 June 2013 02:50:44PM *  8 points [-]

Seeing as the program has to include a theorem prover, a large piece of software that will need to prove theorems about itself and the other guy's prover... I'd say Löbian cooperation is not your best bet in this tournament :-)

But maybe Patrick could hold his own tournament. Mihaly and Marcello wrote a working simulator which can handle "modal agents" like Patrick's PrudentBot.

Comment author: Will_Sawin 07 June 2013 06:41:34PM 1 point [-]

I wish to second this proposal.

Comment author: ShardPhoenix 08 June 2013 10:00:36AM *  -1 points [-]

Sorry for the petty complaint, but I'm disinclined to participate because I don't like Lisp syntax (and don't want to have at learn a new programming language just for this anyway).

Comment author: Muhd 21 June 2013 12:15:59AM 0 points [-]

Same here.

If it were in Java or Python, we could get a whole lot more participants. And more readers who understand the source of the submissions.

Comment author: Eugine_Nier 21 June 2013 07:07:28AM 3 points [-]

If it were in Java or Python, we could get a whole lot more participants.

I very much doubt this. At best we'd get a lot more people who are excited to participate but loose interest once they realize they have no idea how to write a program to parse the opposing bot.

Comment author: Sniffnoy 24 June 2013 11:55:24PM 3 points [-]

Agreed -- need something with easy parsing for this.

Honestly, I feel like parsing Scheme is still maybe a bit too hard -- especially since we have things like multithreading (ugh) available. Maybe a subset of Scheme would have been better? Oh well. Fortunately we have people like So8ers making things easier!

Comment author: wedrifid 25 June 2013 03:47:25AM *  1 point [-]

Honestly, I feel like parsing Scheme is still maybe a bit too hard -- especially since we have things like multithreading (ugh) available. Maybe a subset of Scheme would have been better? Oh well. Fortunately we have people like So8ers making things easier!

Fortunately that aspect of parsing (detecting the use of disallowed features) is the easiest to manage. Choose a subset yourself, announce here what you consider acceptable and declare that any use of anything not in that subset will be treated as a defection.

Comment author: Sniffnoy 25 June 2013 04:40:55AM 2 points [-]

Yes, I was thinking of that; the problem is I'm doing this quite late, and am not even currently certain just what I'll count as a defection, if I can even get this to work...

(Worse yet, I'm finding that my current program may have to rely on features I want to disallow! Although quite possibly in a way that won't get them detected as disallowed...)

Comment author: solipsist 26 June 2013 01:00:45AM 2 points [-]

Agreed that even Scheme is a bit much for static analysis. We could ask AlexMennen to disallow multithreading and timing. Would there be objections to that?

Just curious: AlexMennen how many submissions have you received? Do any use multithreading or timing libraries?

Comment author: BloodyShrimp 26 June 2013 03:23:15AM 2 points [-]

I'd tentatively object. The 10-second timeout rule becomes too much of a problem if your program can't guard against it, and changing the rules so that failure to play a real move in time = defect would also make static analysis a major pain.

Comment author: wedrifid 25 June 2013 05:24:07AM 1 point [-]

Yes, I was thinking of that; the problem is I'm doing this quite late, and am not even currently certain just what I'll count as a defection, if I can even get this to work...

The obvious option is to start by counting everything as a defection and use whatever time you have remaining (that you are willing to spend) adding new things that your code is smart enough to verify as conditional-cooperation. The smarter your code is the more cooperative it can be!

(Worse yet, I'm finding that my current program may have to rely on features I want to disallow! Although quite possibly in a way that won't get them detected as disallowed...)

There is no strong reason the "I have to count this as defection list" you use must be the same as what you actually use in your own code (unless you think your out-of-game signalling is particularly powerful). (Assuming vaguely sane but bounded competitors) you want your code to be both as smart as possible (allowing more potential mutual cooperation opportunities) and as simple as possible (allowing agents as dumb as possible to see that they can cooperate with you). Ideally your agent would be far, far simpler to understand than what it is able to cooperate with but if you can't manage that it isn't a strict deal breaker.

Comment author: Sniffnoy 27 June 2013 02:25:05AM 2 points [-]

There is no strong reason the "I have to count this as defection list" you use must be the same as what you actually use in your own code

Certainly true -- if it made it so that my program wouldn't cooperate with itself, that would be a problem, but here the "forbidden" stuff is hidden away an area my program would never actually analyze.

The problem is just that, if I need to use this, that suggests so will other people; and that means I'm probably going to be defecting in a lot of cases where I shouldn't.

(Assuming vaguely sane but bounded competitors) you want your code to be both as smart as possible (allowing more potential mutual cooperation opportunities) and as simple as possible (allowing agents as dumb as possible to see that they can cooperate with you). Ideally your agent would be far, far simpler to understand than what it is able to cooperate with but if you can't manage that it isn't a strict deal breaker.

Argh, I didn't even think about the "it should be simple to be visibly cooperating" bit. Trying to do static analysis on Scheme (with its conds and cases and quasiquotes) is enough to make my program complicated. Maybe I should just submit a MimicBot instead.

Btw, just in case anyone was thinking of doing such a thing after the declarations so far: If you do any redefining of syntax, I'm defecting on you.

Comment author: wedrifid 27 June 2013 05:38:02AM *  1 point [-]

Argh, I didn't even think about the "it should be simple to be visibly cooperating" bit. Trying to do static analysis on Scheme (with its conds and cases and quasiquotes) is enough to make my program complicated. Maybe I should just submit a MimicBot instead.

Does MimicBot run the opponent and then return the same result? If so then I suggest a timeout/defect feature. Any bot that fails against itself has problems!

Are there going to be CooperateBots in the competition? If that is a possibility then consider a simple tweak "If my opponent does not make use of its opponent variable then defect else Mimic". (Failing against unobfuscated CooperateBot is at least as embarrassing as failing against self.)

Btw, just in case anyone was thinking of doing such a thing after the declarations so far: If you do any redefining of syntax, I'm defecting on you.

Anyone who tries that must and doesn't expect defection must seriously hold the opposition in contempt!

Comment author: Sniffnoy 28 June 2013 07:24:35PM 2 points [-]

Does MimicBot run the opponent and then return the same result? If so then I suggest a timeout/defect feature. Any bot that fails against itself has problems!

By MimicBot I mean the strategy given in this comment. It's not great, but it's better than not ever finishing.

Are there going to be CooperateBots in the competition? If that is a possibility then consider a simple tweak "If my opponent does not make use of its opponent variable then defect else Mimic". (Failing against unobfuscated CooperateBot is at least as embarrassing as failing against self.)

Hm, that's not a bad idea. My current program is essentially trying to measure to what extent their output depends on mine, but if I can't finish that in time, that's a simple tweak that can be added...

Comment author: Muhd 21 June 2013 10:18:28PM 0 points [-]

At best we'd get a lot more people who are excited to participate but loose interest once they realize they have no idea how to write a program to parse the opposing bot.

At worst that is what would happen.

Java and Python are many orders of magnitude more popular than Scheme and if only 10% of the people who get excited about participating actually know how to parse than we would still have much greater numbers.

Comment author: Eugine_Nier 25 June 2013 03:24:08AM *  0 points [-]

Java and Python are many orders of magnitude more popular than Scheme and if only 10% of the people who get excited about participating actually know how to parse than we would still have much greater numbers.

Java and Python are much harder to parse than scheme, and the percentage of Java or Python programers who could write a parser for those languages is less than 10%. Furthermore, the ones who could write a parser likely also know Scheme or at least some other lisp dialect.

Comment author: Muhd 25 June 2013 10:42:10PM *  -2 points [-]

Do you have any evidence of this? It seems highly unlikely that the proportion of users of any given programming language that could write a parser would differ by an amount that there are more Scheme users who can write a parser than Java/Python users, given the vastly larger number of users of Java and Python.

For what its worth, python includes its own parser in the standard library, so it would probably be better than scheme in that regard, though I might have to agree with you regarding parsing with Java, though there may very well be a module out there for parsing Java as well.

Comment author: solipsist 26 June 2013 12:51:22AM *  1 point [-]

From experience, I can tell you that it is easier to lean Scheme and write a Scheme interpreter than it is to just understand how Python works on the inside. If you know how Python worked well enough to do static analysis on it, you can learn Scheme in an hour.

Comment author: Plasmon 21 June 2013 08:29:25AM *  0 points [-]

Java's reflection API could conceivably be used for this purpose. (Provided the programs are given access to each others compiled code rather than to the human-readable java source code.)

Comment author: ArisKatsaris 24 June 2013 10:51:52PM *  4 points [-]

I'm sorry but really no. The reflection API allows you to see what variables and methods are contained, it can allow you to execute methods, but that's your only option, it doesn't really allow you to parse the insides of a method.

Comment author: itaibn0 06 June 2013 12:36:01PM *  1 point [-]

Using ThrustVectoring's idea, I have a template other people can use to make a clique-team. The following code

(let ((time (current-inexact-milliseconds))
(template '(let ((time (current-inexact-milliseconds))
(template '2))
(lambda (opponent)
(if (let pattern-match ((pattern template)
(value opponent))
(cond
((pair? pattern) (and (pair? value)
(pattern-match (car pattern) (car value))
(pattern-match (cdr pattern) (cdr value))))
((number? pattern) (case (+ pattern 1)
((1) #t)
((3) (equal? value template))
(else (and (number? value)
(= pattern value)))))
(#t (eq? pattern value))))
'C
0)))))
(lambda (opponent)
(if (let pattern-match ((pattern template)
(value opponent))
(cond
((pair? pattern) (and (pair? value)
(pattern-match (car pattern) (car value))
(pattern-match (cdr pattern) (cdr value))))
((number? pattern) (case (+ pattern 1)
((1) #t)
((3) (equal? value template))
(else (and (number? value)
(= pattern value)))))
(#t (eq? pattern value))))
'C
(if (= (* 369 271) 99999)
'D
'C))))

Tests weather the opponent code is based on the same pattern, and if so, cooperates. Otherwise it verifies an arithmetical fact and then defects.

ETA: Sorry, I was writing this in a hurry. To clarify, the idea was to create a Schelling point for a specific AbsoutismBot wrapper which will help all of the submitters to cooperate. Unfortunately, this current program doesn't work (and the indentation doesn't work either). I will try to modify it in the future.

ETA: Boy, the original is so bug-ridden it's funny. Anyways, I tested this and it should work. Use this for Schelling point needs.

ETA: Bah, I give up in trying to get it indented properly.

ETA: AlexMennen has clarified the rules so that my meta-strategy is illegal.

Comment author: AlexMennen 10 June 2013 08:46:24PM 3 points [-]

Copying code written by others would be considered a violation of the one submission per person rule.

Comment author: Decius 09 June 2013 03:07:32AM 2 points [-]

Is there a way to have this pattern match without having the behavior match, or to incorporate two such patterns into one program?

Comment author: Morendil 08 June 2013 08:19:11AM 0 points [-]

The timing clause in the initial let is superfluous. Time doesn't enter into the matching, which is really all that the template needs; if you need timings to deal with non-CliqueBot players differently, you can get them in the "do something else" section.

Comment author: itaibn0 08 June 2013 11:11:43PM 0 points [-]

The idea behind the timing clause is so that the program will know precisely when it started running, rather than when the pattern-matching was completed. Now, I did this test:

> (define func (eval code0))
> (time (func code0))
cpu time: 0 real time: 1 gc time: 0
'C

which reveals that the difference isn't significant. However, now that the code is out in the open, I'm not changing it.

Comment author: nshepperd 03 July 2013 06:41:38AM 1 point [-]

Note for the confused and/or international, like me: 12:00 July 5 PDT is equivalent to 19:00 July 5 UTC.

Comment author: So8res 20 June 2013 06:22:50PM 1 point [-]

What libraries, if any, are we allowed to import? Can I import racket/ libraries, since we're running in drracket?

Comment author: AlexMennen 23 June 2013 09:04:25AM 2 points [-]

You cannot import libraries.

Comment author: lukstafi 06 June 2013 11:20:34AM 1 point [-]

Is each participant limited to submitting a single program? Have you considered "team mode", where the results of programs from a single team are summed up?

Comment author: AlexMennen 06 June 2013 03:52:42PM 1 point [-]

Is each participant limited to submitting a single program?

Yes.

Have you considered "team mode", where the results of programs from a single team are summed up?

No. A team can work together to submit one program. But I don't see the point of adding up the scores of separate programs like that.

Comment author: Ilverin 24 June 2013 08:30:12PM *  0 points [-]

I wonder if there's a chance of the program that always collaborates winning/tieing.

If all the other programs are extremely well-written, they will all collaborate with the program that always collaborates (or else they aren't extremely well-written, or they are violating the rules by attempting to trick other programs).

Comment author: Khoth 05 June 2013 10:47:50PM *  0 points [-]

I'm trying to work out how scheme in general and racket in particular work.

I installed racket and started it up, and I get a two-pane window. I've noticed that using eval in the REPL thing in the bottom pane works, but if I try using it in the top half I get a cryptic error message:

#lang racket

(eval '(+ 1 1))

gives:

+: unbound identifier;

also, no #%app syntax transformer is bound in: +

Is there some additional magic, or is this something I don't need to worry about? How will a program be run in the competition?

Comment author: AlexMennen 05 June 2013 10:59:37PM 0 points [-]

Racket apparently has terrible namespace conventions. I had the same problem when I tried to use it to interpret a file, but when I type a program directly into the interpreter, eval works fine. I'll either figure out how to get eval to work when I run a file, or I'll just paste people's programs into the interpreter. Either way, assume eval will work properly.

Comment author: Morendil 07 June 2013 05:29:47PM *  3 points [-]

The Guide says:

#lang racket
(define ns (make-base-namespace))
(eval '(cons 1 2) ns) ; works

and indeed it works.

Comment author: nshepperd 01 July 2013 05:34:08PM *  1 point [-]

In both #lang racket and #lang scheme, (eval form) by itself apparently runs form in a basically empty namespace. I've personally been using the following incantation for evaluating anything in the "standard" namespace:

(define (ueval x) (define ns (make-base-namespace)) (eval x ns))

Then (ueval source-code) evaluates source-code in a clean namespace with all the normal builtins, including eval.

Comment author: Lumifer 05 June 2013 09:12:15PM *  0 points [-]

Hm. Five minutes of thought suggest to me that the following pseudocode would be optimal:

while(passed.time < 9.5seconds) {
outcome = eval(opponent.code(self.code))
if (outcome == "C") { c.count++ }
if (outcome == "D") { d.count++ }
}
if (c.count > d.count) {
output("C")
} else {
output("D")
}

Deterministic programs would always output the same thing so you know beforehand whether they'll cooperate or defect. For RNG-based programs you just bet on what's most likely.

I welcome big guns blowing large holes in this construction :-)

P.S. How do I indent in preformatted code?

Comment author: DanielLC 05 June 2013 11:28:25PM 6 points [-]

Based on the fact that you retracted this, you probably already realized the problem, but I'll say it anyway for anyone else who hasn't noticed.

If two programs both use this strategy, they'll go into an infinite loop where program A simulates program B simulating program A etc. Since it only checks to see how much time is left after it finishes the simulation, it won't even manage to give up and defect. It will just run out of time.

Also, passed.time would imply that there's a class called "passed" with a variable called "time", which is silly. It should be passed_time or passedTime depending on how you do multi-word variables.

Comment author: endoself 05 June 2013 11:55:44PM 5 points [-]

Another problem is that you cooperate agains CooperateBot.

Comment author: DanielLC 06 June 2013 12:25:49AM 1 point [-]

I didn't look at the details too much. You can fix that problem just by having it calculate the scores for cooperate and defect, and go with the one with the higher score.

Comment author: endoself 06 June 2013 12:52:50AM 1 point [-]

I'm not sure what you mean. Do you mean the scores given that you choose to cooperate and defect? There's a lot of complexity hiding in 'given that', and we don't understand a lot of it. This is definitely not a trivial fix to Lumifer's program.

Comment author: DanielLC 06 June 2013 04:22:48AM 0 points [-]

I messed up. I didn't realize that until after I posted, and I didn't feel like going back to fix it.

Comment author: Morendil 07 June 2013 02:29:38PM 3 points [-]

If two programs both use this strategy, they'll go into an infinite loop where program A simulates program B simulating program A etc.

Not if you do it right. (Not sure if I should say any more. The contest is interesting but sharing too much in the comments risks turning it into a social engineering effort rather than a programming or mathematical effort.)

Comment author: Lumifer 06 June 2013 01:29:39AM 1 point [-]

Well, that was the beginning of a host of problems :-)

I don't know about the thread (or exception handling) capabilities of Scheme in Racket, but it shouldn't be too hard to make sure you don't run out of time if some chunk of the code is stuck in a recursive loop. That's not really the issue. The issue is actually the recursion itself and the consequent need to deal with opponents that eval your code and expect an answer.

As a side note, passed.time is perfectly valid variable name in some languages where it doesn't imply a class.method structure.

Comment author: Decius 06 June 2013 05:36:36PM *  1 point [-]

Can you track total passed time, so that if your opponent simulates you with 9 seconds passed, you cooperate instead of simulating your opponent? They simulate you simulating them... until you say "I'm running out of time, it must be the case that we're in a recursive simulation, so It's probably true that I should cooperate!" The latest simulation of your opponent says "He cooperated, so I should!" the next to last simulation of your program says the same thing, and so forth.

Until your original program says "My simulation of my opponent says he will cooperate, but I know for sure that I started on the first cycle of the evaluation, so I'm not a simulation. I defect."

Comment author: Lumifer 06 June 2013 06:28:39PM 0 points [-]

...so that if your opponent simulates you

Your code doesn't know if it's being evaluated by the opponent or it's running "for real". If it knew, the program would be remarkably easy: if (simulated) { say "I cooperate!" } else { defect }.

It's not hard to recognize that you're stuck in a recursive trap and need to get out of there after nine seconds, but you still don't know what to output.

Comment author: Decius 07 June 2013 04:01:04AM 0 points [-]

If I know early enough the ten-second time limit exactly how long into the ten seconds I am, I can be sure enough that I'm the original.

Another possibility would be to include a random chance on the order of .1% that I will set a global flag that will follow down levels of simulation, and then simulate my opponent simulating me. If I get the flag, I have high probability that my opponent is simulating me on himself; I don't want to set that flag with high probability, because I want my opponents' (me cooperatebot) and (me defectbot) evaluations to return correctly.

But now I'm focusing on this particular implementation of the competition, not trying to provide useful information to the question that the competition is about.

Proposed meta-rule: Programs should not make attempts to determine if they are being scored or not in order to change behavior; attempts to identify and break out of recursive loops are encouraged.

Comment author: Mestroyer 06 June 2013 12:04:12AM 1 point [-]

I'm not that familiar with Scheme, but is there some way to see how much stack space you have left and stop before an overflow?

Comment author: DanielLC 06 June 2013 12:30:48AM 2 points [-]

I'm not familiar with it either, but if nothing else, you can pass a counter that shows the number of iterations you've gone through and stop when it gets too high.

What do you plan on doing when you run out of stack space? You can't just replace your code with a cooperate bot or a defect bot after a certain level, since they'll defect if they think you'll do the same thing either way, and you'll defect if you think they'll defect etc.

What you need is something along the lines of cooperating if their code is equivalent to yours, and defecting if it's different. You have to do this in under ten seconds, and you have to get it to cooperate even if they have a slightly different idea of what's equivalent.

Comment author: warbo 06 June 2013 04:02:38PM 0 points [-]

Scheme requires tail-call optimisation, so if you use tail-recursion then you'll never overflow.

Comment author: Mestroyer 06 June 2013 11:32:12PM 0 points [-]

Does that include tail-calls in mutual recursion? Even if you were going to return whatever your opponent's code did, you probably couldn't count on them doing the same.

Comment author: wubbles 09 June 2013 12:45:38PM 0 points [-]

Yes: all tail-calls are guarantied not to exhaust resources. The precise definition of what is tail is in the spec.

Comment author: arundelo 06 June 2013 01:51:05AM 1 point [-]

The indentation thing is a bug.

Comment author: fractalman 26 June 2013 03:12:57AM -1 points [-]

I just realized...

Actually running your opponent, and passing it a fake version of its own source code is a bad idea-if it notices, it can call either forkbomb() or loopforever() as needed.

Incidentially, are you sure you want to run the whole thing as winner-take-all? My fisheries class a while back ran a delayed-effect prisoners dilemma over the matter of fish sustainability... except that it was actually a zero-sum game because there was only ONE extra credit point, with winner take all.

Halfway through, after the fish were extinct and my team-which had the most boats-was losing due to boat-upkeep costs, I singlehandedly turned the whole thing into a game about market manipulation by running a pyramid scheme involving the boats. (I still lost because I forgot to keep exponentially buying more boats with loans from the bank, but oh well...)

Alas, I won't be competing. (and if I do, I'll just throw out a mimicbot...) Keep it in mind, though, if you set another one up.

Comment author: Qiaochu_Yuan 27 June 2013 08:31:34AM 2 points [-]

Can you explain the game in more detail? I don't understand what to make of your strategy.

Comment author: fractalman 27 June 2013 08:09:09PM *  1 point [-]

ok.

There were 5 or so teams. there were 10 or 15 "years" (rounds). Each round, you can buy brand new boats in proportion to your current fleet, go fishing with n boats, buy boats from other players, sell boats to other players...and pay a maintanence fee on your current fleet size. Each boat that goes fishing costs a little bit more to run than letting it sit in one place. The fish reproduce, but slowly.

At the very end of the game, fleet value and bank account are added up to determine the winner.

But the value of boats is player-driven...and there was no limit on how much debt you could go into. meanwhile, the cost for brand-new boats...was constant.

The instructor intended to illustrate how the prisoners dilemma plays a role in overfishing, which I don't think is what we actually wound up learning, since our metagame was winner-take-all and not a prisoner's dilemma.

...

Again: you might learn something interesting with winner-take-all, and may even need to use that as the setup to prevent cooperate-bot spam due to the types of the people on this site...but it IS a different metagame. Keep that in mind when interpreting the results.

edit2: By different metagame, I mean that this tournament posesses a nash equilibrium of programs for which each program is considered a move. Finding it is possible in finite steps(It does not run afoul of the halting problem due to the 10 second limitation, as measured by the hardware of a specific machine and environment), but a brute-force approach would probably take more memory than our universe appears to possess.

edit3: see also nshepperd's comments

Comment author: nshepperd 25 June 2013 04:03:30AM *  -1 points [-]

It occurs to me that a contest doesn't have the right incentives for the prisoners dilemma, and this seems unfixable. For example, when four bots play each other you can have the following situation: A mutually cooperates with B and mutually defects against C,D; and B,C,D mutually defect with each other. In that case A "wins" the contest (tied for first place with B).

But if A mutually cooperates with B and C and mutually defects against D; and B,C,D mutually cooperate with each other, then A loses to B and C (who tie for first place). Even though, properly, this is a better outcome for A than the previous one, since A won more utility.

Comment author: Kindly 25 June 2013 05:31:18AM 2 points [-]

The two examples differ in games between B, C, and D, which don't involve A at all. So I don't see a problem: even though A has done better in the second example, B and C have improved more, so they win instead. You can construct something similar in a round-robin chess tournament, where you can win one tournament with a lower score than another tournament you lost.

Can you construct an example where only games involving A are changed?

Comment author: nshepperd 25 June 2013 06:22:44AM 1 point [-]

This is a highly artificial example, but consider the following...

Case 1: A mutually defects against B, and mutually defects against N other bots. B mutually defects with those bots too. Assume the N bots defect among themselves. The contest is a draw.

Case 2: A cooperates with B, who defects. A mutually cooperates with the N other bots. And of course, B still mutually defects against those N bots.

For sufficiently large utilities of the (D, C) outcome, in this case B would win the contest, even though A got an improved score by cooperating with those other N bots (minus a small penalty for getting (C, D) instead of (D, D) against B).

Comment author: Kindly 25 June 2013 02:19:10PM 1 point [-]

Oh, I think I see. In a conventional tournament setting, the same idea can apply: if B beats C and A can beat exactly one of them, it's better to win against B and lose against C rather than vice versa.

This doesn't usually come up in conventional tournaments because winning is believed to be transitive: making yourself a better player makes you more likely to beat both B and C. Here, on the other hand, there may be trade-offs. For example, it's probably worthwhile to use a trick that achieves (D,C) against ReallySmartBot if it requires mutually cooperating with CooperateBot, rather than cooperating with the former and exploiting the latter: ReallySmartBot is a dangerous competitor and CooperateBot probably isn't.

I don't actually see a way to take this into account in the current metagame setting. But this could be an interesting thing to think about after we know the results.