Comment author: alexg 10 March 2014 08:55:45AM 0 points [-]

I can't believe that this one hasn't been done before:

Unless you are Eliezer Yudkowsky, there are 3 things that are certain in life: death, taxes and the second law of thermodynamics.

Comment author: engineeredaway 11 September 2011 04:36:33AM 8 points [-]

I've written a pretty good program to complete variant 3 but I'm such a lurker on this site i lack the nessesary karma to submit it as a article. So here it is as a comment instead:

Inspired by prase's contest, (results here) and Eliezer_Yudkowsky's comment, I've written a program that plays iterated games of prisoners dilemma, with the cravat that each player can simulate his opponent. I'm now running my own contest. You can enter your onwn strategy by sending it to me in a message. Deadline to enter is Sunday September 25. The contest itself will run shortly there after.

Each strategy plays 100 iterations against each other strategy, with cumulative points. Each specific iteration is called a turn, each set of 100 iterations is called a match. (thanks prase for the terminology). Scoring is CC: (4,4) DC: (7,0) CD: (0,7) DD: (2,2).

Every turn, each strategy receives a record of all the moves it and its opponent made this match, a lump of time to work and its opponents code. Strategies can't review enemy code directly but they can run it through any simulations they want before deciding their move, within the limits of the amount of time they have to work.

Note that strategies can not pass information to themselves between iterations or otherwise store it, other than the record of decisions. They start each turn anew. This way any strategy can be simulated with any arbitrary record of moves without running into issue.

Strategies in simulations need a enemy strategy passed into them. To avoid infinite recursion of simulations, they are forbidden from passing themselves. They can have as many secondary strategies as need however. This creates a string of simulations where strategy(1) simulates enemy(1), passing in strategy(2) which enemy(1) simulates and passes in enemy(2) which strategy(2) simulates and passes in strategy(3). The final sub-strategy passed in by such a chain must be a strategy which performs NO simulations.

for example, the first sub-strategy could be an exact duplicate of main strategy other than that it passed the titfortat program instead. so mainstrategy => substrategy1 => titfortat.

You can of course use as many different sub_strategies as you need, the programs are limited by processing time, not memory. Strategies can run their simulated opponent on any history they devise, playing which ever side they choose.

Strategies can't read the name of their opponent, see the number of strategies in the game, watch any other matches or see any scores outside their current match.

Strategies are not cut off if they run out of time, but both will receive 0 points for the turn. The decisions of the turn will be recorded as if normal.

I never figured out a good way to keep strategies from realizing they were being simulated simply by looking at how much time they were given. Not knowing how much time they have would make it prohibitively difficult to avoid timing out. My hack solution is to not give a definitive amount of time for the main contest but instead a range: from 0.01 seconds to 0.1 seconds per turn, with the true time to be know only by me. This is far from ideal, and if anyone has a better suggestion I'm all ears.

To give reference: a strategy that runs 2 single turn simulations of titfortat against titfortat takes, on average 3.510^-4 seconds. Running only 1 single turn simulations took only 1.510^-4 seconds. titfortat by itself takes about 2.5*10^-5 seconds to run a single turn. Unfortunately, do to factor outside of my control, matlab, the software I'm running will for unknown reasons take 3 to 5 times as long as normal about 1 out of a 100 times. Leave yourself some buffer.

Strategies are NOT allowed a random number generator. This is different from prase's contest but I would like to see strategies for dealing with enemy intelligence trying to figure them out without resorting to unpredictability.

I've come up with a couple of simple example strategies that will be performing in the contest.

Careful_cheater:

simulates its opponent against titfortat to see its next move. If enemy defects, Carefulcheater defects. If enemy cooperates, Carefulcheater simulates its next move after that with a record showing Carefulcheater defecting this turn, passing titfortat. If the opponent would have defected on the next turn, Carefulcheater cooperates, but if the opponent would have cooperated despite Careful_cheaters defection, it goes ahead and defects.

simulations show it doing evenly against titfortat and its usual variations, but Carefulcheater eats forgiving programs like titfor2tats alive.

Everyman:

simulates 10 turn matchs with its opponent against several possible basic strategies, including titfortat, titfortat_optimist (cooperates first two turns), and several others, then compares the scores of each match and plays as the highest scoring

Copy_cat:

Switches player numbers and simulates its opponent playing Copycats record while passing its opponent and then performs that action. That is to say, it sees what its opponent would do if it were in Copycats position and up against itself.

this is basically DuncanS strategy from the comments. DuncanS, your free to enter another one sense I stole yours as an example

and of course titfortat and strategy "I" from prase's contest will be competing as well.

one strategy per person, all you need for a strategy is a complete written description, though I may message back for clarification. I reserve the right to veto any strategy I deem prohibitively difficult to implement.

Comment author: alexg 27 November 2013 04:22:43AM *  0 points [-]

Here's a very fancy cliquebot (with a couple of other characteristics) that I came up with. The bot is in one of 4 "modes" -- SELF, HOSTILE, ALLY, PASSWORD.

Before turn1:

Simulate the enemy for 20 turns against DCCCDDCDDD Cs thereafter, If the enemy responds 10Ds, CCDCDDDCDC then change to mode SELF. (These are pretty much random strings -- the only requirement is that the first begins with D)

Simulate the enemy for 10 turns against DefectBot. If the enemy cooperates in all 10 turns, change to mode HOSTILE. Else be in mode ALLY.

In any given turn,

If in mode SELF, cooperate always.

If in mode HOSTILE, defect always.

If in mode ALLY, simulate the enemy against TFT for the next 2 turns. If the enemy defects on either of these turns, or defected on the last turn, switch to mode HOSTILE and defect. Exception if it is move 1 and the enemy will DC, then switch to mode PASSWORD and defect. Defect on the last move.

If in mode PASSWORD, change to mode HOSTILE if the enemy's moves have varied from DCCCDDCDDD Cs thereafter beginning from move 1. If so, switch to mode HOSTILE and defect. Else defect on moves 1-10, on moves 11-20 CCDCDDDCDC respectively and defect thereafter.

Essentially designed to dominate in the endgame.

Comment author: RichardKennaway 13 November 2013 12:27:34PM 0 points [-]

Rationality is not about maximising the accuracy of your beliefs, nor the accuracy of others. It is about winning!

I don't think you understand what "rationality is about winning" means. It is explained here, here, and here.

Comment author: alexg 13 November 2013 12:41:29PM *  0 points [-]

Possibly I used it out of context, What I mean is that utility (less crime)> utility(society has inaccurate view of justice system) when the latter has few other consequences, and rationaliy is about maximising utility. Also, in the Least Convenient World, overall this trial will not affect any others, hence negating the point about the accuracy of the justice system. Here knowledge is not an end, it is a means to an end.

Comment author: alexg 13 November 2013 12:33:03PM 2 points [-]

G'day

As you can probably guess, I'm Alex. I'm a high school student from Australia and have been disappointed with the education system here from quite some time.

I came to LW via HPMoR which was linked to me by a fellow member of the Aus IMO team. (I seriously doubt I'm the only (ex-)Olympian around here - seems just the sort of place that would attract them). I've spent the past few weeks reading the sequences by EY, as well as miscellaneous other stuff. Made a few (inconsequential) posts too.

I have very little in the way of controversial opinions to offer (relative to the demographics of this site) as just about all the unusual positions it takes I already agreed with (e.g. athiesm) or seemed pretty obvious to me after some thought (e.g. transhumanism). Maybe it's just hindsight bias.

I'm slightly disappointed with the ban on political discussion. I do agree that it should not be mentioned when not relevant but it seems a shame to waste this much rationality in one place by forbidding them to use it where it's most needed. A possible compromise would be to create a politics dicussion page to discuss pros and cons to particular ideologies. (If one already exists point me to it). A reason cited is that there are other sites to discuss politics - if any do so rationally I'd like to see them.

It is a relief to be somewhere where I don't have to constantly take into account inferential distance, and I shall try to make the most of this. I only resolve to write just that which has not been written.

Comment author: RichardKennaway 13 November 2013 11:25:36AM 1 point [-]

... which would decrease their confidence in the justice system if he is set free...

By condemning X, I uphold the people's trust in the justice system, while making it unworthy of that trust. By condemning Y, I reduce the people's trust in the justice system, while making the system worthy of their trust. But what is their trust worth, without the reality that they trust in?

If I intend the justice system to be worthy of confidence, I desire to act to make it worthy of confidence. If I intend it to be unworthy of confidence, I desire to act to make it unworthy of confidence. Let me not become unattached to my desires, nor attached to what I do not desire.

Also, there is no Least Convenient Possible World. The Least Convenient Possible World for your interlocutors is the Most Convenient Possible World for yourself, the one where you get to just say "Suppose that such and such, which you think is Bad, were actually Good. Then it would be Good, wouldn't it?"

Comment author: alexg 13 November 2013 11:51:27AM *  0 points [-]

You're kind of missing the point here. I probably should have clarified my position more The reason I want people to trust the justice system is so that people wil not be inclined to commit crimes, because it would then more likely (from their point of view) that, if they did, they would get caught. I suppose there is the issue of precedent to worry about, but the ultimate purpose of the justice system, from the consequentialist viewpoint, is to deter crimes (by either the offender it is dealing with or potential others), not to punish criminals. As the offender is, by assumption, unlikely to reoffend, the everyone else's criminal behaviors are the main factor here, and these are minimised through the justice system's reputation. (I also should have added the assumption that attempts to convince people of the truth have failed). By prosecuting X you are acheiving this purpose. The Least Convenient Possible World is the one where there's not a third way, or additional factor (I hadn't thought of) that lets you get out of this.

Rationality is not about maximising the accuracy of your beliefs, nor the accuracy of others. It is about winning!

EDIT: Grammer EDIT: The point is, if you would punish a guilty person for a stabler society, you ought to to the same to an innocent person, for the some benefit.

Comment author: alexg 13 November 2013 10:50:29AM 0 points [-]

Test for Consequentialism:

Suppose you are a judge in deciding whether person X or Y commited a murder. Let's also assume your society has the death penalty. A supermajority of society (say, encouraged by the popular media) has come to think that X committed the crime, which would decrease their confidence in the justice system if he is set free, but you know (e.g. because you know Bayes) that Y was responsible. We also assume you know that Y won't reoffend if set free because (say) they have been too spooked by this episode. Will you condemn X or Y? (Before you quibble your way out of this, read The Least Convenient Possible World)

If you said X, you pass.

Just a response to "Saddam Hussein doesn't deserve so much as a stubbed toe."

N.B. This does not mean I'm against consequentialism.

Comment author: alexg 12 November 2013 11:03:39PM 1 point [-]

I'm not sure if anyone's noticed this, but how do you know that you're not a simulation of yourself inside Omega? If he is superintelligent, he would compute your decision by simulating you, and you and your simulation will be indisinguishable.

This is fairly obviously a PD against said simulation -- if you cooperate in PD, you should one-box.

I personally am not sure, although if I had to decide I'd probably one-box