Comment author: [deleted] 10 September 2011 01:44:37PM 0 points [-]

Thanks for the link but it still doesn't make sense to me (I've tried to understand what this qualia thing a few times before and I am still baffled about what it is and why everyone other than me thinks it's real).

Comment author: red75 11 September 2011 04:11:52AM 1 point [-]

Maybe it's better to start from obvious things. Color experience, for example. Can you tell which light of traffic lights is illuminated while you are not using position of light and you aren't asking himself which color it is? Is there something in your perception of different lights that allows you to tell that they are different?

Comment author: JoshuaZ 05 September 2011 06:24:52PM 2 points [-]

I'd be very interested in seeing the same sort of thing where the number of rounds is randomly chosen and they don't know how many rounds they are going to go up against.

Note that the CliqueBots failure is not a good reason to think that Cliquebotish like strategies will fail in evolutionary contexts in real life. In particular, life generally has a lot of conditions to tell how closely related something is to them (such as skin coloration). In this sort of context cliquebots are losing in part because they need at one point to defect against themselves to do an identity test. I suspect that if strategies were told in advance whether they are going up against themselves then Cliquebots would do a lot better.

P seems to be a much better designed CliqueBot than Q especially for large sets. Five defections is a lot to burn, and in fact it does go away well before Q leaves.

One other thing that might be interesting to see what sort of environments which strategies will do well in is to never reduce the number of copies of any strategy to 0, and have 1 be the minimum.

Comment author: red75 05 September 2011 07:08:04PM *  2 points [-]

I think communication cost isn't main reason for P's failure. O, for example, defects on 3 last turns even when playing against itself (rule 1 is of highest priority). Reasons are too strong punishment of other strategies (and consequently itself) and too strict check for self identity.

Strategy I described here should perform much better, when n is in range 80-95.

Comment author: wedrifid 03 September 2011 04:14:42AM 1 point [-]

Is there any nation that "rationally and selfishly follows its collective interest"?

It is safe to say that there isn't. The rest of us would have been left or overwhelmed within months.

Comment author: red75 03 September 2011 04:48:15AM *  1 point [-]

Huh? Do you think that selfishness unambiguously means: dominate Earth (or what left of it) as fast as possible?

Comment author: prase 01 September 2011 08:17:17PM 0 points [-]

Define strategy S[n] as TfT until turn n and defect ever since. In the limit of infinite population having non-zero initial number of S[n] for each n, S[0], i.e. DefectBot, eventually dominates. Starting with equal subpopulations, initially most successful is S[99] which preys on S[100] and finally drives it to extinction. But then, S[98] gains advantage over S[99] and so on.

With not so big population however, the more defectorish strategies die out sooner than the environment becomes suitable for them. (I have done it with population of 2000 strategies and the lowest surviving after several hundred generations was S[80] or so).

Comment author: red75 01 September 2011 08:30:58PM *  2 points [-]

Try another strategy. I[n] - TfT until turn n, defect on turn n, on later turns check if result on turn n was (defect,defect) and play TfT, otherwise defect. Idea is selfcooperation.

Comment author: prase 01 September 2011 05:04:01PM 1 point [-]

Yes. It would depend on concrete implementation and number of players. The numbers needed for defectbots to actually thrive are very big and it would take a long time to converge into Nash, so I was perhaps a bit overconfident with my prediction.

Comment author: red75 01 September 2011 08:03:46PM *  3 points [-]

Moreover, simulations I ran using your rules for evolutionary tournament show that one strategy quickly dominates and others go extinct. Defectbot is among strategies which are fastest to go extinct (even in presence of cooperatebot) as it feeds off overaltruist strategies, which in turn fail to compete with tit-for-tat. So I doubt that at least evolutionary tournament will converge into Nash.

I predict that strategy that tit-for-tats 99 turns and defects on 100-th one will win in evolutionary tournament, given that tit-for-tat is also in the population.

ETA: I've sent another strategy.

Comment author: red75 31 August 2011 07:22:59AM *  0 points [-]

In the meantime I've run my own simulation, studying a group of strategies, which perform as tit-for-tat except that at specific turn they defect and then they use result of this turn to switch to defect stone or continue tit-for-tatting. Thus they recognize copies of itself and cooperate with them. Such strategy can be exploited by switching to defect stone before it does, or by mimicking its behavior (second defect check after first. This case I didn't analyze).

It leads to interesting results in evolutionary tournament. Second fairest (second longest period of tit-for-tatting) strategy wins. It outperforms less fair strategies by longest cooperation with fairest strategy. And it outperforms fairest strategy by exploiting it.

In response to Modularity and Buzzy
Comment author: red75 07 August 2011 06:41:26PM 0 points [-]

Then the goal of lesswrong (in this framework) seems to make brain act like it contains command and control center which corrects for errors caused by another parts of brain. And the list of errors includes the idea that brain contains command and control center. Sophisticated.

Comment author: AdeleneDawner 05 June 2011 01:56:41PM 3 points [-]

We may stare at the empty plane and ask ourselves if this is the graveyard of a superintelligence, once lived here and conquered the plane for a brief time, then vanished in a collapse. Several gliders and roses could be everything what remained, as some dry fossils.

Or, we could find that the playing field stabilizes to something that can easily be interpreted as a superintelligence's preferred state - perhaps with the field divided into subsections in which interesting things happen in repeated cycles, or whatever.

Comment author: red75 15 June 2011 01:19:09AM 0 points [-]

I wonder why rational consequentialist agent should do anything but channel all available resources into instrumental goal of finding a way to circumvent heat death. Mixed strategies are obviously suboptimal as expected utility of heat death circumvention is infinite.

Comment author: red75 23 December 2010 12:16:02PM *  0 points [-]

Below is very unpolished chain of thoughts, which is based on vague analogy with symmetrical state of two indistinguishable quantum particles.

When participant is said ze is decider, ze can reason: let's suppose that before coin was flipped I changed places with someone else, will it make difference? If coin came up heads, than I'm sole decider and there are 9 swaps which make difference in my observations. If coin came up tails then there's one swap that makes difference. But if it doesn't make difference it is effectively one world, so there's 20 worlds I can distinguish, 10 correspond to my observations, 9 have probability (measure?) 0.5 * 0.1 (heads, I'm decider), 1 have probability 0.5 * 0.9 (tails, I'm decider). Consider following sentence as edited out. What I designated as P(heads) is actually total measure (?) of worlds participant is in. All this worlds are mutually exclusive, thus P(heads)=9 * 0.5 * 0.1+1 * 0.5 * 0.9=0.9.

What is average benefit of "yea"? (9 * 0.5 * 0.1 * $100 + 1 * 0.5 * 0.9 * $1000)=$495

Same for "nay": (9 * 0.5 * 0.1 * $700+1 * 0.5 * 0.9 * $700)=$630

Comment author: cousin_it 22 December 2010 03:50:42PM *  7 points [-]

And here's a reformulation of Counterfactual Mugging in the same vein. Find two subjects who don't care about each other's welfare at all. Flip a coin to choose one of them who will be asked to give up $100. If ze agrees, the other one receives $10000.

This is very similar to a rephrasing of the Prisoner's Dilemma known as the Chocolate Dilemma. Jimmy has the option of taking one piece of chocolate for himself, or taking three pieces and giving them to Jenny. Jenny faces the same choice: take one piece for herself or three pieces for Jimmy. This formulation makes it very clear that two myopically-rational people will do worse than two irrational people, and that mutual precommitment at the start is a good idea.

This stuff is still unclear to me, but there may be a post in here once we work it out. Would you like to cooperate on a joint one, or something?

Comment author: red75 22 December 2010 06:16:45PM 0 points [-]

I'm still unsure if it is something more than intuition pump. Anyway, I'll share any interesting thoughts.

View more: Prev | Next