[LINK] EdTech startup hosts AI Hunger Games (cash prize $1k)

MalcolmOcean

17 [LINK] EdTech startup hosts AI Hunger Games (cash prize $1k)

14th Aug 2013

2 min read

17

TL;DR = write a python script to win this applied game theory contest for $1000. Based on Prisoner's Dilemma / Tragedy of the Commons but with a few twists. Deadline Sunday August 18.

https://brilliant.org/competitions/hunger-games/rules/

I. Food and Winning

Each player begins the game with 300(P−1) units of food, where P is the number of players.

If after any round you have zero food, you will die and no longer be allowed to compete. All players who survive until the end of the game will receive the survivor's prize.

The game can end in two ways. After a large number of rounds, there will be a small chance each additional round that the game ends. Alternatively, if there is only one person left with food then the game ends. In each case, the winner is the person who has the most food when the game ends.

II. Hunts

Each round is divided into hunts. A hunt is a game played between you and one other player. Each round you will have the opportunity to hunt with every other remaining player, so you will have P−1 hunts per round, where P is the number of remaining players.

The choices are H = hunt (cooperate) and S = slack (defect), and they use confusing wording here, but as far as I can tell the payoff matrix is (in units of food)

	H / C	S / D
H / C	0:0	-3:1
S / D	1:-3	-2:-2

What's interesting is you don't get the entirety of your partner's history (so strategies like Tit-Tit-Tit for Tat don't work) instead you get only their reputation, which is the fraction of times they've hunted.

To further complicate the Nash equilibria, there's the option to overhunt: a random number m, 0 < m < P(P−1) is chosen before each round (round consisting of P−1 hunts, remember) and if the total number of hunt-choices is at least m, then each player is awarded 2(P−1) food units (2 per hunt).

Your python program has to decide at the start of each round whether or not to hunt with each opponent, based on:

the round number
your food
your reputation
m
an array of the opponents' reputations

Based on the fact they're giving you some values you'd already know if you had access to memory, I'm assuming it must be a memoryless script that gets run each round. (EDIT: I take that back, I looked at the sample code and while it says you don't need to track this stuff, it notes that you can use instance variables).

Submissions close on the 18th.

I think the contest could be both better designed and better explained in a number of ways, but thought I'd share it anyway because hey, money. Also, you'd be competing in an arena where they're giving explanations of what Nash equilibria are. Which is probably not really fair. But it's the Hunger Games, and of course it's not fair. (As far as I can tell, they are not enforcing anything related to fairness here.)

I'd be curious to hear ideas for strategies and thoughts about the design of the game.

Personal Blog

17

New Comment

Rendering 0/23 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 8:34 AM

Moderation Log

17 [LINK] EdTech startup hosts AI Hunger Games (cash prize $1k)

by MalcolmOcean

14th Aug 2013

2 min read

17

TL;DR = write a python script to win this applied game theory contest for $1000. Based on Prisoner's Dilemma / Tragedy of the Commons but with a few twists. Deadline Sunday August 18.

https://brilliant.org/competitions/hunger-games/rules/

I. Food and Winning

Each player begins the game with 300(P−1) units of food, where P is the number of players.

If after any round you have zero food, you will die and no longer be allowed to compete. All players who survive until the end of the game will receive the survivor's prize.

The game can end in two ways. After a large number of rounds, there will be a small chance each additional round that the game ends. Alternatively, if there is only one person left with food then the game ends. In each case, the winner is the person who has the most food when the game ends.

II. Hunts

Each round is divided into hunts. A hunt is a game played between you and one other player. Each round you will have the opportunity to hunt with every other remaining player, so you will have P−1 hunts per round, where P is the number of remaining players.

The choices are H = hunt (cooperate) and S = slack (defect), and they use confusing wording here, but as far as I can tell the payoff matrix is (in units of food)

	H / C	S / D
H / C	0:0	-3:1
S / D	1:-3	-2:-2

Your python program has to decide at the start of each round whether or not to hunt with each opponent, based on:

the round number
your food
your reputation
m
an array of the opponents' reputations

Submissions close on the 18th.

I'd be curious to hear ideas for strategies and thoughts about the design of the game.

Personal Blog

17

New Comment

Rendering 0/23 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 8:34 AM

Moderation Log

More from MalcolmOcean

Curated and popular this week

23Comments

Comment Permalink

ThisSpaceAvailable13y00

If there are two kinds of players, those who throw rock, and those who throw paper, the latter will blow the former out of the the water.

You are engaging in two fallacies: you are cherry-picking conditions to favor your particular strategy, and you are evaluating the strategies at the wrong level. Strategies should be evaluated with respect to how the affect the success of the individual person employing them, not on how they affect the success of people, in general, who employ them. This fallacy is behind much of the cooperate/one-box arguments. Sure, if everyone in Group B cooperates with other members of Group B, then Group B will do better, and on a superficial level, it seems like this means "If you're in Group B, you should cooperate with other members of Group B", but that's fallacious reasoning. It's the sort of thing that lies behind identity politics. "If Americans buy American, then Americans will do better, and you're an American, so you will benefit from buying American". Even if we grant that buying American gives a net benefit to America (which is a rather flimsy premise to begin with), it doesn't follow that any American has a rational reason to buy American. In your scenario, the presence of people with the "cooperate with people who have a reputation greater than 0" provides a reason to cooperate in the first round, but there is no reason whatsoever to condition cooperation on someone having a reputation greater than 0. Anyone who, in this scenario, thinks that one should cooperate with people with reputation greater than 0 does indeed not understand game theory.

Emile13y10

You are engaging in two fallacies: you are cherry-picking conditions to favor your particular strategy, and you are evaluating the strategies at the wrong level.

No, I'm simplifying for arguments' sake, using the example given by Alex (cooperating with any positive reputation). I discuss more complex strategies elsewhere in the thread, of course "cooperate only with people with > 0 reputation is a pretty stupid and exploitable strategy, my point is that even such a stupid strategy could beat Alex's "always defect".

See in context