I'm thinking iterations just confuses things. With a high enough HP value we should be able to eliminate "luck". So here's a pass with 1 iteration and 20 million initial HP:
2: red/blue
8: red/red
13: yellow/blue
13: yellow/red
15: red/yellow
15: yellow/green
17: blue/yellow
17: red/green
17: yellow/yellow
19: blue/red
19: green/blue
19: green/green
19: green/red
19: green/yellow
21: blue/blue
23: blue/green
Note: this image does not belong to me; I found it on 4chan. It presents an interesting exercise, though, so I'm posting it here for the enjoyment of the Less Wrong community.
For the sake of this thought experiment, assume that all characters have the same amount of HP, which is sufficiently large that random effects can be treated as being equal to their expected values. There are no NPC monsters, critical hits, or other mechanics; gameplay consists of two PCs getting into a duel, and fighting until one or the other loses. The winner is fully healed afterwards.
Which sword and armor combination do you choose, and why?