DeepMind on Stratego, an imperfect information game

sanxiyn

LESSWRONG
LW

15 DeepMind on Stratego, an imperfect information game

by sanxiyn

24th Oct 2022

1 min read

15

This is a linkpost for https://arxiv.org/abs/2206.15378

Playing an imperfect information game well has been a challenge for AI. DeepMind reports their (IMHO impressive) work on Stratego, an imperfect information game.

AI CapabilitiesAI

Frontpage

15

DeepMind on Stratego, an imperfect information game

1Dirichlet-to-Neumann

New Comment

9 comments, sorted by

top scoring

Click to highlight new comments since: Today at 11:27 PM

[-]Shmi2y2-12

I am confused why it stopped at the human level:

DeepNash was tested against top human players for two weeks in early April 2022, yielding 50 ranking matches in which DeepNash won 42%.

instead of self-improving to beat even the best human player every time.

[-]paulfchristiano2y105

The quote is:

DeepNash was evaluated against top human players over the course of two weeks in the beginning of April 2022, resulting in 50 ranked matches. Of these matches, 42 (i.e. 84%) were won by DeepNash

Given the game has imperfect information, it's not clear you should expect to be able to win much more than that. (I haven't played much Stratego but I would have guessed that a reasonably strong player going for high-variance strategies could beat God 10-20% of the time.)

[-]Shmi2y30

Hmm, so is this one of those games where a novice can beat an expert a significant fraction of time, because of the imperfect information? Is there a theoretical upper limit for percent wins for the perfect player vs best human player?

[-]sanxiyn2y141

I am a Stratego player, and the answer is no, not really. In fact, DeepNash won 30/30 (100%) against Probe, which won Computer Stratego World Championship three times in the past.

But I think Paul is not wrong. While Stratego is mostly skill not luck (it's not like you are drawing cards and you need good cards, there is zero randomness, just hidden information), there is a bit of rock-paper-scissors involved. Novices can't beat experts, but I do think experts can beat God.

[-]paulfchristiano2y60

My main point was that you quoted 42% when the win rate was 84%.

Even if there's no cap on winrate, I don't think you should necessarily expect to "self-improve to beat the best human players every time." Even in a game of perfect information I think there are 2+ orders of magnitude of scale (or equivalent algorithmic progress) where you will beat human players 60-99% of the time.

So I think it could make sense to be surprised "Isn't Stratego easy enough that AI should be crushing humans?" but it would not make sense to say "Given that AI is able to beat humans at Stratego, why is it not able to crush them every time?"

(Note that humans could potentially do better if they knew they were playing against a much stronger opponent and trying to play for a lucky win.)

[-]Dagon2y30

It doesn't have to be that a novice has a chance against an expert, in order for there to be declining returns to further expertise. As an example, rock-scissors-paper-nothing (rock beats scissors and nothing, scissors beats paper and nothing, paper beats rock and nothing) has the "expert" strategy of "randomize, but never choose "nothing"), which beats the incredible-novice who chooses "nothing" most of the time. Further, there is expertise in noticing patterns among your opponents, while obscuring the patterns that such prediction brings to your plays. But very good AI can probably do better than 50% against human experts, without getting anywhere near 100%.

84% for Stratego is higher than I'd have predicted.

[-]ChristianKl2y20

From where did you take that quote? I find in the paper:

DeepNash was evaluated against top human players over the course of two weeks in the
beginning of April 2022, resulting in 50 ranked matches. Of these matches, 42 (i.e. 84%) were
won by DeepNash. In the Classic Stratego challenge ranking 2022 this corresponds to a rating
of 1799, which resulted in a 3rd place for DeepNash of all ranked Gravon Stratego players

I expect that you mistakenly added the % after 42.

[-]Shmi2y20

I quoted an article mentioning it without checking the source, oops:

https://www.marktechpost.com/2022/07/09/deepmind-ai-researchers-introduce-deepnash-an-autonomous-agent-trained-with-model-free-multiagent-reinforcement-learning-that-learns-to-play-the-game-of-stratego-at-expert-level/

[-]Dirichlet-to-Neumann2y10

It's certainly interesting although to be honest I'm pretty confident the top human stratego players are nowhere near the top achievable level for a human player (contrasting with games like chess or StarCraft).

Moderation Log