One component of AlphaZero is a neural net which takes a board position as input, and outputs a guess about how good the position is and what a good next move would be. It combines this neural net with Monte Carlo Tree Search (MCTS) that plays out different ways the game could go, before choosing the move. The MCTS is used both during self-play to train the neural net, and during competitive test-time. I'm mainly curious about whether the latter is necessary.
So my question is: Once you have the fully-trained AlphaZero system, if you then turn off the MCTS and just choose moves directly with the neural net policy head, is it any good? Is it professional-level, amateur-level, child-level?
(I think this would be a fun little data-point related to discussions of how powerful an AI can be with and without mesa-optimization / search-processes using a generative environmental model.)
The paper includes the ELO for just the NN. I believe it's professional level but not superhuman, but you should check if you really need to know. However, note that Alphazero's actual play doesn't use MCTS at all, it uses a simple tree search which only descends a few ply.
This is incorrect. It is International Master-level without tree search. Good amateur, but there are >1000 players in the world that are better.
And it is neither MCTS or a "simple tree search", it uses PUCT, often calculating very deeply in a few lines.