The paper includes the ELO for just the NN. I believe it's professional level but not superhuman, but you should check if you really need to know. However, note that Alphazero's actual play doesn't use MCTS at all, it uses a simple tree search which only descends a few ply.
Thanks for your answer! But I'm afraid I'm confused on both counts.
I couldn't, and still can't, find "ELO for just the NN" in the paper... :-( I checked the arxiv version and preprint version.
As for "actual play doesn't use MCTS at all", well the authors say it does use MCTS... Am I misunderstanding the authors, or are you saying that the "thing the authors call MCTS" is not actually MCTS? (For example, I understand that it's not actually random.)
But it does use MCTS in training. You might say that it uses MCTS to generate a better player to learn from.
This is incorrect. It is International Master-level without tree search. Good amateur, but there are >1000 players in the world that are better.
And it is neither MCTS or a "simple tree search", it uses PUCT, often calculating very deeply in a few lines.
One component of AlphaZero is a neural net which takes a board position as input, and outputs a guess about how good the position is and what a good next move would be. It combines this neural net with Monte Carlo Tree Search (MCTS) that plays out different ways the game could go, before choosing the move. The MCTS is used both during self-play to train the neural net, and during competitive test-time. I'm mainly curious about whether the latter is necessary.
So my question is: Once you have the fully-trained AlphaZero system, if you then turn off the MCTS and just choose moves directly with the neural net policy head, is it any good? Is it professional-level, amateur-level, child-level?
(I think this would be a fun little data-point related to discussions of how powerful an AI can be with and without mesa-optimization / search-processes using a generative environmental model.)