Cite? They use the supervised network for policy selection (i.e. tree pruning) which is a critical part of the system.
I'm referring to figure 1a on page 4 and the explanation below. I can't be sure but the self-play should be contributing a large part to the training and can go on and improve the algorithm even if the expert database stays fixed.
DeepMind's go AI, called AlphaGo, has beaten the European champion with a score of 5-0. A match against top ranked human, Lee Se-dol, is scheduled for March.