Yes binary neural networks are super interesting because they can be made much more compact in hardware than floating point ops. However there isn't much (theoretical) advantage otherwise. Anything a circuit can do, an NN can do, and vice versa.
A circuit size penalty is already a very common technique. It's called weight decay, where the synapses are encouraged to be as close to zero as possible. A synapse of 0 is the same as it not being there, which means the neural net parameters requires less information to specify.
There have been a couple of brief discussions of this in the Open Thread, but it seems likely to generate more so here's a place for it.
The original paper in Nature about AlphaGo.
Google Asia Pacific blog, where results will be posted. DeepMind's YouTube channel, where the games are being live-streamed.
Discussion on Hacker News after AlphaGo's win of the first game.