Houshalter comments on AlphaGo versus Lee Sedol - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (183)
When I started hearing about the latest wave of results from neural networks, I thought to myself that Eliezer was probably wrong to bet against them. Should MIRI rethink its approach to friendliness?
EY was influenced by E.T. Jaynes, who was really against neural networks, in favor of bayesian networks. He thought NNs were unprincipled and not mathematically elegant, and bayes nets were. I see the same opinions in some of EY's writings, like the one you link. And the general attitude that "non-elegant = bad" is basically MIRI's mission statement.
I don't agree with this at all. I wrote a thing here about how NNs can be elegant, and derived from first principles. But more generally, AI should use whatever works. If that happens to be "scruffy" methods, then so be it.
This seems like a bizarre statement if we care about knowable AI safety. Near as I can tell, you just called for the rapid creation of AGI that we can't prove non-genocidal.
I don't believe Houshalter was referring to proving Friendliness (or something along those lines); my impression is that he was talking about implementing an AI, in which case neural networks, while "scruffy", should be considered a legitimate approach. (Of course, the "scruffiness" of NN's could very well affect certain aspects of Friendliness research; my relatively uninformed impression is that it's very difficult to prove results about NN's.)
Nice post.
Anyway, according to some recent works (ref, ref), it seems to be possible to directly learn digital circuits from examples using some variant of backproagation. In principle, if you add a circuit size penalty (which may be well the tricky part) this becomes time-bounded maximum a posteriori Solomonoff induction.
Yes binary neural networks are super interesting because they can be made much more compact in hardware than floating point ops. However there isn't much (theoretical) advantage otherwise. Anything a circuit can do, an NN can do, and vice versa.
A circuit size penalty is already a very common technique. It's called weight decay, where the synapses are encouraged to be as close to zero as possible. A synapse of 0 is the same as it not being there, which means the neural net parameters requires less information to specify.
Yes, we need to find the way to make existing AIs safe.
Agreed on all points.
I suppose the main lesson for us can be summarized by the famous verse:
The sequences definitely qualify as shallow draughts that intoxicate the brain :-(