Basically, topics or ideas that are not well explained, but could benefit from good expositions.

New Answer
New Comment

2 Answers sorted by

Daniel Kokotajlo

40

Scott Garrabrant once said something that really stuck with me: (paraphrasing) "You can get a mind in two ways -- by building it, or by searching for it. The way things are going with deep learning right now, it seems like we aren't going to build AGI, we are going to search for it. And that means trouble."

I feel like this is ultimately a pretty simple point but it is I think really important, it is a good foundation for thinking about (and taking seriously) things like inner alignment issues. Some comments:

--Obviously it's not binary, there's a spectrum between "building/designing" and "searching."

--Given our current level of understanding, I claim that doing stochastic gradient descent to train big neural networks is by default pretty far towards the "searching" end of the spectrum. The thing we get out the other end is a giant messy glob of neurons and we don't know in advance what it will be like (and often we don't even understand it afterwards, not even a little bit.) Contrast with traditional software, which is pretty far along the "building/designing" end of the spectrum.

Daniel Kokotajlo

30

I'd love to see someone explain what the priors of neural networks are (I think it's called "minimal circuit" or something like that) compared and contrasted with e.g. the Solomonoff prior, or the Levin prior. It would answer questions like these:

--What is the neural network prior? (My tentative, shitty answer: It's the probability distribution over types of circuits, that tells you how some types are more likely to occur than others, were you to randomly sample parameter configurations.)

--How is it different than the solomonoff prior? (My TSA: The solomonoff prior is over types of programs rather than types of circuits. This isn't by itself a big deal because circuits are programs and programs can be approximated by circuits. More importantly, there are many circuits that get high NN prior but low solomonoff prior, and vice versa. In particular solomonoff prior doesn't penalize programs for running for a very long time, it instead entirely emphasizes minimum description length.)

--Why does this matter? (My TSA: As a first approximation, we should think of SGD on big neural networks as selecting the highest-prior circuit that scores perfectly on the training data. It's like making a posterior by conditionalizing the prior on some data. Mignard et al would argue that this is more than just an approximation, though lots of people disagree with them and think it's more complicated than that. This has important implications for inner alignment issues, see e.g. Paul's stuff.)