I really appreciate you taking the time both to write this report and solicit/respond to all these reviews! I think this is a hugely valuable resource, that has helped me to better understand AI risk arguments and the range of views/cruxes that different people have.
A couple quick notes related to the review I contributed:
First, .4% is the credence implied by my credences in individual hypotheses — but I was a little surprised by how small this number turned out to be. (I would have predicted closer to a couple percent at the time.) I’m sympathetic to the ...
A tricky thing here is that it really depends how quickly a technology is adopted, improved, integrated, and so on.
For example, it seems like computers and the internet caused a bit of a surge in American productivity growth in the 90s. The surge wasn't anything radical, though, for at least a few reasons:
Continued technological progress is necessary just to sustain steady productivity growth.
It's apparently very hard, in general, to increase aggregate productivity.
The adoption, improvement, integration, etc., of information technology was a relat
Since neural networks are universal function approximators, it is indeed the case that some of them will implement specific search algorithms.
I don't think this specific point is true. It seems to me like the difference between functions and algorithms is important. You can also approximate any function with a sufficiently large look-up table, but simply using a look-up table to choose actions doesn't involve search/planning.* In this regard, something like a feedforward neural network with frozen weights also doesn't seem importantly different than a l...
...I do agree that OT and ICT by themselves, without any further premises like "AI safety is hard" and "The people building AI don't seem to take safety seriously, as evidenced by their public statements and their research allocation" and "we won't actually get many chances to fail and learn from our mistakes" does not establish more than, say, 1% credence in "AI will kill us all," if even that. But I think it would be a misreading of the classic texts to say that they were wrong or misleading because of this; probably if you went back in time and asked Bost
I agree that your paper strengthens the IC (and is also, in general, very cool!). One possible objection to the ICT, as traditionally formulated, has been that it's too vague: there are lots of different ways you could define a subset of possible minds, and then a measure over that subset, and not all of these ways actually imply that "most" minds in the subset have dangerous properties. Your paper definitely makes the ICT crisper, more clearly true, and more closely/concretely linked to AI development practices.
I still think, though, that the ICT only get...
I think we can interpret it as a burden-shifting argument; "Look, given the orthogonality thesis and instrumental convergence, and various other premises, and given the enormous stakes, you'd better have some pretty solid arguments that everything's going to be fine in order to disagree with the conclusion of this book (which is that AI safety is extremely important)." As far as I know no one has come up with any such arguments, and in fact it's now the consensus in the field that no one has found such an argument.
I suppose I disagree that at least the ...
I think the purpose of the OT and ICT is to establish that lots of AI safety needs to be done. I think they are successful in this. Then you come along and give your analogy to other cases (rockets, vaccines) and argue that lots of AI safety will in fact be done, enough that we don't need to worry about it. I interpret that as an attempt to meet the burden, rather than as an argument that the burden doesn't need to be met.
But maybe this is a merely verbal dispute now. I do agree that OT and ICT by themselves, without any further premises like &qu...
for example, the "Universal prior is malign" stuff shows that in the limit GPT-N would likely be catastrophic,
If you have a chance, I'd be interested in your line of thought here.
My initial model of GPT-3, and probably the model of the OP, is basically: GPT-3 is good at producing text that it would have been unsurprising to find on the internet. If we keep training up larger and larger models, using larger and larger datasets, it will produce text that it would be less-and-less surprising to find on the internet. Insofar as there are safety concerns, th...
Hm, I’d probably disagree.
A couple thoughts here:
First: To me, it seems one important characteristic of “planners” is that they can improve their decisions/behavior even without doing additional learning. For example, if I’m playing chess, there might be some move that (based on my previous learning) initially presents itself as the obvious one to make. But I can sit there and keep running mental simulations of different games I haven’t yet played (“What would happen if I moved that piece there…?”) and arrive at better and better decisions.
It doesn’t seem ... (read more)