Steelmaning AI risk critiques

Stuart_Armstrong

At some point soon, I'm going to attempt to steelman the position of those who reject the AI risk thesis, to see if it can be made solid. Here, I'm just asking if people can link to the most convincing arguments they've found against AI risk.

EDIT: Thanks for all the contribution! Keep them coming...

I mildly disagree. A certain amount of AI safety specifically involves trying to extend our available tools for dealing with Problem One to the situations that we expect to happen when we deal with powerful learning agents. Goal-system stability, for instance, is a matter of program verification -- hence why all the papers about it deal with mathematical logic.

I haven't read any technical papers on goal-system stability; isn't it the case that real-world attempts at that are going to have at least as much of Problem Two as of Problem One about them? ("Internally" -- in the notion of what counts as self-improvement -- if not "externally" in whatever problem(s) the system is trying to solve.) I haven't thought (or read) enough about this for my opinion to have much weight; I could well be completely wrong about it.

Regardless, you're certainly right that Problem One is going to be important as well as Problem Two, and I should have said something like "AI safety is also an instance of Problem Two".

36

Steelmaning AI risk critiques

36

36

36

Steelmaning AI risk critiques

36

36