I feel like we already can point powerful cognition at certain things pretty well (e.g. chess), and the problem is figuring out how to point AIs to more useful things (e.g. honestly answering hard questions well). So I don’t know if I’m nit-picky, but I think that the problem is not pointing powerful cognition at anything at all, but rather pointing powerful cognition at whatever we want.
It's not clear here, but if you read the linked post it's spelled out (the two are complementary really). The thesis is that it's easy to do a narrow AI that knows only about chess, but very hard to make an AGI that knows the world, can operate in a variety of situations, but only cares about chess in a consistent way.
I think this is correct at least with current AI paradigms, and it has both some reassuring and some depressing implications.
Good paper! Thank you for sharing. I have a few nit-picky suggestions with wording and grammar. I will put them here rather than email directly because some of them are subjective. This way others can feel free to chime in if they feel inclined to nit-pick my nit-picks :)
"artificial general intelligence (AGI) may surpass" -> "artificial general intelligence (AGI) seems likely to surpass" (I feel like "may" is a somewhat weak word in this context, but I don't feel strongly here.)
"undesirable (in other words, misaligned)" -> "undesira...
This is hilarious and beautiful and exactly what I expect from LessWrong. Also, hello fellow Simon.