Juno_Watt comments on Muehlhauser-Wang Dialogue - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (284)
If I understand correctly, his theses are that the normal research path will produce safe AI because it won't blow up out of our control or generally behave like a Yudkowsky/Bostrom-style AI. Also that trying to prove friendlyness in advance is futile (and yet AI is still a good idea) because it will have to have "adaptive" goals, which for some reason has to extend to terminal goals.
He needs to taboo "adapive", read and understand Bostroms AI-behaviour stuff, and comprehend the Superpowerful-Optimizer view, and then explain exactly why it is that an AI cannot have a fixed goal architecture.
If AI's can't have a fixed goal architecture, Wang needs to show that AI's with unpredictable goals are somehow safe, or start speaking out against AI.
So what sort of inconvienient word would it take for Wang's major conclusions to be correct?
I don't know, I'm not good enough at this steel-man thing, and my wife is sending me to bed.
The reason would the that the goal stability problem is currently unsolved.