TheAncientGeek comments on Debunking Fallacies in the Theory of AI Motivation - LessWrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (343)
Stepping in as an interlocuter; while I agree that "all-powerful" is poor terminology, I think the described power here is likely with AGI. One feature AGI is nearly certain to have is superhuman processing power; this allows large numbers of Monte Carlo simulations which an AGI could use to predict human responses; especially if there is a Bayesian calibrating mechanism. An above-human ability to predict human responses is an essential component to near-perfect social engineering. I don't see this as an outrageous, magic-seeming power. Such an AGI could theoretically have the power to convince humans to adopt any desired response. I believe your paper maintains that an AGI wouldn't use this power, and not that such a power is outrageous.
My personal feelings twards this article are that is sounds suspiciously close to a "No true Scotsman" argument. "No true (designed with friendly intentions) AI would submit to these catastrophic tendencies." While your arguments are persuasive, I wonder if a catastrophe did occur, would you dismiss it as the work of "not a true AI?" By way of disclaimer, my strengths are in philosophy and mathematics, and decidedly not computer science. I hope you have time to reply anyways.
Loosemore's claim could be steelmanned into the claim that the Maverick Nanny isnt likely...it requires an AI with goals, with harcoded goals, with hardcoded goals including a full explicit definition of happiness, and a buggy full explicit definition of happiness.. That's a chain of premises.
That isn't even remotely what the paper said. It's a parody.
Since it is a steelman it isnt supposed to be what the paper is saying,
Are you maintaining, in contrast, that the maverick nanny is flatly impossible?
Sorry, I may have been confused about what you were trying to say because you were responding to someone else, and I hadn't come across the 'steelman' term before.
I withdraw 'parody' (sorry!) but ... it isn't quite what the logical structure of the paper was supposed to be.
It feels like you steelmanned it onto some other railroad track, so to speak.