CarlShulman comments on Safety Culture and the Marginal Effect of a Dollar - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (105)
It worries me a tad that nobody in the discussion group corrected what I consider to be the obvious basic inaccuracy of the model.
Success on FAI is not a magical result of a researcher caring about safety. The researcher who would have otherwise first created AGI does not gain the power to create FAI just by being concerned about it. They would have to develop a stably self-improving AI which learned an understandable goal system which actually did what they wanted. This could be a completely different set of design technologies than what would have gone into something unstable that improved itself by ad-hoc methods well enough to go FOOM and end the game. The researcher who would have otherwise created AGI might not be good enough to do this. The best you might be able to convince them to do would be to retire from the game. It's a lot harder to convince someone to abandon the incredibly good idea they're enthusiastic about, and start over from scratch or leave the game, then to persuade people to be "concerned about safety", which is really cheap (you just put on a look of grave concern).
If I thought all you had to do to win was to convince the otherwise-first creator of AGI to be "take safety seriously", this problem would be tremendously easier and I would be approaching it in a very different way. I'd be putting practically all of my efforts into PR and academia, not trying to assemble a team to solve basic FAI problems over however-many years and then afterward build FAI. A free win just for convincing someone to take something seriously? Hot damn, that'd be one easy planet to save; there'd be no point in pursuing any other avenue until you'd totally exhausted that one.
As it stands, though, you're faced with (a) the much harder sell of convincing AGI people that they will destroy the world and that being concerned is not enough to save them, that they have to tackle much harder problems than they wanted to face on a problem that seems to them hard-enough-already; and (b) if you do convince the AGI person who otherwise would've destroyed the world, to join the good guys on a different problem or retire, you don't win. The game isn't won there. It's just a question of how long it takes the next AGI person in line to destroy the world. If you convinced them? Number three. You keep dealing through the deck until you turn up the ace of spades, unless the people working on the ace of hearts can solve their more difficult problem before that happens.
All academic persuasion does is buy time, and not very much of that - the return on effort invested seems to be pretty low.
Safety-speed tradeoffs, the systematic bias in "one randomly selected researcher," and AGI vs FAI difficulty were discussed at the time.