JGWeissman comments on What I would like the SIAI to publish - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (218)
Such an AI would still be motivated to FOOM to consolidate its future ability to achieve large utility against the threat of being deactivated before then.
It doesn't know about any threat. You implicitly assume that it has something equivalent to fear, that it perceives threats. You allow for the human ingenuity to implement this and yet you believe that they are unable to limit its scope. I just don't see that it would be easy to make an AI that would go FOOM because it doesn't care to go FOOM. If you tell it to optimize some process then you'll have to tell it what optimization means. If you can specify all that, how is it then still likely that it somehow comes up with its own idea that optimization might be to consume the universe if you told it to optimize its software running on a certain supercomputer? Why would it do that, where does the incentive come from? If I tell a human to optimize he might muse to turn the planets into computronium but if I tell a AI to optimize it doesn't know what it means until I tell it what it means and then it still won't care because it isn't equipped with all the evolutionary baggage that humans are equipped with.
It is a general intelligence that we are considering. It can deduce the threat better than we can.
Because it is a general intelligence. It is smart. It is not limited to getting its ideas from you, it can come up with its own. And if the AI has been given the task of optimising its software for performance on a certain computer then it will do whatever it can to do that. This means harnessing external resources to do research on computation theory.
No he doesn't. He assumes only that it is a general intelligence with an objective. Potentially negative consequences are just part of possible universes that it models like everything else.
I'm not sure what can be done to make this clear:
SELF IMPROVEMENT IS AN INSTRUMENTAL GOAL THAT IS USEFUL FOR ACHIEVING MOST TERMINAL VALUES.
You have this approximately backwards. A human knows that if you tell her to create 10 paperclips every day you don't mean take over the world so she can be sure that nobody will interfere with her steady production of paperclips in the future. The AI doesn't.
It has the ability to model and to investigate hypothetical possibilities that might negatively impact the utility function it is optimizing. If it doesn't, it is far below human intelligence and is non-threatening for the same reason a narrow AI is non-threatening (but it isn't very useful either).
The difficulty of detecting these threats is spread out around the range of difficulties the AI is capable of handling, so it can infer that there are probably more threats which it could only detect if it were smarter. Therefore, making itself smarter will enable it to detect more threats and thereby increase utility.
To be able to optimize it will have to know what it is supposed to optimize. You've to carefully specify what it output (utility function) is supposed to be or it won't be able to tell how good it is at optimizing. If you just tell it to produce paperclips, it won't be able to self-improve because it doesn't know how paperclips look like etc., therefore it cannot judge its own success or that extreme heat would be a negative impact giving paperclips made out of plastic. You further assume that it has a detailed incentive, that it is given a detailed pathway that it tells to look for threats and eliminate them.
If it doesn't it is what most researchers are working on, an intelligence with the potential to learn and make use of what it learnt, with the potential to become intelligent (educated). I'm getting the impression that people here assume that researchers are not working on an AGI but to hardcode a FOOM machine. If FOOM is simply part of your definition then there's no arguing against it going FOOM. But what researchers like Goertzel are working on are systems with the potential to reach human level intelligence, that does not mean that they will by definition jailbreak their nursery school. Although I never tried to argue against the possibility but that there are many pathways where this won't happen rather than the way it is portrayed by the SIAI, that any implementation of AGI will most likely consume humnanity.
The sorts of intelligences you are talking about are narrow AIs, not general intelligences. If you told a general intelligence to produce paperclips but it didn't know what a paperclip was, then its first subgoal would be to find out. The sort of mind that would give up on a minor obstacle like that wouldn't foom, but it wouldn't be much of an AGI either.
And yes, most researchers today are working on narrow AIs, not on AGI. That means they're less likely to successfully make a general intelligence, but it has no bearing on the question of what will happen if they do make one.