PeterisP comments on What I would like the SIAI to publish - Less Wrong

27 Post author: XiXiDu 01 November 2010 02:07PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (218)

You are viewing a single comment's thread. Show more comments above.

Comment author: XiXiDu 03 November 2010 05:46:19PM *  2 points [-]

Yes, I agree with everything. I'm not trying to argue that there exist no considerable risk. I'm just trying to identify some antipredictions against AI going FOOM that should be incorporated into any risk estimations as it might weaken the risk posed by AGI or increase the risk posed by impeding AGI research.

I was insufficiently clear that what I wanted to argue about is the claim that virtually all pathways lead to destructive results. I have an insufficient understanding of why the concept of general intelligence is inevitably connected with dangerous self-improvement. Learning is self-improvement in a sense but I do not see how this must imply unbounded improvement in most cases given any goal whatsoever. One argument is that the only general intelligence we know, humans, would want to improve if they could tinker with their source code. But why is it so hard to make people learn then? Why don't we see much more people interested in how to change their mind? I don't think you can draw any conclusions here. So we are back at the abstract concept of a constructed general intelligence (as I understand it right now), that is an intelligence with the potential to reach at least human standards (same as a human toddler). Another argument is based on this very difference between humans and AI's, namely that there is nothing to distract them, that they will possess an autistic focus on one mandatory goal and follow up on it. But in my opinion the difference here also implies that while nothing will distract them, there will also be no incentive not to hold. Why would it do more than necessary to reach a goal? The further argument here is that it will misunderstand its goals. But the problem I see in this case is firstly that the more unspecific the goal the less it is able to measure its self-improvement against the goal to quantify the efficiency of its output. Secondly, the more vague a goal the larger has to be its general knowledge, previous to any self-improvement, to make sense of it in the first place? Shouldn't those problems outweigh each other to some extent?

For example, if you told the AGI to become as good as possible in Formula 1, so that it was faster than any human race driver. How is it that the AGI is yet smart enough to learn this all by itself but fails to notice that there are rules to follow. Secondly, why would it keep improving once it is faster than any human rather than just hold and become impassive? This argument could be extended to many other goals which have scope bounded solutions.

Of course, if you told it to learn as much about the universe as possible, that is something completely different. Yet I don't see how this risk does raise against other existential risks like grey goo since it should be easier to create advanced replicators to destroy the world than creating AGI that then creates advanced replicators that then fails hold and then destroys the world?

Comment author: PeterisP 05 November 2010 10:14:55AM 5 points [-]

" How is it that the AGI is yet smart enough to learn this all by itself but fails to notice that there are rules to follow" - because there is no reason for an AGI automagically creating arbitrary restrictions if they aren't part of the goal or superior to the goal. For example, I'm quite sure that F1 rules prohibit interfering with drivers during the game; but if somehow a silicon-reaction-speed AGI can't win F1 by default, then it may find it simpler/quicker to harm the opponents in one of the infinity ways that the F1 rules don't cover - say, getting some funds in financial arbitrage, buying out the other teams, and firing any good drivers or engineering a virus that halves the reaction speed of all homo-sapiens - and then it would be happy as the goal is achieved within the rules.

Comment author: XiXiDu 05 November 2010 03:05:51PM *  1 point [-]

...because there is no reason for an AGI automagically creating arbitrary restrictions if they aren't part of the goal or superior to the goal.

That's clear. But let me again state what I'd like to inquire. Given the large amount of restrictions that are inevitably part of any advanced general intelligence (AGI), isn't the nonhazardous subset of all possible outcomes much larger than that where the AGI works perfectly yet fails to hold before it could wreak havoc? Here is where this question stems from. Given my current knowledge about AGI I believe that any AGI capable of dangerous self-improvement will be very sophisticated, including a lot of restrictions. For example, I believe that any self-improvement can only be as efficient as the specifications of its output are detailed. If for example the AGI is build with the goal in mind to produce paperclips, the design specifications of what a paperclip is will be used as leveling rule by which to measure and quantify any improvement of the AGI's output. This means that to be able to effectively self-improve up to a superhuman level, the design specifications will have to be highly detailed and by definition include sophisticated restrictions. Therefore to claim that any work on AGI will almost certainly lead to dangerous outcomes is to assert that any given AGI is likely to work perfectly well, subject to all restrictions except one that makes it hold (spatiotemporal scope boundaries). I'm unable to arrive at that conclusion as I believe that most AGI's will fail extensive self-improvement as that is where failure is most likely for that it is the largest and most complicated part of the AGI's design parameters. To put it bluntly, why is it more likely that contemporary AGI research will succeed at superhuman self-improvement (beyond learning), yet fail to limit the AGI, rather than vice versa? As I see it, it is more likely, given the larger amount of parameters to be able to self-improve in the first place, that most AGI research will result in incremental steps towards human-level intelligence rather than one huge step towards superhuman intelligence that fails on its scope boundary rather than self-improvement.

Comment author: jimrandomh 05 November 2010 03:24:23PM 0 points [-]

What you are envisioning is not an AGI at all, but a narrow AI. If you tell an AGI to make paperclips, but it doesn't know what a paperclip is, then it will go and find out, using whatever means it has available. It won't give up just because you weren't detailed enough in telling it what you wanted.

Comment author: XiXiDu 05 November 2010 03:29:55PM 2 points [-]

Then I don't think that there is anyone working on what you are envisioning as 'AGI' right now. If a superhuman level of sophistication regarding the potential for self-improvement is already part of your definition then there is no argument to be won or lost here regarding risk assessment of research on AGI. I do not believe this is reasonable or that AGI researchers share your definition. I believe that there is a wide range of artificial general intelligence that does not suit your definition yet deserves this terminology.

Comment author: jimrandomh 05 November 2010 04:14:26PM 2 points [-]

Who said anything about a superhuman level of sophistication? Human-level is enough. I'm reasonably certain that if I had the same advantages an AGI would have - that is, if I were converted into an emulation and given my own source code - then I could foom. And I think any reasonably skilled computer programmer could, too.

Comment author: red75 05 November 2010 05:02:56PM 1 point [-]

Debugging will be PITA. Both ways.