Kaj_Sotala comments on Reply to Holden on 'Tool AI' - Less Wrong

94 Post author: Eliezer_Yudkowsky 12 June 2012 06:00PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (348)

You are viewing a single comment's thread.

Comment author: Kaj_Sotala 12 June 2012 09:10:50PM *  16 points [-]

Software that does happen to interface with humans is selectively visible and salient to humans, especially the tiny part of the software that does the interfacing; but this is a special case of a general cost/benefit tradeoff which, more often than not, turns out to swing the other way, because human advice is either too costly or doesn't provide enough benefit.

I suspect this is the biggest counter-argument for Tool AI, even bigger than all the technical concerns Eliezer made in the post. Even if we could build a safe Tool AI, somebody would soon build an agent AI anyway.

My five cents on the subject, from something that I'm currently writing:

Like with external constraints, Oracle AI suffers from the problem that there would always be an incentive to create an AGI that could act on its own, without humans in the loop. Such an AGI would be far more effective in furthering whatever goals it had been built to pursue, but also far more dangerous.

Current-day narrow-AI technology includes high-frequency trading (HFT) algorithms, which make trading decisions within fractions of a second, far too fast to keep humans in the loop. HFT seeks to make a very short-term profit, but even traders looking for a longer-term investment benefit from being faster than their competitors. Market prices are also very effective at incorporating various sources of knowledge (Hanson 2007). As a consequence, a trading algorithm’s performance might be improved both by making it faster and by making it more capable of integrating various sources of knowledge. Since trading is also one of the fields with the most money involved, it seems like a reasonable presumption that most advances towards general AGI will quickly be put to use into making more money on the financial markets, with little opportunity for a human to vet all the decisions. Oracle AIs are unlikely to remain as pure oracles for long.

In general, any broad domain involving high stakes, adversarial decision-making, and a need to act rapidly is likely to become increasingly dominated by autonomous systems. The extent to which the systems will need general intelligence will depend on the domain, but many domains such as warfare, information security and fraud detection could plausibly make use of all the intelligence they can get. This is especially the case if one’s opponents in the domain are also using increasingly autonomous A(G)I, leading to an arms race where one might have little choice but to give increasing amounts of control to A(G)I systems.

From the same text, also related to Eliezer's points:

Even if humans were technically kept in the loop, they might not have the time, opportunity, or motivation to verify the advice given by an Oracle AI. This may be a danger even with more narrow-AI systems. Friedman & Kahn (1992) discuss this risk in the context of APACHE, a computer expert system that provides doctors with advice regarding treatments. They write that as the medical community starts to trust APACHE, it may become practice to act on APACHE’s recommendations somewhat automatically, and it may become increasingly difficult to challenge the “authority” of the recommendation. Eventually, the consultation system may in effect begin to dictate clinical decisions.

Likewise, Bostrom & Yudkowsky (2011) point out that modern bureaucrats often follow established procedures to the letter, rather than exercising their own judgment and allowing themselves to be blamed for any mistakes that follow. Dutifully following all the recommendations of an AGI system would be an even better way of avoiding blame.

Thus, even AGI systems that function purely to provide advice will need to be explicitly designed as safe in the sense of not providing advice that would go against human values (Wallach & Allen 2009). This requires a way of teaching them the correct values.

Comment author: torekp 15 June 2012 02:05:42AM 0 points [-]

I suspect this is the biggest counter-argument for Tool AI, even bigger than all the technical concerns Eliezer made in the post. Even if we could build a safe Tool AI, somebody would soon build an agent AI anyway.

Thank you for saying this (and backing it up better than I would have). I think we should concede, however, that a similar threat applies to FAI. The arms race phenomenon may create uFAI before FAI can be ready. This strikes me as very probable. Alternately, if AI does not "foom", uFAI might be created after FAI. (I'm mostly persuaded that it will foom, but I still think it's useful to map the debate.) The one advantage is that if Friendly Agent AI comes first and fooms, the threat is neutralized; whereas Friendly Tool AI can only advise us how to stop reckless AI researchers. If reckless agent AIs act more rapidly than we can respond, the Tool AI won't save us.

Comment author: Vladimir_Nesov 03 July 2012 09:53:31AM 1 point [-]

Alternately, if AI does not "foom", uFAI might be created after FAI.

If uFAI doesn't "foom" either, they both get a good chunk of expected utility. FAI doesn't need any particular capability, it only has to be competitive with other possible things.