Kaj_Sotala comments on The Brain as a Universal Learning Machine - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (166)
Autonomy? Arguably that's Greek...
There is clearly a demand for agentive AI, in a sense, because people are already using agents to do their bidding, to achieve specific goals. Those qualifications are important because they distinguish a limited kind of AI, that people would want, from a more powerful kind, that they would not.
The idea of AI as "benevolent" dictator is not appealing to democritically minded types, who tend to suspect a slippery slope from benevolence to malevolence, and it is not appealing to dictator to have a superhuman rival...so who is motivated to build one?
Yudkowsky seems to think that there is a moral imperative to put an AI in charge of the world, because it would create billions of extra happy human lives, and not creating those lives is the equivalent of mass murder. That is a very unintuitive piece of reasoning, and it therefore cannot stand as a prediction of what AIs will be built, since it does not stand as a prediction about how people will reason morally.
The option of achieving safety by aiming lower...the technique that leads us to have speed limits, rather than struggling to make the faster possible car safe...is still available.
The God AI concept is related to another favourite MIRI theme, the need to instil the whole of human value into an AI, something MIRI admits would be very difficult. .
MIRI makes the methodological proposal that it simplifies the issue of friendliness or morality or safety to deal with the whole of human value, rather than identifying a morally relevant subset. Having done that, it concludes that human morality is extremely complex. In other words, the payoff in terms of methodological simplification never arrives, for all that MIRI relieves itself of the burden of coming up with a theory of morality. Since dealing with human value in total is in absolute terms very complex, the possibility remains open that identifying the morally relevant subset of values is relatively easier (even if still difficult in absolute terms) than designing an AI to be friendly in terms of the totality of value, particularly since philosophy offers a body of work that seeks to identify simple underlying principles of ethics.
Not only are some human values morally relevant, than others some human values are what make humans dangerous to other humans, bordering on existential threat. I would rather not have superintelligent AIs with paranoia , supreme ambition, or tribal loyalty to other AIs in their value system.
So there are good reasons for thinking that installing subsets of human value would be both easier and safer.
Altruism, in particular is not needed for a limited agentive AI. Such AIs would perform specialised tasks, leaving it to humans to stitch the results into something that fulfils their values. We don't want a Google car that takes us where it guesses we want to go
From section 5.1.1. of Responses to Catastrophic AGI Risk:
The weaponisation of AI has indeed already begun, so it is not a danger that needs pointing out. It suits the military to give drones, and so forth, greater autonomy, but it also suits the military to retain overall control....they are not going to build a God AI that is also a weapon, since there is no military mileagei n building a weapon that might attack you out of its own volition. So weaponised AI is limited agentive AI. Since the military want .to retain overall control, they will in effect conduct their own safety research, increasing the controlability of their systems in parallel with their increasing autonomy. MIRIs research is not very relevant to weaponised AI, because MIRI focuses on the hidden dangers of apparently benevolent AI, and on god AIs, powerful singletons.
You may be tacitly assuming that an AI is either passive, like Oracle AI , .or dangerously agentive. But we already have agentive AIs that haven't killed us.
I am making a three way distinction between 1. Non agentive AI 2. Limited agentive AI 3. Maximally agentive AI, .or "God" AI.
Non agentive AI is passive, doing nothing once it has finished processing its current request. It is typified by Oracle AI. Limited agentive AI performs specific functions, and operates under effective overrides and safety protocols. (For instance, whilst it would destroy the effectiveness of automated trading software to have a human okaying each trade, it nonetheless has kill switches and sanity checks). Both are examples of Tool AI. Tool AI can be used to do dangerous things, but the responsibility ultimately falls on the tool us Maximally agentive AI is not passive by default, and has a wide range if capabilities. It may be in charge of other AIs, or have effectors that allow it to take real world actions directly. Attempts may have been made to add safety features, but their effectiveness would be in doubt...thatis just the hard problem of AI friendliness that MIRI writes so much about.
The contrary view is that there is no need to render God AIs safe technologically, because other is no incentive to build them.(Which does not mean the whole field of AI safety is pointless
ETA
On the other hand you may be distinguishing between limited and maximal agency, but arguing that there is a slippery slope leading from the one to the other. The political analogy shows that people are capable of putting a barrier across the slope: people are generally happy to give some power to some politicians, but resist moves to give all the power to one person.
On the other hand, people might be tempted to give AIs more power once they have a track record of reliability, but a track record of reliability is itself a kind of empirical safety proof.
There is a further argument to the effect that we are gradually giving more autonomy to agentive AIs (without moving entirely away from oracle AIs like Google) , but that gradual increase is being paralelled by an incremental approach to AI safety, for instance in automated trading systems, which have been given both more ability to trade without detailed oversight, and more powerful overrides. Hypothetically, increased autonomy without increased safety measures would mean increased danger, but that is not the case in reality. I am not arguing against AI danger and safety measures overall, I am arguing against a grandiose, all-or-nothing conception of AI safety and danger.