XiXiDu comments on No Basic AI Drives - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (43)
The relevant question is not "will an AGI automatically undergo recursive self-improvement", but "how likely is it that at least one of the early AGIs undergos recursive self-improvement". If one AGI ends up FOOMing and taking over the world, the fact that there were 999 others which didn't is relatively uninteresting.
Presuming that the AGI has been built by humans to solve problems, then it has presumably also been built to care about the time it takes to reach its goals.
That's true, and we need an organisation like the SIAI to take care of that issue. But I still have a perception of harsh overconfidence around here when it comes issues related to risks from AI. It is not clear to me that dangerous recursive self-improvement is easier to achieve than friendliness.
To destroy is easier than to create. But destroying human values by means of unbounded recursive self-improvement seems to me to be one of the most complex existential risks.
The usual difference that is being highlighted around here is how easy it is to create simple goals versus complex goals, e.g. creating paperclips versus the protection of human values. But recursive self-improvement is a goal in and of itself. An artificial agent does not discern between a destination and the route to reach it, it has to be defined in terms of the AI's optimization parameters. It doesn't just happen, it is something very complex that needs to be explicitly defined.
So how likely is it? You need an AGI that is, in my opinion, explicitly defined and capable of unbounded and uncontrollable recursive self-improvement. There need to be internal causation's that prompt it to keep going in the face of countless undefined challenges.
Something that could take over the world seems to me to be the endpoint of a very long and slow route towards a thorough understanding of many different fields, nothing that one could stumble upon early on and by accident.
The conservative assumption is that AGI is easy, and FAI is hard.
I don't know if this is actually true. I think FAI is harder than AGI, but I'm very much not a specialist in the area - either area. However, I do know that I'd very much rather overshoot the required safety margin by a mile than undershoot by a meter.
"FAI" here generally means "Friendly AGI", which would make "FAI is harder than AGI" trivially true.
Perhaps you meant one of the following more interesting propositions:
(Assuming even the sub-problem of Friendliness still has prerequisite part or all of AGI, the latter proposition implies "Friendliness isn't so easy relative to AGI such that progress on Friendliness will lag insignificantly behind progress on AGI.")