Baughn comments on No Basic AI Drives - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (43)
That's true, and we need an organisation like the SIAI to take care of that issue. But I still have a perception of harsh overconfidence around here when it comes issues related to risks from AI. It is not clear to me that dangerous recursive self-improvement is easier to achieve than friendliness.
To destroy is easier than to create. But destroying human values by means of unbounded recursive self-improvement seems to me to be one of the most complex existential risks.
The usual difference that is being highlighted around here is how easy it is to create simple goals versus complex goals, e.g. creating paperclips versus the protection of human values. But recursive self-improvement is a goal in and of itself. An artificial agent does not discern between a destination and the route to reach it, it has to be defined in terms of the AI's optimization parameters. It doesn't just happen, it is something very complex that needs to be explicitly defined.
So how likely is it? You need an AGI that is, in my opinion, explicitly defined and capable of unbounded and uncontrollable recursive self-improvement. There need to be internal causation's that prompt it to keep going in the face of countless undefined challenges.
Something that could take over the world seems to me to be the endpoint of a very long and slow route towards a thorough understanding of many different fields, nothing that one could stumble upon early on and by accident.
The conservative assumption is that AGI is easy, and FAI is hard.
I don't know if this is actually true. I think FAI is harder than AGI, but I'm very much not a specialist in the area - either area. However, I do know that I'd very much rather overshoot the required safety margin by a mile than undershoot by a meter.
"FAI" here generally means "Friendly AGI", which would make "FAI is harder than AGI" trivially true.
Perhaps you meant one of the following more interesting propositions:
(Assuming even the sub-problem of Friendliness still has prerequisite part or all of AGI, the latter proposition implies "Friendliness isn't so easy relative to AGI such that progress on Friendliness will lag insignificantly behind progress on AGI.")