Random832 comments on Muehlhauser-Wang Dialogue - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (284)
"escape the testing environment" is poorly defined. Some people read it as "deduce the exploitable vulnerabilities in the system, hack into it, run itself with higher privileges, somehow transmit itself to other machines / the internet at large / infecting people's brains snow-crash style", and others read it as "convince the people running the test to give it more resources (and maybe infect their brains snow-crash style)".
The former can be prevented by having a secure (air gapped?) system, the latter can be prevented by not running tests interactively and ignoring the moral issues with terminating (or suspending) what may possibly be an intelligent 'person'.
It also implicitly assumes that its ability to improve its own intelligence (and therefore gain the ability to do either of the above) is unbounded by the resources of the system and will have no cost in terms of increased processing time.
I will note that the AI box experiment's conditions expressly forbid a secure environment [i.e. one with inspection tools that cannot be manipulated by the AI]:
Because that's not the part of the AI safety question that the AI box experiment is designed to test, so for the purpose of the experiment it says, "sure you might catch the AI in a lie, but assuming you don't--"