Would it be possible to help with keeping an AI boxed by building a goal of staying in the box into it?
How do you define "staying in the box." Whatever definition you use, the AI will likely find a way to get out of the box while satisfying your definition.
If it's worth saying, but not worth its own post (even in Discussion), then it goes here.