I believe that even through a text-only terminal, a superintelligence could do anything to a human. Persuade the human to let it out, inflict extreme pleasure or suffering with a word. However, can't you just... limit the output of the superintelligence? Just make it so that the human can say anything, but the AI can only respond from a short list of responses, like "Yes," "No," "Maybe," "IDK," "The first option," "The second option," or a similar system. I... don't see how that can still present a risk. What about you?
I'm not sure that AI boxing is a live debate anymore. People are lining up to give full web access to current limited-but-unknown-capabilities implementations, and there's not much reason to believe there will be any attempt at constraining the use or reach of more advanced versions.