Scenario: A human user asks GPT-N a plain text question through the usual browser textbox interface. GPT-N answers in the form of text, which contains clever Javascript code. Instead of being displayed, the malicious code triggers a browser exploit and gets executed. Now GPT-N has an executable of its choice running on the client side and can proceed from there. It has left its AI box with only a text-interface, without asking the user to let it out, and possibly without the user even noticing. Perhaps GPT-3 has done this already, while "deliberately" pretending to be stupider than it is?
Lot's of low cost ways to prevent this- perhaps already implemented (I don't use GPT3 or I'd verify). Human's have been doing this for awhile, so we have a lot of practice defending against it.
https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html
I agree with the cat and mouse metaphor and that we should assume an AI to be hyper competent.
At the same time, it will be restricted to operating within the constraints of the systems in can influence. My main point, which I admit was poorly made, is that cross site scripting attacks can be covered with a small investment, which eliminates clever java script as a possible attack vector. I would place lower probability on this being the way an AI escapes.
I would place higher probability on an AI exploiting a memory buffering type error similar to the on... (read more)