[LINK] AI-boxing Is News, Somehow

MugaSofer

The original article is somewhat cautious:

While it is unlikely that long-term and secure confinement of AI is possible, we are hopeful that the proposed protocol will give researchers a little more time to find a permanent and satisfactory solution for addressing existential risks associated with appearance of superintelligent machines

but not nearly cautious enough:

All supervision, maintenance and interactions with the AI need to be done only via unintelligent programs, which have no option to release the AI.

as if an unintelligent program is not gameable.

Their approach to AI safety:

Asking the AI to answer open ended questions opens up the possibilities for social engineering attacks as well as numerous other dangerous consequences. Our proposed protocol is based on the idea of asking what we will call ‘safe questions’with restricted answers. We define a safe question as one for which a human being could find an answer without any help from superintelligence, all on his own merit.Why is this useful? Suppose that a scientist working on a cure for cancer is looking at two potential cures. Each one could be developed and tested without assistance from the AI, but would take about 3 years of hard work each. Which one to try first if the scientist believes that they have about an equal chance of working? If a wrong option is chosen to be tested first, humanity will still obtain a cure for cancer but it will happen 3 years later.What if we could ask the AI to suggest which option to try first?

This question is safe for a number of reasons. First, each potential answer has an equal probability of being correct. Second, a human being could answer it without assistance from the AI and so asking AI for help will only speed up progress but in no way change the outcome. It is essentially equal to being very lucky multiple choice guesser. Finally, the answer to this question could be encoded in a single bit making it hard to hide additional messages in the provided answer.

To make sure that a potential question is safe, a group of experts could be used to review it.

We believe our communication protocol to be verifiably secure against information leaking even if the AI had full access to the description of the security protocol. If the protocol is correctly implemented, humanity will essentially have access to a superintelligent oracle, which might be extremely beneficial and safer than an unconstrained AI.

I fail to see how "only safe questions with two possible answers of even likelihood which are independently computable by people should be submitted to the AI" can even be constructed for a non-trivial case, and how it can stop a determined superintelligence to make people want to relax the constraints bit by bit.

9

[LINK] AI-boxing Is News, Somehow

9

9

9

[LINK] AI-boxing Is News, Somehow

9

9