Stuart_Armstrong comments on Reduced impact AI: no back channels - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (41)
S is a channel for P' to trick or brick P. Options include:
P' should spend the next 49 years fooming as hard as possible with no concern for E'(U|a), 1 year implementing its anti-P method and then the next eternity optimizing E'(U|a); altering the length of counterfactual time between P' activating and P observing it merely changes the amount of time the universe spends as computronium slaved to plotting against P.
The output channel is intrinsically unsafe, and we have to handle it with care. It doesn't need to do anything subtle with it: it could just take over in the traditional way. This approach does not make the output channel safe, it means that the output channel is the only unsafe part of the system.