Comment author:Sune
20 July 2015 01:08:04PM
*
0 points
[-]
Couldn't you just send one bit X (1 means on, 0 means off) which is most likely 1 but could turn into 0 due to noise and define the utility u* in the same way as for corrigibility? That is,
Here A_1 denotes what happens in the world before the signal is sent, and A_2 what happens afterwards.
This way you only use 1 bit rather than 100 and there is no longer a contribution of 2^{-100} from the case where there is a thermodynamic miracle that turns the on-signal into the on-signal (and you don't have to worry about the distribution of the signal given a thermodynamic miracle). The oracle will optimize u given that X=0 until X is revealed. When that is revealed, we will most likely have X=1, and the oracle will optimize u given X=1 (if the oracle is still running). Does the above idea achieve something more?
Couldn't you just send one bit X (1 means on, 0 means off) which is most likely 1 but could turn into 0 due to noise and define the utility u* in the same way as for corrigibility? That is,
u*(A_1,0,A_2)= u(A_1,0,A_2)
u*(A_1,1,A_2)=u(A_1,1,A_2)+E_{A_2'} u(A_1,0,A_2')- E_{A_2'} u(A_1,1,A_2')
Here A_1 denotes what happens in the world before the signal is sent, and A_2 what happens afterwards. This way you only use 1 bit rather than 100 and there is no longer a contribution of 2^{-100} from the case where there is a thermodynamic miracle that turns the on-signal into the on-signal (and you don't have to worry about the distribution of the signal given a thermodynamic miracle). The oracle will optimize u given that X=0 until X is revealed. When that is revealed, we will most likely have X=1, and the oracle will optimize u given X=1 (if the oracle is still running). Does the above idea achieve something more?