So I would only consider the formulation in terms of semimeasures to be satisfactory if the semimeasures are specific enough that the correct semimeasure plus the observation sequence is enough information to determine everything that's happening in the environment.
Can you make an example of a situation in which that would not be the case? I think the semimeasure AIXI and deterministic programs AIXI are pretty much equivalent, am I overlooking something here?
If we're going to allow infinite episodic utilities, we'll need some way of comparing how big different nonconvergent series are.
I think we need that even without infinite episodic utilities. I still think there might be possibilities involving surreal numbers, but I haven't found the time yet to develop this idea further.
Why?
Because otherwise we definitely end up with an unenumerable utility function and every approximation will be blind between infinitely many futures with infinitely large utility differences, I think. The set of all binary strings of infinite length is uncountable and how would we feed that into an enumerable/computable function? Your approach avoids that via the use of policies p and q that are by definition computable.
I think you are proposing to have some hypotheses privileged in the beginning of Solomonoff induction, but not too much because the uncertainty helps fight wireheading by means of providing knowledge about the existence of an idealized, "true" utility function and world model. I that a correct summary? (Just trying to test whether I understand what you mean.)
In particular they can make positive use of wire-heading to reprogram themselves even if the basic architecture M doesn't allow it
Can you explain this more?
They just do interpersonal comparisons; lots of their ideas generalize to intrapersonal comparisons though.
I recommend the book "Fair Division and Collective Welfare" by H. J. Moulin, it discusses some of these problems and several related others.
True. :)
I get that now, thanks.
you forgot to multiply by 2^-l(q)
I think then you would count that twice, wouldn't you? Because my original formula already contains the Solomonoff probability...
Let's stick with delusion boxes for now, because assuming that we can read off from the environment whether the agent has wireheaded breaks dualism. So even if we specify utility directly over environments, we still need to master the task of specifying which action/environment combinations contain delusion boxes to evaluate them correctly. It is still the same problem, just phrased differently.
I think there is something off with the formulas that use policies: If you already choose the policy
=y_{%3Ck}y_k)
then you cannot choose an y_k in the argmax.
Also for the Solomonoff prior you must sum over all programs
=x_{1:m_k}) .
Could you maybe expand on the proof of Lemma 1 a little bit? I am not sure I get what you mean yet.
Super hard to say without further specification of the approximation method used for the physical implementation.