The formalism presented in this post turned out to be erroneous (as opposed to the formalism in the previous post). The problem is that the step in the proof of the main proposition in which the soundness schema is applied cannot be generalized to the ordinal setting since we don't know whether ακ is a successor ordinal so we can't replace it by ακ'=ακ-1. I'm not deleting this post primarily to preserve the useful discussion in the comments.
Followup to: Parametric polymorphism in updateless intelligence metric
In the previous post, I formulated a variant of Benja's parametric polymorphism suitable for constructing updateless intelligence metrics. More generally, this variants admits agents which are utility maximizers (in the informal sense of trying their best to maximize a utility function, not in the formal sense of finding the absolutely optimal solution; for example they might be "meliorizers" to use the terminology of Yudkowsky and Herreshoff) rather than satisficers. The agents using this formalism labor under a certain "delusion", namely, since they believe that κ ("the number of ineffable mystery") is an actual finite number (whereas it is secretly infinite, in a sense), they think that the chain of tiling agents is effectively finite as well (since at some point agent #n in the chain will discover that κ > n is false and will fail to construct a subsequent agent of the same "epistemic strength"). The same problem exists in Weaver's intuitionistic assertability predicate formalism.
To overcome this limitation, I suggest to extend κ's semantics from natural numbers to elements of a certain recursive ordinal (at least morally; technically it is done a bit differently, see below). In Benja's original formulation this doesn't appear a valid option, since κ is interpreted as a time interval. However, in my variant κ is just an abstract parameter queries about which can be directed to a special "oracle", so there is no such limitation.
Formalism
Fix α a recursive ordinal and αi a system of notations for ordinals smaller than α such that α0=0, the function f defined by αf(i)=αi+1 is recursive and the bit-valued function g(i,j) which equals 1 iff αi < αj is also recursive. We also assume that
(0) T |- For all i: g(i,f(i))=1
where T is our starting theory. We define Tκ(α) to be T extended by the symbol κ (which stands for the index of the ordinal, not the ordinal itself) and the following self-referential soundness schema:
(1) For all formulae φ: For all n, i: ((@_Tκ(α) "φ(n)" and g(i,κ)=1) -> φ(n)[κ\i])
Tκ(α) is proven to be sound analogously to Yudkowsky and Herreshoff, with transfinite induction over ordinals below α replacing the ordinary induction.
The licensing condition for agent A0 to perform action b0 is
(2) Exists n: Tκ(α) |- (g(n,κ)=1 -> (A0()=b0 -> G))
The property which is sufficient for agent B1 to be creatable by agent A0 is
(3) Tκ(α) |- For all b1: (B1()=b1 -> (b1=0 or Exists n: (g(n,κ)=1 and @_Tκ(α) "(g(n,f(κ))=1 and B1()[κ\f(κ)]=b1) -> G[κ\f(κ)]")))
The proof that (3) indeed enables tiling is analogous to the previous post (but requires using (0)).
Discussion
In order for A0 to know that the tiling chain is infinite, it is enough for her to verify ακ >= ω. In order for her to know that A1 is going to know the tiling chain is infinite, she needs to verify ακ >= ω+1. In order for her to know that all agents are going to know the tiling chain is infinite, she needs to verify ακ >= 2ω. In order for her to know that all agents are going to know that, she needs to verify ακ >= 3ω et cetera.
It remains to decide which ordinal should we actually use. My intuition is that the correct ordinal is the least α with the property that α is the proof-theoretic ordinal of Tκ(α) extended by the axiom schema {g(i,κ)=1}. This seems right since the agent shouldn't get much from ακ > β for β above the proof theoretic ordinal. However, a more formal justification is probably in order.
I commend your enthusiasm! Positive reinforcement for working on these posts. A few notes:
A utility meliorizer is not a utility maximizer. A maximizer finds the optimal strategy (this is untenable). A meliorizer simply improves upon some fallback policy (and can run into a number of problems). Eventually, we're going to want something between the two: meliorizers are just the second thing on the chain of "agents that can provably tile to agents like them" (after satisficers).
Not quite: An agent using parametric polymorphism never 'discovers' anything about kappa. A PP agent can always tile (and PP can prove this). The catch is that the agent believes that if kappa = 0 a PP agent can tile to a potentially-inconsistent agent. An agent using PP doesn't care, as in order for PP to work the agent's goals must be formulated in terms of "the world will be safe for at least kappa steps". PP agents think that they might be inconsistent after the kappath step, but they don't mind, because they don't care about the world after the kappath step.
I'm not convinced that your proof carries. (Did you want the 'Exists n' to go inside the turnstile here?) I don't think that this says what you want it to say---it seems to say that the agent can take action b if there is some n such that a_n < a_k implies executing b is safe. I can trivially satisfy this by choosing an n large enough that the antecedent false. (There's always some n such that a_n >= a_k, in which case g(n, k) = 1 is false and the action criterion is trivially satisfied.)
I'm not entirely confident in this analysis, because your statements are somewhat hard to parse in this format. Do you have a latex version, perchance?
Alas, the tiling chain won't be infinite, by the well-foundedness of the ordinals. The chain may be arbitrarily long, but no such chain may be infinite. (More precisely, in the system that I think you're suggesting, A0 would think that ακ is some ordinal and the world will be safe for so long as its child agents can walk some arbitrarily long but finite chain of ordinals descending from ακ.)
Thus, the simple way of extending 'vanilla' PP to use ordinals doesn't grant us too much extra power. However, there are ways to extend vanilla PP with ordinals to make the system significantly more powerful. With vanilla PP, you have to slice up the agent's safety predicate such that it talks about individual timesteps. In the more powerful version you can express more complex goals (such as "X must be done eventually, but you have a while before you must choose a deadline"). There are still some goals which are difficult to express; this is closely related to the procrastination paradox. (Similar limitations appear in the consistency waterfall, which only trusts its own soundness for Pi1 goals.)
Benja has formulated such a PP. An unpolished version of the technique used is shown in Fallenstein's Monster, further details are forthcoming.
On a tangential note, it's quite exciting to see other people who are interested in getting nice properties out of tiling agents. Have you spoken to Benja at all, and/or contacted others at MIRI?
Hi So8res, thx a lot for reading the post and commenting on it!
I understand the difference. I just think the terminology is a bit confusing, since many people... (read more)