All of Frank_R's Comments + Replies

This is slightly off-topic, but you mentioned that you think that other research agendas could be fruitful. How would you rate singular learning theory (SLT) in this context? Do you see connections between SLT and LTA, for example if you try to generalize SLT to reinforcement learning? Are there dangerous topics to avoid? 

Arguments like yours are the reason why I do not think that Yudkowskys scenario is overwhelmingly likely ( P > 50%). However, this does not mean that existintial risk from AGI is low. Since smart people like Terence Tao exist, you cannot prove with complexity theory that no AGI with the intelligence of Terence Tao can be build. Imagine a world where everyone has one or several AI assistants whose capabilities are the same as the best human experts. If the AI assistants are deceptive and are able to coordinate, something like slow disempowerment of human... (read more)

You are right and now it is clear, why your original statement is correct, too. Let  be an arbitrary computable utility function. As above, let  and  with  and . Choose  as in your definition of "computable". Since  terminates, its output depends only on finitely many . Now

is open and a subset of , since .  

I have discovered another minor point. You have written at the beginning of Direction 17 that any computable utility function  is automatically continuous. This seems to be not always true.

I fix some definitions to make sure that we talk about the same stuff. For reasons of simplicity, I assume that and  are finite. Let   be the space of all infinite sequences with values in .  The -th projection  is given by

The product topology is defined a... (read more)

2Vanessa Kosoy
Your alleged counterexample is wrong, because the U you constructed is not computable. First, "computable" means that there is a program P which receives s and ϵ∈Q as inputs s.t. for all s and ϵ, it halts and |P(s,ϵ)−U(s)|<ϵ Second, even your weaker definition fails here. Let ϵ=12. Then, there is no program that computes U within accuracy ϵ, because for every n, U(0n20ω)=34 while U(0ω)=0. Therefore, determining the value of U(s) within ϵ requires looking at infinitely many elements of the sequence. Any program would that outputs 0 on 0ω has to halt after reading some finite m symbols, in which case it would output 0 on 0m20ω as well.

I have a question about the conjecture at the end of Direction 17.5. Let  be a utility function with values in  and let  be a strictly monotonous function. Then  and  have the same maxima.  can be non-linear, e.g. . Therefore, I wonder if the condition  should be weaker.

Moreover, I ask myself if it is possible to modify  by a small amount at a place far away from the optimal policy such that  is still optimal fo... (read more)

3Vanessa Kosoy
No, because it changes the expected value of the utility function under various distributions. Good catch, the conjecture as stated is obviously false. Because, we can e.g. take U2 to be the same as U1 everywhere except after some action which π∗ doesn't actually take, in which case make it identically 0. Some possible ways to fix it: * Require the utility function to be of the form U:Oω→[0,1] (i.e. not depend on actions). * Use (strictly) instrumental reward functions. * Weaken the conclusion so that we're only comparing U1 and U2 on-policy (but this might be insufficient for superimitation). * Require π∗ to be optimal off-policy (but it's unclear how can this generalize to finite g).

I have two questions that may be slightly off-topic and a minor remark:

  • Is a list of open and tractable problems related to Infra-Bayesianism somewhere available?
  • Do you plan to publish the results of the Infra-Bayesianism series in a peer-reviewed journal? I understand that there are certain downsides; mostly that it requires a lot of work, that the whole series may be too long for a journal article and that the peer review process takes much time. However, if your work is citeable, it could attract more researchers, who are able to contribute.
  • On page 22, you should include the condition a(bv) = (ab)v into the definition of a vector space. 

Thank you for the link. This clarifies a lot.

I am starting to learn theoretical stuff about AI alignment and have a question. Some of the quantities in your post contain the Kolmogorov complexity of U. Since it is not possible to compute the Kolmogorov complexity of a given function or to write down a list of all function whose complexity is below a certain bound, I wonder how it would be possible to implement the PreDCA protocol on a physical computer. 

3Tamsin Leake
Like all the other uncomputable or untractable logic in the post, the AI is to make increasingly informed guesses about them using something like logical induction, where one can estimate the likelihood of a logical statement without having to determine its truth value for sure.