Arguments like yours are the reason why I do not think that Yudkowskys scenario is overwhelmingly likely ( P > 50%). However, this does not mean that existintial risk from AGI is low. Since smart people like Terence Tao exist, you cannot prove with complexity theory that no AGI with the intelligence of Terence Tao can be build. Imagine a world where everyone has one or several AI assistants whose capabilities are the same as the best human experts. If the AI assistants are deceptive and are able to coordinate, something like slow disempowerment of human...
You are right and now it is clear, why your original statement is correct, too. Let be an arbitrary computable utility function. As above, let and with and . Choose as in your definition of "computable". Since terminates, its output depends only on finitely many . Now
is open and a subset of , since .
I have discovered another minor point. You have written at the beginning of Direction 17 that any computable utility function is automatically continuous. This seems to be not always true.
I fix some definitions to make sure that we talk about the same stuff. For reasons of simplicity, I assume that and are finite. Let be the space of all infinite sequences with values in . The -th projection is given by
The product topology is defined a...
I have a question about the conjecture at the end of Direction 17.5. Let be a utility function with values in and let be a strictly monotonous function. Then and have the same maxima. can be non-linear, e.g. . Therefore, I wonder if the condition should be weaker.
Moreover, I ask myself if it is possible to modify by a small amount at a place far away from the optimal policy such that is still optimal fo...
I have two questions that may be slightly off-topic and a minor remark:
I am starting to learn theoretical stuff about AI alignment and have a question. Some of the quantities in your post contain the Kolmogorov complexity of U. Since it is not possible to compute the Kolmogorov complexity of a given function or to write down a list of all function whose complexity is below a certain bound, I wonder how it would be possible to implement the PreDCA protocol on a physical computer.
This is slightly off-topic, but you mentioned that you think that other research agendas could be fruitful. How would you rate singular learning theory (SLT) in this context? Do you see connections between SLT and LTA, for example if you try to generalize SLT to reinforcement learning? Are there dangerous topics to avoid?