Here I think that if each of the conceptual steps is feasible on a short enough time scale so that humans there's a reasonable chance of finishing them all before a hypothetical intelligence explosion then researching decision theory is a visible way forward.
But I'm doubtful as to the feasibility of the stages of the proposed research plan beyond the development of decision theory in absence of a detailed taskification of the latter stages.
I could imagine research in decision theory leading to the creation of a Friendly AI, but the same could be true of any area of basic research. For example, the study of solid state physics could lead to useful new technologies which can meet many people's needs cheaply and correspondingly quell political unrest; leading to stable political conditions which are more conducive to militaries taking safety precautions in developing artificial intelligence technologies; thereby averting unfriendly AI for long enough for people to come up with a more promising approach to the currently intractable aspects of the your proposed research program.
Also; supposing that the research program that you allude to does become taskified to a sufficiently fine degree so that it looks tractable it's plausible that there will be a surge of interest in the relevant decision theory and that academia will solve the relevant problems on its own accord.
To be clear: I'm not necessarily discouraging you personally from studying decision theory - you're visibly passionate about it and my observation is that people are much better at doing what they're passionate about than what they do out of a sense of duty. At the same time; I don't see why decision theory deserves higher priority than other basic scientific research which could plausibly have favorable technological consequences.
As a disclosure of personal bias I personally don't find decision theory at all aesthetically attractive (yet?) and it correspondingly seems like something that I would not enjoy or be good at, so I may be motivated to diminish its utilitarian importance or be blind to it. Regardless, I do appreciate that there are people like you, cousin it and Wei Dai who are strongly interested in the subject as I think that it's good for society to have a diversity of intellectuals researching a variety of subjects.
If you strongly believe that researching decision theory presently deserves high priority for researchers in general at present then I would encourage you to write some articles about why you see it as deserving such high priority with a view toward attracting collaborators and helping SIAI explain its focus on decision theory. Some of your thinking here has come out implicitly in your responses to some of my comments but I would be interested in hearing a more holistic account of your views and their justification.
Eliezer has written a great deal about the concept of Friendly AI, for example in a document from 2001 titled Creating Friendly AI 1.0. The new SIAI overview states that:
The SIAI Research Program lists under its Research Areas:
Despite the enormous value that the construction of a Friendly AI would have; at present I'm not convinced that researching the Friendly AI concept is a cost-effective way of reducing existential risk. My main reason for doubt is that as far as I can tell, the problem of building a Friendly AI has not been taskified to a sufficiently fine degree for it to be possible to make systematic progress toward obtaining a solution. I'm open-minded on this point and quite willing to change my position subject to incoming evidence
The Need For Taskification
In The First Step is to Admit That You Have a Problem Alicorn wrote:
In Let them eat cake: Interpersonal Problems vs Tasks HughRistik wrote:
We know that the problems of making a peanut butter sandwich and of finding a romantic partner can (often) be taskified because many people have succeeded in solving them. It's less clear that a given problem that has never been solved can be taskified. Some problems are in principle unsolvable whether because they are mathematically undecidable or because physical law provides an obstruction to their solution. Other currently unsolved problems have solutions in the abstract but lack solutions that are accessible to humans. That taskification is in principle possible is not a sufficient condition for solving a problem but it is a necessary condition.
The Difficulty of Unsolved Problems
There's a long historical precedent of unsolved problems being solved. Humans have succeeded in building cars and skyscrapers, have succeeded in understanding the chemical composition of far away stars and of our own DNA, have determined the asymptotic distribution of the prime numbers and have given an algorithm to determine whether a given polynomial equation is solvable in radicals, have created nuclear bombs and have landed humans on the moon. All of these things seemed totally out of reach at one time.
Looking over the history of human achievement gives one a sense of optimism as to the feasibility of accomplishing a goal. And yet, there's a strong selection effect at play: successes are more interesting than failures and we correspondingly notice and remember successes more than failures. One need only page through a book like Richard Guy's Unsolved Problems in Number Theory to get a sense for how generic it is for a problem to be intractable. The ancient Greek inspired question of whether there are infinitely many perfect numbers remains out of reach for best mathematicians of today. The success of human research efforts has been as much a product of wisdom in choosing one's battles as it has been a product of ambition.
The Case of Friendly AI
My present understanding is that there are potential avenues for researching AGI. Richard Hollerith was kind enough to briefly describe Monte Carlo AIXI to me last month and I could sort of see how it might be in principle possible to program a computer to do Bayesian induction according to an approximation to a universal prior and implement the computer with a decision making apparatus based on its epistemological state at a given time. Some people have suggested to me that the amount of computer power and memory needed to implement human level Monte Carlo AIXI is prohibitively large but (in my current, very ill-informed state; by analogy with things that I've seen in computational complexity theory) I could imagine ingenious tricks yielding an approximation to Monte Carlo AIXI which uses much less computing power/memory and which is a sufficiently close to approximation to serve as a substitute for practical purposes. This would point to a potential taskification of the problem of building an AGI. I could also imagine that there are presently no practically feasible AGI research programs; I know too little about the state of strong artificial intelligence research to have anything but a very unstable opinion on this matter.
As Eliezer has said; the problem of creating a Friendly AI is inherently more difficult than that of creating an AGI and may be a problem much more difficult than that of creating an AGI. At present, the Friendliness aspect of a Friendly AI seems to me to strongly resist taskificaiton. In his poetic Mirrors and Paintings Eliezer gives the most detailed description of what a Friendly AI should do that I've seen, but the gap between concept and implementation here seems so staggeringly huge that it doesn't suggest to me any fruitful lines of Friendly AI research. As far as I can tell, Eliezer's idea of a Friendly AI is at this point not significantly more fleshed out (relative to the magnitude of the task) than Freeman Dyson's idea of a Dyson sphere. In order to build a Friendly AI, beyond conceiving of what a Friendly AI should be in the abstract one has to convert one's intuitive understanding of friendliness into computer code in a formal programming language.
I don't even see how one would start to research the problem of getting a hypothetical AGI to recognize humans as distinguished beings. Solving this problem would seem to require as a prerequisite an understanding of the make up of the hypothetical AGI; something which people don't seem to have a clear grasp of at the moment. Even if one does have a model for a hypothetical AGI, writing code conducive to it recognizing humans as distinguished beings seems like an intractable task. And even with a relatively clear understanding of how one would implement a hypothetical AGI with the ability to recognize humans as distinguished beings; one is still left with the problem of making such a hypothetical AGI Friendly toward such beings.
In view of all this, working toward stable whole-brain emulation of a a trusted and highly intelligent person concerned about human well being seems to me like a more promising strategy of reducing existential risk at the present time than researching Friendly AI. Quoting a comment by Carl Shulman
There are various things that could go wrong with whole-brain emulation and it would be good to have a better option but Friendly AI research doesn't seem to me to be one in light of an apparent total absence of even the outlines of a viable Friendly AI research program.
But I feel like I may have missed something here. I'd welcome any clarifications of what people who are interested in Friendly AI research mean by Friendly AI research. In particular, is there a conjectural taskification of the problem?