I'm not speaking for SIAI as this is more of a Visiting Fellows thing than an SIAI thing, but there are people working on Friendliness, and creating a Friendliness roadmap. We have lists of hundreds of problems, and lists of potentially relevant fields or concepts. Work is getting started on combining these lists into a real roadmap despite the uncertainty and difference of emphasis among researchers. Obviously we'd rather not release things for the public to see unless there were rather good reasons for doing so -- less output means less chance for screwing up public relations, which is important because SIAI Visiting Fellows output is easy to conflate with SIAI output in ways that might be misleading. I've started a blog where I'll put my own thoughts on something-like-Friendliness that I feel are not at all dangerous, and I might encourage other Friendliness researchers to do so as well. I'll link to my blog in a discussion post once I have a few more posts seeded. At some point you might see summaries of collaborative research somewhere. But until we have a better idea of who our audience is and what security precautions are sane, we'd like to work quietly. Again, I'm mostly speaking for myself, kind of speaking for a group of partially-SIAI-affiliated folk, and not at all for SIAI as an organization.
(There aren't that many people that can speak for SIAI, unfortunately. Like, two maybe. If you're an Oppenheimer (strong rationality and remarkable ability to get uber-nerds to work like a well-oiled machine), please consider applying for Visiting Fellowship. We're a bright group, but that has more to do with being bright than it has to do with being a group, and we'd like to change that.)
You have hundreds of subproblems that need to be solved? And you're making a special effort to keep them secret from people on LW? Just... wow. How dumb would you have to be? Excuse me while I beat my head against the wall for awhile.
They're not lists of hundreds of subproblems that need to be solved, they're lists of hundreds of subproblems. Large difference. We don't know which ones need to be solved, nor if the problems are phrased correctly, et cetera. Nor are we making a special effort to keep them secret; we're just not making an effort to publicize quarter-done brainstorming sessions.
If you want an example of some ideas that I wasn't bothering to publicize yet, then here: http://willnewsome.wordpress.com/ . I just made that today in response to this discussion post. They're not polished, but they might be thought-provoking. At some point we'll probably try to start listing open problems for LW or e.g. mathoverflow people to work on, but we're not at that stage yet. A few months maybe?
Thanks for the link.
ETA: the angry tone of my first reply was prompted by your use of the words "dangerous" and "security precautions". Your second comment doesn't mention these, which is nice :-)
Wha? Your grandparent comment did contain the words "dangerous" and "security precautions", or are my eyes lying to me again?
I'm not speaking for SIAI as this is more of a Visiting Fellows thing than an SIAI thing, but there are people working on Friendliness, and creating a Friendliness roadmap. We have lists of hundreds of problems, and lists of potentially relevant fields or concepts.
Meh. Now I'm a bit annoyed in that I did try to poke people into a direction where they'd do something like that when I was there as a Visiting Fellow, but mostly the reaction seemed to be "we should leave all thinking about Friendliness to Eliezer". But upon reflection, I realize that I may not have been as vocal about that as I thought I was (illusion of transparency and all that), so I guess I only have myself to blame for you guys only starting on all the real interesting stuff after I left. ;p
Meh. Now I'm a bit annoyed in that I did try to poke people into a direction where they'd do something like that when I was there as a Visiting Fellow, but mostly the reaction seemed to be "we should leave all thinking about Friendliness to Eliezer".
That's... disturbing, although expected. Why isn't Visiting Fellows program used to strengthen this line of research? There could be practical difficulties in moving there quickly, but it's confusing if it's not even a goal.
but mostly the reaction seemed to be "we should leave all thinking about Friendliness to Eliezer".
Huh? But, but... surely there room for at least half a dozen people in that particular basement!
I'm encouraged by what you say here. The doubt as to the value of Friendliness research that I express above is doubt as to the value of researching Friendly AI without a taskification rather than doubt as to the value researching what a taskification might look like.
If you haven't done so I think that it would be worthwhile to ask the SIAI staff whether they might be comfortable with classifying (some of?) the output of the SIAI Visiting Fellows as part of SIAI's output. As I said in response to a comment by WrongBot, I've gathered that the SIAI visiting fellows program is a good thing; but there's been relatively little public documentation of what the SIAI visiting fellows have been doing. I would guess that a policy of such public documentation would improve SIAI's credibility.
While I didn't read your comment in the way that cousin_it did, I can see why he would do so. I've gotten a vague impression from talking to a number of people loosely or directly connected with SIAI that SIAI has been keeping their research secret on the grounds that releases to the public could be dangerous on account of speeding unfriendly AI research. In view of how primitive the study of AGI looks, the apparent infeasibility of SIAI unilaterally building the first AGI and the fact that Friendliness research would not seem to significantly speed the creation of unfriendly AI; such a policy seems highly dubious to me. So I was happy to hear that you and your collaborators are planning on putting some of what you've been doing out in the open in roughly a few months.
Thanks for the link to your blog posts.
I've know I've written previously about how the next step at the top of my basic theoretical To-Do List is come up with a reflective decision theory, one that can talk about modifying the part of itself that does the self-modification. Can someone link to that?
This sounds sort of interesting but seems to fall under the rubric of general artificial intelligence research rather than addressing the taskification of Friendly AI research. Note that a research program is only as strong as its weakest link.
You ask after the Friendliness ToDo list, but discount the #1 item according to Eliezer Yudkowsky?!?
Yudkowsky explains why he thinks this area is an important one - e.g. in "Strong AI and Recursive Self-Improvement" - http://video.google.com/videoplay?docid=-821191370462819511
FWIW, IMO, such an approach would make little sense if your plan was just to build a machine intelligence.
We already have a decision theory good enough to automate 99% of jobs on the planet - if only we knew how to implement it. A pure machine intelligence project would be likely to focus on those implementation details - not on trying to adjust decision theory to better handle proofs about the dynamics of self-improving systems.
The beginning of your post My Kind of Reflection seems to talk about that. Couldn't find anything more direct.
FWIW, dealing with self-modification isn't very high on my to-do list, because for now I've shifted to thinking of AI as a one-action construction. This approach handles goal stability pretty much automatically, but I'm not sure if it satisfies your needs.
Nesov's reply sounds right to me. It doesn't handle goal stability automatically, it sweeps an issue that you confess you don't understand under the carpet and hopes the AI handles it, in a case where you haven't described an algorithm that you know will handle it and why.
Thanks. I don't understand your reply yet (and about half of Nesov's points are also unparseable to me as usual), but will think more.
I'm not convinced that researching the Friendly AI concept is a cost-effective way of reducing existential risk.
Researching Friendly AI is a way to reduce the risk of Unfriendly AI.
Personally, I find it very easy to outline a recipe for FAI.
You use cognitive neuroscience to figure out human values and human metaethical thought.
You automate human metaethical thought, apply that to the values determined at step 1, and thereby arrive at a human-relative moral/behavioral ideal.
You design an open-ended AGI architecture, and use the ideal from step 2 to supply it with a goal.
To me, that defines a research program that's just full of highly concrete tasks waiting to be carried out. Some of them may be very difficult or abstract, but nothing leaves me feeling helpless, with no idea where to begin. However, you need to know something about ordinary AI (try this), and you need to have some idea of what a computational model of human decision making might look like (simple examples).
- You use cognitive neuroscience to figure out human values and human metaethical thought.
Why do we need to know about meta-ethics? Why would we even think there is such a thing as human meta-ethics?
Meta-ethics may be relevant for crafting ways to approach the problem. But I don't see how it is a descriptive project answerable with the techniques of cognitive neuroscience the way you describe. I also don't see why an AI would need to know whether we were cognitivists or non-cognitivists to know how we want it to act.
I can think of a way it could matter. If we're non-cognitivists, we think our moral discourse consists of speech acts, so when we argue over morality we're actually issuing imperatives, or declaring universal imperatives exist. If we're cognitivists, we're talking about what is the case. There could very well be grounds for a FAI to treat an imperative and a proposition differently. Doesn't mean the distinction actually matters, but it's not obviously irrelevant.
So it would be sloppy to code what you want an AI to do the way you code propositions/beliefs. That is, you don't want to fit the bulk of the goal architecture inside it's belief networks. Nor certainly, should you expect the AI to learn moral truths by looking at the world. Once you tell it to care about what people want, then it can look at people to find that out- but it can't learn to care about what people want just by observing the world. Those kind of moral facts don't exist. So certainly knowing things about meta-ethics will help create an FAI.
But that's an argument for smart people to spend time thinking about meta-ethics. It's not an argument for a descriptive program that finds folk-metaethics to form the the goal architecture of an AI. For one thing, most humans seem to have really confused meta-ethical beliefs.
On reflection, I think by 'metaethical thought' Mitchell probably meant the normative theory that describes human ethics. I don't think there is one of those either, but it's not obviously wrong and certainly makes more sense.
I meant: innate cognitive architecture which plays a role in metaethical thought.
You might be familiar with the idea that, according to CEV, you figure out the full complexity of human value using neuroscience (rather than relying on people's opinions about what they value), and then you "extrapolate" or "renormalize" that using "reflective decision theory" (which does not yet exist). The idea here is that the method of extrapolation should also be extracted from the details of human cognitive architecture, rather than just figured out through intuition or pure reason.
Suppose we have a person - or an intelligent agent - with a particular "value system" or "private decision theory". Note that we are talking about its actual decision theory, as embodied in its causal structure and decision-making dispositions, and not just its introspective opinions about how it decides. Given this actual value system, RDT is supposed to tell us what would happen to that value system if it were changed according to its own implicit ideals. All I'm saying is that there's a meta-ethical relativism for RDT, for large classes of decision architecture. Different theories about how to normatively self-modify a decision architecture ought to be possible, and the selection of which RDT is used should also be derived from the agent's own cognitive architecture.
Of course you can go meta again and say, maybe the RDT extraction procedure can also take different forms - etc. It's one of the tasks of the FAI/CEV/RDT research program to figure out when and how the ethical metalevels stop.
Thanks for your interesting comment.
To me, that defines a research program that's just full of highly concrete tasks waiting to be carried out. Some of them may be very difficult or abstract, but nothing leaves me feeling helpless, with no idea where to begin.
I imagine that writing up a description of many such concrete tasks would help clarify the situation; I'd encourage you to do so if you have some time.
However, you need to know something about ordinary AI (try this), and you need to have some idea of what a computational model of human decision making might look like (simple examples).
Yes, I don't have subject matter knowledge. Thanks for the links.
Richard Hollerith was kind enough to briefly describe Monte Carlo AIXI to me last month
Since technical types can be quick to criticize, let me clarify that I never held up MC AIXI as evidence for the value of the FAI project -- only as evidence against the argument that AIXI lacks promise because AIXI is uncomputable.
In view of all this, working toward stable whole-brain emulation of a a trusted and highly intelligent person concerned about human well being seems to me like a more promising strategy of reducing existential risk at the present time than researching Friendly AI.
The current situation is that good old-fashioned software engineering has revolutionised the financial markets, saving the world billions of dollars of bad investments, revolutionised the world of information storage and retreival, helping people to have accurate views of the world, revolutionised the world of forecasting, and created an international marketplace of huge proportions, vastly increasing the extent to which nations depend on each other, and so minimising the chance of armed conflict.
By contrast, emulating brains has been a practically useless approach, hardly responsible for anything of interest. The high-profile flagship products in the area are literally useless, they do nothing - except flash pretty lights in peoples' eyes - funding via hypnosis.
"Whole-brain emulation" seems likely to continue to be useless for a looooong time. The whole approach seems based on wishful thinking to me. It seems to be an example of the type of thinking that you get if you fail to consult with the likely timelines on a project.
It can simultaneously be the case that whole-brain emulation is unlikely to arise first and that pursuing stable whole-brain emulation is more likely to give rise to a positive singularity than by pursuing Friendly AI.
The alleged facts that you cite about software engineering don't seem relevant here: as far as I know the current state of general artificial intelligence research is still very primitive.
I would recur to FAWS's response to one of your comments from nine months ago.
It is true that it's possible that whole-brain emulation could be safer - but humans are psychopaths. Having agents whose brains we don't understand in charge would be a terrible situation - removing one of our safeguards. Human brains are known to be very bad - stupid, unreliable, etc. The idea that engineered machine intelligence would be worse is, of course, possible - but doesn't seem to be very likely. Engineered machine intelligence would be much more configurable and controllable.
Anyway, the safely of WBE seems likely to be irrelevant - if it is sufficiently likely to be beaten. We can imagine all kinds of time-consuming super-safe fantasy contraptions - but there's a clock ticking.
Large-scale bioinspiration is used infrequently by engineers: an aeroplane is not a scanned bird, a car is not a scanned horse, a submarine is not a scanned fish, garders are not scanned logs, ropes are not a scanned vines - and so on. We scan when we really want a copy. Photos, videos, audio. In this case, a copy is exactly what we don't want. We have cheap human brains. The main vacancies are for inhuman entities - things like a Google datacentre, or a roomba - for example.
IMO, we can see from the details of the situation that scanning isn't going to happen in this case. If we get a fruit-fly brain scanned and working, someone is bound to scale it up 10,000 times, and then hack it into a shape that is useful for something. There are innumerable short-cuts like that on the path - and some of them seem bound to be taken.
Thinking of "general artificial intelligence" as a field is an artefact. "Artificial General Intelligence" is a marketing term used by a break-away group who were apparently irritated by the barriers to presenting at mainstream machine intelligence conferences. The rest of the field is enormous - by comparison with that splinter group - and the efforts of that mainstream on machine learning seem highly relevant to overall development to me.
Machine intelligence research started to get serious the 1950s. My projected mode "arrival time" [sic] is 2025. If correct, that makes the field about 80% "there", timewise. Of course, it may not look as though it is close yet - but that could well be an artefact of exponential growth processes, which appear to reach the destination all at once, in a dramatic final surge.
FAWS's comment seems practically all wrong to me - each paragraph after the first one.
Tangential question:
...good old-fashioned software engineering has revolutionised the financial markets, saving the world billions of dollars of bad investments,...
Is this true? I'd like it to be true for many reasons. But I have seen some plausible arguments that in recent years the financial sector has not been doing its job in the economy very well. For example, http://www.newyorker.com/reporting/2010/11/29/101129fa_fact_cassidy?printable=true .
"I don't even see how one would start to research the problem of getting a hypothetical AGI to recognize humans as distinguished beings."
I'm still not convinced that human beings should be treated as a special case, as opposed to getting the AGI to recognize sentient beings in general. It's easy to imagine ways in which either strategy could go horribly wrong.
A visible way forward is to research decision theory in order to understand what data specifies a decision problem, and how it's organized in agents that can be said to be solving that decision problem. The next (conceptual) step would be to understand elicitation of that data from specific instances of agents (e.g. humans), to define human decision problem. After that, we are ready to consider the question of implementation: develop an efficient decision-making algorithm good enough to go on solving the human decision problem (i.e. making decisions), and think of how to best collect the data that gives the human decision problem.
Here I think that if each of the conceptual steps is feasible on a short enough time scale so that humans there's a reasonable chance of finishing them all before a hypothetical intelligence explosion then researching decision theory is a visible way forward.
But I'm doubtful as to the feasibility of the stages of the proposed research plan beyond the development of decision theory in absence of a detailed taskification of the latter stages.
I could imagine research in decision theory leading to the creation of a Friendly AI, but the same could be true of any area of basic research. For example, the study of solid state physics could lead to useful new technologies which can meet many people's needs cheaply and correspondingly quell political unrest; leading to stable political conditions which are more conducive to militaries taking safety precautions in developing artificial intelligence technologies; thereby averting unfriendly AI for long enough for people to come up with a more promising approach to the currently intractable aspects of the your proposed research program.
Also; supposing that the research program that you allude to does become taskified to a sufficiently fine degree so that it looks tractable it's plausible that there will be a surge of interest in the relevant decision theory and that academia will solve the relevant problems on its own accord.
To be clear: I'm not necessarily discouraging you personally from studying decision theory - you're visibly passionate about it and my observation is that people are much better at doing what they're passionate about than what they do out of a sense of duty. At the same time; I don't see why decision theory deserves higher priority than other basic scientific research which could plausibly have favorable technological consequences.
As a disclosure of personal bias I personally don't find decision theory at all aesthetically attractive (yet?) and it correspondingly seems like something that I would not enjoy or be good at, so I may be motivated to diminish its utilitarian importance or be blind to it. Regardless, I do appreciate that there are people like you, cousin it and Wei Dai who are strongly interested in the subject as I think that it's good for society to have a diversity of intellectuals researching a variety of subjects.
If you strongly believe that researching decision theory presently deserves high priority for researchers in general at present then I would encourage you to write some articles about why you see it as deserving such high priority with a view toward attracting collaborators and helping SIAI explain its focus on decision theory. Some of your thinking here has come out implicitly in your responses to some of my comments but I would be interested in hearing a more holistic account of your views and their justification.
It's 1939. No one knows if it is possible to split an atom, no one knows how to split an atom in a controlled fashion, and no one knows how to use the splitting of an atom as a weapon capable of exterminating hundreds of thousands of civilians. There are no clear approaches to these problems.
Six years later, hundreds of thousands of civilians have been exterminated by the weaponized splitting of atoms.
A project that currently appears to be intractable may not remain so. If trustworthy upload technology were available to us today, then it would probably be a good idea to use it to develop FAI. But we don't have it yet, so if there is even a small chance that useful work will be accomplished by meat-brains we might as well take it.
We shouldn't neglect the development of upload technology and its precursors, but those fields are already receiving three or four orders of magnitude more funding and attention than FAI. It's clear where the marginal benefits are.
I'm saddened to see that your bottom line hasn't changed as a result of your series of posts on SIAI's PR.
I don't understand your comment. Surely your estimate of the usefulness of FAI work should depend on your estimate of its chances of success - not just its importance - because otherwise praying to Zeus would look even more attractive than developing FAI. Multi was pointing out that we don't seem to have any viable way of attacking the problem right now.
Was the state of nuclear science really so primitive in 1939 that there wasn't a discernable research program?
Is it really the case that upload precursors have 3-4 orders of magnitude more funding than FAI research?
In my view whether or not the marginal benefit is in FAI depends on whether there people have potentially fruitful ideas on the subject. My post inquires about whether people have such ideas
I'm still open to changing my mind.
My position on SIAI has changed since August; I have a more favorable impression now; the question is just what the optimal strategy is for the organization to pursue.
There likely was. The SIAI also seems to have a research program outlined.
Yup. There's a Blue Gene supercomputer that is being used to (among other things) simulate increasingly large portions of the brain at a neuronal level. That's $100m right there, and then we can throw in the funding for pretty much all neuroanatomy research as well. I'd guesstimate the global annual budget for FAI research at $1-2m. I may be defining upload precursors more loosely than you are, so I understand your skepticism.
The majority of your post focuses on the difficulty of taskifying FAI, which makes it sound as though you're arguing for a predetermined conclusion.
Great! :)
Considering that the SIAI is currently highly specialized to focus on FAI research, retooling the organization to do something else entirely seems like a waste of money. Reading your post from that perspective, your post seemed hostile, though I realize that wasn't intended.
Considering that the SIAI is currently highly specialized to focus on FAI research, retooling the organization to do something else entirely seems like a waste of money.
Bad argument. If in fact FAI research shouldn't be pursued, then they shouldn't pursue it, no matter sunk cost.
Agreed. I should have stated this as an implicit premise in my reasoning; if FAI research shouldn't be pursued, then the SIAI should probably be dissolved and its resources directed to more useful approaches. This is why I read multi as hostile: if FAI research is the wrong approach as he argues, then the SIAI should shut down. Which (in my head) compresses to "multi wants to shut down the SIAI."
Agreed. I should have stated this as an implicit premise in my reasoning; if FAI research shouldn't be pursued, then the SIAI should probably be dissolved and its resources directed to more useful approaches.
Probably not a good assumption; they've changed approaches before (in their earliest days, the idea of FAI hadn't been invented yet, and they were about getting to the Singularity, any Singularity, as quickly as possible). If, hypothetically, there arose some very convincing evidence that FAI is a suboptimal approach to existential risk reduction, then they could change again but retain their network of donors and smart people and so forth. Probably won't need to happen, but still, shutting down SIAI wouldn't be the only option (let alone the best option) if turned out that FAI was a bad idea.
Thanks for the feedback.
Regarding (1): I looked up historical information on the development of the atomic bomb. According to The Manhattan Project: Making the Atomic Bomb:
Not surprisingly, Ernest Rutherford, Albert Einstein, and Niels Bohr regarded particle bombardment as useful in furthering knowledge of nuclear physics but believed it unlikely to meet public expectations of harnessing the power of the atom for practical purposes anytime in the near future. In a 1933 interview Rutherford called such expectations "moonshine." Einstein compared particle bombardment with shooting in the dark at scarce birds, while Bohr, the Danish Nobel laureate, agreed that the chances of taming atomic energy were remote.
This information has caused me to update my beliefs about how heavily to weight expert opinions about the likelihood of the advancement of a given hypothetical technology. I plan on reading more about the history of the atomic bomb and will think about this matter some more. Note however:
(a) The point that I make in the above article about there being a selection effect where people notice those scientific speculations that actually pan out disproportionately.
(b) XiXiDu's response below in which he (correctly) infers my view that SIAI's research agenda looks to be too broad and general to tackle the Friendly AI problem effectively.
Of these two points, the first doesn't seem so significant to me (small probability; high expected return); the point (b) seems much more significant to me as in absence of a compelling research agenda Friendly AI research seems to me to have vanishingly small probability of success. Now; I don't think that Friendly AI research by humans will inevitably have vanishingly small probability of success; I could imagine suddenly making the problem look tractable as happened for the atomic bomb; I'm pointing out that the problem seems to require much finer taskification than the SIAI research program sets out in order to be tractable.
Regarding (5): Supposing that the SIAI staff and/or donors decide that Friendly AI research turns out to have low utilitarian expected value, I could imagine easily imagine SIAI restructuring or rebranding to work toward higher utilitarian expected value activities.
Eliezer has done a lot to spread rationality between his upcoming rationality book, popular Harry Potter fanfiction and creation of Less Wrong. I've heard that the SIAI visiting fellows program has done a good job of building a community of people of high intellectual caliber devoted to existential risk reduction.
My understanding is that many of the recent papers (whether published or in progress) by Carl Shulman, Anna Salamon, Steve Rayhawk and Peter de Blanc as well as the SIAI Uncertain Future software application fall under the heading of advocacy/forecasting rather than Friendly AI research.
If I make a top level post about this subject that's critical of Friendly AI research I'll be sure to point out the many positive contributions of SIAI staff that fall outside of the domain of Friendly AI research.
The space colonization analog of the SIAI research program might read like this:
As far as I understand what multifoliaterose is scrutinizing, such an agenda is too broad and general to tackle effectively (at least so it appears to some outsiders).
It's true that in 1939 they didn't know how to split an atom. They also didn't know how to teleport, or travel backward in time, or do many other dangerous things. Should they have worried about those, too? What percentage of futuristic technologies ever gets developed? What percentage gets developed soon? It might be rational to worry about an unknown future, but it's irrational to worry about one specific scenario of doom unless you have lots of evidence that it will in fact happen.
I have outline the same problem in my post Simple friendliness: Plan B for AI http://lesswrong.com/r/discussion/lw/327/simple_friendliness_plan_b_for_ai/
We have no time to create FAI, before any AI project would reach success. So we need to make the problem of FAI taskable by dividing it on several mutely independent principles which could be so simple that every one could understand them. In my post I suggested possible list of such principles.
Eliezer has written a great deal about the concept of Friendly AI, for example in a document from 2001 titled Creating Friendly AI 1.0. The new SIAI overview states that:
The SIAI Research Program lists under its Research Areas:
Despite the enormous value that the construction of a Friendly AI would have; at present I'm not convinced that researching the Friendly AI concept is a cost-effective way of reducing existential risk. My main reason for doubt is that as far as I can tell, the problem of building a Friendly AI has not been taskified to a sufficiently fine degree for it to be possible to make systematic progress toward obtaining a solution. I'm open-minded on this point and quite willing to change my position subject to incoming evidence
The Need For Taskification
In The First Step is to Admit That You Have a Problem Alicorn wrote:
In Let them eat cake: Interpersonal Problems vs Tasks HughRistik wrote:
We know that the problems of making a peanut butter sandwich and of finding a romantic partner can (often) be taskified because many people have succeeded in solving them. It's less clear that a given problem that has never been solved can be taskified. Some problems are in principle unsolvable whether because they are mathematically undecidable or because physical law provides an obstruction to their solution. Other currently unsolved problems have solutions in the abstract but lack solutions that are accessible to humans. That taskification is in principle possible is not a sufficient condition for solving a problem but it is a necessary condition.
The Difficulty of Unsolved Problems
There's a long historical precedent of unsolved problems being solved. Humans have succeeded in building cars and skyscrapers, have succeeded in understanding the chemical composition of far away stars and of our own DNA, have determined the asymptotic distribution of the prime numbers and have given an algorithm to determine whether a given polynomial equation is solvable in radicals, have created nuclear bombs and have landed humans on the moon. All of these things seemed totally out of reach at one time.
Looking over the history of human achievement gives one a sense of optimism as to the feasibility of accomplishing a goal. And yet, there's a strong selection effect at play: successes are more interesting than failures and we correspondingly notice and remember successes more than failures. One need only page through a book like Richard Guy's Unsolved Problems in Number Theory to get a sense for how generic it is for a problem to be intractable. The ancient Greek inspired question of whether there are infinitely many perfect numbers remains out of reach for best mathematicians of today. The success of human research efforts has been as much a product of wisdom in choosing one's battles as it has been a product of ambition.
The Case of Friendly AI
My present understanding is that there are potential avenues for researching AGI. Richard Hollerith was kind enough to briefly describe Monte Carlo AIXI to me last month and I could sort of see how it might be in principle possible to program a computer to do Bayesian induction according to an approximation to a universal prior and implement the computer with a decision making apparatus based on its epistemological state at a given time. Some people have suggested to me that the amount of computer power and memory needed to implement human level Monte Carlo AIXI is prohibitively large but (in my current, very ill-informed state; by analogy with things that I've seen in computational complexity theory) I could imagine ingenious tricks yielding an approximation to Monte Carlo AIXI which uses much less computing power/memory and which is a sufficiently close to approximation to serve as a substitute for practical purposes. This would point to a potential taskification of the problem of building an AGI. I could also imagine that there are presently no practically feasible AGI research programs; I know too little about the state of strong artificial intelligence research to have anything but a very unstable opinion on this matter.
As Eliezer has said; the problem of creating a Friendly AI is inherently more difficult than that of creating an AGI and may be a problem much more difficult than that of creating an AGI. At present, the Friendliness aspect of a Friendly AI seems to me to strongly resist taskificaiton. In his poetic Mirrors and Paintings Eliezer gives the most detailed description of what a Friendly AI should do that I've seen, but the gap between concept and implementation here seems so staggeringly huge that it doesn't suggest to me any fruitful lines of Friendly AI research. As far as I can tell, Eliezer's idea of a Friendly AI is at this point not significantly more fleshed out (relative to the magnitude of the task) than Freeman Dyson's idea of a Dyson sphere. In order to build a Friendly AI, beyond conceiving of what a Friendly AI should be in the abstract one has to convert one's intuitive understanding of friendliness into computer code in a formal programming language.
I don't even see how one would start to research the problem of getting a hypothetical AGI to recognize humans as distinguished beings. Solving this problem would seem to require as a prerequisite an understanding of the make up of the hypothetical AGI; something which people don't seem to have a clear grasp of at the moment. Even if one does have a model for a hypothetical AGI, writing code conducive to it recognizing humans as distinguished beings seems like an intractable task. And even with a relatively clear understanding of how one would implement a hypothetical AGI with the ability to recognize humans as distinguished beings; one is still left with the problem of making such a hypothetical AGI Friendly toward such beings.
In view of all this, working toward stable whole-brain emulation of a a trusted and highly intelligent person concerned about human well being seems to me like a more promising strategy of reducing existential risk at the present time than researching Friendly AI. Quoting a comment by Carl Shulman
There are various things that could go wrong with whole-brain emulation and it would be good to have a better option but Friendly AI research doesn't seem to me to be one in light of an apparent total absence of even the outlines of a viable Friendly AI research program.
But I feel like I may have missed something here. I'd welcome any clarifications of what people who are interested in Friendly AI research mean by Friendly AI research. In particular, is there a conjectural taskification of the problem?