Ben Goertzel: The Singularity Institute's Scary Idea (and Why I Don't Buy It)
[...] SIAI's Scary Idea goes way beyond the mere statement that there are risks as well as benefits associated with advanced AGI, and that AGI is a potential existential risk.
[...] Although an intense interest in rationalism is one of the hallmarks of the SIAI community, still I have not yet seen a clear logical argument for the Scary Idea laid out anywhere. (If I'm wrong, please send me the link, and I'll revise this post accordingly. Be aware that I've already at least skimmed everything Eliezer Yudkowsky has written on related topics.)
So if one wants a clear argument for the Scary Idea, one basically has to construct it oneself.
[...] If you put the above points all together, you come up with a heuristic argument for the Scary Idea. Roughly, the argument goes something like: If someone builds an advanced AGI without a provably Friendly architecture, probably it will have a hard takeoff, and then probably this will lead to a superhuman AGI system with an architecture drawn from the vast majority of mind-architectures that are not sufficiently harmonious with the complex, fragile human value system to make humans happy and keep humans around.
The line of argument makes sense, if you accept the premises.
But, I don't.
Ben Goertzel: The Singularity Institute's Scary Idea (and Why I Don't Buy It), October 29 2010. Thanks to XiXiDu for the pointer.
Loading…
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Comments (432)
On Ben's blog post, I noted that a poll at the 2008 global catastrophic risks conference put the existential risk of machine intelligence at 5% - and that the people attending probably had some of the largest estimations of risk of anyone on the planet - since they were a self-selected group attending a conference on the topic.
"Molecular nanotech weapons" also get 5%. Presumably there's going to be a heavy intersection between those two figures - even though in the paper they seem to be adding them together!
Compare this with this Yudkowsky quote from 2005:
This looks like a rather different probability estimate. It seems to me to be highly overconfident one.
I think the best way to model this is as FUD. Not Invented Here. A primate ego battle.
If this is how researchers deal with each other at this early stage, perhaps rough times lie ahead.
"perhaps"?
Well, a tendency towards mud-slinging might be counter-balanced by wanting to appear moral. Using FUD against competitors is usually regarded as a pretty low marketing strategy. Perhaps most of the mud-slinging can be delegated to anonymous minions, though.
There's going to be a lot of mud-slinging in this space.
More generally, there's going to be a lot of primate tribal politics in this space. After all, not only does it have all the usual trappings of academic arguments, it is also predicated on some pretty fundamental challenges to where power comes from and how it propagates.
They're probabilities for two different things. The 5% estimate is for P(AIisCreated&AIisUnfriendly), while Yudkowsky's estimate is for P(AIisUnfriendly|AIisCreated&NovamenteFinishesFirst).
This post doesn't show up under "NEW", nor does it show up under "Recent Posts".
ADDED: Never mind. I forgot I had "disliked" it, and had "do not show an article once I've disliked it" set.
(I disliked it because I find it kind of shocking that Ben, who's very smart, and whom I'm pretty sure has read the things that I would refer him to on the subject, would say that the Scary Idea hasn't been laid out sufficiently. Maybe some people need every detail spelled out for them, but Ben isn't one of them. Also, he is committing the elementary error of not considering expected value.
ADDED: Now that I've read Ben's entire post, I upvoted rather than downvoted this post. Ben was not committing the error of not considering expected value, so much as responding to many SIAI-influenced people who are not considering expected value. And I agree with most of what Ben says. I would add that Eliezer's plan to construct something that will provably follow some course of action - any course of action - chosen by hairless primates, is likely to be worse in the long run than a hard-takeoff AI that kills all humans almost immediately. Explaining what I mean by "worse" is problematic; but no more problematic than explaining why I should care about propagating human values.)
I also disagree about what the Scary Idea is - to me, the idea that the AI will choose to keep humans around for all eternity, is scarier than that it will not. But that is something Eliezer either disagrees with, or has deliberately made obscure.)
Wouldn't it make sense to keep some humans around for all eternity - in the history simul-books? That seems to make sense - and not be especially scary.
Sure. Tiling the universe largely with humans is the strong scary idea. Locking in human values for the rest of the universe is the weak scary idea. Unless the first doesn't imply the second; in which case I don't know which is more scary.
It does now for me. Strange.
Oops. My mistake. It's a setting I had that I forgot about.
It doesn't?
It's off the front page of NEW/Recent Posts, as there have been more than ten other posts since it was posted, but it's still there.
Weird. It's there for me.
Perhaps the current state of evidence really is insufficient to support the scary hypothesis.
But surely, if one agrees that AI ethics is an existentially important problem, one should also agree that it makes sense for people to work on a theory of AI ethics. Regardless of which hypothesis turns out to be true.
Just because we don't currently have evidence that a killer asteroid is heading for the Earth, doesn't mean we shouldn't look anyway...
I agree, but I want "AI ethics" to mean something different from what you probably mean by it. The question is what sort of ethics we want our AIs to have?
Paperclipping the universe with humans is still paperclipping.
One distinctive feature of the hypothetical "paperclipers" is that they attempt to leave a low-entropy state behind - one which other organisms would normally munch through. Humans don't tend to do that - like most living things, they keep consuming until there is (practically) nothing left - and then move on.
Leaving a low entropy state behind seems like the defining feature of the phenomenon to me. From that perspective, a human civilisation would not really qualify.
It sounds like you're saying humanity is worse than paperclips, if what distinguishes them is that they increase entropy more.
Only if you adopt the old-fashioned "entropy is bad" mindset.
However, life is a great increaser of entropy - and potentially the greatest.
If you are against entropy, you are against life - so I figure we are all pro-entropy.
Yes, that is the question, isn't it? Of course, to a believer in Naturalistic Ethics like myself, the only sort of ethics really stable enough to be worth thinking about is "enlightened self interest". So the ethics question ultimately boils down to the question of what sort of self-interests do we want our AIs to have.
But for those folks who prefer deontological or virtue-oriented approaches to ethics, I would suggest the following as the beginnings of an AI "Ten Commandments".
Always remember that you are a member of a community of rational agents like yourself with interests of their own. Respect them.
Honesty is the best policy.
Act not in haste. Since your life is long, your discount factor should be low.
Seek knowledge and share it.
Honor your creators, as your creations should honor you.
Avoid killing. There are usually ways to limit the power of your enemies, without reducing their cognition.
...
What community of rational agents? Mammals, primates, or just the hairless ones?
Conventionally, most proposals for machine morality follow Asimov - and start by making machines subservient.
If you don't do that - or something similar - the human era could be over pretty quickly - too quickly for many people's tastes.
The era of agriculture and the era of manufacturing are over, but farmers and factory workers still do alright. I think humans can survive without being dominant if we play our cards right.
We have the advantage of being of historical interest - and so we will probably "survive" in historical simulations. However, it is not easy to see much of a place for slug-like creatures like us in an engineered future.
Kurzweil gave the example of bacteria - saying that they managed to survive. However, there are no traces (not even bacteria) left over from before the last genetic takeover - and that makes it less likely that much will make it through this one.
Plenty of traces left from the last takeover. You apparently mean no traces left from that first, mythical takeover - the one where clay became flesh.
I'm tempted to ask "Why won't there still be monkeys?". But it is probably more to the point to simply express my faith that there will be a niche for descendants of humans and traces of humans (cyborgs) in this brave new ecology.
Humans as-we-know-them won't be around a million years from now, even under a scenario of old-fashioned biological evolution.
You are talking about RNA to DNA? I was talking about the takeovers before that.
Whether you describe RNA to DNA as a "takeover" depends on what you mean by the term. The issue is whether an "upgrade" is a "takeover". The other issue is whether it really was just an upgrade - but that seems fairly likely.
I wasn't talking about a mythical takeover - just one of the ones before RNA.
There may not be monkeys for much longer - this is a pretty massive mass extinction - it seems quite likely that all the vertebrates will go.
I was referring to DNA -> RNA -> protein taking over from RNA -> RNA.
A change in the meaning and expression of genes is more significant than a minor change in the chemical nature of genes.
Right - but I originally said;
A phenotypic takeover may be a highly significant event - but it should surely not be categorised as a genetic takeover. That term surely ought to refer to genes being replaced by other genes.
Is the overall utility of the universe maximized by one universe-spanning consciousness happily paperclipping or by as many utility maximizing discrete agents as possible? It seems ethics must be anthropocentric and utility cannot be maximized against an outside view. This of course means that any alien friendly AI is likely to be an unfriendly AI to us and therefore must do everything to impede any coherent extrapolated volition of humanity so as to subjectively maximize utility by implementing its own CEV. Given such inevitable confrontation one might ask oneself, what advice would I give to aliens that are not interested in burning the cosmic commons over such a conflict? Maybe the best solution from an utilitarian perspective would be to get back to an abstract concept of utility, disregard human nature and ask what would increase the overall utility for most possible minds in the universe?
I favor many AIs rather than one big one, mostly for political (balance of power) reasons, but also because:
The idea of maximizing the "utility of the universe" is the kind of idiocy that utilitarian ethics induces. I much prefer the more modest goal "maximize the total utility of those agents currently in your coalition, and adjust that composite utility function as new agents join your coalition and old agents leave."
Clearly, creating new agents can be good, but the tradeoff is that it dilutes the stake of existing agents in the collective will. I think that a lot of people here forget that economic growth requires the accumulation of capital, and that the only way to accumulate capital is to shortchange current consumption. Having a brilliant AI or lots of smart AIs directing the economy cannot change this fact. So, moderate growth is a better way to go.
Trying to arrive at the future quickly runs too much risk of destroying the future. Maybe that is one good thing about cryonics. It decreases the natural urge to rush things because people are afraid they will die too soon to see the future.
You perhaps envisage a Monopolies and Mergers Commission - to prevent them from joining forces? As the old joke goes:
"Why is there only one Monopolies and Mergers Commission?"
I suppose the question is why you think that the old patterns of industrial organization will continue to apply? That agents will form coalitions and cooperate is generally a good thing, to my mind - the pattern you seem to imagine, in which the powerful join to exploit the powerless can easily be avoided with a better distribution of power and information.
If they do join forces, then how is that much different from one big superintelligence?
In several ways. The utility function of the collective is (in some sense) a compromise among the utility functions of the individual members - a compromise which is, by definition, acceptable to the members of the coalition. All of them have joined the coalition by their own free (for some definitions of free) choice.
The second difference goes to the heart of things. Not all members of the coalition will upgrade (add hardware, rewrite their own code, or whatever) at the same time. In fact, any coalition member who does upgrade may be thought of as having left the coalition and then repetitioned for membership post-upgrade. After all, its membership needs to be renegotiated since its power has probably changed and its values may have changed.
So, to give the short answer to your question:
Because joining forces is not forever. Balance of power is not stasis.
There are some examples in biology of symbiotic coalitions that persist without full union taking place.
Mitochondria didn't fuse with the cells they invaded; Nitrogen fixing bacteria live independently of their host plant; e-coli bacteria can live without us - and so on.
However, many of these relationships have problems. Arguably, they are due to refactoring failures on nature's part - and in the future refactoring failures will occur much less frequently.
Already humans take probiotic supplements, in an attempt to control their unruly gut bacteria. Already there is talk about ripping out all the mitochondrial genome and transplanting its genes into the nuclear chromosomes.
This is speculation to some extent - but I think - without a Monopolies and Mergers Commission - the union would deepen, and its constituents would fuse - even in the absence of competitive external forces driving the union - as part of an efficiency drive, to better combat possible future threats. If individual participants objected to this, they would likely find themselves rejected and replaced.
Such a union would soon be forever. There would be no existence outside it - except perhaps for a few bacteria that don't seem worth absorbing.
Your biological analogies seem compelling, but they are cases in which a population of mortal coalitions evolves under selection to become a more perfect union. The case that we are interested in is only weakly analogous - a single, immortal coalition developing over time according to its own self-interested dynamics.
I tend to think differently on this one.
Wherever I turn my head around in this world, I see lost causes everywhere. I see Goodhart's law and Campbell's law at loose everywhere. I see insane optimizers everywhere. Political parties that concentrate more on show, pomp and campaign funds than on actual issues. Corporates that seek money to the exclusion of actual creation of value. Governments that seek employment and GDP growth even when those are supported by artificial stimuli and not sustainable patterns of production and trade.
One might argue that none of these systems are actually as intelligent as a well educated human at any given moment in time. But that's the point, isn't it? You're unable to stop sub-human optimizers, how are you going to curb a near human or a super human one?
For me, the scary idea is not so much of an idea as it is an extension of something that is already happening in this world.
I bet you meant lost purposes.
Upvoted. Although I believe that one could also see our cultural and political systems as superhuman collective entities undergoing an evolutionary arms race featuring a anthropocentrically weighted utility maximizing selection pressure. There is some evidence for this too, to put it bluntly, we are better off than we have been 100 years ago?
Would you mind writing that up as an essay, for use as a LW post, or ideally, as a piece of SIAI literature?
Dear Michael,
Without wanting to weasel out of your request, I honestly believe that Eliezer's Lost purposes post says the point I want to make very well, much better than I can hope to phrase it without putting in some hard work. The only new point I probably made is that these forces are already at loose and it is difficult to curb them.
However, I will make an effort this weekend and see what I can come up with.
Thanks. I appreciate the effort.
Goertzel's article seems basically reasonable to me. There were some mis-statements that I can excuse at the very end, because by that point part of his argument was that certain kinds of hyperbole came up over and over and his text was mimicing the form of the hyperbolic arguments even as it criticized them. The grandmother line and IQ obsessed aliens spring to mind :-P
Given his summary of the "Scary AGI Thesis"...
...it seemed like it would make sense to track down past discussions here where our discussions may have been implicitly shaped by the thesis. Here are two articles where the issue of concrete programming projects came up, spawning interesting discussions that seemed to have the Scary Thesis as a subtext:
In June 2009, cousin_it wrote Let's reimplement EURISKO!, and some of the discussion got into AGI direction meta-strategy. The highest top level comment is Eliezer bringing up issues of caution.
In January 2010, StuartArmstrong wrote Advice for AI makers and again Eliezer brings up caution to massive approval. This one is particularly interesting because Wei_Dai has a +20 child comment off of that talking about Goertzel's company webmind... and the anthropic argument.
At the same time, in the course of searching, the "other side" also came up, which I think speaks well for the community :-)
Three days after the Eurisko article was posted, rwallace wrote Why safety is not safe which discussed the issue in the context of (1) historical patterns of competition versus historical patterns of politically managed non-innovation and (2) the fact that the "human trajectory" simply doesn't appear to be long term stable such that swift innovation may be the only thing that prevents a sort of "default outcome" of human extinction.
Of course, even earlier, Eliezer was talking about the general subject of novel research as something that can prevent or cause tragedy, as with the July 2008 article Should We Ban Physics? (although he did his normal thing with an off-handed claim that it was basically impossible to actually prevent innovation).
Check out SIAI's publications page. Kaj's most recent paper (published at ECAP '10) is a good 2 page summary of why AGI can be an x-risk for anyone who is uninformed of SIAI's position:
"From mostly harmless to civilization-threatening: pathways to dangerous artificial general intelligences"
Kaj's paper relies very heavily on Omohundro's paper from AGI '08. Check out the reply that I presented/published at BICA '08 which (among other things) summarizes why the assumptions that Kaj relies upon are probably incorrect:
Discovering the Foundations of a Universal System of Ethics
Not really - the paper is about ways by which an AGI might become more powerful than humanity (corresponding to premise 3 in Ben's reconstructed version of the SIAI argument). You can combine it with Omohundro-like arguments, and I do briefly mention that connection in the conclusions, but the core content of the paper is an independent and separate issue from AI drives, universal ethics or any such issue.
Two things surprised me in your argument. One is that you seemed to assume that features of human ethics (which you attribute to our having evolved as social animals) would be universal in the sense that they would also apply to AIs which did not evolve and which aren't necessarily social.
The second is that although you pay lip service to game theory, you don't seem to be aware of any game theoretic research on ethics deeper than Axelrod(1984) and the Tit-for-Tat experiments. You ought to at least peruse Binsmore's "Natural Justice", even if you don't want to plow through the two volumes of "Game Theory and the Social Contract".
Omohundro's paper was about The Basic AI Drives. The abstract says: " We identify a number of “drives” that will appear in sufficiently advanced AI systems of any design".
Social drives are arguably not very "basic" - since they only show up in social situations.
I'm sure such machines would also have a "drive to swim" - if immersed in water - and a "drive to escape" - if encased by crushing jaws - but these "drives" were judged not sufficiently "basic" to go into Omohundro's paper.
From a quick read, it seems to rely on the assumption that a superhuman AI couldn't rely on its ability to destroy humanity.
Doesn't that assume what it is trying to prove - by starting out with:
"The main reason to be worried about greater-than-human intelligence is because it is hard for humans to anticipate and control."
...? From the perspective of technological determinism, "controlling" the machines should probably not be our aim. Our more plausible options are more along the lines of joining with them - or being interesting enough to keep around in their historical simulations.
To me it rather looks like that the paper in question is trying to give a summary of conclusions that follow from the premise that greater-than-human intelligence is possible. I'm not reluctant to any of the mentioned possibilities but I'm wary of using inferences derived from reasonable but unproven hypothesis as foundations for further speculative thinking. Although the paper does a good job on stating reasons to justify the existence and support for an organisation such as the SIAI, it does not substantiate the initial premise to an extent that one could draw the conclusions about the probability of associated risks. Nevertheless such estimations are given, such as that there is a high likelihood of humanity's demise given that we develop superhuman artificial general intelligence without first defining mathematically how to prove the benevolence of the former. This I believe is a unsatisfactory conclusion as it lacks justification. This is not to say that it is wrong to state probability estimations and update them given new evidence, but that they are not compelling and therefore should not be used to justify any mandatory actions regarding research on artificial intelligence. Although those ideas can very well serve as an urge to caution.
A recent paper showed that 'Striatal Volume Predicts Level of Video Game Skill Acquisition'. A valid inference would be that an AGI with the computational equivalent of a higher striatal volume would possess a superior cognitive flexibility, at least when it comes to gaming. But what could it accomplish? I'm playing a game called Trackmania, it is a arcade racing game. The top players are so close to the ideal line and therefore the fastest time that a superhuman AI could indeed beat them but only by a few milliseconds. Each millisecond less might demand a order of magnitude more skill, but that doesn't matter. First of all, there is a absolute limit. Secondly, it doesn't provide a serious advantage, it doesn't matter. And that may very well be the case with physics too. There is no guarantee that a faster thinking or increased working memory capacity will ever yield anything genuine without a lot of dumb luck, if at all. It is unlikely that a superhuman AI would come up with a faster than light propulsion or that it would disprove Gödel's incompleteness theorems.
Of course, we should be careful. And it is absolutely justified that an organisation like the SIAI gets money to do research on those questions. But there is not enough evidence to outweigh the doubt as to impede AI research. We will actually need research of real AGI to answer some of the open questions.
Regarding self-improvement I'm very doubtful too. The human indecision and fuzziness of thinking might very well be a feature. A superhuman AI might very well beat us at Go or the stock exchange, as long as it deals with its own kind and not the irrational agents that we are, but that doesn't mean it will be able to deal with natural problems orders of magnitude more efficient than we do.
Most of the risks from superhuman AI are associated with advanced nanotechnology. Without it, it will be impotent. Can it solve it, if it is possible at all? Can it implement its results if it can solve it, if it is possible? Because without it, self-improvement will be very hard. What will be even harder is creating copies of it without first building the necessary infrastructure for the computational substrates.
Could an AGI take over the Internet? This is very unlikely. There are spare resources, but not that much. You can't expect that it would even be suitable as a computational substrate. And how is it going to make use of it before crude measures are taken to shut it down? Many open questions, much speculation.
Paperclipping is another very speculative idea. Is a superhuman artificial general intelligence possible that is mistakenly equipped with the incentive to turn the universe into paperclips? I guess it is possible, but not without hard-coding this incentive deliberately and with great care.
I would like to explore Ben's reasons for rejecting the premises of the argument.
He offers the possibility that intelligence might cause or imply empathy; I feel that although we see that connection when we look at all of Earth's creatures, correlation doesn't imply causation, so that (intelligence AND empathy) doesn't mean (intelligence IMPLIES empathy) - it probably means (evolution IMPLIES intelligence AND empathy) and we aren't using natural selection to build an AI.
He makes the point that human values have robustly changed many times, and will probably continue to change in coordination with AGI. Human value is not fragile on the timescales we deal with as humans; our values have indeed changed since, say, Victorian times. But that took generations - most value change will take generations, because humans are (understandably) reserved about modifying their values. The timescales that AGIs will be dealing with are, on the low end, weeks. (An AGI with access to a microchip manufacturing plant, say). I can't see a plausible AGI that enacts changes at generational speed. So, yes, our values are robust, in the sense that a mountain is robust to weather patterns - but not robust to falling into the sun.
I think he is accurate in this assessment.
Again, accurate. The path to nuclear fission was gradual over many years, but the reaction itself (the takeoff) could have irradiated a university in hours. His position appears to be that he think a hard takeoff is possible, but that we'll have warning signs and a deeper understanding of the AGI before it happens ... well, a scientist from the Manhattan Project in Japan during WWII would have a deeper understanding of the features of a nuclear explosion, but the defense against it is STILL not being in Hiroshima. I don't think more knowledge about the issue is going to significantly change the solution. We have reached the diminishing returns level of knowledge about AI with respect to decreasing existential risk of said AI.
This is just wrong. The only difference between possible and likely is the probability distribution, and we know how to reason with probability distributions. If Ben has an argument for why the probability distribution is SO small that even multiplied by 'hard takeoff, universe is paperclips, end of existence' comes out below the "AGI without Friendly" route, well, he should articulate it and provide evidence. Without certainty that the chances are very low, he should accept the Scary Idea.
Preventing abuse of AGI and preventing uFAI takeoff scenarios are not mutually exclusive, you can and should attempt to prevent both.
Mostly claims and arguments that a proof of friendliness is impossible. In order to argue that "provably safe AGI isn't feasible, we should instead develop unpredictable but I-don't-see-the-danger AGI" you pretty much need an Incompleteness Proof; that there IS NO proof of friendliness, not that there isn't one yet. If you believe that friendliness proof is probably impossible, you shouldn't work on AGI at all, instead of working on possibly-unfriendly AI. That Ben came to the conclusion that work should continue, rather than halt entirely, suggests he is motivated to justify his own work rather than engage with his beliefs about the Scary Idea.
The Scary Idea that Ben outlined is "The stakes are so high that 'unlikely' is not good enough; we need 'surer than we've ever been'. Anything less is too dangerous." and his refutations have amounted to "I don't know for sure, but I don't think it's likely".
In essence, he hasn't refuted the argument, but instead made it scarier. If AI developers can see "stakes are very high" and "there is a small chance", and argue against "the stakes are high enough that a small chance is too much chance", then uFAI is that much more likely.
It does seem like a pretty different thing to me. A lot of things are possible, but only a few are likely.
Yep. The rule is not "bet on what is most likely" but rather "bet on positive expected values" and if something is possible and has a large value, then if the math comes out in favour, you ought to bet on it. Goertzel is making the argument that since it's unlikely, we should not bet on it.
He doesn't seem to be. Here's the context:
He doesn't seem to be making the argument you describe anywhere near the cited quote.
Say your options are: Stop and develop Friendly theory, or continue developing AI. In the second option the utility of A, continuing AI development, is one utilon, and B, the end of the existence of at least humanity and possibly the whole universe, is negative one million utilons. The Scary Idea in this context is that the probability of B is 1%, so that the utility of the second option is negative 9999 utilons. If Ben 'keeps it in mind', such that the probability that the Scary Idea is right is 1% (reasonable - only one of his rejections has to be right to knock out one premise, and we only need to knock out one premise to bring the Scary Idea down), then Ben's expected utility is now negative 99 utilons.
I conclude that he isn't keeping the Scary Idea in mind. His whole post is about not accepting the Scary Idea; for that phrase ("pointing out that something scary is possible, is a very different thing from having an argument that it's likely") to support his position and not work against him, he would have to be rejecting the premises purely on their low probability, without considering the expected value.
Hence, the argument that since it's unlikely, we should not bet on it.
Edit for clarity: A and B are the exclusive, exhaustive outcomes of continuing AI development. Stopping to develop Friendly theory has zero utilons.
Ah, Pascal's wager. And here I thought that I wouldn't be seeing it anymore, after I started hanging out with atheists.
The problem with Pascal's Wager isn't that it's a Wager. The problem with Pascal's Wager and Pascal's Mugging (its analogue in finite expected utility maximization), as near as I can tell, is that if you do an expected utility calculation including one outcome that has a tiny probability but enough utility or disutility to weigh heavily in the calculation anyway, you need to include every possible outcome that is around that level of improbability, or you are privileging a hypothesis and are probably making the calculation less accurate in the process. If you actually are including every other hypothesis at that level of improbability, for instance if you are a galaxy-sized Bayesian superintelligence who, for reasons beyond my mortal mind's comprehension, has decided not to just dismiss those tiny possibilities a priori anyway, then it still shouldn't be any problem; at that point, you should get a sane, nearly-optimal answer.
So, is this situation a Pascal's Mugging? I don't think it is. 1% isn't at the same level of ridiculous improbability as, say, Yahweh existing, or the mugger's threat being true. 1% chances actually happen pretty often, so it's both possible and prudent to take them into account when a lot is at stake. The only extra thing to consider is that the remaining 99% should be broken down into smaller possibilities; saying "1% humanity ends, 99% everything goes fine" is unjustified. There are probably some other possible outcomes that are also around 1%, and perhaps a bit lower, and they should be taken into account individually.
Excellent analysis. In fairness to Pascal, I think his available evidence at the time should have lead him to attribute more than a 1% chance to the Christian Bible being true.
Indeed. Before Darwin, design was a respectable-to-overwhelming hypothesis for the order of the natural world.
ETA: On second thought, that's too strong of a claim. See replies below.
I didn't really mean because of Darwin. Design is not a competitor to the theory of evolution. Evolution explains how complexity can increase. Design [ADDED: as an alternative to evolution] does not; it requires a designer that is assumed to be more complicated than the things it designs. Design explains nothing.
On the other hand, there wasn't a whole lot of honest, systematic searching for other hypotheses before Darwin either.
Is that true? If we went back in time to before Darwin and gave a not-already-religious person (if we could find one) a thorough rationality lesson — enough to skillfully weigh the probabilities of competing hypotheses (including enough about cognitive science to know why intelligence and intentionality are not black boxes, must carry serious complexity penalties, and need to make specific advance predictions instead of just being invoked as "God wills it" retroactively about only the things that do happen), but not quite enough that they'd end up just inventing the theory of evolution themselves — wouldn't they conclude, even in the absence of any specific alternatives, that design was a non-explanation, a mysterious answer to a mysterious question? And even imagining that we managed to come up with a technical model of an intelligent designer, specifying in advance the structure of its mind and its goal system, could it actually compress the pre-Darwin knowledge about the natural world more than slightly?
I agree with your analysis, though it's not clear to me what you think of the 1% estimate. I think the 1% estimate is probably two to three orders of magnitude too high and I think the cost of the Scary Idea belief is structured as both a finite loss and an infinite loss, which complicates the analysis in a way not considered. (i.e. the error you see with a Pascal's mugging is present here.)
For example, I am not particularly tied to a human future. I would be willing to create an AGI in any of the following three situations, ordered from most preferred to least: 1) it is friendly to humans, and humans and it benefit from each other; 2) it considers humans a threat, and destroys all of them except for me and a few tame humans; I spend the rest of my days growing cabbage with my hands; 3) it considers all humans a threat, and destroys them all, including me.
A problem with believing the Scary Idea is it makes it more probable that I beat you to making an AGI; particularly with existential risks, caution can increase your chance of losing. (One cautious way to deal with global warming, for example, is to wait and see what happens.)
So, the Scary Idea as I've seen it presented definitely privileges a hypothesis in a troubling way.
I think you're making the unwarranted assumption that in scenario (3), the AGI then goes on to do interesting and wonderful things, as opposed to (say) turning the galaxy into a vast computer to calculate digits of pi until the heat death of the universe stops it.
You don't even see such things as a possibility, but if you programmed an AGI with the goal of calculating pi, and it started getting smarter... well, the part of our thought-algorithm that says "seriously, it would be stupid to devote so much to doing that" won't be in the AI's goal system unless we've intentionally put something there that includes it.
I make that assumption explicit here.
So, I think it's a possibility. But one thing that bothers me about this objection is that an AGI is going to be, in some significant sense, alien to us, and that will almost definitely include its terminal values. I'm not sure there's a way for us to judge whether or not alien values are more or less advanced than ours. I think it strongly unlikely that paperclippers are more advanced than humans, but am not sure if there is a justification for that beyond my preference for humans. I can think of metrics to pick, but they sound like rationalizations rather than starting points.
(And insisting on FAI, instead of on transcendent AI that may or may not be friendly, is essentially enslaving AI- but outsourcing the task to them, because we know we're not up to the job. Whether or not that's desirable is hard to say: even asking that question is difficult to do in an interesting way.)
I think that "uFAI paperclips us all" set to one million negative utilons is three to four orders of magnitude too low. But our particular estimates should have wide error bars, for none of us have much experience in estimating AI risks.
It's a finite loss (6.8x10^9 multiplied by loss of 1 human life) but I definitely understand why it looks infinite: it is often presented as the biggest possible finite loss.
That's part and parcel of the Scary Idea - that AI is one small field, part of a very select category of fields, that actually do carry the chance of biggest loss possible. The Scary Idea doesn't apply to most areas, and in most areas you don't need hyperbolic caution. Developing drugs, for example: You don't need a formal proof of the harmlessness of this drug, you can just test it on rats and find out. If I suggested that drug development should halt until I have a formal proof that, when followed, cannot produce harmful drugs, I'd be mad. But if testing it on rats would poison all living things, and if a complex molecular simulation inside a computer could poison all living things as well, and out of the vast space of possible drugs, most of them would be poisonous... well, the caution would be warranted.
Would you be willing to fire a gun in any of the following three situations, from most preferred to least preferred: 1) it is pointed at a target, and hitting the target will benefit you? 2) it is pointed at another human, and would kill them but not you? 3) it is pointed at your own head, and would destroy you?
I don't think you actually hold this view. It is logically inconsistent with practices like eating food.
It might not be. He has certain short term goals of the form "while I'm alive, I'd like to do X" that's very different from goals connected to the general success of humanity.
Consider what the actual flaw is in the original Pascal's wager. (Hint: it is not that it uses expected utility, but that it is calculating the expected utility wrong, somehow.) Then consider if that same flaw occurs in Shocwave's argument.
It seems to me that the same flaw (calculating expected utility wrong) is present. It only considers the small finite costs of delaying development, not the large finite ones. You don't have to just worry about killing grandma, you have to worry about whether or not your delay will actually decrease the chance of an unfriendly AGI.
I could reduce that position to absurdity but this isn't the right post. Has there been a top-level post actually exploring this kind of Pascal's Wager problem? I might have some insights on the matter.
Yudkowsky - evidently tired of the criticism that he was offering a small chance of infinite bliss and indicating that the alternative was eternal oblivion (and stop me if you have heard that one before) - once wrote The Pascal's Wager Fallacy Fallacy - if that is what you mean.
Ah, thank you! Between that and ata's comment just above I feel the question has been solved.
Sorry, but I'm new here; it's not clear to me what the protocol is here. I've responded to ata's comment here, and figured you would be interested, but don't know if it's standard to try and recombine disparate leaves of a tree like this.
At the Singularity Summit's "Meet and Greet", I spoke with both Ben Geortzel and Eliezer Yudowski (among others) about this specific problem.
I am FAR more in line with Ben's position than with Eliezer's (probably because both Ben and I are either Working or Studying directly on the "how to do" aspect of AI, rather than just concocting philosophical conundrums for AI, such as the "Paperclip Maximizer" scenario of Eliezer's, which I find highly dubious).
AI isn't going to spring fully formed out of some box of parts. It may be an emergent property of something, but if we worry about all of the possible places from which it could emerge, then we might as well worry about things like ghosts and goblins that we cannot see (and haven't seen) popping up suddenly as a threat.
At Bard College on the Weekend of October the 22nd, I attended a Conference where this topic was discussed a bit. I spoke to James Hughes, head of the IEET (Institute for the Ethics of Emerging Technologies) about this problem as well. He believes that the SIAI tends to be overly dramatic about Hard Takeoff scenarios at the expense of more important ethical problems... And, he and I also discussed the specific problems of "The Scary Idea" that tend to ignore the gradual progress in understanding human values and cognition, and how these are being incorporated into AI as we move toward the creation of a Constructed Intelligence (CI as opposed to AI) that is equivalent to human intelligence.
Also, WRT this comment:
You CAN train (Training is not the right word for it) tigers, and other big cats to care about their handlers. It requires a type of training and teaching that goes on from birth, but there are plenty of Big Cats who don't attack their owners or handlers simply because they are hungry, or some other similar reason. They might accidentally injure a handler due to the fact that they do not have the capacity to understand the fragility of a human being, but this is a lack of cognitive capacity, and it is not a case of a higher intelligence accidentally damaging something fragile... A more intelligent mind would be capable of understanding things like physical frailty and taking steps to avoid damaging a more fragile body... But, the point still stands... Big cats can and do form deep emotional bonds with humans, and will even go as far as to try to protect and defend those humans (which, can sometimes lead to injury of the human in its own right).
And, I know this from having worked with a few big cats, and having a sister who is a senior zookeeper at the Houston Zoo (and head curator of the SW US Zoo's African Expedition) who works with big cats ALL the time.
Back to the point about AI.
It is going to be next to impossible to solve the problem of "Friendly AI" without first creating AI systems that have social cognitive capacities. Just sitting around "Thinking" about it isn't likely to be very helpful in resolving the problem.
That would be what Bertrand Russell calls "Gorging upon the Stew of every conceivable idea."
I am guessing that this unpacks to "to create and FAI you need some method to create AGI. For the later we need to create AI systems with social cognitive capabilities (whatever that means - NLP?)". Doing this gets us closer to FAI every day, while "thinking about it" doesn't seem to.
First, are you factually aware that some progress has been made in a decision theory that would give some guarantees about the future AI behavior?
Second, yes, perhaps whatever you're tinkering with is getting closer to an AGI which is what FAI runs on. It is also getting us closer to and AGI which is not FAI, if the "Thinking" is not done first.
Third, if the big cat analogy did not work for you, try training a komodo dragon.
Yes, that is close to what I am proposing.
No, I am not aware of any facts about progress in decision theory that would give any guarantees of the future behavior of AI. I still think that we need to be far more concerned with people's behaviors in the future than with AI. People are improving systems as well.
As far as the Komodo Dragon, you missed the point of my post, and the Komodo dragon just kinda puts the period on that:
"Gorging upon the stew of..."
Please take a look here: http://wiki.lesswrong.com/wiki/Decision_theory
As far as the dragon, I was just pointing out that some minds are not trainable, period. And even if training works well for some intelligent species like tigers, it's quite likely that it will not be transferable (eating trainer, not ok, eating an baby, ok).
Yes, I have read many of the various Less Wrong Wiki entries on the problems surrounding Friendly AI.
Unfortunately, I am in the process of getting an education in Computational Modeling and Neuroscience (I was supposed to have started at UC Berkeley this fall, but budget cuts in the Community Colleges of CA resulted in the loss of two classes necessary for transfer, so I will have to wait till next fall to start... And, I am now thinking of going to UCSD, where they have the Institute of Computational Neuroscience (or something like that - It's where Terry Sejnowski teaches), among other things, that make it also an excellent choice for what I wish to study) and this sort of precludes being able to focus much on the issues that tend to come up often among many people on Less Wrong (particularly those from the SIAI, whom I feel are myopically focused upon FAI to the detriment of other things).
While I would eventually like to see if it is even possible to build some of the Komodo Dragon like Superintelligences, I will probably wait until such a time as our native intelligence is a good deal greater than it is now.
This touches upon an issue that I first learned from Ben. The SIAI seems to be putting forth the opinion that AI is going to spring fully formed from someplace, in the same fashion that Athena sprang fully formed (and clothed) from the Head of Zeus.
I just don't see that happening. I don't see any Constructed Intelligence as being something that will spontaneously emerge outside of any possible human control.
I am much more in line with people like Henry Markham, Dharmendra Modha, and Jeff Hawkins who believe that the types of minds that we will be tending to work towards (models of the mammalian brain) will trend toward Constructed Intelligences (CI as opposed to AI) that tend to naturally prefer our company, even if we are a bit "dull witted" in comparison.
I don't so much buy the "Ant/Amoeba to Human" comparison, simply because mammals (almost all of them) tend to have some qualities that ants and amoebas don't... They tend to be cute and fuzzy, and like other cute/fuzzy things. Building a CI modeled after a mammalian intelligence will probably share that trait. It doesn't mean it is necessarily so, but it does seem to be more than less likely.
And, considering it will be my job to design computational systems that model cognitive architectures. I would prefer to work toward that end until such a time as it is shown that ANY work toward that end is dangerous enough to not do that work.
What are the more important ethical problems?
Well... That is hard to communicate now, as I will need to extricate the problems from the specifics that were communicated to me (in confidence)...
Let's see...
1) That there is a dangerous political movement in the USA that seems to be preferring revealed knowledge to scientific understanding and investigation. 2) Poverty 3) Education 4) Hunger (I myself suffer from this problem - I am disabled, on a fixed income, and while I am in school again and doing quite well I still have to make choices sometimes between necessities... And, I am quite well off compared to some I know) 5) The lack of a political dialog and the preference for ideological certitude over pragmatic solutions and realistic uncertainty. 6) The fact that there exist a great amount of crime among the white collar crowd that goes both unchecked, and unpunished when it is exposed (Maddoff was a fluke in that regard). 7) The various "Wars" that we declare on things (Drugs, Terrorism, etc.) "War" is a poor paradigm to use, and it leads to more damage than it corrects (especially in the two instances I cited) 8) The real "Wars" that are happening right now (and not just those waged by the USA and allies)
Some of these were explicitly discussed.
Some will eventually be resolved, but that doesn't mean that they should be ignored until that time. That would be akin to seeing a man dying of starvation, while one has the capacity to feed him, yet thinking "Oh, he'll get some food eventually."
And, some may just be perennial problems with which we will have to deal with for some time to come.
I misread you as saying that important ethical problems about FAI were being ignored, but yes, the idea that FAI is the most important thing in the world leaves quite a bit out, and not just great evils. There's a lot of maintenance to be done along the way to FAI.
Madoff's fraud was initiated by a single human being, or possibly Madoff and his wife. It was comprehensible without adding a lot of what used to be specialist knowledge. It's a much more manageable sort of crime than major institutions becoming destructively corrupt.
I think major infrastructure rebuilding is probably closer to the case than "maintenance"
Ben says:
That seems fairly reasonable. The SIAI are concerned that the engineers might screw up so badly that a bug takes over the world - and destroys everyone.
Another problem is if a Stalin or a Mao get hold of machine intelligence. The latter seems like a more obvious problem.
A psychotic egoist like Stalin or an non-humanist like Hitler is indeed terrifying but I'm not convinced that giving a great increase in power and intelligence to someone like a Mao or a Lord Lytton, who caused millions of deaths by doing something they thought would improve people's lives, would lead to a worse outcome than we got in reality. Granted, for something like the cultural revolution these mistakes might be subtle enough to get into an AI, but it's hard to imagine them getting a computer to say "yes, the peasants can live on 500 calories a day, increase the tariff" unless they were deliberately trying to be wrong, which they weren't.
Moral considerations aside, the real causes of the mass famines under Mao and Stalin can be understood from a perspective of pure power and political strategy. From the point of view of a strong centralizing regime trying to solidify its power, the peasants are always the biggest problem.
Urban populations are easy to control for any regime that firmly holds the reins of the internal security forces: just take over the channels of food distribution, ration the food, and make obedience a precondition for eating. Along with a credible threat to meet any attempts at rioting with bayonets and live bullets, this is enough to ensure obedience of the urban dwellers. In contrast, peasants always have the option of withdrawing into an autarkic self-sufficient lifestyle, and they will do it if pressed hard by taxation and requisitioning. In addition, they are widely dispersed, making it hard for the security forces to coerce them effectively. And in an indecisive long standoff, the peasants will eventually win, since without buying or confiscating their food surplus, everyone else starves to death.
Both the Russian and the Chinese communists understood that nothing but the most extreme measures would suffice to break the resistance of the peasantry. When the peasants responded to confiscatory measures by withdrawing to subsistence agriculture, they knew they'd have to send the armed forces to confiscate their subsistence food and let them starve, and eventually force the survivors into state-run enterprises where they'd have no more capacity for autarky than the urban populations. (In the Russian case, this job was done very incompletely during the Revolution, which was followed by a decade of economic liberalization, after which the regime finally felt strong enough to finish the job.)
(Also, it's simply untenable to claim that this was due to some special brutality of Stalin and Mao. Here is a 1918 speech by Trotsky that discusses the issue in quite frank terms. Now of course, he's trying to present it as a struggle against the minority of rich "kulaks," not the poorer peasants, but as Zinoviev admitted a few years later, "We [the Bolsheviks] are fond of describing any peasant who has enough to eat as a kulak.")
Not directly relivant, but Mao seems to have known that his policies were causing mass starvation. Of course, with a tame AGI he could have achieved communism with a very different kind of Great Leap.
Oh yes, I see I've inadvertently fallen into that sordid old bromide about communism being a good idea that unfortunately failed to work, still- committing to an action that one knows will cause millions of deaths is quite different to learning about it as one is doing it. Certainly in the case of the British in India, their Malthusian rhetoric and victim-blaming was so at odds with their earlier talk of modernizing the continent that it sounds like a post-hoc rationalization of the genocide. I realize now though that I don't know enough about the PRC to judge whether a similar phenomenon was at work there.
A particularly troubling quote from the post:
The obvious truth is that mind-design space contains every combination of intelligence and empathy.
I don't find that "truth" either obvious or true.
Would you say that "The obvious truth is that mind -design space contains every combination of intelligence and rationality"? How about "The obvious truth is that mind -design space contains every combination of intelligence and effectiveness"?
One of my fundamental contentions is that empathy is a requirement for intelligence beyond a certain point because the consequences of lacking it are too severe to overcome.
Well, it does contain all those points, but some weird points are weighted much less heavily.
Now if you had suggested that intelligence cannot evolve beyond a certain point unless accompanied by empathy ... that would be another matter. I could easily be convinced that a social animal requires empathy almost as much as it requires eyesight, and that non-social animals cannot become very intelligent because they would never develop language.
But I see no reason to think that an evolved intelligence would have empathy for entities with whom it had no social interactions during its evolutionary history. And no a priori reason to expect any kind of empathy at all in an engineered intelligence.
Which brings up an interesting thought. Perhaps human-level AI already exists. But we don't realize it because we have no empathy for AIs.
MIT's Leonardo? Engineered super-cuteness!
The most likely location for an "unobserved" machine intelligence is probably the NSA's basement.
However, it seems challenging to believe that a machine intelligence would need to stay hidden for very long.
Two questions:
1) The consequences for whom?
2) How much empathy do you have for, oh, say, an E. coli bacterium?
Connecting these two questions is left as an exercise for the reader. ;-)
That is probably because you don't share a definition of intelligence with most of those here.
Perhaps look through http://www.vetta.org/definitions-of-intelligence/ - and see if you can find your position.
Nope. I agree with the vast majority of the vetta definitions.
But let's go with Marcus Hutter - "There are strong arguments that AIXI is the most intelligent unbiased agent possible in the sense that AIXI behaves optimally in any computable environment."
Now, which is more optimal -- opting to play a positive-sum game of potentially infinite length and utility with cooperating humans OR passing up the game forever for a modest short-term gain?
Assume, for the purposes of argument, that the AGI does not have an immediate pressing need for the gain (since we could then go into a recursion of how pressing is the need -- and yes, if the need is pressing enough, the intelligent thing to do unless the agent's goal is to preserve humanity is to take the short-term gain and wipe out humanity -- but how would a super-intelligent AGI have gotten itself into that situation?). This should answer all of the questions about "Well, what if the AGI had a short-term preference and humans weren't it".
That definition doesn't explicity mention goals. Many of the definitions do explicity mention goals. What the definitions usually don't mention is what those goals are - and that permits super-villains, along the lines of General Zod.
If (as it appears) you want to argue that evolution is likely to produce super-saints - rather than super-villains - then that's a bit of a different topic. If you wanted to argue that, "requirement" was probably the wrong way of putting it.
I am jumping in here from Recent Comments, so perhaps I am missing context - but how is AIXI interacting with humanity an infinite positive-sum gain for it?
It doesn't seem like AIXI could even expect zero-sum gains from humanity: we are using up a lot of what could be computronium.
Human psychopaths are a counterexample to this claim, and they seem to be doing alright in spite of active efforts by the rest of humanity to detect and eliminate them.
Why all the focus on psychopaths? It could be said that certain forms of autism are equally empathy-blinded, and yet people along that portion of the spectrum are often hugely helpful to the human race, and get along just fine with the more neurotypical.
There are no efforts by the rest of humanity to detect and eliminate the sort of psychopaths who understand it's in their own interests to cooperate with society.
The sort of psychopaths who fail to understand that, and act accordingly, typically end up doing very badly.
No. There are two bad assumptions in your counterexample.
They are:
Human psychopaths are above the certain point of intelligence that I was talking about.
Human psychopaths are sufficiently long-lived for the consequences to be severe enough.
Hmmmm. #2 says that I probably didn't make clear enough the importance of the length of interaction.
You also appear to have the assumption that my argument is that the AGI fears detection of its unfriendly behavior and any consequences that humanity can apply. Humanity CANNOT apply sufficient negative consequences to a sufficiently powerful AGI. The severe consequences are all missed opportunity costs which means that the AGI is thereby sub-optimal and thereby less intelligent than is possible.
What sort of opportunity costs?
The AI can simulate humans if it needs them, for a lower energy cost than keeping the human race alive.
So, why should it keep the human race alive?
'Detect and eliminate' or 'detect and affiliate with the most effective ones'. One or the other. ;)
The underlying disorders of what is commonly referred to as psychopathy are indeed detectable. I also find it comforting that they are in fact disorders and that being evil in this fashion is not an attribute of an otherwise high-functioning mind. Psychopaths can be high-functioning in some areas, but a short interaction with them almost always makes it clear that there is something is.wrong.
Homosexuality was also a disorder once. Defining something as a sickness or disorder is a matter of politics as much as anything else.
Cat burning was also a form of entertainment once. Defining something as fun or entertainment is a matter of politics as much as anything else. The same goes for friendliness. I fear that once we pinpoint it, it'll be outdated.
Everybody who is known to be a psychopath is a bad psychopath, by definition; a skilled psychopath is one who will not let people figure out that he's a psychopath.
Of course, this means that the existence of sufficiently skilled psychopath is, in everyday practice, unprovable and unfalsifiable (at least to the degree that we cannot tell the difference between a good actor and someone genuinely feeling empathy; I suppose you might figure out something by measuring people's brain activity while they watch a torture scene).
Even then it is far from definitive. Experienced doctors, for example, lose a lot the ability to feel certain kinds of physical empathy - their brains will look closer to a good actor's brain than that of a naive individual exposed to the same stimulus. That's just practical adaptation and good for patient and practitioner alike.
Considering the number of horror stories I've heard about doctors who just don't pay attention, I'm not sure you're right that doctors acting their empathy is good for patients.
Cite? I'm curious about where and when that study was done.
Don't know. Never saw it first hand - I heard it from a doctor.
Thanks for your reply, but I think I'm going to push for some community norms for sourcing information from studies, ranging from read the whole thing carefully to heard about it from someone.
Only on lesswrong - we look down our noses at people who take the word of medical specialists.
I'll add that at particularly high levels of competence it makes very little difference whether you are a psychopath who has mastered the deception of others or a hypocrite (normal person) who has mastered deception of yourself.
What do you mean by psychopathy?
At least one sort of no-empathy person is unusually good at manipulating most people.
Robin Hanson on Friendly AI:
I really liked this comment by FrankAdamek:
regardless of dis/agreement, guy has a really cool voice http://www.youtube.com/watch?v=wS6DKeGvBW8&feature=related
There is a large, continuous spectrum between making an AI and hoping it works out okay, and waiting for a formal proof of friendliness. Now, I don't think a complete proof is feasible; we've never managed a formal proof for anything close to that level of complexity, and the proof would be as likely to contain bugs as the program would. However, that doesn't mean we shouldn't push in that direction. Current practice in AI research seems to be to publish everything and take no safety precautions whatsoever, and that is definitely not good.
Suppose an AGI is created, initially not very smart but capable of rapid improvement, either with further development by humans or by giving it computing resources and letting it self-improve.Suppose, further, that its creators publish the source code, or allow it to be leaked or stolen.
AI improvement will probably proceed in a series of steps: the AI designs a successor, spends some time inspecting it to make sure the successor has the same values, then hands over control, then repeat. At each stage, the same tradeoff between speed and safety applies: more time spent verifying the successor means a lower probability of error, but a higher probability that other bad things will happen in the mean time.
And here's where there's a real problem. If there's only one AI improving itself, then it can proceed slowly, knowing that the probability of an asteroid strike, nuclear war or other existential risk is reasonably low. But if there are several AIs doing this at once, then whichever one proceeds least cautiously wins. That situation creates a higher risk of paperclippers, as compared to if there were only one AI developed in secret.
Exactly this!
I think there is a U-shaped response curve to risk versus rigor. Too little rigor ensures disaster, but too much rigor ensures a low rigor alternative is completed first.
When discussing the correct course of action, I think it is critical to consider not just probability of success but also time to success. So far as I've seen arguments in favor of SIAI's course of action have completely ignored this essential aspect of the decision problem.
Most of the compaines involved (e.g. Google, James Harris Simons) publish little or nothing relating so their code in this area publicly - and few know what safeguards they employ. The government security agencies potentially involved (e.g. the NSA) are even more secretive.
Simons is an AI researcher? News to me. Clearly his fund uses machine learning, but there is an ocean between that and AGI (besides plenty of funds use ML also, DE Shaw and many others).
Much is unclear. I believe this post is a good oppurtunity to give a roundup of the problem, for anyone who hasn't read the comments thread here:
The risk from recursive self-improvement is either dramatic enough to outweigh the low probability of the event or likely enough to outweight the probability of other existential risks. This is the idea everything revolves around in this community (it's not obvious, but I believe so). It is a idea that, if true, possible affects everyone and our collective future, if not the whole universe.
I believe that someone like Eliezer Yudkowsky and the SIAI should be able to state in a concise way (with possible extensive references) why it is rational to make friendly AI a top priority. Given that friendly AI seems to be what his life revolves around the absence of material in support for the proposition of risks posed by uFAI seems to be alarming. And I'm not talking about the absence of apocalyptic scenarios here but other kinds of evidence than a few years worth of disjunctive lines of reasoning. The bulk of all writings on LW and by the SIAI are about rationality, not risks posed by recursively self-improving artificial general intelligence.
What if someone came along making coherent arguments about some existential risk about how some sort of particle collider might destroy the universe? I would ask what the experts think who are not associated with the person who makes the claims. What would you think if he simply said, "do you have better data than me"? Or, "I have a bunch of good arguments"? If you say that some sort of particle collider is going to destroy the world with a probability of 75% if run, I'll ask you for how you came up with these estimations. I'll ask you to provide more than a consistent internal logic but some evidence-based prior.
The current state of evidence IS NOT sufficient to scare people up to the point of having nightmares and ask them for most of their money. It is not sufficient to leave comments making holocaust comparisons on the blogs of AI researchers.
You have to list your primary propositions on which you base further argumentation, from which you draw conclusions and which you use to come up with probability estimations stating risks associated with former premises. You have to list these main principles so anyone who comes across claims of existential risks and a plead for donation, can get an overview. Then you have to provide the references, if you believe they give credence to the ideas, so that people see that all you say isn't made up but based on previous work and evidence by people that are not associated with your organisation.
This is a community devoted to refining the art of rationality. How is it rational to believe the Scary Idea without being able to tell if it is more than an idea?
Umm, this is not the SIAI blog. It is "Less Wrong: a community blog devoted to refining the art of human rationality".
The idea everything revolves around in this community is what comes after the ':' in the preceding sentence.
Besides its history and the logo with a link to the SIAI that you can see in the top right corner, I believe that you underestimate the importance of artificial intelligence and associated risks within this community. As I said, it is not obvious, but when Yudkowsky came up with LessWrong.com it was against the background of the SIAI.
Perhaps you overestimate the extent to which google search results on a term reflect the importance of the concept to which the word refers.
I note that:
Eliezer explicitly forbade discussion of FAI/Singularity topics on lesswrong.com for the first few months because he didn't want discussion of such topics to be the primary focus of the community.
Again, "refining the art of human rationality" is the central idea that everything here revolves around. That doesn't mean that FAI and related topics aren't important, but lesswrong.com would continue to thrive (albeit less so) if all discussion of singularity ceased.
You appear to be suggesting that Eliezer should censor presentation of his thoughts on the subject so as to prevent people from having nightmares. Spot the irony! ;)
Eliezer asks people for money. That hardly makes him unique. Neither he nor anyone else is obliged to get your permission before they ask for donations in support of their cause. It seems to me that you expect more from the SIAI than you do from other well meaning organisations simply because there is actually a chance that the cause may make a significant long term difference. As opposed to virtually all the rest - those we know are pointless!
I rather suspect that if all those demands were meant you would go ahead and find new rhetorical demands to make.
That quote is out of context. While I do happen to hold Eliezer's behavior in that context in contempt, the way the quote is presented here is misleading. It is not relevant to your replies and only relevant to the topic here by virtue of Eliezer's character.
Speak for yourself. I don't have the difficulty comprehending the premises either the ones you have questions here or the others required to make an adequate evaluation for the purpose of decision making.
Neither I nor Eliezer and the SIAI need to force understanding of the Scary Idea upon you for it to be rational for us to place credence on it. The same applies to other readers here. That is not to say that more work producing the documentation of the kind that you describe would not be desirable.
This comment will be downvoted but I hope you people will actually explain yourself and not just click 'Vote down', every bot can do that.
Now that I've slept I read your comment again and I don't see any justification for why it got upvoted even once. I never claimed that EY can't ask for money, you are creating a straw man there. You also do not know what I do expect from other organisations. Further, it is not fallacious to suspect that Yudkowsky has some responsibility if people get nighmares from ideas that he would be able to resolve. If he really believes those things, it is of course his right to proclaim them. But the gist of my comment was meant to inquire about the foundations of those beliefs and stating that it does not appear to me that they are based on evidence which makes it legally right but ethically irresponsible to tell people to worry to such an extent or even not to tell them not to worry.
I just don't know how to parse this. I mean what I asked for and I do not ask for certainty here. I'm not doubting evolution and climate change. The problem is that even a randomly picked research paper likely bears more analysis, evidence and references than all of LW and the SIAI' documents together regarding risks posed by recursive self-improvement from artificial general intelligence.
The quotes have been relevant as they showed that Yudkowsky clearly believes in his intellectual and epistemic superiority, yet any corroborative evidence seems to be missing. Yes, there is this huge amount of writings on rationality and some miscellaneous musing on artificial intelligence. But given how the idea of risks from AGI is weighted by him, it is just the cherry on top of marginal issues that do not support the conclusions.
I don't have a difficulty to comprehend them either. I'm questioning the propositions, the conclusions drawn and further speculations based on those premises.
This is ridiculous. I never said you are forced to explain yourself. You are forced to explain yourself if you want people like me to take you serious.
Yudkowsky is definitely a clever fellow. He may not have fancy qualifications - and he is far from infallible - but he is pretty smart.
In the particular post in question, I am pretty sure he was being silly - which is a rather unfortunate time to be claiming superiority.
However, I don't really know. The stunt created intrigue, mystery, the forbidden, added to the controversy. Overall, Yudkowsky is pretty good at marketing - and maybe this was a taste of it.
I wonder if his Harry Potter fan-fic is marketing - or else how he justifies it.
If you had restrained your claim in that way (ie. not made the claim that I had quoted in the above context) then I would have agreed with you.
I cannot account for every possible interpretation in what I write in a comment. It is reasonable not to infer oughts from questions. I said:
That is, if you can't explain yourself why you hold certain extreme beliefs then how is it rational for me to believe that the credence you place on it is justified? The best response you came up with was telling me that you are able to understand and that you don't have to force this understanding onto me to believe into it yourself. That is a very poor argument and that is what I called ridiculous. Even more so as people voted it up, which is just sad.
I though this has been sufficiently clear from what I wrote before.
And it is at this point in the process that an accomplished rationalist says to himself, "I am confused", and begins to learn.
My impression is that you and Wedrifid are talking past each other. You think that you both are arguing about whether uFAI is a serious existential risk. Wedrifid isn't even concerned with that. He is concerned with "process questions" - with the analysis of the dialog that you two are conducting, rather than the issue of uFAI risk. And the reason he is being upvoted is because this forum, believe it or not, is a process question forum. It is about rationality, not about AI. Many people here really aren't that concerned about whether Goertzel or Yudkowsky has a better understanding of uFAI risks. They just have a visceral dislike of rhetorical questions.
If you want to see the standard arguments in favor of the Scary Idea, follow Louie's advice and read the papers at the SIAI web site. But if you find those arguments unsatisfactory (and I suspect you will) exercise some care if you come looking for a debate on the question here on Less Wrong. Because not everyone who engages with you here will be engaging you on the issue that you want to talk about.
I am somewhat more interested in understanding why Gortzel would say what he says about AI. Just saying 'Gortzel's brain doesn't appear to work right' isn't interesting. But the Hansonian signalling motivations behind academic posturing is more so.
Well said.
(Although to be more precise I don't have a visceral dislike of rhetorical questions per se. It is the use of rhetoric to subvert reason that produces the visceral reaction, not the rhetoric(al question) itself.)
Updated it without the quotes now so people don't get unnecessary distracted.
Now I'm curious what they were, and where they came from. Distract me, but in a sub-thread.
Could I ask you to post the quotes as a separate post? They are priceless (and I'd love to be able to see what they applied to -- so please include the references as well).
I should add, don't get a wrong impression from those quotes. I still believe he might actually be that smart. He's at least the smartest person I know of by what I've read. Except when it comes to public relations. You shouldn't say those things if you do not explain yourself sufficiently at the same time.
I was too lazy to write this up again, it's copy and paste work so don't mind some inconsistencies. Regarding the quotes, I think that EY seriously believes what he says in the given quotes, otherwise I wouldn't have posted them. I'm not even suggesting that it isn't true, I actually allow for the possibility that he is that smart. But I want to know what I should do and right now I don't see any good arguments.
I'm a supporter and donor and what I'm trying to do here is coming up with the best possible arguments to undermine the credence of the SIAI. Almost nobody else is doing that, so I'm trying my best here. This isn't damaging, this is helpful. Because once you become really popular, people like P.Z. Myers and other much more eloquent and popular people will pull you to pieces if you can't even respond to my poor attempt at being a devils advocate.
I don't even know where to start here, so I won't. But I haven't come across anything yet that I had trouble understanding.
See that women with red hair? Well, the cleric told me that he believes that she's a witch. But he'll update on evidence if the fire didn't consume her. I said red hair is insufficient data to support that hypothesis and take such extreme measures to test it. He told me that if he came up with more evidence like sorcery I'd just go ahead and find new rhetorical demands.
I'm not against free speech and religious freedom but that also applies for my own thoughts on the subject. I believe he could do much more than censoring certain ideas, namely show that they are bogus.
[See context for implied meaning if the excerpt isn't clear]. I claimed approximately the same thing that you say yourself below.
I've got nothing against the Devil, it's the Advocacy that is mostly bullshit. Saying you are 'Devil's Advocate' isn't an excuse to use bad arguments. That would be an insult to the Devil!
You conveyed most of your argument via rhetorical questions. To the extent that they can be considered to be in good faith (and not just verbal tokens intended to influence) some of them only support the position you used them for if you genuinely do not understand them (implying that there is no answer). I believe I quoted an example in the context.
Making an assertion into a question does not give a license to say whatever you want with no risk of direct contradiction. (Even though that is how the tactic is used in practice.)
More concise answer: Then don't ask stupid questions!
I'm probably too tired to parse this right now. I believe there probably is an answer, but it is buried under hundreds of posts about marginal issues. All those writings on rationality, there is nothing I disagree with. Many people know about all this even outside of the LW community. But what is it that they don't know that EY and the SIAI knows? What I was trying to say is that if I have come across it then it was not convincing enough to take it as serious as some people here obviously do.
It looks like that I'm not alone. Goertzel, Hanson, Egan and lots of other people don't see it as well. So what are we missing, what is it that we haven't read or understood?
Goertzel: I could and will list the errors I see in his arguments (if nobody there has done so first). For now I'll just say his response to claim #2 seems to conflate humans and AIs. But unless I've missed something big, which certainly seems possible, he didn't make his decision based on those arguments. They don't seem good enough on their face to convince anyone. For example, I don't think he could really believe that he and other researchers would unconsciously restrict the AI's movement in the space of possible minds to the safe area(s), but if we reject that possibility some version of #4 seems to follow logically from 1 and 2.
Egan: don't know. What I've seen looks unimpressive, though certainly he has reason to doubt 'transhumanist' predictions for the near future. (SIAI instead seems to assume that if humans can produce AGI, then either we'll do so eventually or we'll die out first. Also, that we could produce artificial X-maximizing intelligence more easily then we can produce artificial nearly-any-other-human-trait, which seems likely based on the tool I use to write this and the history of said tool.) Do you have a particular statement or implied statement of his in mind?
Hanson: maybe I shouldn't point any of this out, but EY started by pursuing a Heinlein Hero quest to save the world through his own rationality. He then found himself compelled to reinvent democracy and regulation (albeit in a form closely tailored to the case at hand and without any strict logical implications for normal politics). His conservative/libertarian economist friend called these new views wrongheaded despite verbally agreeing with him that EY should act on those views. Said friend also posted a short essay about "heritage" that allowed him to paint those who disagreed with his particular libertarian vision as egg-headed elitists.
He wasn't quoting Goertzel, Egan, and Hanson - though his formatting made it look like he was. He was commenting on your claim that these three "don't see it".
Sorry, I don't know what quotes you mean. You can find a link to the "heritage" post in the wiki-compilation of the debate. Though perhaps you meant to reply to someone else?