One of the biggest problems with evaluating the plausibility of SI's arguments is that the arguments involve a large number of premises (as any complex argument will) and often these arguments are either not written down or are written down in disparate locations, making it very hard to piece together these claims. SI is aware of this and one of their major aims is to state their argument very clearly. I'm hoping to help with this aim.

My specific plan is as follows: I want to map out the broad structure of SI's arguments in "standard form" - that is, as a list of premises that support a conclusion. I then want to write this up into a more readable summary and discussion of SI's views.

The first step to achieving this is making sure that I understand what SI is arguing. Obviously, SI is arguing for a number of different things but I take their principle argument to be the following:

P1. Superintelligent AI (SAI) is highly likely to be developed in the near future (say, next 100 years and probably sooner)
P2. Without explicit FAI research, superintelligent AI is likely to pose a global catastrophic risk for humanity.
P3. FAI research has a reasonable chance of making it so that superintelligent AI will not pose a global catastrophic risk for humanity.
Therefore
C1. FAI research has a high expected value for humanity.
P4. We currently fund FAI research at a level below that supported by its expected value.
Therefore
C2. Humanity should expend more effort on FAI research.

Note that P1 in this argument can be weakened to simply say that SAI is a non-trivial possibility but, in response, a stronger version of P2 and P3 are required if the conclusion is still to be viable (that is, if SAI is less likely, it needs to be more dangerous or FAI research needs to be more effective in order for FAI research to have the same expected value). However, if P2 and P3 already seem strong to you, then the argument can be made more forceful by weakening P1. One further note, however, doing so might also make the move from C1 and P4 to C2 more open to criticism - that is, some people think that we shouldn't make decisions based on expected value calculations when we are talking about low probability/high value events.

So I'm asking for a few things from anyone willing to comment:

1.) A sense of whether this is a useful project (I'm very busy and would like to know whether this is a suitable use of my scarce spare time) - I will take upvotes/downvotes as representing votes for or against the idea (so feel free to downvote me if you think this idea isn't worth pursuing even if you wouldn't normally downvote this post).
2.) A sense of whether I have the broad structure of SI's basic argument right.

In terms of my commitment to this project: as I said before, I'm very busy so I don't promise to finish this project. However, I will commit to notifying Less Wrong if I give in on it and engaging in handover discussions with anyone that wants to take the project over.

New Comment
22 comments, sorted by Click to highlight new comments since:

My specific plan is as follows: I want to map out the broad structure of SI's arguments in "standard form" - that is, as a list of premises that support a conclusion.

One of the huge problems here is definitions- laying them out clearly, and making sure nothing gets smuggled in. P1, for example, introduces "superintelligent" as one word when it needs to be paragraphs, or perhaps pages. Does superintelligence require the internal use of a utility function?

Does superintelligence require the internal use of a utility function?

No, a utility function is just a useful way of describing how agents behave, whether or not it's explicitly represented in it's code.

Utility functions are much more useful descriptions for consequentialist agents than other varieties of agents, but I agree they have at least some use for all types of agents.

The context of that question is a discussion between some SI folks and some SI critics, where some SI publications had assumed that agents will have internal utility functions and some dangers associated with utility functions. The same conclusions may hold for agents without internal utility functions- but new arguments need to made for that to be clear.

Good point. Part of my response to this is that my plan would be to provide sub-arguments leading up to each of the premises in this broad brush argument and a lot more detail would be included in these. However, I think you're right that I also need to be more clear on definitions as well.

In terms of the specific definition of superintelligence, I had in mind Chalmer's definition: "Let us say that AI++ (or superintelligence) is AI of far greater than human level (say, at least as far beyond the most intelligent human as the most intelligent human is beyond a mouse)" combined with the view (expressed here, for example) that we are interested in optimisation power rather than some more general sense of intelligence.

Right, but general optimization power is also really vague. The amount that an entity narrows potential future distributions and that amount that an entity can control the narrowing of potential future distributions are different things, which is a distinction 'optimization power' doesn't quite seem to respect, unless you get deeper.

(We could do this for a long time.)

Right, but general optimization power is also really vague.

This sounds unattractive at the outset, but could one relate optimization power in economic terms? A machine super-optimizer would most likely have to become an economic power on its way to subverting the world's industrial infrastructure. If one can relate optimization power to economic power, then one could make a strong argument for the inability of human civilization to control a machine super-optimizer.

(We could do this for a long time.)

That's probably true so let's not, I take your basic point.

Given that the (initial) aim of the project is to summarise SI's arguments, is this a point that you think SI has been clear on (and if so, is there any post or paper you can direct me to) or do you think this is a problem with SI's argument?

I have not engaged with SI's arguments deeply enough to give them a fair assessment. I know that others think this is a point SI has been unclear on, but it also sounds like SI is moving towards clarity (and a project like this, which finds the holes where premises that should be conclusions are instead premises, will help them do that).

I had in mind Chalmer's definition: "Let us say that AI++ (or superintelligence) is AI of far greater than human level (say, at least as far beyond the most intelligent human as the most intelligent human is beyond a mouse)"

Is this in fact a definition? Is there a meaningful way we can comparatively measure the intelligence of both a human and a mouse? I don't know of any way we can compare that doesn't assume that humans are smarter. (Cortical folding, brain vs. body mass.)

In fact, is there a definition of the kind of superintelligence discussed here at length at all? We can do this through information-theory, and define intelligence with regards to optimization power, but how does one relate such definitions to the supposed capabilities of super-intelligences discussed here?

[-]roll30

From what I gathered SI's relevance rests upon an enormous conjunction of implied and a very narrow approach as solution, both of which were decided upon significant time in the past. Subsequently, truly microscopic probability of relevance is easily attained; I estimate at most 10^-20 due to multiple use of narrow guesses into a huge space of possibilities.

Hm, most of the immediate strategies SI is considering going forward strike me as fairly general:

http://lesswrong.com/r/discussion/lw/cs6/how_to_purchase_ai_risk_reduction/

They're also putting up these strategies for public scrutiny, suggesting they're open to changing their plans.

If you're referring to sponsoring an internal FAI team, Luke wrote:

I don't take it to be obvious that an SI-hosted FAI team is the correct path toward the endgame of humanity "winning." That is a matter for much strategic research and debate.

BTW, I wish to reinforce you for the behavior of sharing a dissenting view (rationale: view sharing should be agnostic to dissent/assent profile, but sharing a dissenting view intuitively risks negative social consequences, an effect that would be nice to neutralize), so I voted you up.

[-]roll-10

Well, that's the Luke's aspirations; I was referring to the work done so far. The whole enterprise has the feeling of over optimistic startup with ill defined extremely ambitious goals; those don't have any success rate even for much much simpler goals.

Note that P1 in this argument can be weakened to simply say that SAI is a non-trivial possibility but, in response, P2 and P3 need to be strengthened (that is, if SAI is less likely, it needs to be more dangerous or FAI research needs to be more effective in order for FAI research to have the same expected value).

This is arguing from the bottom line. If P1 is weakened, it in no way implies that P2 or P3 become stronger, instead it argues that expected value of FAI research is lower. If you are targeting a particular power of an argument, then it's OK, but you also seem to be making assertions about the true values of these parameters, and the true values don't work like that.

[-]maia00

I read this section of the post as saying, "if one argument is weakened, the others need to be stronger [or else SI's conclusions are no longer supported.]"

But if you don't read that into it, it does sound very sketchy.

This is definitely what I meant - I apologise that it was unclear. So to clarify, I meant that the premises can still support the conclusion if P1 is weakened, as long as P2 and P3 are strong enough that the expected value of FAI research is still adequately high.

In some ways, the strength of P1, P2 and P3 can be traded off against one another. That means that the argument might be more convincing to more people - you can think SAI is unlikely but still think the conclusion is correct (though I'm certainly not arguing that if P1 gets weakened, P2 and P3 must be stronger because the bottom line is undeniable, or any such thing).

Perhaps I should have said: Note that P1 in this argument can be weakened to simply say that SAI is a non-trivial possibility but, in response, a stronger version of P2 and P3 are required if the conclusion is still to be viable (that is, if SAI is less likely, it needs to be more dangerous or FAI research needs to be more effective in order for FAI research to have the same expected value).

Unless anyone comments negatively regarding this new version of the paragraph in the next few hours, I'll update the original post.

[-][anonymous]00

What's wrong with the parent comment, what am I missing?

Edit: So this comment summoned some upvotes to the parent, but didn't clarify the problem; staying at 0 Karma for several hours indicates that I'm probably missing something, and in this case I don't have any plausible hypotheses to give more weight to based on that observation (besides "the idea that expected value of FAI research can be overestimated makes people flinch", which seems weak)...

[This comment is no longer endorsed by its author]Reply

fyi, I don't need P1 or P3 to believe C2.

I'm not sure I follow -

P3 says that FAI research had a reasonable chance of success. Presumably you believe that at least a weak version of P3 must be true because otherwise there's no expected value to researching FAI (unless you just enjoy reading the research?)

Something similar can be said in terms of P1. As I note below the main argument, you can weaken P1 but you surely need at least a weakened version of P1.

Is that what you mean? That you only need a very weak version of P1 and P3 for C2 to follow. So if AI is even slightly possible and FAI has even a small chance of success then C2 follows anyway.

If so, that's fine but then you put a lot of weight on P2 as well as an unstated premise (P2a. Global catastrophic risks are very bad) as well as on another unstated premise (P4a. We should decide on funding based on expected value calculations [even in cases with small probabilities/high gains]).

Further, note that the more you lower the expected value of FAI research, the harder it becomes to support P4. There are lots of things we would like to fund - other global catastrophic risk research, literature, nice food, understanding the universe etc - and FAI research needs to have a high enough expected value that we should spend our time on this research rather than on these other things. As such, the expected value doesn't just need to be high enough that in an ideal world we would want to do FAI research but high enough that we ought to do the research in this world.

If that's what you're saying, that's fine but by putting so much weight on fewer premises, you risk failing to convince other people to also accept the importance of FAI research. If that's not what you mean then I'd love to get a better sense of what you're saying.

That's basically it. What's missing here is probabilities. I don't need FAI research to have a high enough probability of helping to be considered "reasonable" in order to believe that it is still the best action. Similarly, I don't need to believe that AGI will be developed in the next one or even few hundred years for it to be urgent. Basically the expected value is dominated by the negative utility if we do nothing (loss of virtually all utility forever) and my belief that UFAI is the default occurrence (high probability). I do however believe that AGI could be developed soon; it simply adds to the urgency.

Cool, glad I understood. Yes, the argument could be made more specific with probabilities. At this stage, I'm deliberately being vague because that allows for more flexibility - ie. there are multiple ways you can assign probabilities and values to the premises such that they will support the conclusion and I don't want to specify just one of them at the expense of others.

If I get to the end of the project I plan to consider the argument in detail in which case I will start to give more specific (though certainly not precise) probabilities for different premises.

It's important to note the degree to which each of these premises has support.

Yep, that's the eventual plan. The next step is to provide new arguments leading to each of these premises as subconclusions. So, this would be an argument, for example, about what the arguments are for P1.

The final step would then be an evaluation of the strength of the overall arguments taking into account the strength of each premise in the argument.