You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

MileyCyrus comments on Stupid Questions Open Thread - Less Wrong Discussion

42 Post author: Costanza 29 December 2011 11:23PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (265)

You are viewing a single comment's thread.

Comment author: MileyCyrus 30 December 2011 12:35:49AM 14 points [-]

If the SIAI engineers figure out how to construct friendly super-AI, why would they care about making it respect the values of anyone but themselves? What incentive do they have to program an AI that is friendly to humanity, and not just to themselves? What's stopping LukeProg from appointing himself king of the universe?

Comment author: Dr_Manhattan 30 December 2011 02:50:45AM 14 points [-]

Not an answer, but a solution:

You know what they say the modern version of Pascal's Wager is? Sucking up to as many Transhumanists as possible, just in case one of them turns into God. -- Julie from Crystal Nights by Greg Egan

:-p

Comment author: lukeprog 30 December 2011 02:18:29AM 12 points [-]

What's stopping LukeProg from appointing himself king of the universe?

Personal abhorrence at the thought, and lack of AI programming abilities. :)

(But, your question deserves a more serious answer than this.)

Comment author: orthonormal 30 December 2011 06:18:32PM 4 points [-]

This is basically what I was asking before. Now, it seems to me highly unlikely that SIAI is playing that game, but I still want a better answer than "Trust us to not be supervillains".

Comment author: TrueBayesian 30 December 2011 07:35:58PM 16 points [-]

Too late - Eliezer and Will Newsome are already dual kings of the universe. They balance each other's reigns in a Ying/Yang kind of way.

Comment author: Armok_GoB 30 December 2011 03:34:43PM -1 points [-]

Serious or not, it seems correct. There might be some advanced game thoery that says otherwise, but it only aplies to those who know the game theory.

Comment author: jimrandomh 31 December 2011 08:23:49AM *  4 points [-]

Lots of incorrect answers in other replies to this one. The real answer is that, from Luke's perspective, creating Luke-friendly AI and becoming king of the universe isn't much better than creating regular friendly AI and getting the same share of the universe as any other human. Because it turns out, after the first thousand galaxies worth of resources and trillion trillion millenia of lifespan, you hit such diminishing returns that having another seven-billion times as many resources isn't a big deal.

This isn't true for every value - he might assign value to certain things not existing, like powerful people besides him, which other people want to exist. And that last factor of seven billion is worth something. But these are tiny differences in value, utterly dwarfed by the reduced AI-creation success-rate that would happen if the programmers got into a flamewar over who should be king.

Comment author: Larks 30 December 2011 05:14:24AM 2 points [-]

I think it would be significantly easier to make FAI than LukeFreindly AI: for the latter, you need to do most of the work involved in the former, but also work out how to get the AI to find you (and not accidentally be freindly to someone else).

If it turns out that there's a lot of coherance in human values, FAI will resemble LukeFreindlyAI quite closely anyway.

Comment author: wedrifid 31 December 2011 08:42:34AM *  8 points [-]

I think it would be significantly easier to make FAI than LukeFreindly AI

Massively backwards! Creating an FAI (presumably 'friendly to humanity') requires an AI that can somehow harvest and aggregate preferences over humans in general but an FAI<Luke> just needs to scan one brain.

Comment author: Larks 31 December 2011 09:16:12PM 0 points [-]

Scanning is unlikely to be the bottleneck for a GAI, and it seems most of the difficulty with CEV is from the Extrapolation part, not the Coherence.

Comment author: wedrifid 31 December 2011 09:54:32PM 5 points [-]

Scanning is unlikely to be the bottleneck for a GAI, and it seems most of the difficulty with CEV is from the Extrapolation part, not the Coherence.

It doesn't matter how easy the parts may be, scanning, extrapolating and cohering all of humanity is harder than scanning and extrapolating Luke.

Comment author: torekp 02 January 2012 06:48:35PM 4 points [-]

Not if Luke's values contain pointers to all those other humans.

Comment author: TheOtherDave 30 December 2011 05:19:26AM 5 points [-]

If FAI is HumanityFriendly rather than LukeFriendly, you have to work out how to get the AI to find humanity and not accidentally optimize for the extrapolated volition of some other group. It seems easier to me to establish parameters for "finding" Luke than for "finding" humanity.

Comment author: Larks 30 December 2011 05:29:22AM 0 points [-]

Yes, it depends on whether you think Luke is more different from humanity than humanity is from StuffWeCareNotOf

Comment author: TheOtherDave 30 December 2011 10:36:34AM 5 points [-]

Of course an arbitrarily chosen human's values are more similar to to the aggregated values of humanity as a whole than humanity's values are similar to an arbitrarily chosen point in value-space. Value-space is big.

I don't see how my point depends on that, though. Your argument here claims that "FAI" is easier than "LukeFriendlyAI" because LFAI requires an additional step of defining the target, and FAI doesn't require that step. I'm pointing out that FAI does require that step. In fact, target definition for "humanity" is a more difficult problem than target definition for "Luke"

Comment author: Armok_GoB 30 December 2011 03:40:54PM 3 points [-]

I find it much more likely that it's the other way around; making one for a single brain that already has an utility function seems much easier than finding out a good compromise between billions. Especially if the form "upload me, then preform this specific type of enchantment to enable me to safely continue self improving." turns out to be safe enough.

Comment author: Zed 30 December 2011 01:58:45AM *  2 points [-]

Game theory. If different groups compete in building a "friendly" AI that respects only their personal extrapolated coherent violation (extrapolated sensible desires) then cooperation is no longer an option because the other teams have become "the enemy". I have a value system that is substantially different from Eliezer's. I don't want a friendly AI that is created in some researcher's personal image (except, of course, if it's created based on my ideals). This means that we have to sabotage each other's work to prevent the other researchers to get to friendly AI first. This is because the moment somebody reaches "friendly" AI the game is over and all parties except for one lose. And if we get uFAI everybody loses.

That's a real problem though. If different fractions in friendly AI research have to destructively compete with each other, then the probability of unfriendly AI will increase. That's real bad. From a game theory perspective all FAI researchers agree that any version of FAI is preferable to uFAI, and yet they're working towards a future where uFAI is becoming more and more likely! Luckily, if the FAI researchers take the coherent extrapolated violation of all of humanity the problem disappears. All FAI researchers can work to a common goal that will fairly represent all of humanity, not some specific researcher's version of "FAI". It also removes the problem of different morals/values. Some people believe that we should look at total utility, other people believe we should consider only average utility. Some people believe abstract values matter, some people believe consequences of actions matter most. Here too the solution of an AI that looks at a representative set of all human values is the solution that all people can agree on as most "fair". Cooperation beats defection.

If Luke were to attempt to create a LukeFriendlyAI he knows he's defecting from the game theoretical optimal strategy and thereby increasing the probability of a world with uFAI. If Luke is aware of this and chooses to continue on that course anyway then he's just become another uFAI researcher who actively participates in the destruction of the human species (to put it dramatically).

We can't force all AI programmers to focus on the FAI route. We can try to raise the sanity waterline and try to explain to AI researchers that the optimal (game theoretically speaking) strategy is the one we ought to pursue because it's most likely to lead to a fair FAI based on all of our human values. We just have to cooperate, despite differences in beliefs and moral values. CEV is the way to accomplish that because it doesn't privilege the AI researchers who write the code.

Comment author: Xachariah 30 December 2011 11:01:30PM *  1 point [-]

Game Theory only helps us if it's impossible to deceive others. If one is able to engage in deception, the dominant strategy becomes to pretend to support CEV FAI while actually working on your own personal God in a jar. AI development in particular seems an especially susceptible domain for deception. The creation of a working AI is a one time event, it's not like most stable games in nature which allow one to detect defections of hundreds of iterations. The creation of a working AI (FAI or uFAI) is so complicated that it's impossible for others to check if any given researcher is defecting or not.

Our best hope then is for the AI project to be so big it cannot be controlled by a single entity and definitely not by a single person. If it only takes guy in a basement getting lucky to make an AI go FOOM, we're doomed. If it takes ten thousand researchers collaborating in the biggest group coding project ever, we're probably safe. This is why doing work on CEV is so important. So we can have that piece of the puzzle already built when the rest of AI research catches up and is ready to go FOOM.

Comment author: Armok_GoB 30 December 2011 03:46:39PM 1 point [-]

This doesn't apply to all of humanity, just to AI researchers good enough to pose a threat.

Comment author: TimS 30 December 2011 02:11:59AM *  1 point [-]

As I understand the terminology, AI that only respects some humans' preferences is uFAI by definition. Thus:

a friendly AI that is created in some researcher's personal image

is actually unFriendly, as Eliezer uses the term. Thus, the researcher you describe is already an "uFAI researcher"


It also removes the problem of different morals/values. Some people believe that we should look at total utility, other people believe we should consider only average utility. Some people believe abstract values matter, some people believe consequences of actions matter most. Here too the solution of an AI that looks at a representative set of all human values is the solution that all people can agree on as most "fair".

What do you mean by "representative set of all human values"? Is there any reason to that the resulting moral theory would be acceptable to implement on everyone?

Comment author: Zed 30 December 2011 02:22:36AM *  1 point [-]

[a "friendly" AI] is actually unFriendly, as Eliezer uses the term

Absolutely. I used "friendly" AI (with scare quotes) to denote it's not really FAI, but I don't know if there's a better term for it. It's not the same as uFAI because Eliezer's personal utopia is not likely to be valueless by my standards, whereas a generic uFAI is terrible from any human point of view (paperclip universe, etc).

Comment author: TimS 30 December 2011 02:40:31AM -1 points [-]

I guess it just doesn't bother me that uFAI includes both indifferent AI and malicious AI. I honestly think that indifferent AI is much more likely than malicious (Clippy is malicious, but awfully unlikely), but that's not good for humanity's future either.

Comment author: John_Maxwell_IV 31 December 2011 06:35:31AM *  1 point [-]

The good guys do not write an AI which values a bag of things that the programmers think are good ideas, like libertarianism or socialism or making people happy or whatever. There were multiple Overcoming Bias sequences about this one point, like the Fake Utility Function sequence and the sequence on metaethics. It is dealt with at length in the document Coherent Extrapolated Volition. It is the first thing, the last thing, and the middle thing that I say about Friendly AI.

...

The good guys do not directly impress their personal values onto a Friendly AI.

http://lesswrong.com/lw/wp/what_i_think_if_not_why/

The rest of your question has the same answer as "why is anyone altruist to begin with", I think.

Comment author: MileyCyrus 31 December 2011 06:45:58AM 4 points [-]

I understand CEV. What I don't understand is why the programmers would ask the AI for humanity's CEV, rather than just their own CEV.

Comment author: wedrifid 31 December 2011 07:04:36AM 11 points [-]

I understand CEV. What I don't understand is why the programmers would ask the AI for humanity's CEV, rather than just their own CEV.

The only (sane) reason is for signalling - it's hard to create FAI<self> without someone else stopping you. Given a choice, however, CEV<self> is strictly superior. If you actually do want to have FAI<humanity> then FAI<self> will be equivalent to it. But if you just think you want FAI<humanity> but it turns out that, for example, FAI<humanity> gets dominated by jerks in a way you didn't expect then FAI<self> will end up better than FAI<humanity>... even from a purely altruistic perspective.

Comment author: TheOtherDave 31 December 2011 06:58:39AM 2 points [-]

Yeah, I've wondered this for a while without getting any closer to an understanding.

It seems that everything that some human "really wants" (and therefore could potentially be included in the CEV target definition) is either something that, if I was sufficiently well-informed about it, I would want for that human (in which case my CEV, properly unpacked by a superintelligence, includes it for them) or is something that, no matter how well informed I was, I would not want for that human (in which case it's not at all clear that I ought to endorse implementing it).

If CEV-humanity makes any sense at all (which I'm not sure it does), it seems that CEV-arbitrary-subset-of-humanity makes leads to results that are just as good by the standards of anyone whose standards are worth respecting.

My working answer is therefore that it's valuable to signal the willingness to do so (so nobody feels left out), and one effective way to signal that willingness consistently and compellingly is to precommit to actually doing it.

Comment author: John_Maxwell_IV 31 December 2011 06:54:06AM 1 point [-]

Is this question any different from the question of why there are altruists?

Comment author: TheOtherDave 31 December 2011 07:23:09AM 1 point [-]

Sure. For example, if I want other people's volition to be implemented, that is sufficient to justify altruism. (Not necessary, but sufficient.)

But that doesn't justify directing an AI to look at other people's volition to determine its target directly... as has been said elsewhere, I can simply direct an AI to look at my volition, and the extrapolation process will naturally (if CEV works at all) take other people's volition into account.

Comment author: Solvent 30 December 2011 10:51:43AM 0 points [-]

Short answer is that they're nice people, and they understand that power corrupts, so they can't even rationalize wanting to be king of the universe for altruistic reasons.

Also, a post-Singularity future will probably (hopefully) be absolutely fantastic for everyone, so it doesn't matter whether you selfishly get the AI to prefer you or not.

Comment author: John_Maxwell_IV 26 March 2012 12:25:01AM 0 points [-]

Now that I understand your question better, here's my answer:

Let's say the engineers decide to make the AI respect only their values. But if they were the sort of people who were likely to do that, no one would donate money to them. They could offer to make the AI respect the values of themselves and their donors, but that would alienate everyone else and make the lives of themselves and their donors difficult. The species boundary between humans and other living beings is a natural place to stop expanding the circle of enfranchised agents.

Comment author: TheOtherDave 26 March 2012 12:51:42AM 0 points [-]

This seems to depend on the implicit assumption that their donors (and everyone else powerful enough to make their lives difficult) don't mind having the values of third parties respected.

If some do mind, then there's probably some optimally pragmatic balancing point short of all humans.

Comment author: John_Maxwell_IV 26 March 2012 01:37:06AM 0 points [-]

Probably, but defining that balancing point would mean a lot of bureaucratic overhead to determine who to exclude or include.

Comment author: TheOtherDave 26 March 2012 03:19:22AM 0 points [-]

Can you expand on what you mean by "bureaucratic" here?

Comment author: John_Maxwell_IV 26 March 2012 03:32:31AM 0 points [-]

Are people going to vote on whether someone should be included? Is there an appeals process? Are all decisions final?

Comment author: TheOtherDave 26 March 2012 01:01:00PM 1 point [-]

OK, thanks.

It seems to me all these questions arise for "include everyone" as well. Somewhere along the line someone is going to suggest "don't include fundamentalist Christians", for example, and if I'm committed to the kind of democratic decision process you imply, then we now need to have a vote, or at least decide whether we have a vote, etc. etc, all of that bureaucratic overhead.

Of course, that might not be necessary; I could just unilaterally override that suggestion, mandate "No, we include everyone!", and if I have enough clout to make that stick, then it sticks, with no bureaucratic overhead. Yay! This seems to more or less be what you have in mind.

It's just that the same goes for "Include everyone except fundamentalist Christians."

In any case, I don't see how any of this cumbersome democratic machinery makes any sense in this scenario. Actually working out CEV implies the existence of something, call it X, that is capable of extrapolating a coherent volition from the state of a group of minds. What's the point of voting, appeals, etc. when that technology is available? X itself is a better solution to the same problem.

Which implies that it's possible to identify a smaller group of minds as the Advisory Board and say to X "Work out the Advisory Board's CEV with respect to whose minds should be included as input to a general-purpose optimizer's target definition, then work out the CEV of those minds with respect to the desired state of the world."
Then anyone with enough political clout to get in my way, I add to the Advisory Board, thereby ensuring that their values get taken into consideration (including their values regarding whose values get included).

That includes folks who think everyone should get an equal say, folks who think that every human should get an equal say, folks who think that everyone with more than a certain threshold level of intelligence and moral capacity get a say, folks who think that everyone who agrees with them get a say, etc., etc. X works all of that out, and spits out a spec on the other side for who actually gets a say and to what degree, which it then takes as input to the actual CEV-extrapolating process.

This seems kind of absurd to me, but no more so than the idea that X can work out humanity's CEV at all. If I'm granting that premise for the sake of argument, everything else seems to follow.

Comment author: John_Maxwell_IV 26 March 2012 06:31:56PM 1 point [-]

It's just that the same goes for "Include everyone except fundamentalist Christians."

There is no clear bright line determining who is or is not a fundamentalist Christian. Right now, there pretty much is a clear bright line determining who is or is not human. And that clear bright line encompasses everyone we would possibly want to cooperate with.

Your advisory board suggestion ignores the fact that we have to be able to cooperate prior to the invention of CEV deducers.

And you're not describing a process for how the advisory board is decided either. Different advisory boards may produce different groups of enfranchised minds. So your suggestion doesn't resolve the problem.

In fact, I don't see how putting a group of minds on the advisory board is any different than just making them the input to the CEV. If a person's CEV is that someone's mind should contribute to the optimizer's target, that will be their CEV regardless of whether it's measured in an advisory board context or not.

Comment author: TheOtherDave 26 March 2012 11:02:11PM *  3 points [-]

There is no clear bright line determining who is or is not a fundamentalist Christian.

There is no clear bright line determining what is or isn't a clear bright line.

I agree that the line separating "human" from "non-human" is much clearer and brighter than that separating "fundamentalist Christian" from "non-fundamentalist Christian", and I further agree that for minds like mine the difference between those two lines is very important. Something with a mind like mine can work with the first distinction much more easily than with the second.

So what?

A mind like mine doesn't stand a chance of extrapolating a coherent volition from the contents of a group of target minds. Whatever X is, it isn't a mind like mine.

If we don't have such an X available, then it doesn't matter what defining characteristic we use to determine the target group for CEV extrapolation, because we can't extrapolate CEV from them anyway.

If we do have such an X available, then it doesn't matter what lines are clear and bright enough for minds like mine to reliably work with; what matters is what lines are clear and bright enough for systems like X to reliably work with.

Right now, there pretty much is a clear bright line determining who is or is not human. And that clear bright line encompasses everyone we would possibly want to cooperate with.

I have confidence < .1 that either one of us can articulate a specification determining who is human that doesn't either include or exclude some system that someone included in that specification would contest the inclusion/exclusion of.

I also have confidence < .1 that, using any definition of "human" you care to specify, the universe contains no nonhuman systems I would possibly want to cooperate with.

Your advisory board suggestion ignores the fact that we have to be able to cooperate prior to the invention of CEV deducers.

Sure, but so does your "include all humans" suggestion. We're both assuming that there's some way the AI-development team can convincingly commit to a policy P such that other people's decisions to cooperate will plausibly be based on the belief that P will actually be implemented when the time comes; we are neither of us specifying how that is actually supposed to work. Merely saying "I'll include all of humanity" isn't good enough to ensure cooperation if nobody believes me.

I have confidence that, given a mechanism for getting from someone saying "I'll include all of humanity" to everyone cooperating, I can work out a way to use the same mechanism to get from someone saying "I'll include the Advisory Board, which includes anyone with enough power that I care whether they cooperate or not" to everyone I care about cooperating.

And you're not describing a process for how the advisory board is decided either.

I said: "Then anyone with enough political clout to get in my way, I add to the Advisory Board." That seems to me as well-defined a process as "I decide to include every human being."

Different advisory boards may produce different groups of enfranchised minds.

Certainly.

So your suggestion doesn't resolve the problem.

Can you say again which problem you're referring to here? I've lost track.

In fact, I don't see how putting a group of minds on the advisory board is any different than just making them the input to the CEV.

Absolutely agreed.

Consider the implications of that, though.

Suppose you have a CEV-extractor and we're the only two people in the world, just for simplicity.
You can either point the CEV-extractor at yourself, or at both of us.
If you genuinely want me included, then it doesn't matter which you choose; the result will be the same.
Conversely, if the result is different, that's evidence that you don't genuinely want me included, even if you think you do.

Knowing that, why would you choose to point the CEV-extractor at both of us?

One reason for doing so might be that you'd precommitted to doing so (or some UDT equivalent), so as to secure my cooperation. Of course, if you can secure my cooperation without such a precommitment (say, by claiming you would point it at both of us), that's even better.

Comment author: John_Maxwell_IV 27 March 2012 12:44:57AM *  1 point [-]

Complicated or ambiguous schemes take more time to explain, get more attention, and risk folks spending time trying to gerrymander their way in instead of contributing to FAI.

I think any solution other than "enfranchise humanity" is a potential PR disaster.

Keep in mind that not everyone is that smart, and there are some folks who would make a fuss about disenfranchisement of others even if they themselves were enfranchised (and therefore, by definition, those they were making a fuss about would be enfranchised if they thought it was a good idea).

I agree there are potential ambiguity problems with drawing the line at humans, but I think the potential problems are bigger with other schemes.

Sure, but so does your "include all humans" suggestion. We're both assuming that there's some way the AI-development team can convincingly commit to a policy P such that other people's decisions to cooperate will plausibly be based on the belief that P will actually be implemented when the time comes; we are neither of us specifying how that is actually supposed to work. Merely saying "I'll include all of humanity" isn't good enough to ensure cooperation if nobody believes me.

I agree there are potential problems with credibility, but that seems like a separate argument.

I have confidence that, given a mechanism for getting from someone saying "I'll include all of humanity" to everyone cooperating, I can work out a way to use the same mechanism to get from someone saying "I'll include the Advisory Board, which includes anyone with enough power that I care whether they cooperate or not" to everyone I care about cooperating.

It's not all or nothing. The more inclusive the enfranchisement, the more cooperation there will be in general.

I said: "Then anyone with enough political clout to get in my way, I add to the Advisory Board." That seems to me as well-defined a process as "I decide to include every human being."

With that scheme, you're incentivizing folks to prove they have enough political clout to get in your way.

Moreover, humans aren't perfect reasoning systems. Your way of determining enfranchisement sounds a lot more adversarial than mine, which would affect the tone of the effort in a big and undesirable way.

Why do you think that the right to vote in democratic countries is as clearly determined as it is? Restricting voting rights to those of a certain IQ or higher would be a politically unfeasible PR nightmare.

One reason for doing so might be that you'd precommitted to doing so (or some UDT equivalent), so as to secure my cooperation. Of course, if you can secure my cooperation without such a precommitment (say, by claiming you would point it at both of us), that's even better.

Again, this is a different argument about why people cooperate instead of defect. To a large degree, evolution hardwired us to cooperate, especially when others are trying to cooperate with us.

I agree that if the FAI project seems to be staffed with a lot of untrustworthy, selfish backstabbers, we should cast a suspicious eye on it regardless of what they say about their project.

Ultimately it probably doesn't matter much what their broadcasted intention towards the enfranchisement of those outside their group is, since things will largely come down to what their actual intentions are.

Comment author: kodos96 01 January 2013 12:32:35AM 1 point [-]

There is no clear bright line determining who is or is not a fundamentalist Christian. Right now, there pretty much is a clear bright line determining who is or is not human.

Is there? What about unborn babies? What about IVF fetuses? People in comas? Cryo-presevered bodies? Sufficiently-detailed brain scans?

Comment author: FiftyTwo 02 January 2012 12:52:41AM -2 points [-]

I for one welcome our new singularitarian overlords!

Comment author: falenas108 30 December 2011 01:39:26AM 0 points [-]

Right now, and for the foreseeable future, SIAI doesn't have the funds to actually create FAI. All they're doing is creating a theory for friendliness, which can be used when someone else has the technology to create AI. And of course, nobody else is going to use the code if it focuses on SIAI.

Comment author: Vladimir_Nesov 30 December 2011 04:19:11PM *  4 points [-]

SIAI doesn't have the funds to actually create FAI

Funds are not a relevant issue for this particular achievement at present time. It's not yet possible to create a FAI even given all the money in the world; a pharaoh can't build a modern computer. (Funds can help with moving the time when (and if) that becomes possible closer, improving the chances that it happens this side of an existential catastrophe.)

Comment author: falenas108 30 December 2011 04:32:13PM -1 points [-]

Yeah, I was assuming that they were able to create FAI for the sake of responding to the grandparent post. If they weren't, then there wouldn't be any trouble with SIAI making AI only friendly to themselves to begin with.

Comment author: EStokes 30 December 2011 02:05:40AM 3 points [-]

If they have all the threory and coded it and whatnot, where is the cost coming from?

Comment author: falenas108 30 December 2011 02:59:15PM -1 points [-]

The theory for friendliness is completely separate from the theory of AI. So, assuming they complete one does not mean that they complete the other. Furthermore, for something as big as AI/FAI, the computing power required is likely to be huge, which makes it unlikely that a small company like SIAI will be able to create it.

Though, I suppose it might be possible if they were able to get large enough loans, I don't have the technical knowledge to say how much computing power is needed or how much that would cost.

Comment author: Psy-Kosh 30 December 2011 07:39:00PM 3 points [-]

The theory for friendliness is completely separate from the theory of AI.

??? Maybe I'm being stupid, but I suspect it's fairly hard to fully and utterly solve the friendliness problem without, by the end of doing so, AT LEAST solving many of the tricky AI problems in general.