Yesterday I sat down with Lukeprog for a few hours and we produced this ten-page interview for the Singularity Institute blog. This interview contains information about the Singularity Institute's technical research program and recent staff changes that hasn't been announced anywhere else!  We hope you find it informative. 

http://singinst.org/blog/2011/09/15/interview-with-new-singularity-institute-research-fellow-luke-muehlhuaser-september-2011/

 

New Comment
67 comments, sorted by Click to highlight new comments since:

Q: What are some of those open problems in Friendly AI theory?

A: ... When extrapolated, will the values of two different humans converge? Will the values of all humans converge? Would the values of all sentient beings converge? ...

I don't think the question about sentient beings should be considered open.

[-][anonymous]50

If we can't consider it open why do we consider the question of the values of two different human beings open? Unless we choose to define humans so as to exclude some Homo Sapiens brains that occupy certain spaces of neurodiversity and/or madness?

For the question about human values, there are ways to put it so that it's interesting and non-trivial. For values of unrelated minds, the answer is clear however you interpret the question.

[-][anonymous]20

For the question about human values, there are ways to put it so that it's interesting and non-trivial.

Basically for some indeterminate but not too small fraction of all human brains?

Sure, brain damage and similar conditions don't seem interesting in this regard.

It isn't clear that autism is brain damage, for one.

eg Clippy. Clippy's values wouldn't converge with ours, or with an otherwise similar AI that preferred thumb tacks. So the general case is most certainly 'no'.

[-][anonymous]60

Most philosophers talk about ‘ideal preference theories’, but I prefer to call them ‘value extrapolation algorithms’. If we want to develop Friendly AI, we may not want to just scan human values from our brains and give those same values to an AI. I want to eat salty foods all day, but I kind of wish I didn’t want that, and I certainly don’t want an AI to feed me salty foods all day. Moreover, I would probably change my desires if I knew more and was more rational. I might learn things that would change what I want. And it’s unlikely that the human species has reached the end of moral development. So we don’t want to fix things in place by programming an AI with our current values. We want an AI to extrapolate our values so that it cares about what we would want if we knew more, were more rational, were more morally developed, and so on.

What exactly is human moral development?

I can't put it into words, but I feel like not having slaves and not allowing rape within marriage are both good things that are morally superior for reasons beyond simply "I believe this and people long-ago didn't".

The process whereby things like this occur are what I'd call "human moral development".

[-][anonymous]440

Related to: List of public drafts on LessWrong

I can't put it into words, but I feel like not having slaves and not allowing rape within marriage are both good things that are morally superior for reasons beyond simply "I believe this and people long-ago didn't".

The process whereby things like this occur are what I'd call "human moral development".

So we have a mysterious process that with some deviations has generally over time made values more like those that we have today. Looking back at the steps of change we get the feeling that somehow this looks right.

Very well, considering that we here at LW should be particularly familiar with the power of the human mind to construct convincing narratives for nearly any difficult to predict sequence of events in hindsight and considering that we know of biases that are strong enough to give us that "morally superior for reasons beyond simply they are different" feeling (halo effect for starters) and do indeed give us such feelings on some other matters, I hope I am not to bold to ask...

how exactly would you distinguish the universe in which we live in from the universe in which human moral change was determined by something like a random walk through value space? Now naturally a random walk through value space dosen't sound like something to which you are willing to outsource future moral and value development. But then why is unknown process X that happens to make you feel sort of good, because you like what its done so far, something which inspires so much confidence that you'd like a godlike AI to emulate (quite closely) its output?

Sure its better in the Bayesian sense than a process who's output so far you wouldn't have liked, but we don't have any empirical comparison of results to an alternative process, or do we? Also consider other changes, that feel right in the merely because its more similar to us way. It seems plausible that these kinds of changes of values and morality might indeed be far more common. Even if these changes are something that our ancestors would have found to be neutral changes (which seems highly doubtful), they are clearly hijacking our attention away from the fuzzy category of "right in a qualitative different way than just similar to my own" that is put as the basis of going with the current process.

But again perhaps I simply feel discomforted by such implicit narratives of moral progress considering that North Korean society has demonstrably constructed a narrative with itself at the apex that feels just as good from the inside as does ours. Considering similar comments of mine have been up voted in the past, I think at the very least a substantial minority agrees with me that the standard LW discourse and state of thought on this matter is woefully inadequate. I mean how is it possible that this process apparently inspires such confidence in LWers, while a process that has so far also given us comparably felicitous change that feels so right to us humans that we often invoke an omnipotent benevolent agent to explain the result, can terrify us once we think about it clear-headedly?

I have a hunch that if we looked at the guts of this process we may find more old sanity shattering Outer Gods waiting for us.

PS Would anyone be interested in a top level/discussion post of some of my more advanced thoughts and arguments on this? Or have I just been sloppy and missed relevant material that covered this? :)

Edit: This comment was adapted as an article for More Right, where I will be writting a full sequence on my thoughts on metaethics.

I have a reason to believe that there is such a thing as moral progress, and it's not completely arbitrary. The reason is not merely that I feel good about my own morality. But I have trouble explaining it in a couple sentences; there is just too much inferential distance.

(It has to do with the fact that making friends with people of different nationalities or ethnicties reliably makes people less nationalist or racist, and inoculates them against experiences that foster nationalism or racism.)

If you write a post about this, maybe I will write a response post.

[-][anonymous]100

(It has to do with the fact that making friends with people of different nationalities or ethnicties reliably makes people less nationalist or racist, and inoculates them against experiences that foster nationalism or racism.)

I don't see why ever expanding circles of outgroups becoming ingroups qualifies as something that is always objectivley better. To be honest I'm not so sure this is even occuring.

[-][anonymous]10

But I have trouble explaining it in a couple sentences; there is just too much inferential distance.

If you write a post about this, maybe I will write a response post.

I'm really interested in getting a better understanding of the problems involved so, if I do end up writing a post about this (I need to get better acquainted with the newer meta ethics material and do some research before I feel comfortable doing so), please do! :)

Would anyone be interested in a top level/discussion post of some of my more advanced thoughts and arguments on this?

Yes! Please write this post!

PS Would anyone be interested in a top level/discussion post of some of my more advanced thoughts and arguments on this? Or have I just been sloppy and missed relevant material that covered this? :)

I don't think anything like this has been posted before, or has it? I do agree most posters haven't devoted too much tough to it. I mean how can they be so certain this process is something worth keeping and something that works on all of mankind, and would still be here even if a few random events in our evolutionary past or even written history had happened differently, yet are so sure that it would not apply for AI? Think about that for a little bit, practically everyone agrees that FAI is important precisely because they are sure this process isn't going to kick in for the AI. But also most seem to think that its guaranteed to have acted on us in some way even if we as humans had a very different history (the only alternative to this interpretation is anthropics, it feels right to us because we are a in a very very luck universe where the conditions where just right so the process is turning out fine). For that matter they seem to implicitly think this process is much stronger or at the very least at least as strong as genetic (since we can now be pretty sure humans have been changing in biologically even in recorded history) and memetic evolution on the scale of a few centuries or millennia.

I can't put it into words, but I feel like not having slaves and not allowing rape within marriage are both good things that are morally superior for reasons beyond simply "I believe this and people long-ago didn't".

I mean how is it possible that this process inspires such confidence in us, while a process that has so far also given us comparably felicitous change that feels so right we often invoke a omnipotent benevolent being to explain it, evolution can terrify us one we think about it clear-headedly?

I have a hunch that if we looked at the guts of this process we may find more old sanity shattering Outer Gods waiting for us.

Considering established nomenclature perhaps we should call it Yidhra or the Nameless Mist. ^_^

Members of Yidhra's cult can gain immortality by merging with her, though they become somewhat like Yidhra as a consequence. ... She usually conceals her true form behind a powerful illusion, appearing as a comely young woman; only favored members of her cult can see her as she actually is.

Anyone who passes his utility function over to her wisdom's modification is basically home safe, because future development will overall still be progress to his eyes. Moral judgement becomes a snap as all you need to do is wait for a sufficiently long time for society to get more stuff "right", the stuff that isn't "right" in that way and is lost is just random baggage you shouldn't value anyway.

[-]Rain130

I don't think anything like this has been posted before, or has it?

Eliezer gave it brief mention in his metaethics sequence, in posts such as Whither Moral Progress?

[-][anonymous]10

Thanks for the link!

I recalled reading something like that on OB, I think this is where I first stumbled upon the "random walk morality" challenge.

Thanks! I must admit I'm behind my reading on the metaethics stuff. Some of the other sequences where much more interesting for me personally and until recently I was binging on them with little regard for anything else.

Edit: Interesting the article barley has a few upvotes, considering this is EY, this increases the P that it probably hasn't been that well read or discussed in the last year or two.

Why doesn't a parallel argument apply to material and scientific progress?

Presumably because it is possible to objectively assess the degree of material and scientific progress (whether they are good is another matter). We can tell that our current knowledge is better because we can say why it is better. If there were no no epistemological progress, LW would be in vain!

So presumably the argument that there is no moral progress hinges on morality being something that can't be objectively arrived at or verified. But examples of rational discussion of morality abound, not least on LW. If we can explain our morlity better than our predecessors we are justified in thinking it is better. (But progress in morality is not quite the same as progress in values. The values might be remain the same, with moral progress consisting of a better expression of those values).

Why doesn't a parallel argument apply to material and scientific progress?

Because airplanes fly.

how exactly would you distinguish the universe in which we live in from the universe in which human moral change was determined by something like a random walk through value space?

If you use this analogy again in the future, you may want to be more precise.

  • Maybe by "random walk" you convey the idea that human moral change is nondeterministic, or depends on conditions in such a way that change in one direction is just as likely as change in another direction;

  • Or maybe by "random walk" you mean that human moral change is robust and deterministic, but does not approach any kind of limit, or approaches a limit that is very far from our current position in values-space.

how exactly would you distinguish the universe in which we live in from the universe in which human moral change was determined by something like a random walk through value space?

If historical civilizations agree with us about as much as contemporary ones. That said, there has no more been a constant upward slope than there has been such a slope in technology of equality, we are just unusual due to the enlightenment, I think.

EDIT: I think you may be assuming that we perfectly understand what we want. If I persuade a racist all humans are people, have I changed his utility function?

[-][anonymous]20

Do you to the first approximation equate moral progress with more equality?

I would consider it one form of such progress, yes.

EDIT:

If I persuade a racist all humans are people, have I changed his utility function?

That was a genuine question, incidentally. I really want to know your answer.

If I persuade a racist all humans are people, have I changed his utility function?

Can you taboo what you mean by a person's utility function?

How they decide the relative desirability of a given situation.

Said procedure tends not to resemble a utility function.

Huh?

He gets disutility from people suffering. If Jews are people, then he shouldn't torture them to death - but he didn't suddenly decide to value Jews, just realised they are people.

how exactly would you distinguish the universe in which we live in from the universe in which human moral change was determined by something like a random walk through value space? Now naturally a random walk through value space dosen't sound like something to which you are willing to outsource future moral and value development. But then why is unknown process X that happens to make you feel sort of good, because you like what its done so far, something which inspires so much confidence that you'd like a godlike AI to emulate (quite closely) its output?

What do you mean by 'value space' (any human values or desires?) and 'moral change' (generally desirable human values?)? Also, a godlike AI is like playing with infinities when inserted in your moral calculus; a godlike AI leading to horrible consequences per your morality doesn't show that your morality was fully flawed (maybe there was a tiny loophole no human would try or be capable of exploiting). And I think you discount the possibility that there are many moral solutions like there are many solutions to chess (this is especially important when noting the impact of culture on values).

[-][anonymous]40

'moral change'

This is a possible case of a currently ongoing mild value change in the US:

What I am proposing here is that for most Americans multi-generational living is a means toward maintaining the lifestyle and values which they hold dear, but the shift itself may change that lifestyle and those values in deep and fundamental ways. The initial trigger here is economic, with the first-order causal effects sociological. But the downstream effects may also be economic, as Americans become less mobile and more familialist. What can we expect? Look abroad, and look to the past.

Is this "generally desirable human values"? Depends on the values you already hold. I certainly think radically undesirable moral change might occur, looking from the value sets of past humans we see it almost certainly would not be a freak occurrence.

My key argument is that we have very little idea bout the mechanics of moral change in real human societies. Thus future moral change is fundamentally unsafe from our current value set and we do not have the tools to do anything about it. Yet. Once we get them excluding us realizing the process we are currently chained to has some remarkable properties we like, we will probably do away with it.

If moral progress is a salvageable concept then we shall see it for the first time in the history of mankind. If not we will finally do away with the tragedy of being doomed to an alien future devoid of all we value.

Of course "we" is misleading. The society that eventually gets these tools might be one that has values that are quite worthless or even horrifying from our perspective.

Does technological progress giving us the necessary leisure time and surplus resources to care about morality count as moral progress?

how exactly would you distinguish the universe in which we live in from the universe in which human moral change was determined by something like a random walk through value space?

I don't know, but I think that might be some of the stuff that Luke will be working on as a researcher for SIAI.

On "rape in marriage" you are clearly wrong. Freedom of contract is morally superior, the traditional contract for the past two thousand years being that a man and a woman each gave their consent to sex once and forever:

The concept of "rape" in marriage defines marriage, as it was originally understood out of existence, marriage as it was originally understood being the power to bind our future selves to stick it out

According to the New Testament:

let every man have his own wife, and let every woman have her own husband.

Let the husband render unto the wife due benevolence: and likewise also the wife unto the husband.

The wife hath not power of her own body, but the husband: and likewise also the husband hath not power of his own body, but the wife.

Defraud ye not one the other, except it be with consent for a time,

If consent to sex is given moment to moment, rather than once and forever, then marriage cannot be a durable contract: Consent to marriage then has to be moment to moment, which is to say routine hooking up, rather than marriage, thus producing the present situation where men are reluctant to invest in children and posterity, and where eighty percent of fertile age women have sex with twenty percent of men.

The concept of "rape" in marriage defines women as incapable of contract. Like so much of feminism, it infantilizes women in the guise of empowering them.

Saint Paul phrased it more delicately than I phrase it, or people in the eighteenth century phrased it, but what he meant, and what he rather delicately implied, and what people in the eighteenth century said plainly enough, is that if a fertile age wife is not getting done by her husband, she will be getting done by someone else pretty soon, and if a fertile age wife knocks her husband back, she is probably thinking about getting done by someone higher status than her husband, and pretty soon will be so. If she violates the marital contract by not servicing her husband, she is about to violate the marital contract a lot more drastically.

I could consent to have you shoot me, but it will still injure and possibly kill me. A child or their parent could consent to being shown graphic sexual or violent imagery, but it would still cause psychological damage. Rape is not theft but assault, and society does not allow such harm to be perpetrated just because someone thought they could handle it.

If consent to sex is given moment to moment, rather than once and forever, then marriage cannot be a durable contract

Not technically true. Since you have already said marriage is being redefined it just means that the redefinition must be to something which does not necessarily include sex---that is, a contract that allows enforced abstinence. A logically coherent concept even though I find the notion repugnant.

[-][anonymous]40

And it’s unlikely that the human species has reached the end of moral development.

...

IAnissimov: What is metaethics and how is it relevant to Friendly AI theory?

Metaethics goes one level deeper. What do terms like ‘good’ and ‘right’ even mean? Do moral facts exist, or is it all relative? Is there such a thing as moral progress? These questions are relevant to friendliness content because presumably, if moral facts exist, we would want an AI to respect them. Even if moral facts do not exist, our moral attitudes are part of what we value, and that is relevant to friendliness content theory.

So we are sure we don't want to fix things into place but we are not sure anything like moral progress exists? Isn't the distinction between moral change and moral progress the notion that the latter is normative while the former may not be? Also is the use of moral development here a synonym of moral progress or is it closer to my use of moral change or does it have a different meaning altogether?

Does the answer to the bolded question seem to be "yes" in Luke's opinion, but he also considers the implications of a "no" answer to carry significant enough impact to be included on the list? Note I don't find it likley the process of moral change or development has ended, I simply question that is should be ought when it comes to these matters.

In a way I am asking which part of the current or planned metaethics investigations has covered this? It seems to me a very basic thing to ask ourselves.

[-][anonymous]10

Also is the use of moral development here a synonym of moral progress or is it closer to my use of moral change or does it have a different meaning altogether?

Perhaps this?

A formalization of coherent extrapolated volition, a process for extrapolating current human values into ‘matured’ human values (what we would want if we had full information, perfect rationality, etc.).

Form the interview, Luke's part:

One reason for us to focus on AI for the moment is that there are dozens of open problems in Friendly AI theory that we can make progress on right now without needing the vast computational resources required to make progress in whole brain emulation.

This is confusing, for WBE progress that can be made right now is not constrained by a "need for vast computational resources", while the corresponding argument on AI's side is about present capability, and there are many more relevant considerations about WBE.

I read it as saying "we can pursue work on AI without spending hundreds of thousands on mid-level supercomputing facilities". That is, WBE isn't constrained by the limit of our computational resources, but it might be constrained by the limit of a non-profit's ability to pay for computational resources.

What kinds of expensive computations would you need running in order to make progress on WBE? (As a separate issue, why should you want to make progress on WBE?)

This idea probably just comes from looking at the Blue Brain project that seems to be aiming in the direction of WBE and uses an expensive supercomputer for simulating models of neocortical columns... right, Luke? :)

(I guess because we'd like to see WBE come before AI, due to creating FAI being a hell of a lot more difficult than ensuring a few (hundred) WBEs behave at least humanly friendly and thereby be of some use in making progress on FAI itself.)

Here's an MP3 of the interview (text-to-speech conversion).

(If you think this is not a fair use of copyright, let me know and I'll take it down.)

When can we expect more info on the spin-off rationality organization?

Right now we're still gathering the initial team, doing market research, experimentally testing our success at teaching rationality, and so on. So one reason for the lack of information about what Rationality Org will look like is that we don't have that information yet. And I'm not sure when we'll have it.

The interview is very clear. Luke is unusually methodical and broad minded for an autodidact and I think that comes across well in the interview. It sounds like he's weighing many alternatives. I consider this an improvement over SIAI's traditional style.

Are there ‘categorical’ oughts that do not depend on an “If you want X” clause? Naturalists tend to deny this possibility, but perhaps categorical epistemic or moral oughts can be derived from the mathematics of game theory and decision theory, as naturalist Gary Drescher suggests in Good and Real. If so, it may be wise to make sure they are included in friendliness content theory, so that an AI can respect them.

The phrasing "it may be wise to make sure they are included in friendliness content theory" makes it sound wrong to my ear, as if you are brewing some kind of informal eclectic theoretical soup.

It struck me as sort of a sly understatement, which went over well compared to the more familiar "I am politely and quietly screaming that this is important" tone.

I'm not confident that is the case, or at least that this meaning is reliably communicated.

I concur.

A minor nitpick: is there a reason for the interview being "Luke vs. Anissimov", and not, say, "Luke vs. Michael"?

Yeah, I go by Anissimov because there are so many Michaels in the Singularity movement.

Descriptively, that could be a cause for why that happened, but should it affect how the interview is formatted?

Ah, I decided to change it to Luke vs. Michael. Thanks for feedback.

[-][anonymous]00

.

Name spelling fixed, thanks.

As it turns out, the heroes who can save the world are not those with incredible strength or the power of flight.

That's levitation homes.

[-][anonymous]-10

Tactile telekinesis FTW.

[This comment is no longer endorsed by its author]Reply

typo: "think a good world would like" -> "think a good world would be like"

Fixed, thanks.

Another one: "the study particular moral questions" -> "the study of particular moral questions"

[-][anonymous]00

Another one: "the study particular moral questions" -> "the study of particular moral questions"

[This comment is no longer endorsed by its author]Reply