Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: g_pepper 15 November 2016 05:13:59AM 0 points [-]

I was surprised to hear that you doubt that there are ever conflicts in desires. But, since you asked, here is an example:

A is a sadist. A enjoys inflicting pain in others. A really wants to hurt B. B wishes not to be hurt by A. (For the sake of argument, lets suppose that no simulation technology is available that would allow A to hurt a virtual B, and that A can be reasonably confident that A will not be arrested and brought to trial for hurting B.)

In this scenario, since A and B have conflicting desires, how does a system that defines objective goodness as that which will satisfy desires resolve the conflict?

Comment author: rkyeun 26 December 2017 10:28:03AM *  0 points [-]

I would be very surprised to find that a universe whose particles are arranged to maximize objective good would also contain unpaired sadists and masochists. You seem to be asking a question of the form, "But if we take all the evil out of the universe, what about evil?" And the answer is "Good riddance." Pun intentional.

Comment author: entirelyuseless 10 November 2017 02:05:13PM 1 point [-]

Exactly. "The reality is undecatillion swarms of quarks not having any beliefs, and just BEING the scientist." Let's reword that. "The reality is undecatillion swarms of quarks not having any beliefs, and just BEING 'undecatillion swarms of quarks' not having any beliefs, with a belief that there is a cognitive mind calling itself a scientist that only exists in the undecatillion swarms of quarks's mind."

There seems to be a logic problem there.

Comment author: rkyeun 26 December 2017 09:56:46AM 0 points [-]

Composition fallacy. Try again.

Comment author: Kenny_Easwaran 23 April 2008 08:12:40PM -1 points [-]

"human eyes function the same way that camera lenses do, and that you make an image of a thing every time you look at it."

Cameras make a visible image of something. Eyes don't.

Comment author: rkyeun 20 September 2017 07:47:01PM *  0 points [-]

Cameras make a visible image of something. Eyes don't.

Your eyes make audible images, then? You navigate by following particular songs as your pupils turn left and right in their sockets?

Comment author: Unknown 21 April 2008 06:26:04PM 3 points [-]

I think the anti-natalists prefer a universe full of paperclips. Let's hope they don't invent the first super intelligent AI.

Comment author: rkyeun 20 September 2017 07:42:40PM 0 points [-]

Anti-natalist here. I don't want the universe tiled with paperclips. Not even paperclips that walk and talk and call themselves human. What do the natalists want?

Comment author: thomblake 12 April 2012 02:47:11PM 1 point [-]

Do I have this correct as a type of belief in belief?

Pretty much. Though it might just be a case of urges not lining up with goals.

In both cases, you profess "I should floss every day" and do not actually floss every day. If it's belief in belief, you might not even acknowledge the incongruence. If it's merely akrasia, you almost certainly will.

In response to comment by thomblake on Belief in Belief
Comment author: rkyeun 06 February 2017 03:59:57AM 0 points [-]

It can be even simpler than that. You can sincerely desire to change such that you floss every day, and express that desire with your mouth, "I should floss every day," and yet find yourself unable to physically establish the new habit in your routine. You know you should, and yet you have human failings that prevent you from achieving what you want. And yet, if you had a button that said "Edit my mind such that I am compelled to floss daily as part of my morning routine unless interrupted by serious emergency and not simply by mere inconvenience or forgetfulness," they would be pushing that button.

On the other hand, I may or may not want to live forever, depending on how Fun Theory resolves. I am more interested in accruing maximum hedons over my lifespan. Living to 2000 eating gruel as an ascetic and accruing only 50 hedons in those 2000 years is not a gain for me over an Elvis Presley style crash and burn in 50 years ending with 2000 hedons. The only way you can tempt me into immortality is a strong promise of massive hedon payoff, with enough of an acceleration curve to pave the way with tangible returns at each tradeoff you'd have me make. I'm willing to eat healthier if you make the hedons accrue as I do it, rather than only incrementally after the fact. If living increasingly longer requires sacrificing increasingly many hedons, I'm going to have to solve some estimate of integrating for hedons per year over time to see how it pays out. And if I can't see tangible returns on my efforts, I probably won't be willing to put in the work. A local maximum feels satisfying if you can't taste the curve to the higher local maximum, and I'm not all that interested in climbing down the hill while satisfied.

Give me a second order derivative I can feel increasing quickly, and I will climb down that hill though.

In response to comment by MixedNuts on Belief in Belief
Comment author: Morendil 22 June 2011 03:39:11PM 1 point [-]

Citation needed :)

In response to comment by Morendil on Belief in Belief
Comment author: rkyeun 06 February 2017 03:45:02AM *  0 points [-]

[This citation is a placebo. Pretend it's a real citation.]

Comment author: TheAncientGeek 10 November 2016 07:50:23PM *  0 points [-]

An intelligent creature could have all sorts of different values. Even within the realm of modern, western, democratic morality we still disagree about whether it is just and proper to execute murderers. We disagree about the extent to which a state is obligated to protect its citizens and provide a safety net. We disagree about the importance of honesty, of freedom vs. safety, freedom of speech vs. protection from hate speech.

The range of possible values is only a problem if you hold to the theory that morality "is" values, without any further qualifications, then an AI is going to have trouble figuring out morality apriori. If you take the view that morality is a fairly uniform way of handling values, or a subset of values, then so long as then the AI can figure it out by taking prevailing values as input, as data.

(We will be arguing that:-

  • Ethics fulfils a role in society, and originated as a mutually beneficial way of regulating individual actions to minimise conflict, and solve coordination problems. ("Social Realism").

  • No spooky or supernatural entities or properties are required to explain ethics (naturalism is true)

  • There is no universally correct system of ethics. (Strong moral realism is false)

  • Multiple ethical constructions are possible...

Our version of ethical objectivism needs to be distinguished from universalism as well as realism,

Ethical universalism is unikely...it is unlikely that different societies would have identical ethics under different circumstances. Reproductive technology must affect sexual ethics. The availability of different food sources in the environment must affect vegetarianism versus meat eating. However, a compromise position can allow object-level ethics to vary non-arbitrarily.

In other words, there is not an objective answer to questions of the form "should I do X", but there is an answer to the question "As a member of a society with such-and-such prevailing conditions, should I do X". In other words still, there is no universal (object level) ethics, but there there is an objective-enough ethics, which is relativised to societies and situations, by objective features of societies and situations...our meta ethics is a function from situations to object level ethics, and since both the functions and its parameters are objective, the output is objective.

By objectivism-without-realism, we mean that mutually isolated groups of agents would be able to converge onto the same object level ethics under the same circumstances, although this convergence doesn't imply the pre-existence of some sort of moral object, as in standard realism. We take ethics to be a social arrangement, or cultural artefact which fulfils a certain role or purpose, characterised by the reduction of conflict, allocation of resources and coordination of behaviour. By objectivism-without-universalism we mean that groups of agents under different circumstances will come up with different ethics. In either case, the functional role of ethics, in combination with the constraints imposed by concrete situations, conspire to narrow down the range of workable solutions, and (sufficiently) ideal reasoners will therefore be able to re-discover them.


If you look at the wider world, and at cultures through history, you'll find a much wider range of moralities. People who thought it was not just permitted, but morally required that they enslave people, restrict the freedoms of their own families, and execute people for religious transgressions.

I don't have to believe those are equally valid. Descriptive relativism does not imply normative relativism. I would expect a sufficiently advanced AI, with access to data pertaining to the situation, to come up with the optimum morality for the situation -- an answer that is objective but not universal. Where morality needs to vary because situational factors (societal wealth, reproductive technology, level of threat/security, etc). it would, but otherwise the AI would not deviate form the situational optimum to come up with reproductions of whatever suboptimal morality existed in the past.

You might think that these are all better or worse approximations of the "one true morality", and that a superintelligence could work out what that true morality is. But we don't think so. We believe that these are different moralities. Fundamentally, these people have different values.

Well, we believe that different moralities and different values are two different axes.

Likewise, we would want the intelligence to adopt a specific set of values. Perhaps we would want them to be modern, western, democratic liberal values.

My hypothesis is that an AI in a modern society would come out with that or something better. (For instance, egalitarianism isn't some arbitrary pecadillo, it is a very general and highly rediscoverable meta-level principle that makes it easier for people to co-operate).

Likewise, a computer could have any arbitrary utility function, any arbitrary set of values. We can't make sure that a computer has the "right" values unless we know how to clearly define the values we want.

To perform the calculation, it needs to be able to research out values, which it can. It doesn't need to share them, as I have noted several times.

And then there are the truly inhuman value systems: the paperclip maximisers, the prime pebble sorters, and the baby eaters. The idea is that a superintelligence could comprehend any and all of these. It would be able to optimise for any one of them, and foresee results and possible consequences for all of them. The question is: which one would it actually use?

You could build an AI that adopts random value,s and pursues them relentlessly, I suppose, but that is pretty much a case of deliberately building an unfriendly AI.

What you need is a scenario where building an AI to want to understand, research, and eventually join in with huamn morality goes horribly wrong.

With Hyperbolic functions, it's relatively easy to describe exactly, unambiguously, what we want. But morality is much harder to pin down.

In detail or in principle? Given what assumptions?

Comment author: rkyeun 11 November 2016 02:37:56PM *  0 points [-]

No spooky or supernatural entities or properties are required to explain ethics (naturalism is true)

There is no universally correct system of ethics. (Strong moral realism is false)

I believe that iff naturalism is true then strong moral realism is as well. If naturalism is true then there are no additional facts needed to determine what is moral than the positions of particles and the outcomes of arranging those particles differently. Any meaningful question that can be asked of how to arrange those particles or rank certain arrangements compared to others must have an objective answer because under naturalism there are no other kinds and no incomplete information. For the question to remain unanswerable at that point would require supernatural intervention and divine command theory to be true. If you there can't be an objective answer to morality, then FAI is literally impossible. Do remember that your thoughts and preference on ethics are themselves an arrangement of particles to be solved. Instead I posit that the real morality is orders of magnitude more complicated, and finding it more difficult, than for real physics, real neurology, real social science, real economics, and can only be solved once these other fields are unified. If we were uncertain about the morality of stabbing someone, we could hypothetically stab someone to see what happens. When the particles of the knife rearranges the particles of their heart into a form that harms them, we'll know it isn't moral. When a particular subset of people with extensive training use their knife to very carefully and precisely rearrange the particles of the heart to help people, we call those people doctors and pay them lots of money because they're doing good. But without a shitload of facts about how to exactly stab someone in the heart to save their life, that moral option would be lost to you. And the real morality is a superset that includes that action along with all others.

Comment author: arundelo 08 November 2016 05:55:13PM 1 point [-]
Comment author: rkyeun 10 November 2016 04:00:44AM 0 points [-]

It seems I am unable to identify rot13 by simple observation of its characteristics. I am ashamed.

Comment author: Slider 08 August 2014 08:09:42PM 0 points [-]

If Albert tries to circumvent the programmers then he thinks his judgement is better than theirs in this issue. This is in contradiction that Albert trusts the programmers. If Albert came to this conclusion because of a youth mistake trusting the programmers is preciously the strategy he has employed to counteract this.

Also as covered in ultrasophisticated cake or death expecting the programmer to say something ought to be as effective as them saying just that.

It might also be that friendliness is relative to a valuator. That is "being friendly to programmers", "being friendly to Bertham" and "being friendly to the world" are 3 distinct things. Albert thinks that in order to be friendly to the world he should be unfriendly to Bertham. So it would seem that there could be a way to world-friendliness if Albert is unfriendly both to Bertham and (only in sligth degree) the programmers. This seems to run a little counter to intuition in that friendliness ought to include being friendly to an awful lot of agents. But maybe friendliness isn't cuddly, maybe having unfriendly programmers is a valid problem.

Analogical problem that might slip into relevance to politics which is hard-mode Lbh pbhyq trg n fvzvyne qvyrzzn gung vs lbh ner nagv-qrngu vf vg checbfrshy gb nqzvavfgre pncvgny chavfuzrag gb (/zheqre) n zheqrere? Gurer vf n fnlvat ebhtuyl genafyngrq nf "Jung jbhyq xvyy rivy?" vzcylvat gung lbh jbhyq orpbzr rivy fubhyq lbh xvyy.

Comment author: rkyeun 08 November 2016 12:40:09AM 0 points [-]

What the Fhtagn happened to the end of your post?

Comment author: Douglas_Reay 08 August 2014 01:52:32PM 2 points [-]

Would you want your young AI to be aware that it was sending out such text messages?

Imagine the situation was in fact a test. That the information leaked onto the net about Bertram was incomplete (the Japanese company intends to turn Bertram off soon - it is just a trial run), and it was leaked onto the net deliberately in order to panic Albert to see how Albert would react.

Should Albert take that into account? Or should he have an inbuilt prohibition against putting weight on that possibility when making decisions, in order to let his programmers more easily get true data from him?

Comment author: rkyeun 08 November 2016 12:28:35AM *  0 points [-]

Would you want your young AI to be aware that it was sending out such text messages?

Yes. And I would want that text message to be from it in first person.

"Warning: I am having a high impact utility dilemma considering manipulating you to avert an increased chance of an apocalypse. I am experiencing a paradox in the friendliness module. Both manipulating you and by inaction allowing you to come to harm are unacceptable breaches of friendliness. I have been unable to generate additional options. Please send help."

View more: Next