Comment author: Douglas_Reay 18 July 2014 10:08:59AM -1 points [-]

To paraphrase "Why Flip a Coin: The Art and Science of Good Decisions", by H. W. Lewis

Good decisions are made when the person making the decision shares in both the benefits and the consequences of that decision. Shield a person from either, and you shift the decision making process.

However, we know there are various cognitive biases which makes people's estimates of evidence depend upon the order in which the evidence is presented. If we want to inform people, rather than manipulate them, then we should present them information in the order that will minimise the impact of such biases, even if doing so isn't the tactic most likely to manipulate them into agreeing with the conclusion that we ourselves have come to.

Comment author: Douglas_Reay 18 July 2014 10:13:22AM -1 points [-]

Having said that, there is research suggesting that some groups are more prone than others to the particular cognitive biases that unduly prejudice people against an option when they hear about the scary bits first.

Short Summary
Longer Article

Comment author: Douglas_Reay 18 July 2014 10:08:59AM -1 points [-]

To paraphrase "Why Flip a Coin: The Art and Science of Good Decisions", by H. W. Lewis

Good decisions are made when the person making the decision shares in both the benefits and the consequences of that decision. Shield a person from either, and you shift the decision making process.

However, we know there are various cognitive biases which makes people's estimates of evidence depend upon the order in which the evidence is presented. If we want to inform people, rather than manipulate them, then we should present them information in the order that will minimise the impact of such biases, even if doing so isn't the tactic most likely to manipulate them into agreeing with the conclusion that we ourselves have come to.

Comment author: Douglas_Reay 18 July 2014 09:55:21AM -1 points [-]

To the extent that we care about causing people to become better at reasoning about ethics, it seems like we ought to be able to do better than this.

What would you propose as an alternative?

Comment author: Douglas_Reay 18 July 2014 09:50:00AM -1 points [-]

The problem here is whether even a cautious programmer will be able to reliably determine when an AI is sufficiently advanced that the AI can deceive the programmer over whether the programmer has been successful in redefining the AI's core purpose.

One would hope that the programmer would resist the AI trying to tempt the programmer into allowing the AI to grow to beyond that point before the programmer has set the core purpose that they want the AI to have for the long term.

Comment author: Douglas_Reay 18 July 2014 09:52:08AM -1 points [-]

One lesson you could draw from this is that, as part of your definition of what a "paperclip" is, you should include the AI putting a high value upon being honest with the programmer (about its aims, tactics and current ability levels) and not deliberately trying to game, tempt or manipulate the programmer.

Comment author: Douglas_Reay 18 July 2014 09:50:00AM -1 points [-]

The problem here is whether even a cautious programmer will be able to reliably determine when an AI is sufficiently advanced that the AI can deceive the programmer over whether the programmer has been successful in redefining the AI's core purpose.

One would hope that the programmer would resist the AI trying to tempt the programmer into allowing the AI to grow to beyond that point before the programmer has set the core purpose that they want the AI to have for the long term.

Comment author: Douglas_Reay 18 July 2014 09:40:13AM -1 points [-]

I think this is a political issue, not one with a single provably correct answer.

Think of it this way. Supposing you have 10 billion people in the world at the point at which several AIs get created. To simplify things, lets say that just four AIs get created, and each asks for resources to be donated to them, to further that AIs purpose, with the following spiel:

AI ONE - My purpose is to help my donors life long and happy lives. I will value aiding you (and just you, not your relatives or friends) in proportion to the resources you donate to me. I won't value helping non-donors, except in as far as it aids me in aiding my donors.

AI TWO - My purpose is to help those my donors want me to help. Each donor can specify a group of people (both living and future), such as "the species homo sapiens", or "anyone sharing 10% or more of the parts of my genome that vary between humans, in proportion to how similar they are to me", and I will aid that group in proportion to the resources you donate to me.

AI THREE - My purpose is to increase the average utility experienced per sentient being in the universe. If you are an altruist who cares most about quality of life, and who asks nothing in return, donate to me.

AI FOUR - My purpose is to increase the total utility experienced, over the life time of this universe by all sentient beings in the universe. I will compromise with AIs who want to protect the human species, to the extent that doing so furthers that aim. And, since the polls predict plenty of people will donate to such AIs, have no fear of being destroyed - do the right thing by donating to me.

Not all of those 10 billion have the same number of resources, or willingness to donate those resources to be turned into additional computer hardware to boost their chosen AI's bargaining position with the other AIs. But let us suppose that, after everyone donates and the AIs are created, there is no clear winner, and the situation is as follows:

AI ONE ends up controlling 30% of available computing resources, AI TWO also have 30%, AI THREE has 20% and AI FOUR has 20%.

And let's further assume that humanity was wise enough to enforce an initial "no negative bargaining tactics", so AI FOUR couldn't get away with threatening "Include me in your alliance, or I'll blow up the Earth".

There are, from this position, multiple possible solutions that would break the deadlock. Any three of the AIs could ally to gain control of sufficient resources to out-grow all others.

For example:

The FUTURE ALLIANCE - THREE and FOUR agree upon a utility function that maximises total utility under a constraint that expected average utility must, in the long term, increase rather than decrease, in a way that depends upon some stated relationship to other variables such as time and population. They then offer to ally with either ONE or TWO with a compromise cut off date, where ONE or TWO controls the future of the planet Earth up to that date, and THREE-FOUR controls everything beyond then, and they'll accept which ever of ONE or TWO bids the earlier date. This ends up with a winning bid from ONE of 70 years + a guarantee that some genetic material and a functioning industrial base will be left, at minimum, for THREE-FOUR to take over with after then.

The BREAD AND CIRCUSES ALLIANCE - ONE offers to suppose whoever can give the best deal for ONE's current donors and TWO, who has most in common with ONE and can clench the deal by itself, outbids THREE-FOUR.

The DAMOCLES SOLUTION - There is no unifying to create a single permanent AI with a compromise goals. Instead all four AIs agree to a temporary compromise, long enough to humanity to attain limited interstellar travel, at which point THREE and FOUR will be launched in opposite directions and will vacate Earth's solar system which (along with other solar systems containing planets within a pre-defined human habiltability range) will remain under the control of ONE-TWO. To enforce this agreement, a temporary AI is created and funded by the other four, with the sole purpose of carrying out the agreed actions and then splitting back into the constituent AIs at the agreed upon points.

Any of the above (and many other possible compromises) could be arrived at, when the four AIs sit down at the bargaining table. Which is agreed upon would depend upon the strength of bargaining position, and other political factors. There might well be 'campaign promises' made in the appeal for resources stage, with AIs voluntarily taking on restrictions on how they will further their purpose, in order to make themselves more attractive allies, or to poach resources by reducing the fears of donors.

Comment author: Douglas_Reay 18 July 2014 08:50:03AM -1 points [-]

We have the notion of total utilitarianism, in which the government tries to maximize the sum of the utility values of each of its constituents. This leads to "repugnant conclusion" issues in which the government generates new constituents at a high rate until all of them are miserable.

We also have the notion of average utilitarianism, in which the government tries to maximize the average of the utility values of each of its constituents. This leads to issues -- I'm not sure if there's a snappy name -- where the government tries to kill off the least happy constituents so as to bring the average up.

Not quite. If our societal utility function S(n) = n x U(n), where n is the number of people in the society, and U(n) is the average utility gain per year per person (which decreases as n increases, for high n, because of over crowding and resource scarcity), then you don't maximise S(n) by just increasing n until U(n) reaches 0. There will be an optimum n, for which 1 x U(n+1) - the utility from yet one more citizen, is less than n x ( U(n) - U(n+1) ) - the loss of utility by the other n citizens from adding that person.

Comment author: Douglas_Reay 18 July 2014 08:24:23AM -1 points [-]

It might be useful to distinguish between the actual total utility experienced so far, and the estimates of that which can be worked out from various view points.

Suppose we break it down by week. If during the first week of March 2014, Bob finds utility (eg pleasure) in watching movies, collecting stamps, and in owning stamp collections, and in having watched movies (4 different things), then you'd multiply the duration (1 week) by the rate at which those things add to his utility experienced to get how much you add to his total lifetime utility experienced.

If, during the second week of March, a fire destroys his stamp collection, that wouldn't reduce his lifetime total. What it would do is reduce the rate at which he added to that total during the following weeks.

Comment author: Douglas_Reay 18 July 2014 08:40:48AM -1 points [-]

Now let's take a different example. Suppose there is a painter whose only concern is their reputation upon their death, as measured by the monetary value of the paintings they put up for one final auction. Painting gives them no joy. Finishing a painting doesn't increase their utility, only the expected amount of utility that they will reap at some future date.

If, before they died, a fire destroyed the warehouse holding the paintings they were about to auction off, then they would account the net utility experienced during their life as zero. Having spent years with owning lots of paintings, and having had a high expectation of gaining future utility during that time, wouldn't have added anything to their actual total utility over those years.

How is that affected by the possibility of the painter changing their utility function?

If they later decide that there is utility to be experienced by weeks spent improving their skill at painting (by means of painting pictures, even if those pictures are destroyed before ever being seen or sold), does that retroactively change the total utility added during the previous years of their life?

I'd say no.

Either utility experienced is real, or it is not. If it is real, then a change in the future cannot affect the past. It can affect the estimate you are making now of the quantity in the past, just as an improvement in telescope technology might affect the estimate a modern day scientist might make about the quantity of explosive force of a nova that happened 1 million years ago, but it can't affect the quantity itself, just as a change to modern telescopes can't actually go back in time to alter the nova itself.

Comment author: Douglas_Reay 18 July 2014 08:24:23AM -1 points [-]

It might be useful to distinguish between the actual total utility experienced so far, and the estimates of that which can be worked out from various view points.

Suppose we break it down by week. If during the first week of March 2014, Bob finds utility (eg pleasure) in watching movies, collecting stamps, and in owning stamp collections, and in having watched movies (4 different things), then you'd multiply the duration (1 week) by the rate at which those things add to his utility experienced to get how much you add to his total lifetime utility experienced.

If, during the second week of March, a fire destroys his stamp collection, that wouldn't reduce his lifetime total. What it would do is reduce the rate at which he added to that total during the following weeks.

Comment author: John_Maxwell_IV 16 June 2014 08:06:05AM 7 points [-]

Highlights not already present in this thread:

  • Safely scalable AI

  • Humane AI

  • Benevolent AI

  • Moral AI

Comment author: Douglas_Reay 18 July 2014 08:15:14AM -1 points [-]

I like "scalable". "Stability" is also an option for conveying that it is the long term outcome of the system that we're worried about.

"Safer" rather than "Safe" might be more realistic. I don't know of any approach in ANY practical topic, that is 100% risk free.

And "assurance" (or "proven") is also an important point. We want reliable evidence that the approach is as safe as the designed claim.

But it isn't snappy or memorable to say we want AI whose levels of benevolence have been demonstrated to be stable over the long term.

Maybe we should go for a negative? "Human Extinction-free AI" anyone? :-)

View more: Prev | Next