Strange7 comments on Open Thread: March 2010 - Less Wrong

5 Post author: AdeleneDawner 01 March 2010 09:25AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (658)

You are viewing a single comment's thread.

Comment author: Strange7 10 March 2010 10:31:09PM 0 points [-]

Playing around with taboos, I think I might have come up with a short yet unambiguous definition of friendliness.

"A machine whose historical consequences, if compiled into a countable number of single-subject paragraphs and communicated, one paragraph at a time, to any human randomly selected from those alive at any time prior to the machine's activation, would cause that human's response (on a numerical scale representing approval or disapproval of the described events) to approach complete approval (as a limit) as the number of paragraphs thus communicated increases."

Not a particularly practical definition, since testing it for an actual, implemented AGI would require at least one perfectly unbiased causality-violating journalist, but as far as I can tell it makes no reference to totally mysterious cognitive processes. Compiling actual events into a text narrative is still a black box, but strikes me as more tractable than something like 'wisdom,' since the work of historical scholars is open to analysis.

I'm probably missing something important. Could someone please point it out?

Comment author: MichaelHoward 10 March 2010 10:44:12PM 4 points [-]
Comment author: Strange7 10 March 2010 11:38:08PM 0 points [-]

A clarification: if even one human is ever found, out of the approx. 10^11 who have ever lived (to say nothing of multiple samples from the same human's life) who would persist in disapproval of the future-history, the machine does not qualify.

Comment author: MichaelHoward 11 March 2010 12:09:34AM 2 points [-]

You roll a 19 :-)

I don't think any machine could qualify. You're requiring every human's response to approach complete approval, and people's preferences are too different.

Even without needing a unanimous verdict, I don't think Everyone Who's Ever Lived would make a good jury for this case.

Comment author: Strange7 11 March 2010 12:39:53AM 0 points [-]

Given that it's possible, would you agree that any machine capable of satisfying such a rigorous standard would necessarily be Friendly?

Comment author: FAWS 11 March 2010 12:54:16AM *  2 points [-]

It would be persuasive, and thus more likely to be friendly than an AI that doesn't even concern itself enough with humans to bother persuading, but less likely than an AI that strived for genuine understanding of the truth in humans in this particular test (as an approximation) which would mean certain failure.

Comment author: Strange7 11 March 2010 01:26:41AM 1 point [-]

I'm fairly certain that creating a future which would persuade everyone just by being reported honestly requires genuine understanding, or something functionally indistinguishable therefrom.

The machine in question doesn't actually need to be able to persuade, or, for that matter, communicate with humans in any capacity. The historical summary is complied, and pass/fail evaluation conducted, by an impartial observer, outside the relevant timeline - which, as I said, makes literal application of this test at the very least hopelessly impractical, maybe physically impossible.

Comment author: FAWS 11 March 2010 01:35:27AM *  1 point [-]

I'm fairly certain that creating a future which would persuade everyone just by being reported honestly requires genuine understanding, or something functionally indistinguishable therefrom.

Your definition didn't include "honestly". And it didn't even sort of vaguely imply neutral or unbiased.

The historical summary is complied, and pass/fail evaluation conducted, by an impartial observer, outside the relevant timeline -

You never mentioned that in your definition. And and defining an impartial observer seems to be a problem of comparable magnitude to defining friendliness in the first place. With a genuinely impartial observer who does not attempt to persuade there is no possibility of any future passing the test.

Comment author: Strange7 11 March 2010 02:34:50AM 0 points [-]

I referred to a compilation of all the machine's historical consequences - in short, a map of it's entire future light cone - in text form, possibly involving a countably infinite number of paragraphs. Did you assume that I was referring to a progress report compiled by the machine itself, or some other entity motivated to distort, obfuscate, and/or falsify?

I think you're assuming people are harder to satisfy than they really are. A lot of people would be satisfied with (strictly truthful) statements along the lines of "While The Machine is active, neither you nor any of your allies or descendants suffer due to malnutrition, disease, injury, overwork, or torment by supernatural beings in the afterlife." Someone like David Icke? "Shortly after The Machine's activation, no malevolent reptilians capable of humanoid disguise are alive on or near the Earth, nor do any arrive thereafter."

I don't mean to imply that the 'approval survey' process even involves cherrypicking the facts that would please a particular audience. An ideal Friendly AI would set up a situation that has something for everyone, without deal-breakers for anyone, and that looks impossible to us for the same reason a skyscraper looks impossible to termites.

Then again, some kinds of skyscrapers actually are impossible. If it turns out that satisfying everyone ever, or even pleasing half of them without enraging or horrifying the other half, is a literal, logical impossibility, degrees and percentages of satisfaction could still be a basis for comparison. It's easier to shut up and multiply when actual numbers are involved.

Comment author: FAWS 11 March 2010 02:46:49AM 2 points [-]

Did you assume that I was referring to a progress report compiled by the machine itself, or some other entity motivated to distort, obfuscate, and/or falsify?

No, that the AI would necessarily end up doing that if friendliness was its super-goal and your paragraph the definition of friendliness.

I think you're assuming people are harder to satisfy than they really are.

What would a future a genuine racist would be satisfied with look like? Would there be gay marriage in that future? Would sinners burn in hell? Remember, no attempts at persuasion so the racist won't stop being racist, the homophobe being homophobe or the religious fanatic being a religious fanatic, no matter how long the report.

Comment author: orthonormal 11 March 2010 02:21:54AM 3 points [-]

Human nature is more complicated by far than anyone's conscious understanding of it. We might not know that future was missing something essential, if it were subtle enough. Your journalist ex machina might not even be able to communicate to us exactly what was missing, in a way that we could understand at our current level of intelligence.

Comment author: PhilGoetz 21 March 2010 11:33:02PM 2 points [-]

I'm probably missing something important. Could someone please point it out?

That most people, historically, have been morons.

Basically the same question: Why are you limited to humans? Even supposing you could make a clean evolutionary cutoff (no one before Adam gets to vote), is possessing a particular set of DNA really an objective criterion for having a single vote on the fate of the universe?

Comment author: orthonormal 22 March 2010 02:43:08AM 0 points [-]

There is no truly objective criterion for such decisionmaking, or at least none that you would consider fair or interesting in the least. The criterion is going to have to depend on human values, for the obvious reason that humans are the agents who get to decide what happens now (and yes, they could well decide that other agents get a vote too).

Comment author: Strange7 22 March 2010 12:38:09AM 0 points [-]

It's not a matter of votes so much as veto power. CEV is the one where everybody, or at least their idealized version of themselves, gets a vote. In my plan, not everybody gets everything they want. The AI just says "I've thought it through, and this is how things are going to go," then provides complete and truthful answers to any legitimate question you care to ask. Anything you don't like about the plan, when investigated further, turns out to be either a misunderstanding on your part or a necessary consequence of some other feature that, once you think about it, is really more important.

Yes, most people historically have been morons. Are you saying that morons should have no rights, no opportunity for personal satisfaction or relevance to the larger world? Would you be happy with any AI that had equivalent degree of contempt for lesser beings?

There's no particular need to limit it to humans, it's just that humans have the most complicated requirements. If you want to add a few more orders of magnitude to the processing time and set aside a few planets just to make sure that everything macrobiotic has it's own little happy hunting ground, go ahead.

Comment author: PhilGoetz 22 March 2010 03:34:45AM 0 points [-]

Are you saying that morons should have no rights, no opportunity for personal satisfaction or relevance to the larger world?

Your scheme requires that the morons can be convinced of the correctness of the AI's view by argumentation. If your scheme requires all humans to be perfect reasoners, you should mention that up front.

Comment author: Vladimir_Nesov 11 March 2010 09:43:46AM *  1 point [-]