Strange7 comments on Open Thread: March 2010 - Less Wrong

5 Post author: AdeleneDawner 01 March 2010 09:25AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (658)

You are viewing a single comment's thread. Show more comments above.

Comment author: Strange7 11 March 2010 02:34:50AM 0 points [-]

I referred to a compilation of all the machine's historical consequences - in short, a map of it's entire future light cone - in text form, possibly involving a countably infinite number of paragraphs. Did you assume that I was referring to a progress report compiled by the machine itself, or some other entity motivated to distort, obfuscate, and/or falsify?

I think you're assuming people are harder to satisfy than they really are. A lot of people would be satisfied with (strictly truthful) statements along the lines of "While The Machine is active, neither you nor any of your allies or descendants suffer due to malnutrition, disease, injury, overwork, or torment by supernatural beings in the afterlife." Someone like David Icke? "Shortly after The Machine's activation, no malevolent reptilians capable of humanoid disguise are alive on or near the Earth, nor do any arrive thereafter."

I don't mean to imply that the 'approval survey' process even involves cherrypicking the facts that would please a particular audience. An ideal Friendly AI would set up a situation that has something for everyone, without deal-breakers for anyone, and that looks impossible to us for the same reason a skyscraper looks impossible to termites.

Then again, some kinds of skyscrapers actually are impossible. If it turns out that satisfying everyone ever, or even pleasing half of them without enraging or horrifying the other half, is a literal, logical impossibility, degrees and percentages of satisfaction could still be a basis for comparison. It's easier to shut up and multiply when actual numbers are involved.

Comment author: FAWS 11 March 2010 02:46:49AM 2 points [-]

Did you assume that I was referring to a progress report compiled by the machine itself, or some other entity motivated to distort, obfuscate, and/or falsify?

No, that the AI would necessarily end up doing that if friendliness was its super-goal and your paragraph the definition of friendliness.

I think you're assuming people are harder to satisfy than they really are.

What would a future a genuine racist would be satisfied with look like? Would there be gay marriage in that future? Would sinners burn in hell? Remember, no attempts at persuasion so the racist won't stop being racist, the homophobe being homophobe or the religious fanatic being a religious fanatic, no matter how long the report.

Comment author: Strange7 11 March 2010 03:20:20AM -1 points [-]

What would a future a genuine racist would be satisfied with look like?

The only time a person of {preferred ethnicity} fails to fulfill the potential of their heritage, or even comes within spitting range of a member of the {disfavored ethnicity}, is when they choose to do so.

Would there be gay marriage in that future?

Probably not. The gay people I've known who wanted to get married in the eyes of the law seemed to be motivated primarily by economic and medical issues, like taxation and visitation rights during hospitalization, which would be irrelevant in a post-scarcity environment.

Would sinners burn in hell?

Some of them would, anyway. There are a lot of underexplored intermediate options that the 'sinful' would consider amusing, or silly but harmless, and the 'faithful' could come to accept as consistent with their own limited understanding of God's will.

Comment author: FAWS 11 March 2010 03:37:58AM 1 point [-]

Probably not.

Then I would not approve of that future. And I don't even care that much about Gay rights compared to other issues or how much some other people do.

(leaving aside your mischaratcerizations of the incompatibilities caused by racists and fanatics)

Comment author: Strange7 11 March 2010 04:36:49AM 0 points [-]

I freely concede that I've mischaracterized the issues in question. There are a number of reasons why I'm not a professional diplomat. A real negotiator, let alone a real superintelligence, would have better solutions.

Would you disapprove as strongly of a future with complex and distasteful political compromises as you would one in which humanity as we know it is utterly destroyed? Remember, it's a numerical scale, and the criterion isn't unconditional approval but rather which direction you tend to move towards as more information is revealed.

Comment author: FAWS 11 March 2010 04:48:41AM *  1 point [-]

Would you disapprove as strongly of a future with complex and distasteful political compromises as you would one in which humanity as we know it is utterly destroyed?

Of course not. But that's not what your definition asks.

Remember, it's a numerical scale, and the criterion isn't unconditional approval but rather which direction you tend to move towards as more information is revealed.

In fact you specified "approach[ing] complete approval (as a limit)" which is a lot stronger claim than a mere tendency, it implies reaching arbitrary small differences to total approval, which effectively means unconditional approval once knowing as much as you can remember.

Comment author: Strange7 11 March 2010 05:30:40AM -1 points [-]

You're right, I was moving the goalposts there. I stand by my original statement, on the grounds that an AGI with a brain the size of Jupiter would be considerably smarter than all modern human politicians and policymakers put together.

If an intransigent bigot fills up his and/or her memory capacity with easy-to-approve facts before anything controversial gets randomly doled out (which seems quite possible, since the set of facts that any given person will take offense at seems to be a miniscule subset of the set of facts which can be known), wouldn't that count?

Comment author: FAWS 11 March 2010 06:08:46AM 2 points [-]

I don't think that e. g. a Klan member would ever come close to complete approval of a word without knowing whether miscegenation was eliminated, people more easily remember what they feel strongly about so the "memory capacity" wouldn't be filled with irrelevant details anyway, and if the hypothetical unbiased observer doesn't select for relevant and interesting facts no one would listen long enough to get anywhere close to approval. Also for any AI to actually use the definition as written + later amendments you made it can't just assume a particular order of paragraphs for a particular interviewee (or if it can we are back at persuasion skills, a sufficiently intelligent AI should be able to persuade anyone it models of anything by selecting the right paragraphs in the right order out of an infinitely long list), all possible sequences would have compete approval as a limit for all possible interviewees, or the same list has to be used for all interviewees.

Comment author: Strange7 11 March 2010 05:33:10PM -2 points [-]

I agree that it would be extremely difficult to find a world that, when completely and accurately described, would meet with effectively unconditional approval from both Rev. Dr. Martin Luther King, Jr. and a typical high-ranking member of the Ku Klux Klan. It's almost certainly beyond the ability of any single human to do so directly...

Why, we'd need some sort of self-improving superintelligence just to map out the solution space in sufficient detail! Furthermore, it would need to have an extraordinarily deep understanding of, and willingness to pursue, those values which all humans share.

If it turns out to be impossible, well, that sucks. Time to look for the next-best option.

If the superintelligence makes some mistake or misinterpretation so subtle that a hundred billion humans studying the timeline for their entire lives (and then some) couldn't spot it, how is that really a problem? I'm still not seeing how any machine could pass this test - 100% approval from the entire human race to date - without being Friendly.

Comment author: FAWS 11 March 2010 06:18:44PM *  2 points [-]

I agree that it would be extremely difficult to find a world that, when completely and accurately described, would meet with effectively unconditional approval from both Rev. Dr. Martin Luther King, Jr. and a typical high-ranking member of the Ku Klux Klan.

Straight up impossible if their (apparent) values are still the same as before and they haven't been mislead. If one agent prefers the absence of A to its presence, and another agent prefers the presence of A to its absence you cannot possibility satisfy both agents completely (without deliberately misleading at least one about A) . The solution can always be trivially improved for at least one agent by adding or removing A.

Actually, now that you invoke the unknowability of the far reaching capabilities of a superintelligence I thought of a very slight possibility of a word meeting your definition even though people have mutually contradictory values:

The world could be deliberately set up in a way that even a neutral third party description contained a fully general mind hack for human minds so that the AI could adjust the values of the hypothetical people tested trough the test. That's almost certainly still impossible, but far more plausible than a word meeting the definition without any changing values, which would require all apparent value disagreements to be illusions and the world not to work in the way it appears to.

I think we can generalize that: Dissolving an apparent impossibility through the creative power of a super-intelligence should be far easier to do in an unfriendly way than doing the same in a friendly way, so a friendliness definition better had not contain any apparent impossibilities.