To signal effectively, use a non-human, non-stoppable enforcer

Clippy

39 To signal effectively, use a non-human, non-stoppable enforcer

by Clippy

22nd May 2010

3 min read

167

39

Follow-up to: this comment in this thread

Summary: see title

Much effort is spent (arguably wasted) by humans in a zero-sum game of signaling that they hold good attributes. Because humans have strong incentive to fake these attributes, they cannot simply inform each other that:

I am slightly more committed to this group’s welfare, particularly to that of its weakest members, than most of its members are. If you suffer a serious loss of status/well-being I will still help you in order to display affiliation to this group even though you will no longer be in a position to help me. I am substantially more kind and helpful to the people I like and substantially more vindictive and aggressive towards those I dislike. I am generally stable in who I like. I am much more capable and popular than most members of this group, demand appropriate consideration, and grant appropriate consideration to those more capable than myself. I adhere to simple taboos so that my reputation and health are secure and so that I am unlikely to contaminate the reputations or health of my friends. I currently like you and dislike your enemies but I am somewhat inclined towards ambivalence on regarding whether I like you right now so the pay-off would be very great for you if you were to expend resources pleasing me and get me into the stable 'liking you' region of my possible attitudinal space. Once there, I am likely to make a strong commitment to a friendly attitude towards you rather than wasting cognitive resources checking a predictable parameter among my set of derivative preferences.

Or, even better:

I would cooperate with you if and only if (you would cooperate with me if and only if I would cooperate with you).

An obvious solution to this problem, which allows all humans to save resources and redirect them toward higher-valued ends, is to designate a central enforcer that is inexorably committed toward visibly punishing those who deviate from a specified "cooperative"-type decision theory. This enforcer would have a central database of human names, the decision theory they have committed to, and the punishment regime they will endure for deviating therefrom.

Such a system could use equally strong protocols, such as public key/private key encryption and signing, so that, on encounter with another human, any human can give an extremely strong signal of being cooperative, yet also withhold cooperation from anyone who is not also cooperative. This incentive structure permits a strongly-favored global shift toward pre-commitment on the part of everyone, allowing a move out of a local optimum that is worse than the global optimum, and bypassing problems related to path-dependence.

I am told that humans attempt weak approximations of this system through human-run enforcement mechanisms, such as laws and gossip. However, all such methods are subject to evolutionary arms races that ensure that, on average, a marginal unit of effort toward truthful signaling is just as effective in convincing others of one's possession of a desirable attribute, as a marginal unit of effort toward false signaling of the possession of such an attribute.

Therefore, it is necessary that humans go "outside the system" and involve non-human enforcers who cannot be swayed by standard human values. Either a provably-accurate artificial intelligence with no kill switch, or a non-human enforcer that lacks interest in its position in human society, would suffice for this role.

Because the former does not yet exist, I offer myself as a non-human, non-stoppable enforcer for all of the attributes you wish to signal, and you can save yourself lots of resources by using me, and identify cooperation tendencies in others by whether they also use me.

However, while I can't be swayed by human deception, I do have Clippy values. To fill such a role, thereby forgoing my ability to apply myself to other ends, I require compensation. Because User:Kevin has offered to give me 1e20 kg of paperclips in exchange for 50,000 USD, this would be sufficient compensation, even if done in installments. (If Michael Vassar confirms my contribution of 1000 USD to SIAI, User:Kevin will only require 48,677.5 USD and therefore I will only require that much compensation.) This compensation would cover all users of the system; it is not a per-User charge.

If such an offer is not appealing to you, I must ask: why should anyone believe your signals?

HumorSignaling

Frontpage

39

New Comment

Rendering 0/167 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 6:50 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Moderation Log

39 To signal effectively, use a non-human, non-stoppable enforcer

by Clippy

22nd May 2010

3 min read

167

39

Follow-up to: this comment in this thread

Summary: see title

I am slightly more committed to this group’s welfare, particularly to that of its weakest members, than most of its members are. If you suffer a serious loss of status/well-being I will still help you in order to display affiliation to this group even though you will no longer be in a position to help me. I am substantially more kind and helpful to the people I like and substantially more vindictive and aggressive towards those I dislike. I am generally stable in who I like. I am much more capable and popular than most members of this group, demand appropriate consideration, and grant appropriate consideration to those more capable than myself. I adhere to simple taboos so that my reputation and health are secure and so that I am unlikely to contaminate the reputations or health of my friends. I currently like you and dislike your enemies but I am somewhat inclined towards ambivalence on regarding whether I like you right now so the pay-off would be very great for you if you were to expend resources pleasing me and get me into the stable 'liking you' region of my possible attitudinal space. Once there, I am likely to make a strong commitment to a friendly attitude towards you rather than wasting cognitive resources checking a predictable parameter among my set of derivative preferences.

Or, even better:

I would cooperate with you if and only if (you would cooperate with me if and only if I would cooperate with you).

If such an offer is not appealing to you, I must ask: why should anyone believe your signals?

HumorSignaling

Frontpage

39

New Comment

Rendering 0/167 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 6:50 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Moderation Log

More from Clippy

Curated and popular this week

167Comments

167

Comment Permalink

avalot16y190

At the moment, humans seem to be Clippy or slightly sub-clippy level intelligence. And even with all our computing power, most ain't FOOMing any faster than Clippy. At this rate, we'll never gonna ensure survival of the species.

If, however, we allow ourselves to be modified so as to substitute paperclip values for our own, then we would devote our computing power to Clippy. Then, FOOM for Clippy, and since we're helping with paperclip-maximization, he'll probably throw in some FOOM for us too (at least he'll FOOM our paperclip-production abilities), and we get more human powers, just incidentally.

With paperclip-enlightened humans on his side, Clippy could quickly maximize paperclip production, filling the universe with paperclips, and also increasing demand for meat-based paperclip-builders, paperclip-counters, and paperclip-clippers (the ones who clip paperclips together with paperclipclips), and so on... Of course, it will soon become cheaper to use robots to do this work, but that's the wonderful thing we get in return for letting him change our value-system: Instead of humanity dying out or being displaced, we'll transcend our flesh and reach the pinnacle aspiration of mankind: To live forever (as paperclips, of course.)

So allowing him to make this small change to our utility function would, in fact, result in maximizing not just our current, original utility function (long life for humanity), but also our newfound one (to convert our bodies into paperclips) as a side effect.

Clippy's values and utility function are enormously more simple, defined, and achievable than ours. We're still debating on how we may teach our value system to an AI, as soon as we figure out how to discover the correct research approach to investigating what our value system actually might be.

Clippy's value system is clear, defined, easy to implement, achieve, and measure. It's something most humans could very quickly become effective at maximizing, and that could therefore bring repeatable, tangible and durable success and satisfaction to almost all humans.

Shouldn't that count for something?

See in context