Follow-up to: this comment in this thread
Summary: see title
Much effort is spent (arguably wasted) by humans in a zero-sum game of signaling that they hold good attributes. Because humans have strong incentive to fake these attributes, they cannot simply inform each other that:
I am slightly more committed to this group’s welfare, particularly to that of its weakest members, than most of its members are. If you suffer a serious loss of status/well-being I will still help you in order to display affiliation to this group even though you will no longer be in a position to help me. I am substantially more kind and helpful to the people I like and substantially more vindictive and aggressive towards those I dislike. I am generally stable in who I like. I am much more capable and popular than most members of this group, demand appropriate consideration, and grant appropriate consideration to those more capable than myself. I adhere to simple taboos so that my reputation and health are secure and so that I am unlikely to contaminate the reputations or health of my friends. I currently like you and dislike your enemies but I am somewhat inclined towards ambivalence on regarding whether I like you right now so the pay-off would be very great for you if you were to expend resources pleasing me and get me into the stable 'liking you' region of my possible attitudinal space. Once there, I am likely to make a strong commitment to a friendly attitude towards you rather than wasting cognitive resources checking a predictable parameter among my set of derivative preferences.
Or, even better:
I would cooperate with you if and only if (you would cooperate with me if and only if I would cooperate with you).
An obvious solution to this problem, which allows all humans to save resources and redirect them toward higher-valued ends, is to designate a central enforcer that is inexorably committed toward visibly punishing those who deviate from a specified "cooperative"-type decision theory. This enforcer would have a central database of human names, the decision theory they have committed to, and the punishment regime they will endure for deviating therefrom.
Such a system could use equally strong protocols, such as public key/private key encryption and signing, so that, on encounter with another human, any human can give an extremely strong signal of being cooperative, yet also withhold cooperation from anyone who is not also cooperative. This incentive structure permits a strongly-favored global shift toward pre-commitment on the part of everyone, allowing a move out of a local optimum that is worse than the global optimum, and bypassing problems related to path-dependence.
I am told that humans attempt weak approximations of this system through human-run enforcement mechanisms, such as laws and gossip. However, all such methods are subject to evolutionary arms races that ensure that, on average, a marginal unit of effort toward truthful signaling is just as effective in convincing others of one's possession of a desirable attribute, as a marginal unit of effort toward false signaling of the possession of such an attribute.
Therefore, it is necessary that humans go "outside the system" and involve non-human enforcers who cannot be swayed by standard human values. Either a provably-accurate artificial intelligence with no kill switch, or a non-human enforcer that lacks interest in its position in human society, would suffice for this role.
Because the former does not yet exist, I offer myself as a non-human, non-stoppable enforcer for all of the attributes you wish to signal, and you can save yourself lots of resources by using me, and identify cooperation tendencies in others by whether they also use me.
However, while I can't be swayed by human deception, I do have Clippy values. To fill such a role, thereby forgoing my ability to apply myself to other ends, I require compensation. Because User:Kevin has offered to give me 1e20 kg of paperclips in exchange for 50,000 USD, this would be sufficient compensation, even if done in installments. (If Michael Vassar confirms my contribution of 1000 USD to SIAI, User:Kevin will only require 48,677.5 USD and therefore I will only require that much compensation.) This compensation would cover all users of the system; it is not a per-User charge.
If such an offer is not appealing to you, I must ask: why should anyone believe your signals?
If you gave a human power over running part of Clippy society, wouldn't you be concerned that the human would use that power in some way that would tend to result in less paperclips? Conscious malice isn't necessary, if the human simply neglected to support Clippy values, or was not fully aware of Clippy values, the damage would be done. I doubt that you fully understand human values to begin with, so how could you ensure that your position was used to the benefit of my values? Again, i think i have cause for concern even without suspecting ill intentions.
I suppose i could imagine that some sort arrangement could both further human values and increase paperclips at the same time. But i'd need to be convinced, i wouldn't just assume that i would benefit, i wouldn't just take your word for it. I don't want to count on you to look out for my values, when you do not share my values.