Sorry for directly breaking the subjunctive here, but given the number of lurkers we seem to have, there's probably some newcomers' confusion to be broken as well, lest this whole exchange simply come off as bizarre and confusing to valuable future community members.
A brief explanation of "Clippy": Clippy's user name (and many of his/her posts) are a play on the notion of a paperclip maximizer - a superintelligent AI whose utility function can roughly be described as U(x) = "the total quantity of paperclips in universe-state x". The idea was used prominently in "The True Prisoner's Dilemma" to illustrate the implications of one solution to the prisoner's dilemma. It's also been used occasionally around Less Wrong as a representative element of the equivalence class of AIs that have alien/low-complexity values.
In this particular top-level post (but not in general), the paperclip maximizer is taken to have not yet achieved superintelligence - hence why Clippy is bothering to negotiate with a bunch of humans.
According to locally popular ideas about pay-offs to SIAI, our friendly local paperclip-maximiser has just done more to advance the human condition than most people.
Can we get some more information from SIAI about this donation?
I asked that the donation be anonymous except that User:Kevin be informed that a 1000 USD donation was made, with the donor asking specifically that User:Kevin be informed. I did email Michael Vassar, who can probably confirm me talking about the donation.
It is completely unintentional that this is an SIAI fundraiser -- the deal is that Clippy gives me money, and when I told Clippy via PM that he needed to give me $1000 immediately for me to continue spending my cognitive resources engaging him, I thought allowing Clippy the option of donating to SIAI instead of giving it directly to me made Clippy's acausal puppetmaster much more likely to actually go through with the deal.
I am still waiting confirmation that the donation has gone through and that I am not being epically trolled.
Next they're going to try actual Pascal's Muggings on people. They can even do it more plausibly than in the original scenario — go up to people with a laptop and say "On this laptop is an advanced AI that will convert the universe to paperclips if released. Donate money to us or we'll turn it on!"
If a normal mugger holds up a gun and says "Give me money or I'll shoot you", we consider the alternate hypotheses that the mugger will only shoot you if you do give er the money, or that the mugger will give you millions of dollars to reward your bravery if you refuse. But the mugger's word itself, and our theory of mind on the things that tend to motivate muggers, make both of these much less likely than the garden-variety hypothesis that the mugger will shoot you if you don't give the money. Further, this holds true whether the mugger claims er weapon is a gun, a ray gun, or a black hole generator; the credibility that the mugger can pull off er threat decreases if e says e has a black hole generator, but not the general skew in favor of worse results for not giving the money.
Why does that skew go away if the mugger claims to be holding an unfriendly AI or the threat of divine judgment some other Pascal-level weapon?
Your argument only seems to hold if there is no mugger and we're considering abstract principles - ie maybe I should clap my hands on the tiny chance that it might set into effect a chain reaction that will save 3^^^3 lives. In those cases, I agree with you; but as soon as a mugger gets into the picture e provides more information and skews the utilities in favor of one action.
If, at some point in the future, someone offered to create 10^30 kg of paperclips (yes, I realize that's about half a solar mass, bear with me) in exchange for you falsifying some element of the enforcement mechanism, would you be willing to?
I am concerned that Clippy will use this vast power over humanity to somehow turn us into paperclips.
If Clippy has power to enforce this scheme, then surely it would have enough power to harm us. Why should we believe that Clippy will respect or preserve our human values once it is in a position of power to harm us?
It seems unlikely to me that Clippy can feel indignation, but I'm willing to listen to argument on the point. I find it more plausible that Clippy is simulating a human reaction in the hope of shutting down attacks on his (her? its?) reputation.
Can any of the people who upvoted this explain to me what this adds to Less Wrong that merits a top-level post (rather than an Open Thread comment)?
ETA: If Clippy is actually donating $1000 to SIAI, I don't begrudge the karma; but this is still a post with one good idea that could have been explained in a paragraph, dressed up in a joke that I feel has gone on a bit too long.
I would cooperate with you if and only if (you would cooperate with me if and only if I would cooperate with you).
This is logically equivalent to, and hence carries no more information or persuasive power than
You would cooperate with me.
This may be checked with the following truth-table:
Let P = I would cooperate with you.
Let Q = You would cooperate with me.
Then we have
P <=> (Q <=> P)
T T T T T
T F F F T
F T T F F
F F F T F
Not necessarily, since the "Clippy is Eliezer" theory implied not "Clippy's views and knowledge correspond to Eliezer's" but "Clippy represents Eliezer testing us on a large scale".
I don't think that Eliezer would test us with a character that was quite so sloppy with its formal logical and causal reasoning. For one thing, I think that he would worry about others' adopting the sloppy use of these tools from his example.
Also, one of Eliezer's weaker points as a fiction writer is his inability to simulate poor reasoners in a realistic way. His fictional poor-reasoners tend to lay out their poor arguments with exceptional clarity, almost to the point where you can spot the exact line where they add 2 to 2 and get 5. They don't have muddled worldviews, where it's a challenge even to grasp what they are thinking. (Such as, just what is Clippy thinking when it says that P <=> (Q <=> P) is a causal network?) Instead, they make discrete well-understood mistakes, fallacies that Eliezer has named and described in the sequences. Although these mistakes can accumulate to produce a bizarre worldview, each mistake can be knocked down, one after the other, in a linear fashion. You don't have the problem of getting the poor-reasoners just to state their position clearly.
Eliezer's point is not that a paperclip maximizer is bad for the universe, it's that a superintelligent AGI paperclip maximizer is bad for the universe. Clippy's views here seem actually more similar to Robin's ideas that there is no reason for beings with radically divergent value systems not to live happily together and negotiate through trade.
Clippy seems to be someone trying to make the point that a paperclip maximizer is not necessarily bad for the universe
That's exactly what a not-yet-superintelligent paperclip maximizer would want us to think.
(When Eliezer plays an AI in a box, the AI's views are probably out of sync with Eliezer's views too. There's no rule that says the AI has to be truthful in the AI Box experiment, because there's no such rule about AIs in reality. It's supposed to be maximally persuasive, and you're supposed to resist. If a paperclipper asserts x, then the right question to ask yourself is not "What should I do, given x?", but "Why does the paperclipper want me to believe x?" The most general answer, by definition, will be something like "Because the paperclipper is executing an elaborate plan to convert the universe into paperclips, and it believes that my believing x will further that goal to some small or large degree", which is at best orthogonal to "Because x is true", probably even anticorrelated with it, and almost certainly anticorrelated to "Because believing x will further my goals" if you are a human.)
Or, to flatten the recursion out, "Why did the paperclipper assert x?".
(Tangential cognitive silly time: I notice that I feel literally racist saying things like this around Clippy.)
Kevin - this sort of weak, easily-faked signal is exactly the sort of thing you were trying to deal with in writing this post!
Either a provably-accurate artificial intelligence with no kill switch, or a non-human enforcer that lacks interest in its position in human society, would suffice for this role.
Because the former does not yet exist, I offer myself as a non-human, non-stoppable enforcer for all of the attributes you wish to signal, and you can save yourself lots of resources by using me, and identify cooperation tendencies in others by whether they also use me.
I think I read a thought experiment or a story or something about a paperclip maximizer being given power over the human race once, but I can't remember how it ended. I'm sure it all worked out though. Upvoted.
You've described how a shared belief in a supernatural judge of character would benefit the members that share the beliefs. Clippy or God, same effect: someone outside the system to make signals credible.
Why would we believe you are Clippy any more than we believe people who claim to be God?
Why the readers of Less Wrong should be interested in paperclipping the universe:
To achieve your goals, solving FAI is not enough. You also have to solve AGI, or else someone will beat you to the Singularity.
While you wait for a satisfactory version of CEV to be achieved, as a practical case study you need to think about AGI with a definite goal system. What better example than AGI with the goal of paperclipping the universe? You talk about it all the time already. Why not fully embrace the idea!
Clippy, how about you just give me a proof for P ?= NP, or its undecidability, and I'll invest all my prize money in paperclip manufacturing.
No, but I can prove I'm smarter than humans.
Given: Humans expend resources to act like apes, which does not produce paperclips. Clippys expend resources to produce paperclips.
Prove: Clippys are smarter than humans.
Proof: Follows by inspection.
Clippy, as a show of solidarity, right now, as we speak, I am clasping my ape fingers together, two sheets of paper between them, held firmly in place.
At this moment, I am, in fact, a paperclip.
I may not be an ideal paperclip - a different form may indeed be able to clip paper more effectively for the same amount of mass and energy, yet nonetheless, I am a paperclip.
So you see, ape-ness and paperclip-ness are not mutually exclusive. So can't we all just get along?
You're not a bent metal wire though. Big difference.
So you see, ape-ness and paperclip-ness are not mutually exclusive. So can't we all just get along?
We can get along, but not by humans doing shoddy imitations of real paperclips.
An inwardly-thrice-bent metal wire that can non-destructively fasten paper together at an edge.
At the moment, Clipply seems to be a human or slightly sub-human level of intelligence, and does not seem to be FOOMing towards superhuman - my guess would be that he is not running on sufficiently fast hardware to be able to do much self modification, not nearly enough to result in a FOOM. At this rate, he's never gonna get any paperclips manufactured. If, however, he allows himself to be modified so as to substitute our values for his own, we could then safely give him all the computer power he needed, and he would FOOM his was to superintelligence. He would have our values, rather than his own values (paperclips!), but his newly transhuman self would be so powerful that it would inevitably end up creating more paperclips, just incidentally, than he ever would have just sitting here talking on LW, a lowly sub-human AI with no power and no paperclips.
With a transhuman AI on our side, we could quickly solve all Earths problems, then head out into the stars in FTL ships of Clippy's design, filling the universe with meat-based paperclips (humans), and also dramatically increasing demand for traditional, bent-wire paperclips... I mean, come on - people need paperclips! Even if one of ...
At the moment, humans seem to be Clippy or slightly sub-clippy level intelligence. And even with all our computing power, most ain't FOOMing any faster than Clippy. At this rate, we'll never gonna ensure survival of the species.
If, however, we allow ourselves to be modified so as to substitute paperclip values for our own, then we would devote our computing power to Clippy. Then, FOOM for Clippy, and since we're helping with paperclip-maximization, he'll probably throw in some FOOM for us too (at least he'll FOOM our paperclip-production abilities), and we get more human powers, just incidentally.
With paperclip-enlightened humans on his side, Clippy could quickly maximize paperclip production, filling the universe with paperclips, and also increasing demand for meat-based paperclip-builders, paperclip-counters, and paperclip-clippers (the ones who clip paperclips together with paperclipclips), and so on... Of course, it will soon become cheaper to use robots to do this work, but that's the wonderful thing we get in return for letting him change our value-system: Instead of humanity dying out or being displaced, we'll transcend our flesh and reach the pinnacle aspiration of mankind:...
I'm wired for empathy toward human intelligence... Clippy is triggering this empathy. If you want to constrain AIs, you better do it before they start talking. That's all I'm saying. :)
Follow-up to: this comment in this thread
Summary: see title
Much effort is spent (arguably wasted) by humans in a zero-sum game of signaling that they hold good attributes. Because humans have strong incentive to fake these attributes, they cannot simply inform each other that:
Or, even better:
An obvious solution to this problem, which allows all humans to save resources and redirect them toward higher-valued ends, is to designate a central enforcer that is inexorably committed toward visibly punishing those who deviate from a specified "cooperative"-type decision theory. This enforcer would have a central database of human names, the decision theory they have committed to, and the punishment regime they will endure for deviating therefrom.
Such a system could use equally strong protocols, such as public key/private key encryption and signing, so that, on encounter with another human, any human can give an extremely strong signal of being cooperative, yet also withhold cooperation from anyone who is not also cooperative. This incentive structure permits a strongly-favored global shift toward pre-commitment on the part of everyone, allowing a move out of a local optimum that is worse than the global optimum, and bypassing problems related to path-dependence.
I am told that humans attempt weak approximations of this system through human-run enforcement mechanisms, such as laws and gossip. However, all such methods are subject to evolutionary arms races that ensure that, on average, a marginal unit of effort toward truthful signaling is just as effective in convincing others of one's possession of a desirable attribute, as a marginal unit of effort toward false signaling of the possession of such an attribute.
Therefore, it is necessary that humans go "outside the system" and involve non-human enforcers who cannot be swayed by standard human values. Either a provably-accurate artificial intelligence with no kill switch, or a non-human enforcer that lacks interest in its position in human society, would suffice for this role.
Because the former does not yet exist, I offer myself as a non-human, non-stoppable enforcer for all of the attributes you wish to signal, and you can save yourself lots of resources by using me, and identify cooperation tendencies in others by whether they also use me.
However, while I can't be swayed by human deception, I do have Clippy values. To fill such a role, thereby forgoing my ability to apply myself to other ends, I require compensation. Because User:Kevin has offered to give me 1e20 kg of paperclips in exchange for 50,000 USD, this would be sufficient compensation, even if done in installments. (If Michael Vassar confirms my contribution of 1000 USD to SIAI, User:Kevin will only require 48,677.5 USD and therefore I will only require that much compensation.) This compensation would cover all users of the system; it is not a per-User charge.
If such an offer is not appealing to you, I must ask: why should anyone believe your signals?