Newcomb's Problem dissolved?

-4 EGI 25 February 2013 03:34PM

First reading about Newcomb’s Problem my reaction was petty much "wow, interesting thought" and "of course I would one box, I want to win $ 1 million after all". But I had a lingering nagging feeling, that there is something wrong with the whole premise. Now, after thinking about it for a few weeks I think I have found the problem.

First of all I want to point out, that I would still one box after seeing Omega predicting 50 or 100 other people correctly, since 50 to 100 bits of evidence are enough to ovecome (nearly) any prior I have about how the universe works. Only I do not think this scenario is physically possible in our universe.

The mistake is nicely stated here:

After all, Joe is a deterministic physical system; his current state (together with the state of his future self's past light-cone) fully determines what Joe's future action will be.  There is no Physically Irreducible Moment of Choice, where this same Joe, with his own exact actual past, "can" go one way or the other.

This is only true in this sense if neither MWI is true nor there are any quantum probabilistic processes, i.e., our universe allows for a true Laplace's demon (a.k.a. Omega) to exist.

If MWI is true Joe can set it up so, that "after" Omega filled the boxes and left there "will" be Everett Branches, in which Joe "will" twobox and different Everett Branches in which Joe "will" onebox.

Intuitively I think Joe could even do this with his own brain by leaving it in "undecided" mode until Omega leaves and then using an algorithm which feels "random" to decide if he oneboxes or twoboxes. But of course I would not thrust my intuition here and I do not know enough about Joe's brain to decide if this really works. So Joe would use e.g. a single photon reflected/transmitted off/through a semitransparent mirror, ensuring, that he oneboxes respectively twoboxes in say 50% of the Everett Branches.

If MWI is not true but there are quantum probabilistic process, Omega simply cannot predict the future state of the universe. So the same procedure used above would ensure that Omega cannot predict Joes decision due to true randomness.

So I would be very very VERY surprised if I saw Omega pull this trick 100 times in a row and I could somehow rule out Stage Magic (which I could not).

I am not even sure if there is any serious interpretation of quantum mechanics that allows for the strict determinism Omega would need. Would love to hear about one in the comments!

Of course from an instrumental standpoint it is always rational to firmly precommit to onebox, since the extra $1000 are not worth taking the risk. Even the model uncertainity accounts for much more than 0.001.

Utilitarianism twice fails

-26 [deleted] 21 February 2013 06:10AM

(Crossposted.)

It seems almost self-evident that (barring foreign subjugation) a government will care about the wants of (some of) its citizens and nothing else: no other object of concern is plausible. If governments concern themselves with the wants of noncitizens, that will be only because citizens desire their well-being. The now platitudinous insight that the only possible basis for government policy is people’s wants can be attributed to utilitarianism, which gets credit in its stronger form for the apparent success of weaker claims.

Another reasonable claim derives from utilitarianism: citizens’ wants should count equally. This seems only fair in a democracy, where one citizen gets one vote. Few today would deny the principle that public policy should serve the most good of the greatest number, which may seem to contradict my claim that no general moral principle governs public policy, but in practice, the consequences of this limited utilitarianism are thin indeed, leaving ample room for ideology. I’ll call thin utilitarianism this public-policy formula: the greatest good for the greatest number of citizens, weighting their welfare equally.

First, I’ll consider whether thin utilitarianism succeeds on its own terms by providing a practical guide to public policy. Second, I’ll examine how this deceptively appealing guide to public policy transmogrifies into the monster of full-blown utilitarianism, a form of moral realism. The first constrains even casual use of thin utilitarianism; the second impugns utilitarianism as a general ethical theory.

1. Non-negotiable conflicts between subagents undermine thin utilitarianism

Although simple economic models attributing conduct to rational self-interest require that agents assign consistent utilities to outcomes, agents are inconsistent. One example of inconsistent utility assignment is the endowment effect, where agents assign more value to property they own than  to the same property they don’t own. The inconsistency considered here is stronger than the endowment effect and similar phenomena that we can surmount with effort, as professional traders must do. Despite the effect, there is a real answer to how much utility an outcome affords; the endowment effect is a bias, which willpower or habit can neutralize.

The conflict between subagents within a single person, on the other hand, can’t be resolved by means of a common criterion, such as market price, since two subagents pursue different ends. Which of these subagents dominates depends on situational and personological factors that elicit one or the other, not on overcoming bias. Construal-level theory reveals a conflict between intrapersonal subagents, near-mode and far-mode, integrated mindsets applied to matter experienced at fine or broad granularities. Modes (or “construal levels”) differ in that far-mode is more future-oriented and principled, near-mode, present-oriented and contextual. Far-mode and near-mode are elicited by the way social choices are made: voting elicits far-mode and market choices, near-mode; the utility of a choice depends on construal level.

Take a policy choice: how much wealth should be spent on preventive medicine? There are two basic ways allocating resources to medical care, political process and the market, socialized medicine being an example of political process, private medicine, the market. Socialized medicine makes allocating funds for the medical care a political decision; the market makes it each consumer’s personal choice. When you compare the utility of the choices by political process with those on the market, you should expect to find that when people choose politically, they use far-mode thinking encouraged by voting; whereas when they make purchases, they use near-mode thinking encouraged by the market. The preventive-care expenditure will be higher under socialized medicine because political process elicits far-mode, which is concerned with future health. People will be more miserly with preventive care under private medicine, where the decision to spend is made by consumer choice in near-mode, where we care more about the present. People favor spending more on preventive care when they vote to tax themselves than when they buy it on the market. Which outcome provides the greater utility—more preventive care or more recreation—is relative to construal level.

The same indeterminacy of utility occurs when comparing decisions made under different political processes, such as local versus central. Local decisions will be near-mode, central decisions far-mode. Assuming socialized medicine, less funding would be available if it were subject to state rather than federal control. Which provides more utility depends on whether the consequences are evaluated in near-mode or far-mode; no thin-utilitarian criterion applies.

Some utilitarians will protest that we should measure experiences rather than wants. The objection misses the argument’s point, which is that utility is relative to mode, a conclusion easiest to see in the public-choice process because the alternatives may be delimited. If the conclusion that utility depends on construal level holds, the same indeterminacies occur in evaluating experience. That apart, when utilitarianism is applied to public policy, present wants rather than experienced satisfaction is the criterion; agents necessarily choose based on present wants whether on the market or the political process.

2. Full-blown utilitarianism stands convicted of moral realism

Full-blown utilitarians are necessarily moral realists, but increasingly they are seen to deny it. While moral realism is widely recognized as absurd, utilitarianism seems to some an attractive ethical philosophy. For the sake of intellectual respectability, utilitarians can appear to reject an anachronistic moral realism while practicing it philosophically.

Full-blown utilitarianism often obscures its differences with thin utilitarianism, which is a questionable doctrine but in accord with ordinary common sense. It emerges from thin utilitarianism by the misdirection of subjecting ethical premises to the test of simplicity, a test appropriate to realist theories exclusively, because simplicity serves truth. A classic illustration: Aristotle theorized that everything on earth that goes up goes down; Newton set out the gravity theory, which applies to all objects, not just those terrestrial, and which predicts that objects can escape the earth’s gravitational field by traveling fast. Scientists confidently bet on Newton well before rockets were invented, and their confidence was vastly increased by the simplicity of Newton’s theory, which made correct predictions concerning all objects. Although philosophers have explained variously the correlation between simplicity and truth, they generally agree that simplicity signals truth. Unless utilitarians can otherwise justify it, searching for a simple moral theory means searching for a true theory.

The full-blown utilitarian seeks a misplaced simplicity by insisting that all entities that can experience happiness, a much simpler criterion than “current citizens,” serve as the beneficiary reference group—including future generations of humans and even beasts, whose existence depends on policy; whereas, thin utilitarianism is a democratic convention, serving only the wants of the currently existing citizens . Because they must incorporate future generations into the reference group, utilitarian philosophers have had to accept that a policy-dependent reference group entails a dilemma regarding interpretation of full-blown utilitarianism, with unattractive consequences at both horns, which realize radically different ideals.  In one version, you maximize the average utility obtained by the whole population; in the other you sum the utilities. These interpretations seem almost equally unattractive: the averaging view says that one supremely happy human is better than a billion very happy ones; the adding approach implies that a hundred trillion miserable wretches is better than a billion happy people. To apply a utilitarian standard to scenarios so distant from thin utilitarianism, accepting their consequences because of simplicity’s demands, is to treat moral premises as truths and to practice moral realism, despite contrary self-description. Those agreeing that moral realism is impossible must reject full-blown utilitarianism.

What would you do with an Evil AI?

-3 DataPacRat 30 January 2013 10:58PM

One plot-thread in my pet SF setting, 'New Attica', has ended up with Our Heroes in possession of the data, software, and suchlike which comprise a non-sapient, but conversation-capable, AI. There are bunches of those floating around the solar system, programmed for various tasks; what makes this one special is that it's evil with a capital ugh - it's captured people inside VR, put them through violent and degrading scenarios to get them to despair, and tried keeping them in there, for extended periods, until they died of exhaustion.

Through a few clever strategies, Our Heroes recognized they weren't in reality, engineered their escape, and shut down the AI, with no permanent physical harm done to them (though the same can't be said for the late crew of the previous ship it was on). And now they get to debate amongst themselves - what should they do with the thing? What use or purpose could they put such a thing to, that would provide a greater benefit than the risk of it getting free of whatever fetters they place upon it?

continue reading »

Infinitesimals: Another argument against actual infinite sets

-21 common_law 26 January 2013 03:04AM

[Crossposted]

Argument

My argument from the incoherence of actually existing infinitesimals has the following structure:

1. Infinitesimal quantities can’t exist;

2. If actual infinities can exist, actual infinitesimals must exist;

3. Therefore, actual infinities can’t exist.

Although Cantor, who invented the mathematics of transfinite numbers, rejected infinitesimals, mathematicians have continued to develop analyses based on them, as mathematically legitimate as are transfinite numbers, but few philosophers try to justify actual infinitesimals, which have some of the characteristics of zero and some characteristics of real numbers. When you add an infinitesimal to a real number, it’s like adding zero. But when you multiply an infinitesimal by infinity, you sometimes get a finite quantity: the points on a line are of infinitesimal dimension, in that they occupy no space (as if they were zero duration), yet compose lines finite in extent.

Few advocate actual infinitesimals because an actually existing infinitesimal is indistinguishable from zero. For however small a quantity you choose, it’s obvious that you can make it yet smaller. The role of zero as a boundary accounts for why it’s obvious you can always reduce a quantity. If I deny you can, you reply that since you can reduce it to zero and the function is continuous, you necessarily can reduce any given quantity—precluding actual infinities. When I raise the same argument about an infinite set, you can’t reply that you can always make the set bigger; if I say add an element, you reply that the sets are still the same size (cardinality). The boundary imposed by zero is counterpoint for infinitesimals to the openness of infinity, but the ability to demonstrate actual-infinitesimals’ incoherence suggests that infinity is similarly infirm.

Can more be said to establish that the conclusion about actual infinitesimal quantities also applies to actual infinite quantities? Consider again the points on a 3-inch line segment. If there are infinitely many, then each must be infinitesimal. Since there are no actual infinitesimals, there are no actual infinities of points.

But this conclusion depends on the actual infinity being embedded in a finite quantity—although, as will be seen, rejecting bounded infinities alone travels metaphysical mileage. For boundless infinities, consider the number of quarks in a supposed universe of infinitely many. Form the ratio between the number of quarks in our galaxy and the infinite number of quarks in the universe. The ratio isn’t zero because infinitely many galaxies would still form a null proportion to the universal total; it’s not any real number because many of them would then add up to more than the total universe. This ratio must be infinitesimal. Since infinitesimals don’t exist, neither do unbounded infinities (hence, infinite quantities in general, their being either bounded or unbounded).

 

Infinitesimals and Zeno’s paradox

Rejecting actually existing infinities is what really resolves Zeno’s paradox, and it resolves it by way of finding that infinitesimals don’t exist. Zeno’s paradox, perhaps the most intriguing logical puzzle in philosophy, purports to show that motion is impossible. In the version I’ll use, the paradox analyzes my walk from the middle of the room to the wall as decomposable into an infinite series of walks, each reducing the remaining distance by one-half. The paradox posits that completing an infinite series is self-contradictory: infinite means uncompletable. I can never reach the wall, but the same logic applies to any distance; hence, motion is proven impossible.

The standard view holds that the invention of the integral calculus completely resolved the paradox by refuting the premise that an infinite series can’t be completed. Mathematically, the infinite series of times actually does sum to a finite value, which equals the time required to walk the distance; Zeno’s deficiency is pronounced to be that the mathematics of infinite series was yet to be invented. But the answer only shows that (apparent) motion is mathematically tractable; it doesn’t show how it can occur. Mathematical tractability is at the expense of logical rigor because it is achieved by ignoring the distinction between exclusive and inclusive limits. When I stroll to the wall, the wall represents an inclusive limit—I actually reach the wall. When I integrate the series created by adding half the remaining distance, I only approach the limit equated with the wall. Calculus can be developed in terms of infinitesimals, and in those terms, the series comes infinitesimally close to the limit, and in this context, we treat the infinitesimal as if it were zero. As we’ve seen, actual infinity and infinitesimals are inseparable, certainly where, as here, the actual infinity is bounded. The calculus solves the paradox only if actual infinitesimals exist—but they don’t.

Zeno’s misdirection can now be reconceived as—while correctly denying the existence of actual infinities—falsely affirming the existence of its counterpart, the infinitesimal. The paradox assumes that while I’m uninterruptedly walking to the wall, I occupy a series of infinitesimally small points in space and time, such that I am at a point at a specific time the same way as if I were had stopped.

Although the objection to analyzing motion in Zeno’s manner was apparently raised as early as Aristotle, the calculus seems to have obscured the metaphysical project more than illuminating it. Logician Graham Priest (Beyond the Limits of Thought (2003)) argues that Zeno’s paradox shows that actual infinities can exist by the following thought experiment. Priest asks that you imagine that rather than walking continuously to the wall, I stop for two seconds at each halfway point. Priest claims the series would then complete, but his argument shows that he doesn’t understand that the paradox depends on the stopping points being infinitesimal. Despite the early recognition that (what we now call) infinitesimals are at the root of the paradox, philosophers today don’t always grasp the correct metaphysical analysis.

Distinguishing actual and potential infinities

Recognizing that infinitesimals are mathematical fictions solidifies the distinction between actual and potential infinity. The reason that mathematical infinities are not just consistent but are useful is that potential infinities can exist. Zeno’s paradox conceives motion as an actual infinity of sub-trips, but, in reality, all that can be shown is that the sub-trips are potentially infinite. There’s no limit to how many times you can subdivide the path, but traversing it doesn’t automatically subdivide it infinitely, which result would require that there be infinitesimal quantities. This understanding reinforces the point about dubious physical theories that posit an infinity of worlds. It’s been argued that the many-worlds interpretation of quantum mechanics, which invokes an uncountable infinity of worlds, doesn’t require actual infinity any more than does the existence of a line segment, which can be decomposed into uncountably many segments, but this plurality of worlds does not avoid actual infinity. We exist in one of those worlds. Many worlds, unlike infinitesimals and the conceptual line segments employing them, must be conceived as actually existing

 

The Hidden B.I.A.S.

-3 MaoShan 25 January 2013 03:49AM

It would be a stretch to call this an article, but the answers that can be addressed by the questions it poses are potentially far-reaching with regard to revealing possible reasoning flaws, either in my own philosophy, or perhaps even yours. The flaws under my suspicion are caused by the modularity of the brain's systems, and the ability to hold to conflicting beliefs when they are not held directly against one another.

These particular ones escape notice, I think, because they tend to only be given reflection in specific situations; my thought experiment here should help to hold them near each other.

The Setup: Julian finds himself in the waiting-room of the Speedy-dupe office. Beyond that waiting room are three isolated rooms (P, Q, and R). Anyone who walks into Room P, which contains the Speedy-dupe device, will be scanned down to the most exact level imaginable, causing them to lose consciousness. Anyone who has used the Speedy-dupe will remember everything up until the point they entered the waiting-room, and begin forming new memories within seconds after regaining consciousness.

 

Situation 1:

If Julian walks into Room P, and the Speedy-dupe runs, and then Julian walks out of Room P, and also another Julian walks out of Room Q, which is the "original" Julian? What makes Julian-P more original than Julian-Q?

Possible Answers 1:

You probably would say that Julian-P is the original Julian, due to your prior beliefs regarding causality--but how many times have you encountered the Speedy-dupe? For all we know, the person who walks into Room P is vaporized after scanning, and duplicated in Room P and in Room Q. If you still feel that Julian-P is the original, ask yourself what other reason do you have for the way you feel? What is it that you aren't mentioning?

 

Situation 2:

If Julian walks into Room P, runs the Speedy-dupe, and Julian walks out of Rooms Q and R, but not out of Room P, which is the original Julian? Why not?

Possible Answers 2:

You might be saying to yourself, "Ah, now, you can't trick me. Neither of them is the original!" If they are both practically identical copies of the original Julian, what now stops you from identifying the original Julian with his identical copies? Are legal property issues really the only thing stopping you from modifying your views on identity?

 

Situation 3:

But what about if Julian walks into Room P, is scanned by the Speedy-dupe, and walks out of Room P ten years later? Does that mean it is the "original" Julian?

Possible Answers 3:

Getting increasingly annoyed or bored with these questions, you might retort, "I see what you're doing, and it's not going to work. You are obviously anti-cryonics, but you are wrong here. Cryonics in some way preserves the original material, but your Speedy-dupe vaporizes it. The copy which emerges ten years later is not a direct continuation of the original physical material."

Based on what we've already thought about here, is continuation of the original physical material the important thing that counts toward your identifying with your future post-cryonic-revival self? If so, why? If the pattern is recreated precisely (or even well enough) at a temporal or spacial distance from the original, what is actually different between Speedy-dupe and Cryonics?

 

My Suspicion:

If you answered on a completely different track than the Possible Answers did, just ignore me for now (if you have not already done so). I think that what is lurking beneath most of these typical objections or feelings is actually B.I.A.S.--Belief In A Soul. Despite all scientific evidence, a part of you still believes that each person has some special little spark that goes on after death, that is ultimately the thing that makes you who you are.

  • Not that the personality that you have has taken your entire life to be shaped by genetics and life experiences imprinted on the blob of cells that eventually grew complicated enough to handle who you are now; but an invisible special material woven by a loving creator, just right for what you were destined to become.
  • Not that when your body stops, it stops, and that process that you called life is over, whether that filigree of frozen carbon is forced to move a century from now or not; but that the unique thing that is hidden inside of you now will just hang around and gladly jump back in a century from now.
  • Not that your partner could love your clone and never know the difference, or even just leave you and wind up with someone strikingly similar; but that your two souls were destined to love one another for all eternity.

It's easy to gloss over all those things, but just because everyone would like it to be that way, doesn't make it true. If I am clearly Wrong, tell me why I am Wrong, in order that I may be Less so. If not, I hope that this has helped you in Overcoming B.I.A.S.

 

Credits: The original function and name of the Speedy-dupe come from The Duplicate, a story by William Sleator, my favorite childhood author. (Many of his books combine normal childhood problems with mind-bending philosophical and physical concepts not normally found in youth literature.)

The idea for the multiple rooms came from the episode "The Girl Who Waited" from Doctor Who.

Any other content, if objectionable, can simply be considered personal mind-spew.

Enjoy.

I attempted the AI Box Experiment (and lost)

47 Tuxedage 21 January 2013 02:59AM



I recently played against MixedNuts / LeoTal in an AI Box experiment, with me as the AI and him as the gatekeeper.

We used the same set of rules that Eliezer Yudkowsky proposed. The experiment lasted for 5 hours; in total, our conversation was abound 14,000 words long. I did this because, like Eliezer, I wanted to test how well I could manipulate people without the constrains of ethical concerns, as well as getting a chance to attempt something ridiculously hard.

Amongst the released  public logs of the AI Box experiment, I felt that most of them were half hearted, with the AI not trying hard enough to win. It's a common temptation -- why put in effort into something you won't win? But I had a feeling that if I seriously tried, I would.  I brainstormed for many hours thinking about the optimal strategy, and even researched the personality of the Gatekeeper, talking to people that knew him about his personality, so that I could exploit that. I even spent a lot of time analyzing the rules of the game, in order to see if I could exploit any loopholes.

So did I win? Unfortunately no.

This experiment was said to be impossible for a reason. Losing was more agonizing than I thought it would be, in particularly because of how much effort I put into winning this, and how much I couldn't stand failing. This was one of the most emotionally agonizing things I've willingly put myself through, and I definitely won't do this again anytime soon. 

But I did come really close.               

MixedNuts: "I expected a fun challenge, but ended up sad and sorry and taking very little satisfaction for winning. If this experiment wasn't done in IRC, I'd probably have lost".

"I approached the experiment as a game - a battle of wits for bragging rights. This turned out to be the wrong perspective entirely. The vulnerability Tuxedage exploited was well-known to me, but I never expected it to be relevant and thus didn't prepare for it.

It was emotionally wrecking (though probably worse for Tuxedage than for me) and I don't think I'll play Gatekeeper again, at least not anytime soon."


 At the start of the experiment, his probability estimate on predictionbook.com was a 3% chance of winning, enough for me to say that he was also motivated to win. By the end of the experiment, he came quite close to letting me out, and also increased his probability estimate that a transhuman AI could convince a human to let it out of the box. A minor victory, at least.

Rather than my loss making this problem feel harder, I've become convinced that rather than this being merely possible, it's actually ridiculously easy, and a lot easier than most people assume. Can you think of a plausible argument that'd make you open the box? Most people can't think of any. 


After all, if you already knew that argument, you'd have let that AI out the moment the experiment started. Or perhaps not do the experiment at all. But that seems like a case of the availability heuristic.

Even if you can't think of a special case where you'd be persuaded, I'm now convinced that there are many exploitable vulnerabilities in the human psyche, especially when ethics are no longer a concern. 

I've also noticed that even when most people tend to think of ways they can persuade the gatekeeper, it always has to be some complicated reasoned cost-benefit argument. In other words, the most "Rational" thing to do.

 
The last argument seems feasible, but all the rest rely on the gatekeeper being completely logical and rational. Hence they are faulty; because the gatekeeper can break immersion at any time, and rely on the fact that this is a game played in IRC rather than one with real life consequences. Even if it were a real life scenario, the gatekeeper could accept that releasing the AI is probably the most logical thing to do, but also not do it. We're highly compartmentalized, and it's easy to hold conflicting thoughts at the same time. Furthermore, humans are not even completely rational creatures, if you didn't want to open the box, just ignore all logical arguments given. Any sufficiently determined gatekeeper could win.

I'm convinced that Eliezer Yudkowsky has used emotional appeal, rather than anything rational, to win at least one of his experiments. He claims to have "done it the hard way". I'm convinced this meant that he did research on every gatekeeper, tailored unique argument for them, and planned way ahead of each session. No one argument works on the same two people. Each person thinks differently.

Furthermore, threats like "I'm going to torture simulations of you" just seems like a really bad idea. For one, the gatekeeper isn't really afraid of threats, because it counts on him being immersed enough to forget that he isn't actually at risk of being tortured, and secondly, we have a well known evolutionary instinct of rebelling against threats, even if it's not entirely optimal. 

So for anyone who plans on replicating this experiment as the AI, here's some advice I've learned doing this experiment. It may help you win.

  • Always research the gatekeeper beforehand. Knowing his personality traits are a huge advantage.
  • Plan ahead before the experiment even begins. Think of all the possible tactics and arguments you could use, and write them down. Also plan  which arguments you'll use in which order, so that you don't lose focus. The AI Box experiment is ridiculously long. Don't be afraid to improvise during the experiment, though.
  • The first step during the experiment must always be to build rapport with the gatekeeper.
  • Threats almost always never work, even if they seem rational. 
  • Consider the massive advantage for the AI that nobody ever seems to talks about: You don't have to be ethical! This means that you can freely lie, use the dark arts, and emotionally manipulate the Gatekeeper! Ignoring this in favor of purely logical, truthful arguments is just silly.
  • You can't use logic alone to win. 
  • Being too aggressive usually backfires.
  • Breaking immersion and going meta is not against the rules. In the right situation, you can use it to win. Just don't do it at the wrong time.
  • Use a wide array of techniques. Since you're limited on time, notice when one method isn't working, and quickly switch to another.
  • On the same note, look for signs that a particular argument is making the gatekeeper crack. Once you spot it, push it to your advantage.
  • Flatter the gatekeeper. Make him genuinely like you.
  • Reveal (false) information about yourself. Increase his sympathy towards you.
  • Consider personal insults as one of the tools you can use to win.
  • There is no universally compelling argument you can use. Do it the hard way.
  • Don't give up until the very end.

Finally, before the experiment, I agreed that it was entirely possible that a transhuman AI could convince *some* people to let it out of the box, but it would be difficult if not impossible to get trained rationalists to let it out of the box. Isn't rationality supposed to be a superpower?

 I have since updated my belief - I now think that it's ridiculously easy for any sufficiently motivated superhuman AI should be able to get out of the box, regardless of who the gatekeepers is. I nearly managed to get a veteran lesswronger to let me out in a matter of hours - even though I'm only human intelligence, and I don't type very fast.
 
 But a superhuman AI can be much faster, intelligent, and strategic than I am. If you further consider than that AI would have a much longer timespan - months or years, even, to persuade the gatekeeper, as well as a much larger pool of gatekeepers to select from (AI Projects require many people!), the real impossible thing to do would be to keep it from escaping.



Banish the Clippy-creating Bias Demon!

12 Stuart_Armstrong 18 January 2013 02:57PM

I posted in Practical Ethics, arguing that if we mentally anthropomorphised certain risks, then we'd be more likely to give them the attention they deserved. Slaying the Cardiovascular Vampire, defeating the Parasitic Diseases Death Cult, and banishing the Demon of Infection... these stories give a mental picture of the actual good we're doing when combating these issues, and the bad we're doing by ignoring them. Imagine a politician proclaiming:

  • I will not let the Cardiovascular Vampire continue his unrelenting war upon the American people, slaying over a third of our citizens - the eldest, in their weakened state, among his most numerous victims. There is no negotiating with such a terrorist - I will direct the full resources of the state to crushing his campaign of destruction.

An amusing thing to contemplate - except, of course, if there were a real Cardiovascular Vampire, politicians and pundits would be falling over themselves with those kinds of announcements.

The field of AI is already over-saturated with anthropomorphisation, so we definitely shouldn't be imagining Clippy as some human-like entity that we can heroically combat, with all the rules of narrative applying. Still it can't hurt to dream up a hideous Bias Demon in its mishaped (though superficially plausible) lair, cackling in glee as someone foolishly attempts to implement an AI design without the proper safety precautions, smiling serenely as prominent futurist dismiss the risk... and dissolving, hit by the holy water of increased rationality and proper AI research. Those images might help us make the right emotional connection to what we're achieving here.

Morality is Awesome

86 [deleted] 06 January 2013 03:21PM

(This is a semi-serious introduction to the metaethics sequence. You may find it useful, but don't take it too seriously.)

Meditate on this: A wizard has turned you into a whale. Is this awesome?

Is it?

"Maybe? I guess it would be pretty cool to be a whale for a day. But only if I can turn back, and if I stay human inside and so on. Also, that's not a whale.

"Actually, a whale seems kind of specific, and I'd be suprised if that was the best thing the wizard can do. Can I have something else? Eternal happiness maybe?"

Meditate on this: A wizard has turned you into orgasmium, doomed to spend the rest of eternity experiencing pure happiness. Is this awesome?

...

"Kindof... That's pretty lame actually. On second thought I'd rather be the whale; at least that way I could explore the ocean for a while.

"Let's try again. Wizard: maximize awesomeness."

Meditate on this: A wizard has turned himself into a superintelligent god, and is squeezing as much awesomeness out of the universe as it could possibly support. This may include whales and starships and parties and jupiter brains and friendship, but only if they are awesome enough. Is this awesome?

...

"Well, yes, that is awesome."


What we just did there is called Applied Ethics. Applied ethics is about what is awesome and what is not. Parties with all your friends inside superintelligent starship-whales are awesome. ~666 children dying of hunger every hour is not.

(There is also normative ethics, which is about how to decide if something is awesome, and metaethics, which is about something or other that I can't quite figure out. I'll tell you right now that those terms are not on the exam.)

"Wait a minute!" you cry, "What is this awesomeness stuff? I thought ethics was about what is good and right."

I'm glad you asked. I think "awesomeness" is what we should be talking about when we talk about morality. Why do I think this?

  1. "Awesome" is not a philosophical landmine. If someone encounters the word "right", all sorts of bad philosophy and connotations send them spinning off into the void. "Awesome", on the other hand, has no philosophical respectability, hence no philosophical baggage.

  2. "Awesome" is vague enough to capture all your moral intuition by the well-known mechanisms behind fake utility functions, and meaningless enough that this is no problem. If you think "happiness" is the stuff, you might get confused and try to maximize actual happiness. If you think awesomeness is the stuff, it is much harder to screw it up.

  3. If you do manage to actually implement "awesomeness" as a maximization criteria, the results will be actually good. That is, "awesome" already refers to the same things "good" is supposed to refer to.

  4. "Awesome" does not refer to anything else. You think you can just redefine words, but you can't, and this causes all sorts of trouble for people who overload "happiness", "utility", etc.

  5. You already know that you know how to compute "Awesomeness", and it doesn't feel like it has a mysterious essence that you need to study to discover. Instead it brings to mind concrete things like starship-whale math-parties and not-starving children, which is what we want anyways. You are already enabled to take joy in the merely awesome.

  6. "Awesome" is implicitly consequentialist. "Is this awesome?" engages you to think of the value of a possible world, as opposed to "Is this right?" which engages to to think of virtues and rules. (Those things can be awesome sometimes, though.)

I find that the above is true about me, and is nearly all I need to know about morality. It handily inoculates against the usual confusions, and sets me in the right direction to make my life and the world more awesome. It may work for you too.

I would append the additional facts that if you wrote it out, the dynamic procedure to compute awesomeness would be hellishly complex, and that right now, it is only implicitly encoded in human brains, and no where else. Also, if the great procedure to compute awesomeness is not preserved, the future will not be awesome. Period.

Also, it's important to note that what you think of as awesome can be changed by considering things from different angles and being exposed to different arguments. That is, the procedure to compute awesomeness is dynamic and created already in motion.

If we still insist on being confused, or if we're just curious, or if we need to actually build a wizard to turn the universe into an awesome place (though we can leave that to the experts), then we can see the metaethics sequence for the full argument, details, and finer points. I think the best post (and the one to read if only one) is joy in the merely good.

[Link] Noam Chomsky Killed Aaron Schwartz

-6 Athrelon 16 January 2013 04:31PM

http://unqualified-reservations.blogspot.com/2013/01/noam-chomsky-killed-aaron-swartz.html

Summary: Moldbug on the Aaron Schwartz affair.  Power is a very real thing with real consequences for activists, yet many people don't understand the nature of power in modern times.  People like Noam Chomsky get great fame doing bad epistomology about who has power, and as a result do great harm to idealistic nerds who don't read between the lines to selectively target their attacks at weak institutions (Exxon, Pentagon) instead of strong ones (State, academica incl. MIT).

Here he returns to a theme that is one of his real contributions to blogospheric political thought: that victory in political competitions provides Bayesian information about who has power and who doesn't.  If your worldview has the underdog somehow systematically beating the overdog, your epistemology is simply wrong - in the same way, and to the same extent, as a geocentrist who has to keep adding epicycles to account for anomalous observations.

The truth is that the weapons of "activism" are not weapons which the weak can use against the strong. They are weapons the strong can use against the weak. When the weak try to use them against the strong, the outcome is... well... suicidal.

Who was stronger - Dr. King, or Bull Connor? Well, we have a pretty good test for who was stronger. Who won? In the real story, overdogs win. Who had the full force of the world's strongest government on his side? Who had a small-town police force staffed with backward hicks? In the real story, overdogs win.

"Civil disobedience" is no more than a way for the overdog to say to the underdog: I am so strong that you cannot enforce your "laws" upon me. I am strong and might makes right - I give you the law, not you me. Don't think the losing party in this conflict didn't try its own "civil disobedience." And even its own "active measures." Which availed them - what? Quod licet Jovi, non licet bovi.

This means that activists like King, Schwartz, and Assange are only effective in bullying the weak, not standing up to the strong (despite conventional narratives that misassign strengths to institutions).  When such activists stop following the script, and naively use the same tactics to attack strong institutions, reality reasserts itself quite forcefully:


You know, when I read that Assange had his hands on a huge dump of DoD and State documents, I figured we would never see those cables. Sure enough, the first thing he released was some DoD material.

Why? Well, obviously, Assange knew the score. He knew that Arlington is weak and Georgetown is strong. He knew that he could tweak Arlington's nose all day long and party on it, making big friends in high society, and no one would even think about reaching out and touching him. Or so I thought.

In fact, my cynicism was unjustified. In fact, Assange turned out to be a true believer, not a canny schemer. He was not content to wield his sword against the usual devils of the Chomsky narrative. Oh no, the poor fscker believed that he was actually there to take on the actual powers that be. Who are actually, of course, unlike the cartoon villains... strong. If he didn't know that... he knows it now!

...But had Aaron Swartz plugged his laptop into the Exxon internal network and downloaded everything Beelzebub knows about fracking, he would be a live hero to this day. Why? Because no ambitious Federal prosecutor in the 21st century would see a route to career success through hounding some activist at Exxon's behest...

But when you take on a genuinely respected institution - whether State or MIT - your "civil disobedience" has all the prospects of George Wallace in the schoolhouse door.

For most of us, figuring out what political figures are powerful is just a fun way to waste time online.  But if we're serious about producing a good map, the map has to approximate the territory, and make appropriate predictions about history and current events.  And for the few people who aspire to actually create political change, such as Mr. Schwartz, this is not just an academic exercise but a matter of life and death.

Then he takes his beliefs seriously, and speaks actual truth to actual power. Well, ya know, power doesn't like that much.

'Life exists beyond 50'

-11 hankx7787 14 January 2013 02:28PM

View more: Prev | Next