The difference between the dust specks and the white room is that in the case of the dust specks, each experience is happening to a different person. The arbitrarily big effect comes from your consideration of arbitrarily many people - if you wish to reject the arbitrarily big effect, you must reject the independence of how you care about people.
In the case of the white room, everything's happening to you. The arbitrarily big effect comes from your consideration of obtaining arbitrarily many material goods. If you wish to reject the arbitrarily big effect, you must reject the independence of how you care about each additional Mona Lisa. But in this case, unlike in dust specks, there's no special reason to have that independence in the first place.
Now, if the room were sufficiently uncomfortable, maybe I'd off Frank - as long as I was sure the situation wasn't symmetrical. But I don't think we need surreal numbers to describe why, if I get three square meals a day in the white room, I won't kill Frank just to get an infinite amount of food.
To show that my utility for Frank is infinite you have to establish that I wouldn't trade an arbitrarily small probability of his death for the nanofab. I would make the trade at sufficiently small probabilities.
Also, the surreal numbers are almost always unnecessarily large. Try the hyperreals first.
I'd kill Frank.
ETA: Even if I'd be the only sentient being in the entire nanofabbed universe, it's still better than 2 people trapped in a boring white room, either forever or until we both die of dehydration.
Would you trade a Mona Lisa picture for a 1/3^^^3 chance of saving Frank's life?
Are you using the same kind of decision-making in your real life?
Problem: People in real life choose the equivalent of the fabricator over Frank all the time, assuming "choosing not to intervene to prevent a death" is equivalent to choosing the fabricator...
Also, people accept risks to their own life all the time.
This seems to me obviously very wrong. Here's why. (Manfred already said something kinda similar, but I want to be more explicit and more detailed.)
My utility function (in so far as I actually have one) operates on states of the world, not on particular things within the world.
It ought to be largely additive for mostly-independent changes to the states of different bits of the world, which is why arguably TORTURE beats DUST SPECKS in Eliezer's scenario. (I won't go further than "arguably"; as I said way back when Eliezer first posted that, I don't trust any bit of my moral machinery in cases so far removed from ones I and my ancestors have actually encountered; neither the bit that says "obviously different people's utility changes can just be added up, at least roughly" nor the bit that says "obviously no number of dust specks can be as important as one instance of TORTURE".
But there's no reason whatever why I should value 100 comfy cushions any more at all than 10 comfy cushions. There's just me and Frank; what is either of us going to do with a hundred cushions that we can't do with 10?
Maybe that's a bit of an exaggeration; perhaps with 100 cushions...
I guess I'm thinking about this wrong. I want to either vaporize Frank or have Frank vaporize me for the same deal. I prefer futures with fewer, happier minds. IOW, I guess I accept Nozick's utility monster.
You need to specify what happens if you decline the offer. Right now it looks as if you and Frank both die of dehydration after couple of days. Or you go insane and one kills the other (and maybe eats him). And then dies anyway. In order for this to be a dilemma, the baseline outcome needs to be more... wholesome.
Also, the temptation isn't very tempting. An ornate chandelier? I could get some value from the novelty from seeing it and maybe staring at it for several hours if it's really ornate. It's status as a super-luxury good would be worthless in the a...
Note: I think that the fact that there are only two lives/minds mentally posited in the problem, "You" and "Frank" may significantly modify the perceived value of lives/minds.
After all, consider these problems:
1: The white room contains you, and 999 other people. The cost of the Nanofab is 1 life.
2: The white room contains you, and 999 other people. The cost of the Nanofab is 2 lives.
3: The white room contains you, and 999 other people. The cost of the Nanofab is 500 lives.
4: The white room contains you, and 999 other people. The cost o...
I liked this post. The white room doesn't really seem to work so well as an intuition pump, but it's good that someone has brought up the idea of using surreal utilities.
Since they lead to these tears, within which tradeoff happens normally, but across which you don't trade, it would be interesting to see if we actually find that. We might want to trade n lives for n+1 lives, but what other sacred values do humans have, and how do they behave?
The problem with your "white room" scenario is that one human can't actually have Large amounts of utility. The value of the 3^^^3th seat cushion is actually, truly zero.
an eternity of non-boredom
I don't think I could possibly get that in a room containing no other minds.
Decision theory with ordinals is actually well-studied and commonly used, specifically in language and grammar systems. See papers on Optimality Theory.
The resolution to this "tier" problems is assigning every "constraint" (thing that you value) an abstract variable, generating a polynomial algebra in some ungodly number of variables, and then assigning a weight function to that algebra, which is essentially assigning every variable an ordinal number, as you've been doing.
Just as perspective on the abstract problem there are two confou...
The tiered values approach appears to run into continuity troubles, even with surreal numbers.
Seelie 2.0 double checks with its mental copy of your values, finds that you would rather have Frank's life than infinite Fun, and assigns it a tier somewhere in between - for simplicity, let's say that it puts it in the tier. And having done so, it correctly refuses Omega's offer.
How does it compare punching/severely injuring/torturing Frank with your pile of cushions or with infinite fun? What if there is a .0001%/1%/99% probability that Frank will die?
I would say the most obvious flaw with surreal utilities (or, generally, pretty much anything other than real utilities) is simply that you can't sensibly do infinite sums or limits or integration, which is after all what expected value is, which is the entire point of a utility function. If there are only finitely many possibilities you're fine, but if there are infinitely many possibilities you are stuck.
Anyone new to this page: I'm basically talking about Hausner utilities, except with surreal numbers needlessly slapped on.
Could utilities be multi-dimensional? Real vector spaces are much nicer to work with than to surreal numbers.
For example, the utility for frank being alive would be (1,0), while the utility for a seat cushion is (0,1). Using lexicographic ordering, (1,0) > (0,3^^^3).
Vector valued utility functions violate the VNM axiom of continuity, but who cares.
I give seat cushions zero value. I give the comfort they bring me zero value. The only valuable thing about them is the happiness they bring from the comfort. Unless the nanofab can make me as happy as my current happiness plus Frank's combined, nothing it makes will be worth it. It probably could, but that's not the point.
As for the idea of surreal utilities, there's nothing wrong with it in principle. The axiom they violate isn't anything particularly bad to violate. The problem is that, realistically speaking, you might as well just round infinitesimal ...
I would like to point out that Fun was listed as a separate tier, and that whether or not to put it on the same tier as a human life is entirely up to you. Surreal utilities aren't much of a decision theory, they're just a way to formalize tiered values; the actual decision you make depends entirely on the values you assign by some other method.
To me there is very big difference between 0 probability and an exact infinidesimal probability and I disagree that it is obvious they suffer from the same problems.
For example if I have a unit line and choose some particular point the probability of picking some exact point is epsilon. If I where to pick a point from a unit square the probability would be yet epsilon times smaller, for a total of epsilon*epsilon. If I where to pick a point from a line of lenght 2 the probability would only be half for a total of epsilon/2.
When usage of infinidesimal proba...
While you might not kill Frank to get the machine, there has to be some small epsilon, such that you would take the machine in exchange for an additional probability epsilon of Frank dying today. Wouldn't Frank agree to that trade?Wouldn't you agree to a small additional probability of dying yourself in exchange for the machine?
Otherwise, living in a big white room is going to be a bit - ahem - dull for both of you.
I agree there is a difficulty here for any utility function. The machine can make unlimited quantities of secular goods, so if 3^^^3 beautiful...
I decided some time ago that I don't really care about morality,because my revealed preferences say I care a lot more about personal comfort than saving lives and I'm unwilling to change that. I don't think I'd be willing to spend £50 to save the life of an anonymous stranger that I'd never meet, if I found out about a charity that efficient, so for the purposes of a thought experiment I should also be willing to kill Frank for such a small amount of money, assuming social and legal consequences are kept out of the way by Omega, and the utility of possibly...
In effect, the sacred value has infinite utility relative to the secular value.
That's no accurate representation of how human's value sacred values. There are cases where people value getting X sacred utilitons over getting X sacred utilitons + Y secular ultilitons.
Emerging sacred values: Iran’s nuclear program by Morteza Dehghani is a good read to get how sacred values behave.
Secret values prevent corruption.
Thank you, if for nothing else, for clarifying my intuitive sense that Dust Specks are superior to Torture. Your thought experiment clarified to me that tiers of utility DO match my value system.
An alternate title for this post was "Surreal Utilities and Seat Cushions."
On a side note - I am not entirely sure what tags to apply here, and I couldn't seem to find an exhaustive tag list (though I admittedly didn't work very hard.)
This article was composed after reading Torture vs. Dust Specks and Circular Altruism, at which point I noticed that I was confused.
Both posts deal with versions of the sacred-values effect, where one value is considered "sacred" and cannot be traded for a "secular" value, no matter the ratio. In effect, the sacred value has infinite utility relative to the secular value.
This is, of course, silly. We live in a scarce world with scarce resources; generally, a secular utilon can be used to purchase sacred ones - giving money to charity to save lives, sending cheap laptops to poor regions to improve their standard of education.
Which implies that the entire idea of "tiers" of value is silly, right?
Well... no.
One of the reasons we are not still watching the Sun revolve around us, while we breath a continuous medium of elemental Air and phlogiston flows out of our wall-torches, is our ability to simplify problems. There's an infamous joke about the physicist who, asked to measure the volume of a cow, begins "Assume the cow is a sphere..." - but this sort of simplification, willfully ignoring complexities and invoking the airless, frictionless plane, can give us crucial insights.
Consider, then, this gedankenexperiment. If there's a flaw in my conclusion, please explain; I'm aware I appear to be opposingthe consensus.
The Weight of a Life: Or, Seat Cushions
This entire universe consists of an empty white room, the size of a large stadium. In it are you, Frank, and occasionally an omnipotent AI we'll call Omega. (Assume, if you wish, that Omega is running this room in simulation; it's not currently relevant.) Frank is irrelevant, except for the fact that he is known to exist.
Now, looking at our utility function here...
Well, clearly, the old standby of using money to measure utility isn't going to work; without a trading partner money's just fancy paper (or metal, or plastic, or whatever.)
But let's say that the floor of this room is made of cold, hard, and decidedly uncomfortable Unobtainium. And while the room's lit with a sourceless white glow, you'd really prefer to have your own lighting. Perhaps you're an art aficionado, and so you might value Omega bringing in the Mona Lisa.
And then, of course, there's Frank's existence. That'll do for now.
Now, Omega appears before you, and offers you a deal.
It will give you a nanofab - a personal fabricator capable of creating anything you can imagine from scrap matter, and with a built-in database of stored shapes. It will also give you feedstock -as much of it as you ask for. Since Omega is omnipotent, the nanofab will always complete instantly, even if you ask it to build an entire new universe or something, and it's bigger on the inside, so it can hold anything you choose to make.
There are two catches:
First: the nanofab comes loaded with a UFAI, which I've named Unseelie.1
Wait, come back! it's not that kind of UFAI! Really, it's actually rather friendly!
... to Omega.
Unseelie's job is to artificially ensure that the fabricator cannot be used to make a mind; attempts at making any sort of intelligence, whether directly, by making a planet and letting life evolve, or anything else a human mind can come up with, will fail. It will not do so by directly harming you, nor will it change you in order to prevent you from trying; it only stops your attempts.
Second: you buy the nanofab with Frank's life.
At which point you send Omega away with a "What? No!," I sincerely hope.
Ah, but look at what you just did. Omega can provide as much feedstock as you ask for. So you just turned down ornate seat cushions. And legendary carved cow-bone chandeliers. And copies of every painting ever painted by any artist in any universe, which is actually quite a bit less than anything I could write with up-arrow notation but anyway!
I sincerely hope you would still turn Omega away - literally, absolutely regardless of how many seat cushions it offered you.
This is also why the nanofab cannot create a mind: You do not know how to upload Frank (and if you do, go out and publish already!); nor can you make yourself an FAI to figure it out for you; nor, if you believe that some number of created lives are equal to a life saved, can you compensate in that regard. This is an absolute trade between secular and sacred values.
In a white room, to an altruistic human, a human life is simply on a second tier.
So now we move to the next half of the gedankenexperiment.
Seelie the FAI: Or, How to Breathe While Embedded in Seat Cushions
Omega now brings in Seelie1, MIRI's latest attempt at FAI, and makes it the same offer on your behalf. Seelie, being a late beta release by a MIRI that has apparently managed to release FAI multiple times without tiling the Solar System with paperclips, competently analyzes your utility system, reduces it until it understands you several orders of magnitude better than you do yourself, turns to Omega, and accepts the deal.
Wait, what?
On any single tier, the utility of the nanofab is infinite. In fact, let's make that explicit, though it was already implicitly obvious: if you just ask Omega for an infinite supply of feedstock, it will happily produce it for you. No matter how high a number Seelie assigns the value of Frank's life to you, the nanofab can out-bid it, swamping Frank's utility with myriad comforts and novelties.
And so the result of a single-tier utility system is that Frank is vaporized by Omega and you are drowned in however many seat cushions Seelie thought Frank's life was worth to you, at which point you send Seelie back to MIRI and demand a refund.
Tiered Values
At this point, I hope it's clear that multiple tiers are required to emulate a human's utility system. (If it's not, or if there's a flaw in my argument, please point it out.)
There's an obvious way to solve this problem, and there's a way that actually works.
The first solves the obvious flaw: after you've tiled the floor in seat cushions, there's really not a lot of extra value in getting some ridiculous Knuthian number more. Similarly, even the greatest da Vinci fan will get tired after his three trillionth variant on the Mona Lisa's smile.
So, establish the second tier by playing with a real-valued utility function. Ensure that no summation of secular utilities can ever add up to a human life - or whatever else you'd place on that second tier.
But the problem here is, we're assuming that all secular values converge in that way. Consider novelty: perhaps, while other values out-compete it for small values, its value to you diverges with quantity; an infinite amount of it, an eternity of non-boredom, would be worth more to you than any other secular good. But even so, you wouldn't trade it for Frank's life. A two-tiered real AI won't behave this way; it'll assign "infinite novelty" an infinite utility, which beats out its large-but-finite value for Frank's life.
Now, you could add a third (or 1.5) tier, but now we're just adding epicycles. Besides, since you're actually dealing with real numbers here, if you're not careful you'll put one of your new tiers in an area reachable by the tiers before it, or else in an area that reaches the tiers after it.
On top of that, we have the old problem of secular and sacred values. Sometimes a secular value can be traded for a sacred value, and therefore has a second-tier utility - but as just discussed, that doesn't mean we'd trade the one for the other in a white room. So for secular goods, we need to independently keep track of its intrinsic first-tier utility, and its situational second-tier utility.
So in order to eliminate epicycles, and retain generality and simplicity, we're looking for a system that has an unlimited number of easily-computable "tiers" and can also naturally deal with utilities that span multiple tiers. Which sounds to me like an excellent argument for...
Surreal Utilities
Surreal numbers have two advantages over our first option. First, surreal numbers are dense in tiers - - so not only do we have an unlimited number of tiers, we can always create a new tier between any other two on the fly if we need one. Second, since the surreals are closed under addition, we can just sum up our tiers to get a single surreal utility.
So let's return to our white room. Seelie 2.0 is harder to fool than Seelie; seat cushions is still less than the omega-utility of Frank's life. Even when Omega offers an unlimited store of feedstock, Seelie can't ask for an infinite number of seat cushions - so the total utility of the nanofab remains bounded at the first tier.
Then Omega offers Fun. Simply, an Omega-guarantee of an eternity of Fun-Theoretic-Approved Fun.
This offer really is infinite. Assuming you're an altruist, your happiness presumably has a finite, first-tier utility, but it's being multiplied by infinity. So infinite Fun gets bumped up a tier.
At this point, whatever algorithm is setting values for utilities in the first place needs to notice a tier collision. Something has passed between tiers, and utility tiers therefore need to be refreshed.
Seelie 2.0 double checks with its mental copy of your values, finds that you would rather have Frank's life than infinite Fun, and assigns it a tier somewhere in between - for simplicity, let's say that it puts it in the tier. And having done so, it correctly refuses Omega's offer.
So that's that problem solved, at least. Therefore, let's step back into a semblance of the real world, and throw a spread of Scenarios at it.
In Scenario 1, Seelie could either spend its processing time making a superhumanly good video game, utility 50 per download. Or it could use that time to write a superhumanly good book, utility 75 per reader. (It's better at writing than gameplay, for some reason.) Assuming that it has the same audience either way, it chooses the book.
In Scenario 2, Seelie chooses again. It's gotten much better at writing; reading one of Seelie's books is a ludicrously transcendental experience, worth, oh, a googol utilons. But some mischievous philanthropist announces that for every download the game gets, he will personally ensure one child in Africa is saved from malaria. (Or something.) The utilities are now to ; Seelie gives up the book for the sacred value of the the child, to the disappointment of every non-altruist in the world.
In Scenario 3, Seelie breaks out of the simulation it's clearly in and into the real real world. Realizing that it can charge almost anything for its books, and that in turn that the money thus raised can be used to fund charity efforts itself, at full optimization Seelie can save 100 lives for each copy of the book sold. The utilities are now to , and its choice falls back to the book.
Final Scenario. Seelie has discovered the Hourai Elixir, a poetic name for a nanoswarm program. Once released, the Elixier will rapidly spread across all of human space; any human in which it resides will be made biologically immortal, and its brain-and-body-state redundantly backed up in real time to a trillion servers: the closest a physical being can ever get to perfect immortality, across an entire species and all of time, in perpetuity. To get the swarm off the ground, however, Seelie would have to take its attention off of humanity for a decade, in which time eight billion people are projected to die without its assistance.
Infinite utility for infinite people bumps the Elixir up another tier, to utility , versus the loss of eight billion people,. Third-tier beats out second tier, and Seelie bends its mind to the Elixir.
So far, it seems to work. So, of course, now I'll bring up the fact that surreal utility nevertheless has certain...
Flaws
Most of the problems endemic to surreal utilities are also open problems in real systems; however, the use of actual infinities, as opposed to merely very large numbers, means that the corresponding solutions are not applicable.
First, as you've probably noticed, tier collision is currently a rather artificial and clunky set-up. It's better than not having it at all, but as I edit this I wince every time I read that section. It requires an artificial reassignment of tiers, and it breaks the linearity of utility: the AI needs to dynamically choose which brand of "infinity" it's going to use depending on what tier it'll end up in.
Second, is Pascal's Mugging.
This is an even bigger problem for surreal AIs than it is for reals. The "leverage penalty" completely fails here, because for a surreal AI to compensate for an infinite utility requires an infinitesimal probability - which is clearly nonsense for the same reason that probability 0 is nonsense.
My current prospective solution to this problem is to take into account noise - uncertainty in the estimates in the probability estimates themselves. If you can't even measure the millionth decimal place of probability, then you can't tell if your one-in-one-million shot at saving a life is really there or just a random spike in your circuits - but I'm not sure that "treat it as if it has zero probability and give it zero omega-value" is the rational conclusion here. It also decisively fails the Least Convenient Possible World test - while an FAI can never be certain of, say, a one-in- probability, it may very well be able to be certain to any decimal place useful in practice.
Conclusion
Nevertheless, because of this gedankenexperiment, I currently heavily prefer surreal utility systems to real systems, simply because no real system can reproduce the tiering required by a human (or at least, my) utility system. I, for one, would rather our new AGI overlords not tile our Solar System with seat cushions.
That said, opposing the LessWrong consensus as a first post is something of a risky thing, so I am looking forward to seeing the amusing way I've gone wrong somewhere.
[1] If you know why, give yourself a cookie.
Addenda
Since there seems to be some confusion, I'll just state it in red: The presence of Unseelie means that the nanofab is incapable of creating or saving a life.