In this case the only reason the money pumping doesn't work is because Omega is unable to choose its policy based on its prediction of your second decision: If it could, you would want to switch back to b, because if you chose a, Omega would know that and you'd get 0 payoff. This makes the situation after the coinflip different from the original problem where Omega is able to see your decision and make its decision based on that.
In the Allais problem as stated, there's no particular reason why the situation where you get to choose between $24,000, or $27,000 with 33/34 chance, differs depending on whether someone just offered it to you, or if they offered it to you only after you got <=34 on a d100.
Is everybody's code going to be in Python?
What are the rules about program runtime?
A common concern around here seems to be that, without massive and delicate breakthroughs in our understanding of human values, any superintelligence will destroy all value by becoming some sort of paperclip optimizer. This is what Eliezer claims in Value is Fragile. Therefore, any vision of the future that manages to do better than this without requiring huge philosophical breakthroughs (in particular, a future that doesn’t know how to implement CEV before the Singularity happens) is encouraging to me as a proof of concept for how the future might be more likely to go well.
In a future where uploading minds into virtual worlds becomes possible before an AI takeover, there might well be a way to salvage quite a lot of human value with a very comparatively simple utility function: simply create a big virtual world and upload lots of people into it, then have the AI’s whole goal be to run this simulation for as long as possible.
This idea of “just run this program” seems a lot more robust and more likely to work and less likely to be exploited than attempting to maximize some utility function meant to represent human values, and the result would probably be better than what would happen if the latter went wrong. I suspect it would be well within the capability of a society which can upload minds to create a virtual world for these minds where the only scarce resource is computation cycles and there is no way to forcibly detain someone, so this virtual world would not have many of the problems our current world has.
This is far from a perfect outcome, of course. The AI would likely destroy everything it touches for resources, killing everyone not fortunate enough to get uploaded. And there are certainly other problems with any idea of “virtual utopia” we could come up with. But this idea gives me hope because it might be improved upon, and because it is a way that we don’t lose everything even if CEV proves too hard of a problem to solve before Singularity.
Thanks for the link, I will check it out!
As for cannibalism, it seems to me that its role in Eliezer's story is to trigger a purely illogical revulsion in the humans who antropomorphise the aliens.
I dunno about you but my problem with the aliens isn't that it is cannibalism but that the vast majority of them die slow and horribly painful deaths
No cannibalism takes place, but the same amount of death and suffering is present as in Eliezer's scenario. Should we be less or more revolted at this?
The same.
Which scenario has the greater moral weight?
Neither. They are both horrible.
Should we say the two-species configuration is morally superior because they've developed a peaceful, stable society with two intelligent species coexisting instead of warring and hunting each other?
Not really because most of them still die slow and horribly painful deaths.
Sorry to necro this here, but I find this topic extremely interesting and I keep coming back to this page to stare at it and tie my brain in knots. Thanks for your notes on how it works in the logically uncertain case. I found a different objection based on the assumption of logical omniscience:
Regarding this you say:
Perhaps you think that the problem with the above version is that I assumed logical omniscience. It is unrealistic to suppose that agents have beliefs which perfectly respect logic. (Un)Fortunately, the argument doesn't really depend on this; it only requires that the agent respects proofs which it can see, and eventually sees the Löbian proof referenced.
However, this assumes that the Löbian proof exists. We show that the Löbian proof of A=cross→U=−10 exists by showing that the agent can prove □(A=cross→U=−10)→(A=cross→U=−10), and the agent's proof seems to assume logical omniscience:
Examining the agent, either crossing had higher expected utility, or P(cross)=0. But we assumed □(A=cross→U=−10), so it must be the latter. So the bridge gets blown up.
If □ here means "provable in PA", the logic does not follow through if the agent is not logically omniscient: the agent might find crossing to have a higher expected utility regardless, because it may not have seen the proof. If □ here instead means "discoverable by the agent's proof search" or something to that effect, then the logic here seems to follow through (making the reasonable assumption that if the agent can discover a proof for A=cross->U=-10, then it will set its expected value for crossing to -10). However, that would mean we are talking about provability in a system which can only prove finitely many things, which in particular cannot contain PA and so Löb's theorem does not apply.
I am still trying to wrap my head around exactly what this means, since your logic seems unassailable in the logically omniscient case. It is counterintuitive to me that the logically omniscient agent would be susceptible to trolling but the more limited one would not. Perhaps there is a clever way for the troll to get around this issue? I dunno. I certainly have no proof that such an agent cannot be trolled in such a way.
That's what I was thinking. Garbage in, garbage out.
That's beside the point. In the first case you'd take 1A in the first game, and 2A in the 2nd game(34% chance of living is better than 33%). In the 2nd case, if you bothered to play at all, you'd probably take 1B/2B. What doesn't make sense is taking 1A and 2B. That policy is inconsistent no matter how you value different amounts of money (unless you don't care about money at all in which case do whatever, the paradox is better illustrated with something you do care about) so things like risk, capital cost, diminishing returns etc are beside the point.