Open Thread, February 15-29, 2012

OpenThreadGuy

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

The dangers pointed to by the thought experiment aren't restricted to exploitation by an outside entity. An AI should be able to safely consider the hypothesis "If I don't destroy my future light cone, 3^^^3 people outside the universe will be killed" regardless of where the hypothesis came from.

But even if we're just worried about mugging, how could you possibly weight it enough? Even if paying once doomed me to spend the rest of my life paying $5 to muggers, the utility calculation still works out the same way.

But even if we're just worried about mugging, how could you possibly weight it enough? Even if paying once doomed me to spend the rest of my life paying $5 to muggers, the utility calculation still works out the same way.

My idea is as follows:

Mugger: Give me 5 dollars, or I'll torture 3^^^3 sentient people across the omniverse using my undetectable magical powers.
AI: If I make my decision on this and similar trades based on a decision process DP0 of comparing the disutility(3^^^3 torture) P(you're telling the truth) compared to the disutility(giving ... (read more)

0TheOtherDave14y

I agree with your first paragraph, but I'm not convinced of your second paragraph... at least, if you intend it as a rhetorical way of asserting that there is no possible way to weight the evidence properly. It's just another proposition; there's evidence for and against it. I think we get confused here because we start with our bottom line already written. I "know" that the EV of destroying my light cone is negative. But theory seems to indicate that, when assigning a confidence interval P1 to the statement "Destroying my future light cone will preserve 3^^^3 extra-universal people" (hereafter, statement S1), a well-calibrated inference engine might assign P1 such that the EV of destroying my light cone is positive. So I become anxious, and I try to alter the theory so that the resulting P1s are aligned with my pre-existing "knowledge" that the EV of destroying my light cone is negative. Ultimately, I have to ask what I trust more: the "knowledge" produced by the poorly calibrated inference engine that is my brain, or the "knowledge" produced by the well-calibrated inference engine I built? If I trust the inference engine, then I should trust the inference engine.

6

Open Thread, February 15-29, 2012

6

6

6

Open Thread, February 15-29, 2012

6

6