Today's post, Pascal's Mugging: Tiny Probabilities of Vast Utilities was originally published on 19 October 2007. A summary (taken from the LW wiki):
An Artificial Intelligence coded using Solmonoff Induction would be vulnerable to Pascal's Mugging. How should we, or an AI, handle situations in which it is very unlikely that a proposition is true, but if the proposition is true, it has more moral weight than anything else we can imagine?
Discuss the post here (rather than in the comments to the original post).
This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was "Can't Say No" Spending, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.
Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.
An AI should treat a Pascal's Mugger as an agent trying to arbitrarily gain access to it's root systems without proper authority, or phrased more simply, an attack.
To explain why, consider this statement in the original article:
If something is allowed to override EVERYTHING on a computer, it seems functionally identical to saying that it has root access.
Since Pascal's Mugging is commonly known and discussed on the internet, having it be equivalent to a root password would be a substantial security hole, like setting your root password to "password"
An AI would presumably have to have some procedure in case someone was attempting unauthorized access. That procedure would need to trigger FIRST, before considering the argument on the merits. Once that procedure is triggered, there argument is no longer being considered on the merits, it is being considered as an attack. Saying "Well, but what if there REALLY ARE 3^^^^3 lives at stake?" seems to be equivalent to saying "Well, but what if the prince of Nigeria REALLY IS trying to give me 1 million dollars according to the email in my spam box?"