antigonus comments on On self-deception - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (83)
If before you open the book, you believe that the book will provide incredibly compelling evidence of Zoroastrianism whether or not Zoroastrianism is true, and upon opening the book you find incredibly compelling evidence of Zoroastrianism, your probability of Zoroastrianism should not change, since you didn't observe any evidence which is more likely to exist if Zoroastrianism were true than if it were not true.
It may be that you are underestimating the AI's cleverness, so that you expect to see decent evidence of Zoroastrianism, but in fact you found incredible evidence of Zoroastrianism, and so you become convinced. In this case your false belief about the AI not being too convincing is doing the philosophical work of deceiving you, and it's no longer really deceiving yourself. Deceiving yourself seems to be more about starting with all correct beliefs, but talking yourself into an incorrect belief.
If you happen to luck out into having a false belief about the AI being unconvincing, and if this situation with the library of theology just falls out of the sky without your arranging it, you got lucky - but that's being deceived by others. If you try to set up the situation, you can't deliberately underestimate the AI because you'll know you're doing it. And you can't set up the theological library situation until you're confident you've deliberately underestimated the AI.
You may want to look at Brandon Fitelson's short paper Evidence of evidence is not (necessarily) evidence. You seem to be arguing that, since we have strong evidence that the book has strong evidence for Zoroastrianism before we read it, it follows that we already have (the most important part of) our evidence for Zoroastrianism. But it turns out that it's extremely tricky to make this sort of reasoning work. To use the most primitive example from the paper, discovering that a playing card C is black is evidence that C is the ace of spades. Furthermore, that C is the ace of spades is excellent evidence that it's an ace. But discovering that C is black does not give you any evidence whatsoever that C is an ace.
The problem here - at least one of them - is that discovering C is black is just as much evidence for C being the x of spades for any other card-value x. Similarly, before opening the book on Zoroastrianism, we have just as much evidence for the existence of strong evidence for Christianity/atheism/etc, so our credences shouldn't suddenly start favoring any one of these. But once we learn the evidence for Zoroastrianism, we've acquired new information, in just the same way that learning that the card is an ace of spades provides us new information if we previously just knew it was black.
I do suspect that there are relevant disanalogies here, but don't have a very detailed understanding of them.
Not exactly. I do think this would be a true statement, if the book was a genuine book on Zoroastrianism and not a book which we know was designed to deceive us. But as far as I know it's only tangentially connected to the argument I'm making.
Thanks for summarizing the paper; I tried to read it but it was written in a way that seemed designed to be as obscure as possible. Your explanation makes more sense.
But I still don't see the problem. Learning a card is black increases the chance it's the ace of spades or clubs, but decreases the chance it's the ace of hearts or diamonds. The chance that it's the ace of spades becomes greater, but the net chance that it's an ace remains exactly the same. Evidence of evidence is still evidence, but evidence of evidence plus evidence of evidence that goes the opposite direction cancel out and make zero evidence.
Again, I'm not sure about the relevance here. It's not the case that, merely by knowing the book exists without reading it, we have new evidence for the existence of some evidence which both supports and, in a different way, opposes Zoroastrianism.
(I guess I'd say it was written in a way designed to be precise. But I agree that the author isn't the best writer.)
I find this sentence hard to make sense of. Based on the first part of the sentence, you seem to be suggesting that the problem in the card scenario is that our evidence^2 (= the card is black) is both evidence^1 for the card being an ace and evidence^1 against the card being an ace, and the two pieces of evidence^1 balance out to yield the same total probability of the card being an ace as before. But clearly no single piece of information, such as the card being black, can provide evidence^1 both for and against a given hypothesis. It either yields evidence^1 or it doesn't. And if it doesn't, then evidence^2 is not always evidence^1.
Anyway, the relevance is this: When we learn the card is black, we acquire evidence for a bunch of different pieces of information which, taken on their own, have varying probabilistic effects on the hypothesis that the card is an ace. These effects add up in such a way as to leave the posterior probability of the hypothesis untouched. But once we actually learn one of these individual pieces of information, suddenly the posterior shoots way up.
Similarly, before we read the book, we have evidence for a bunch of different pieces of information which, taken on their own, have varying probabilistic effects on the truth of Zoroastrianism. These effects add up in such a way as to leave our posterior in Zoroastrianism untouched (assuming we don't consider non-book-possessing religions). So why is it that when we learn one of these pieces of information by reading the book, our posterior shouldn't change, unlike in the card case?
Thank you, that's what I was trying to get at, but didn't know how.
O.K., here's a disanalogy that may be important. In the card case, learning that C is the ace of spades should drastically lower our credence of the card being another x of spades. On the other hand, after reading the Zoroastrianism book, we shouldn't significantly doubt that the other books contain strong evidence, as well, given the known capabilities of the AI.
This isn't a very formal treatment, though.
"Evidence of evidence is more likely to be filtered evidence" is a more accurate phrasing.
I'm not exactly sure what the "more likely" here means. More likely than what?
The link keeps omitting the colon in the "http://." I don't know why it's doing that.
Evidence of evidence is not (necessarily) evidence
Markup code: