2 min read
by pwno
2 min read

-1

I used to think that, given two equally capable individuals, the person with more true information can always do at least as good as the other person. And hence, one can only gain from having true information. There is one implicit assumption that makes this line of reason not true in all cases. We are not perfectly rational agents; our mind isn’t stored in a vacuum, but in a Homo sapien brain.There are certain false beliefs that benefit you by exploiting your primitive mental warehouse, e.g., self-fulfilling prophecies.

Despite the benefits, adopting false beliefs is an irrational practice. If people never acquire the maps that correspond the best to the territory, they won’t have the most accurate cost-benefit analysis for adopting false beliefs. Maybe, in some cases, false beliefs make you better off. The problem is you'll have a wrong or sub-optimal cost-benefit analysis, unless you first adopt reason.

Also, it doesn’t make sense to say that the rational decision could be to “have a false belief” because in order to make that decision, you would have to compare that outcome against “having a true belief.” But in order for a false belief to work, you must truly believe in it — you cannot deceive yourself into believing the false belief after knowing the truth! It’s like figuring out that taking a placebo leads to the best outcome, yet knowing it’s a placebo no longer makes it the best outcome.

Clearly, it is not in your best interest to choose to believe in a falsity—but what if someone else did the choosing? Can’t someone whose advice you rationally trust be the decider of whether to give you false information or not (e.g. a doctor deciding whether you receive a placebo or not)? They could perform a cost-benefit analysis without diluting the effects of the false belief. We only want to know the truth, but prefer to be unknowingly lied to in some cases.

Which brings me to my question: do we program an AI to only tell us the truth or to lie when the AI believes (with high certainty) the lie will lead us to a net benefit over our expected lifetime?

Added: Keep in mind that knowledge of the truth, even for a truth-seeker, is finite in value. The AI can believe that the benefit of a lie would outweigh a truth-seeker's cost of being lied to. So unless someone values the truth above anything else (which I highly doubt), would a truth-seeker ever choose only to be told the truth from the AI?

New Comment


32 comments, sorted by Click to highlight new comments since:

This topic has already been done to death, and then some.

Ah, my bad.

This fact should be fixed by some articles on the wiki, with references to the previous discussions (so that you can substantiate you claim with a link, explaining to any newcommer where to look). This point seems to fit in the concepts of Truth and Self-deception; the latter article is currently almost empty.

ETA: Costs of rationality is a closer match.

The wiki is a good starting tool, but it's not yet as fully developed as I would like. I'm still working to develop sufficient background knowledge of the discussions, assumptions, and definitions used in Less Wrong so as to be sufficiently confident in commenting.

So I will forgive the occasions when someone who sincerely wants information and thoughtful reactions stumbles into spaces that have already been well-trodden.

Nevertheless, the wiki itself isn't yet fully developed with interconnections and links to definitions: until such internal tagging is complete, newer people will sometimes fail to find what they are searching for and will instead ask it directly.

I welcome these questions being asked, and if only as a sign that Less Wrong does not encourage self-censorship (which, I gather from conversations elsewhere, may have been a concern on Overcoming Bias).

Of course the wiki is still in its infancy. All the more reason to shape it by contributing your own synthesis of the discussed concepts, especially when these concepts leave the impression of having been discussed to death.

It’s like figuring out that taking a placebo leads to the best outcome, yet knowing it’s a placebo no longer makes it the best outcome.

Unless what counts for whether the placebo works is what your System 1 believes, what counts for your rational cost-benefit analysis is what your System 2 believes, and you somehow manage to keep them separate (which is hard but not always impossible).

Yep, done to death. Eliezer's answer: The Third Alternative. I hereby disclose that I downvoted you.

Which brings me to my question: do we program an AI to only tell us the truth or to lie when the AI believes (with high certainty) the lie will lead us to a net benefit over our expected lifetime?

This question is too anthropomorphic. Since you clearly mean a strong AI, it's no longer a question-answering machine, it's an engine for a supervised universe. At which point you ought to talk about the organization of the future, for example of fun theory.

Well, the AI wouldn't only have to predict how the lie will benefit the person who hears it, but also how the actions that result from holding a false belief might affect other individuals.

The above quibble aside, the answer to your question is pretty trivial. To a person who values the truth, knowledge is a benefit and will therefore be part of the AI's 'net benefit' calculation. Consequently, the kind of person who would want to program an AI to only tell him the truth would never be lied to by an AI that does a net benefit calculation.

The only question that remains is if we would want to program an AI to always tell the truth to people who want to be deceived at least some of the time. In other words, do we want other individuals to believe whatever they like, as long as it's a net benefit to them and doesn't affect the rest of us? My answer would be yes.

Of course, in the real world, popular false beliefs, such as religious ones, often do not lead to a net benefit for those who hold them, and even more often affect the rest of us negatively.

To a person who values the truth, knowledge is a benefit and will therefore be part of the AI's 'net benefit' calculation.

But knowledge of the truth has a finite value. What if the AI believed that the benefit of a lie would outweigh a truth-seeker's cost of being lied to?

So the question is, would any rational truth-seeker choose to only be told the truth by the AI?

A person doesn't have to 'infinitely value' truth to always prefer the truth to a lie. The importance put on truth merely has to be greater than the importance put on anything else.

That said, if the question is, is there a human, or has there ever been a human who values truth more than anything else, the answer is almost certainly no. For example, I care about the truth a lot, but if I were given the choice between learning a single, randomly chosen fact about the universe, and being given a million dollars, I'd pick the cash without too much hesitation.

However, as Eliezer has said many times, human minds only represent a tiny fraction of all possible minds. A mind that puts truth above anything else is certainly possible, even if it doesn't exist yet.

Now that we know we programmed an AI that may lie to us, our rational expectations will make us skeptical of what the AI says, which is not ideal. Sounds like the AI programmer will have to cover up the fact that the AI does not always speak the truth.