RobbBB comments on Open Thread, May 25 - May 31, 2015 - Less Wrong

3 Post author: Gondolinian 25 May 2015 12:00AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (301)

You are viewing a single comment's thread. Show more comments above.

Comment author: iceman 27 May 2015 07:12:24AM 15 points [-]

(Disclaimer: My lifetime contribute to MIRI is in the low six digits.)

It appears to me that there are two LessWrongs.

The first is the LessWrong of decision theory. Most of the content in the Sequences contributed to making me sane, but the most valuable part was the focus on decision theory and considering how different processes performed in the prisoner's dilemma. Understanding decision theory is a precondition to solving the friendly AI problem.

The first LessWrong results in serious insights that should be integrated into one's life. In Program Equilibrium in the Prisoner's Dilemma via Lob's Theorem, the authors take a moment to discuss the issue of "Defecting Against CooperateBot"--if you know that you are playing against CooperateBot, you should defect. I remember when I first read the paper and the concept just clicked. Of course you should defect against CooperateBot. But this was an insight that I had to be told and LessWrong is valuable to me as it has helped internalize game theory. The first year that I took the LessWrong survey, I answered that of course you should cooperate in the one shot non-shared source code prisoner's dilemma. On the latest survey, I instead put the correct answer.

The second LessWrong is the LessWrong of utilitarianism, especially of a Singerian sort, which I find to clash with the first LessWrong. My understanding is that Peter Singer argues that because you would ruin your shoes to jump into a creek to save a drowning child, you should incur an equivalent cost to save the life of a child in the third world.

Now never mind that saving the child might have postive expected value to the jumper. We can restate Singer's moral obligation as a prisoner's dilemma, and then we can apply something like TDT to it and make the FairBot version of Singer: I want to incur a fiscal cost to save a child on the other side of the world iff parents on the other side of the world would incur a fiscal cost to save my child. I believe Singer would deny this statement (and would be more aghast at the PrudentBot version), and would insist that there's a moral obligation regardless of the other theoretical reciprocation.

I notice that I am being asked to be CooperateBot. I don't think CFAR has "Don't be CooperateBot," as a rationality technique, but they should.

Practically, I find that 'altruism' and 'CooperateBot' are synonyms. The question of reciprocality hangs in the background. It must, because Azathoth both generates those who are CooperateBot and those who exploit CooperateBots.

I will also point out that this whole discussion is happening on the website that exists to popularize humanity's greatest collective action problem. Every one of us has a selfish interest in solving the friendly AI problem. And while I am not much of a utilitarian, I would assume that the correct utilitarian charity answer in terms of number of people saved/generated would be MIRI, and that the most straightforward explanation is Hansonian cynacism.

Comment author: RobbBB 29 May 2015 12:17:21AM *  5 points [-]

'Altruism' for me doesn't mean 'I assign infinite value to my own happiness (and freedom, beauty, etc.) and 0 to others', but everyone would be better off (myself included) if I sacrificed my own happiness for others'. So I'll sacrifice my own happiness for others'.' Rather, I assign some value to my own happiness, but a lot more value to others' happiness. I care unconditionally about others' happiness.

Since it's only a Prisoner's Dilemma if I value 'I defect, you cooperate' over 'we both cooperate', for me high-stakes 'defecting' would mean directly indulging in my desire to help others, while 'cooperating' via UDT would mean sacrificing humanity's welfare in some small way in order to keep a non-utilitarian agent from doing even more to reduce humanity's welfare. The structure of the PD has nothing to do with whether the agents are selfish vs. altruistic (as long as you take that into account when initially calculating payoffs).

Thought experiments like Singer's are how I found out that I do in fact terminally value people who are distant from me in space (and time). My behavior isn't perfectly utilitarian, but I'd take a pill to become more so, so my revealed preferences aren't what I'd prefer them to be.