Thanks for the interesting write-up.
Regarding Evidential Cooperation in Large Worlds, the Identical Twin One Shot Prisoner's dilemma makes sense to me because the entity giving the payout is connected to both worlds. What is the intuition for ECL (where my understanding is there isn't any connection)?
The "entity giving the payout" in practice for ECL would be just the world states you end up in and requires you to care about the environment of the person you're playing the PD with.
So, defecting might be just optimising my local environment for my own values and cooperating would be optimising my local environment for some aggregate of my own values and the values of the person I'm playing with. So, it only works if there are positive-sum aggregates and if each player cares about what the other does to their local environment.
Caspar Oesterheld came up with two of the most important concepts in my field of work: Evidential Cooperation in Large Worlds and Safe Pareto Improvements. He also came up with a potential implementation of evidential decision theory in boundedly rational agents called decision auctions, wrote a comprehensive review of anthropics and how it interacts with decision theory which most of my anthropics discussions built on, and independently decided to work on AI some time late 2009 or early 2010.
Needless to say, I have a lot of respect for Caspar’s work. I’ve often felt very confused about what to do in my attempts at conceptual research, so I decided to ask Caspar how he did his research. Below is my writeup from the resulting conversation.
How Caspar came up with surrogate goals
The process
Caspar’s reflections on what was important during the process
How Caspar came up with ECL
The process
Caspar’s reflections on what was important during the process
How Caspar came up with decision auctions
The process
[editor’s note: I find it notable that all the linked papers are in CS venues rather than economics. That said, while Yiling Chen is a CS professor, she studied economics and has an economics PhD.]
How Caspar decided to work on superhuman AI in late 2009 or early 2010
My impression is that a few people in AI safety independently decided that AI was the most important lever over the future and then discovered LessWrong, Eliezer Yudkowsky, and the AI safety community. Caspar is one of those people. While this didn’t turn out to be unique or counterfactually impactful, I am including his story for deciding to work on superhuman AI. The story is from notes Caspar left in writing after the interview. I mostly copied them verbatim with some light editing for clarity and left it in first person.
The process
“Much of this happened when I was very young, so there's some naivete throughout:
Caspar’s reflections on what was important during the process
General notes on his approach to research
What does research concretely look like in his case?
Thinks he might do when he does research, in no particular order:
Research immersion
Goal orientation vs. curiosity orientation