but you think it's unreasonable to try to solve some ultimately simple problems with the fate of the world at stake
To be clear, I'm not afraid that you'll fail to solve one or more philosophical problems and waste your donors' money. If that was the only worry I'd certainly want you to try. (ETA: Well, aside from the problem of shortening AI timelines.) What I'm afraid of is that you'll solve them incorrectly while thinking that you've solved them correctly.
I recall you used to often say that you've "solved metaethics". But when I looked at your solution I was totally dissatisfied, and wrote several posts explaining why. I also thought you were overconfident about utilitarianism and personal identity, and wrote posts pointing out holes in your arguments about those. "Free will" and "nature of truth" happen to be topics that I've given less time to, but I could write down my inside view of why I'm not confident they are solved problems, if you think that would help with our larger disagreements.
The philosophers who can't agree on free will seem like entirely different sorts of creatures to me.
Is there anyone besides Gary Drescher who you'd consider to be in your reference class? What about the people who came up with the same exact solution to "tree falls in forest" as you? (Did you follow the links I provided?) Or the people who originally came up with utilitarianism, Bayesian decision theory, and Solomonoff induction (all of whom failed to notice the problems later discovered in those ideas)? Do you consider me to be in your reference class, given that I independently came up with some of the same decision theory ideas as you?
Or if it's just Drescher, would it change your mind on how confident you ought to be in your ideas if he was to express disagreement with several of them?
"Free will" and "nature of truth" happen to be topics that I've given less time to, but I could write down my inside view of why I'm not confident they are solved problems
This depends on the threshold of "solved", which doesn't seem particularly relevant to this conversation. What philosophy would consider "solved" is less of an issue than its propensity to miss/ignore available insight (as compared to, say, mathematics). "Free will" and "nature of truth", for example, still have important outs...
On the subject of how an FAI team can avoid accidentally creating a UFAI, Carl Shulman wrote:
In the history of philosophy, there have been many steps in the right direction, but virtually no significant problems have been fully solved, such that philosophers can agree that some proposed idea can be the last words on a given subject. An FAI design involves making many explicit or implicit philosophical assumptions, many of which may then become fixed forever as governing principles for a new reality. They'll end up being last words on their subjects, whether we like it or not. Given the history of philosophy and applying the outside view, how can an FAI team possibly reach "very high standards of proof" regarding the safety of a design? But if we can foresee that they can't, then what is the point of aiming for that predictable outcome now?
Until recently I haven't paid a lot of attention to the discussions here about inside view vs outside view, because the discussions have tended to focus on the applicability of these views to the problem of predicting intelligence explosion. It seemed obvious to me that outside views can't possibly rule out intelligence explosion scenarios, and even a small probability of a future intelligence explosion would justify a much higher than current level of investment in preparing for that possibility. But given that the inside vs outside view debate may also be relevant to the "FAI Endgame", I read up on Eliezer and Luke's most recent writings on the subject... and found them to be unobjectionable. Here's Eliezer:
Does anyone want to argue that Eliezer's criteria for using the outside view are wrong, or don't apply here?
And Luke:
These ideas seem harder to apply, so I'll ask for readers' help. What reference classes should we use here, in addition to past attempts to solve philosophical problems? What inside view adjustments could a future FAI team make, such that they might justifiably overcome (the most obvious-to-me) outside view's conclusion that they're very unlikely to be in the possession of complete and fully correct solutions to a diverse range of philosophical problems?