Thanks for sharing this; it's a really helpful window into the world of AI ethics. I most of all liked this comment you made early on, however: "...making modern-day systems behave ethically involves a bunch of bespoke solutions only suitable to the domain of operation of that system, not allowing for cross-comparison in any useful way."
What this conjures in my mind is the hypothetical alternative of a transformer-like model that could perform zero-shot evaluation of ethical quandaries, and return answers that we humans would consider "ethical", across a wide range of settings and scenarios. But I'm sure that someone has tried this before, e.g. training a BERT-type text classifier to distinguish between ethical and unethical completions of moral dilemma setups based on human-labeled data, and I guess I want to know why that doesn't work (as I'm sure it doesn't, or else we would have heard about it).
For anyone interested, I have used a machine learning algorithm to generate short summaries of all but six of the Yudkowsky posts collected in this ePub. You can find them at the following link:
https://github.com/umm-maybe/MostlyWrong/blob/main/lesswrong_summaries.md
Pull requests for corrections where the algorithm wasn't accurate are welcome.
This may be useful for people like me who feel the need to get caught up on LessWrong background, but are short on time. Others may like to skim the summaries as a way to identify posts on topics that interest them enough to go find the original and read it in full. I do hope it lowers barriers to participation in the discussion.