Thank you for this! I have sometimes wondered whether or not it's possible for even a superhuman AI to meaningfully answer a question as potentially undetermined as "What should we want you to do?" Do you think that it would be easier to explain things that we're sure we don't want (something like a heftier version of Asimov's Laws)? Even then it would be hard (both sides of a war invariably claim their side is justified; and maybe you can't forbid harming people's mental health unless you can define mental health), but maybe maybe sufficient to avoid doomsday until we thought of something better?
Thank you for this! I have sometimes wondered whether or not it's possible for even a superhuman AI to meaningfully answer a question as potentially undetermined as "What should we want you to do?" Do you think that it would be easier to explain things that we're sure we don't want (something like a heftier version of Asimov's Laws)? Even then it would be hard (both sides of a war invariably claim their side is justified; and maybe you can't forbid harming people's mental health unless you can define mental health), but maybe maybe sufficient to avoid doomsday until we thought of something better?