A lot of AI risk models rely on the fact that we supervise outcomes like consequentialists rather than processes like CAIS.
But, how easy is it to supervise processes for AI, compared to outcomes both in:
Near term: 10 years.
Long-term: 50 years.
Crucially, is it easier to supervise processes like CAIS today than outcomes?
And a final question. Should we fund massively attempts to supervise processes in today's AI, in the hope that it's an attractor state rather than outcomes like consequentialism?
A lot of AI risk models rely on the fact that we supervise outcomes like consequentialists rather than processes like CAIS.
But, how easy is it to supervise processes for AI, compared to outcomes both in:
Near term: 10 years.
Long-term: 50 years.
Crucially, is it easier to supervise processes like CAIS today than outcomes?
And a final question. Should we fund massively attempts to supervise processes in today's AI, in the hope that it's an attractor state rather than outcomes like consequentialism?