Responsible scaling policy TLDR

lemonhope

9 Responsible scaling policy TLDR

by lemonhope

28th Sep 2023

1 min read

0

9

I do not speak for ARC Evals!

TLDR²: If you notice risk and pause, then you have a better chance of doing sufficient mitigations.

Assume you are able to

think of all major AI risk categories,
define dummy tasks that are (very) leading indicators of each risk, and
make any AI system to do a genuine best-effort on simple tasks.

Then, if you actually define your risks & tasks and actually try to use your AI to complete the tasks, you get plenty of warning of danger.

You pause training & development until the risks are "properly mitigated".

It is too early to say what mitigations work well when, or how to really be sure whether the risk is gone. All else being equal, it is better to make a tentative mitigation plan anyways.

AI EvaluationsAI

Frontpage

9

New Comment

Moderation Log

LESSWRONG
LW

9

Responsible scaling policy TLDR

9

New to LessWrong?

9