An important work in AI safety should be to prove equivalency of various Capability benchmarks to Risk benchmarks. So that, when AI labs show their model is crossing a capability benchmark, they are automatically crossing a AI safety level. "So we don't have two separate reports from them; one saying that the model is a PhD level Scientist, and the other saying that studies shows that the CBRN risk with model is not more than internet search."
An important work in AI safety should be to prove equivalency of various Capability benchmarks to Risk benchmarks. So that, when AI labs show their model is crossing a capability benchmark, they are automatically crossing a AI safety level.
"So we don't have two separate reports from them; one saying that the model is a PhD level Scientist, and the other saying that studies shows that the CBRN risk with model is not more than internet search."