This is an executive summary of a post from my personal blog, also cross-posted from the EA Forum. Read the full texts here.
Summary
Benchmarks support the empirical, quantitative evaluation of progress in AI research. Although benchmarks are ubiquitous in most subfields of machine learning, they are still rare in the subfield of AI safety.
I argue that creating benchmarks should be a high priority for AI safety. While this idea is not new, I think it may still be underrated. Among other benefits, benchmarks would make it much easier to:
- track the field’s progress and focus resources on the most productive lines of work;
- create professional incentives for researchers - especially Chinese researchers - to work
... (read 258 more words →)