This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Wikitags
LW
Login
Subscribe
Discussion
0
METR (org)
Ruby
METR (org)
Subscribe
Discussion
0
Written by
Ruby
last updated
1st Jul 2024
Summaries
Cancel
Submit
Formerly ARC Evals
Posts tagged
METR (org)
Most Relevant
2
10
Review of METR’s public evaluation protocol
nahoj
,
JaimeRV
10mo
0
1
241
METR: Measuring AI Ability to Complete Long Tasks
Ω
Zach Stein-Perlman
16d
Ω
104
1
153
ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks
Ω
Beth Barnes
2y
Ω
12
1
108
Clarifying METR's Auditing Role
Ω
Beth Barnes
11mo
Ω
1
1
90
Introducing METR's Autonomy Evaluation Resources
Megan Kinniment
,
Beth Barnes
1y
0
1
65
METR is hiring!
Beth Barnes
1y
1
1
48
Reactions to METR task length paper are insane
Cole Wyeth
10d
41
1
40
ARC Evals: Responsible Scaling Policies
Zach Stein-Perlman
2y
10
1
20
How far along Metr's law can AI start automating or helping with alignment research?
Q
Christopher King
1mo
Q
21
1
20
Improved visualizations of METR Time Horizons paper.
LDJ
1mo
4
1
16
METR: AI models can be dangerous before public deployment
UnofficialLinkpostBot
2mo
0
1
13
METR’s preliminary evaluation of o3 and o4-mini
Christopher King
4d
2
1
5
METR is hiring ML Research Engineers and Scientists
Xodarap
10mo
0