x

LESSWRONG
LW

METR (org) — LessWrong

METR (org)

Edited by Ruby last updated 1st Jul 2024

Formerly ARC Evals

Add Posts

Posts tagged METR (org)

2

100METR's Observations of Reward Hacking in Recent Frontier Models

Daniel Kokotajlo

8mo

9

2

97Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity

7mo

43

2

21AXRP Episode 47 - David Rein on METR Time Horizons

1mo

0

2

10Review of METR’s public evaluation protocol

2y

0

1

242METR: Measuring AI Ability to Complete Long Tasks

Zach Stein-Perlman

10mo

106

1

153ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks

3y

12

1

145METR's Evaluation of GPT-5

GradientDissenter

6mo

15

1

108Clarifying METR's Auditing Role

2y

1

1

90Introducing METR's Autonomy Evaluation Resources

Megan Kinniment, Beth Barnes

2y

0

1

70Interpreting the METR Time Horizons Post

10mo

13

1

67Reactions to METR task length paper are insane

10mo

43

1

65METR is hiring!

2y

1

1

64CoT May Be Highly Informative Despite “Unfaithfulness” [METR]

GradientDissenter

6mo

3

1

40ARC Evals: Responsible Scaling Policies

Zach Stein-Perlman

2y

10

1

40Is METR Underestimating LLM Time Horizons?

andreasrobinson

1mo

6

Load More (15/22)

Add Posts