LESSWRONG
is fundraising!
LW
$

AlexMeinke

Posts

Sorted by New

196Frontier Models are Capable of In-context Scheming

6d

22

61Training AI agents to solve hard problems could lead to Scheming

23d

12

105Me, Myself, and AI: the Situational Awareness Dataset (SAD) for LLMs

5mo

28

93Apollo Research 1-year update

6mo

0

50A starter guide for evals

1y

2

45Paper: Tell, Don't Show- Declarative facts influence how LLMs generalize

1y

4

Wiki Contributions

Comments

Sorted by