This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
AlexMeinke
Posts
Sorted by New
112
Ablations for “Frontier Models are Capable of In-context Scheming”
3mo
1
203
Frontier Models are Capable of In-context Scheming
Ω
3mo
Ω
24
61
Training AI agents to solve hard problems could lead to Scheming
Ω
4mo
Ω
12
109
Me, Myself, and AI: the Situational Awareness Dataset (SAD) for LLMs
8mo
37
93
Apollo Research 1-year update
Ω
10mo
Ω
0
51
A starter guide for evals
Ω
1y
Ω
2
45
Paper: Tell, Don't Show- Declarative facts influence how LLMs generalize
Ω
1y
Ω
4
Wikitag Contributions
Comments
Sorted by
Newest