This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
mrinank_sharma
Posts
Sorted by New
78
Best-of-N Jailbreaking
Ω
2mo
Ω
5
66
Towards Understanding Sycophancy in Language Models
Ω
1y
Ω
0
70
Paper: Understanding and Controlling a Maze-Solving Policy Network
Ω
1y
Ω
0
Wiki Contributions
Comments
Sorted by
Newest