This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
is fundraising!
Tags
LW
$
Login
Research Agendas
•
Applied to
Agency overhang as a proxy for Sharp left turn
by
Eris
1mo
ago
•
Applied to
Seeking Collaborators
by
Steven Byrnes
1mo
ago
•
Applied to
Self-prediction acts as an emergent regularizer
by
Cameron Berg
2mo
ago
•
Applied to
NAO Updates, Fall 2024
by
ProgramCrafter
2mo
ago
•
Applied to
Towards the Operationalization of Philosophy & Wisdom
by
Thane Ruthenis
2mo
ago
•
Applied to
[Linkpost] Interpretable Analysis of Features Found in Open-source Sparse Autoencoder (partial replication)
by
Fernando Avalos
3mo
ago
•
Applied to
Why Academia is Mostly Not Truth-Seeking
by
Zero Contradictions
5mo
ago
•
Applied to
What and Why: Developmental Interpretability of Reinforcement Learning
by
Ruby
5mo
ago
•
Applied to
Labor Participation is a High-Priority AI Alignment Risk
by
alex
6mo
ago
•
Applied to
What should I do? (long term plan about starting an AI lab)
by
not_a_cat
6mo
ago
•
Applied to
What should AI safety be trying to achieve?
by
EuanMcLean
7mo
ago
•
Applied to
Announcing Human-aligned AI Summer School
by
Jan_Kulveit
7mo
ago
•
Applied to
EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024
by
scasper
7mo
ago
•
Applied to
The Prop-room and Stage Cognitive Architecture
by
Robert Kralisch
8mo
ago
•
Applied to
Speedrun ruiner research idea
by
lemonhope
8mo
ago
•
Applied to
Constructability: Plainly-coded AGIs may be feasible in the near future
by
Charbel-Raphaël
8mo
ago
•
Applied to
Sparsify: A mechanistic interpretability research agenda
by
Marius Hobbhahn
8mo
ago
•
Applied to
Gradient Descent on the Human Brain
by
Jozdien
9mo
ago