This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Academic Papers
•
Applied to
'Chat with impactful research & evaluations' (Unjournal NotebookLMs)
by
david reinstein
2mo
ago
•
Applied to
Searching for Impossibility Results or No-Go Theorems for provable safety.
by
Maelstrom
2mo
ago
•
Applied to
Secret Collusion: Will We Know When to Unplug AI?
by
schroederdewitt
2mo
ago
•
Applied to
Me, Myself, and AI: the Situational Awareness Dataset (SAD) for LLMs
by
Owain_Evans
4mo
ago
•
Applied to
How Big a Deal are MatMul-Free Transformers?
by
JustisMills
5mo
ago
•
Applied to
Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller
by
Henry Cai
5mo
ago
•
Applied to
Evidence of Learned Look-Ahead in a Chess-Playing Neural Network
by
Erik Jenner
6mo
ago
•
Applied to
Rawls's Veil of Ignorance Doesn't Make Any Sense
by
Arjun Panickssery
9mo
ago
•
Applied to
Skepticism About DeepMind's "Grandmaster-Level" Chess Without Search
by
Arjun Panickssery
9mo
ago
•
Applied to
How to Control an LLM's Behavior (why my P(DOOM) went down)
by
RogerDearnaley
11mo
ago
•
Applied to
Striking Implications for Learning Theory, Interpretability — and Safety?
by
RogerDearnaley
11mo
ago
•
Applied to
VLM-RM: Specifying Rewards with Natural Language
by
ChengCheng
1y
ago
•
Applied to
Paper digestion: "May We Have Your Attention Please? Human-Rights NGOs and the Problem of Global Communication"
by
Klara Helene Nielsen
1y
ago
•
Applied to
Papers, Please #1: Various Papers on Employment, Wages and Productivity
by
Kaj_Sotala
2y
ago
•
Applied to
A technical note on bilinear layers for interpretability
by
RobertM
2y
ago
•
Applied to
Attributes of successful professors
by
electroswing
2y
ago
•
Applied to
An Overview of Sparks of Artificial General Intelligence: Early experiments with GPT-4
by
Annapurna
2y
ago