x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
Academic Papers — LessWrong
Academic Papers
Edited by
Kaj_Sotala
last updated
9th Jul 2020
Posts either linking to, or summarizing, formal papers published elsewhere.
Subscribe
Discussion
Subscribe
Discussion
Posts tagged
Academic Papers
Most Relevant
8
206
Some AI research areas and their relevance to existential safety
Ω
Andrew_Critch
5y
Ω
37
6
102
Pretraining on Aligned AI Data Dramatically Reduces Misalignment—Even After Post-Training
Ω
RogerDearnaley
21d
Ω
12
6
66
How to Control an LLM's Behavior (why my P(DOOM) went down)
Ω
RogerDearnaley
2y
Ω
30
6
35
The Best Way to Align an LLM: Is Inner Alignment Now a Solved Problem?
Ω
RogerDearnaley
8mo
Ω
34
5
37
Striking Implications for Learning Theory, Interpretability — and Safety?
RogerDearnaley
2y
4
4
115
Thirty-three randomly selected bioethics papers
Rob Bensinger
,
gwern
5y
46
3
50
My Reservations about Discovering Latent Knowledge (Burns, Ye, et al)
Robert_AIZI
3y
0
2
178
Scientific breakthroughs of the year
technicalities
2mo
13
2
168
2021 AI Alignment Literature Review and Charity Comparison
Ω
Larks
4y
Ω
28
2
159
Comment on "Endogenous Epistemic Factionalization"
Zack_M_Davis
6y
8
2
121
Evidence of Learned Look-Ahead in a Chess-Playing Neural Network
Ω
Erik Jenner
2y
Ω
14
2
109
Me, Myself, and AI: the Situational Awareness Dataset (SAD) for LLMs
L Rudolf L
,
bilalchughtai
,
Jan Betley
,
kaivu
,
Jérémy Scheurer
,
Mikita Balesni
,
AlexMeinke
,
Owain_Evans
,
Marius Hobbhahn
2y
40
2
106
Optimized Propaganda with Bayesian Networks: Comment on "Articulating Lay Theories Through Graphical Models"
Zack_M_Davis
6y
10
2
90
Fallacies as weak Bayesian evidence
Kaj_Sotala
14y
42
2
86
I Was Not Almost Wrong But I Was Almost Right: Close-Call Counterfactuals and Bias
Kaj_Sotala
14y
40