x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
LLM Personas — LessWrong
LLM Personas
This page is a stub.
Subscribe
Discussion
Subscribe
Discussion
Posts tagged
LLM Personas
Most Relevant
10
201
Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment
Ω
Cam
,
Puria
,
Kyle O’Brien
,
David Africa
,
Samuel Ratnam
,
andyk
6mo
Ω
25
8
713
Simulators
Ω
janus
4y
Ω
170
8
426
the void
Ω
nostalgebraist
1y
Ω
108
7
27
Constitutional AI Alignment
RogerDearnaley
1mo
9
7
25
Experimental Evidence for Simulator Theory— Part 1: Emergent Misalignment and Weird Generalizations
RogerDearnaley
3mo
0
7
21
Experimental Evidence for Simulator Theory— Part 2: The Scalers Strike Back
RogerDearnaley
3mo
0
6
769
The Rise of Parasitic AI
Adele Lopez
9mo
191
6
266
A Three-Layer Model of LLM Psychology
Ω
Jan_Kulveit
1y
Ω
17
6
177
Persona Parasitology
Raymond Douglas
4mo
38
6
106
Pretraining on Aligned AI Data Dramatically Reduces Misalignment—Even After Post-Training
Ω
RogerDearnaley
5mo
Ω
12
6
83
Shaping the exploration of the motivation-space matters for AI safety
Maxime Riché
,
Victor Gillioz
,
nielsrolf
,
Kajetan Dymkiewicz
,
Filip Sondej
,
RogerDearnaley
,
Daniel Tan
,
dillonkn
4mo
15
5
121
A Case for Model Persona Research
nielsrolf
,
Maxime Riché
,
Daniel Tan
6mo
11
4
68
The Bleeding Mind
Ω
Adele Lopez
6mo
Ω
9
4
40
Selection Pressures on LM Personas
Ω
Raymond Douglas
1y
Ω
0
3
69
Concrete research ideas on AI personas
nielsrolf
,
Maxime Riché
,
Daniel Tan
5mo
10