This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Wikitags
LW
Login
Alignment Tax
Settings
Applied to
Safety-capabilities tradeoff dials are inevitable in AGI
by
Steven Byrnes
17d
ago
Applied to
Alignment can be the ‘clean energy’ of AI
by
Cameron Berg
2mo
ago
Dakara
v1.4.0
Dec 30th 2024 GMT
(
+3
/
-3
)
1
Dakara
v1.3.0
Dec 30th 2024 GMT
(
+17
/
-12
)
1
Applied to
AI safety tax dynamics
by
owencb
6mo
ago
Applied to
Safety tax functions
by
owencb
6mo
ago
Applied to
The case for a negative alignment tax
by
Cameron Berg
7mo
ago
Applied to
How difficult is AI Alignment?
by
Sammy Martin
8mo
ago
Applied to
Labor Participation is a High-Priority AI Alignment Risk
by
alex
11mo
ago
Applied to
Ten Levels of AI Alignment Difficulty
by
Sammy Martin
2y
ago
Applied to
The case for removing alignment and ML research from the training dataset
by
beren
2y
ago
markov
v1.2.0
Apr 28th 2023 GMT
(
+1315
/
-258
)
1
Applied to
Against ubiquitous alignment taxes
by
beren
2y
ago
Applied to
[Linkpost] Jan Leike on three kinds of alignment taxes
by
Orpheus16
2y
ago
Applied to
On the Importance of Open Sourcing Reward Models
by
elandgre
2y
ago
Applied to
The commercial incentive to intentionally train AI to deceive us
by
Derek M. Jones
2y
ago
Applied to
Security Mindset and the Logistic Success Curve
by
Multicore
2y
ago
markov
v1.1.0
Oct 24th 2022 GMT
(+953)
0
markov
v1.0.0
Oct 24th 2022 GMT
0