This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Goal-Directedness
•
Applied to
Don't want Goodhart? — Specify the variables more
by
YanLyutnev
15h
ago
•
Applied to
Don't want Goodhart? — Specify the damn variables
by
Ян Лютнев
15h
ago
•
Applied to
[Interim research report] Evaluating the Goal-Directedness of Language Models
by
Rauno Arike
4mo
ago
•
Applied to
A "Bitter Lesson" Approach to Aligning AGI and ASI
by
RogerDearnaley
5mo
ago
•
Applied to
Emotional issues often have an immediate payoff
by
Chipmonk
5mo
ago
•
Applied to
Measuring Coherence and Goal-Directedness in RL Policies
by
dx26
7mo
ago
•
Applied to
Understanding mesa-optimization using toy models
by
tilmanr
8mo
ago
•
Applied to
Measuring Coherence of Policies in Toy Environments
by
dx26
8mo
ago
•
Applied to
Refinement of Active Inference agency ontology
by
Roman Leventov
1y
ago
•
Applied to
Quick thoughts on the implications of multi-agent views of mind on AI takeover
by
Kaj_Sotala
1y
ago
•
Applied to
Towards an Ethics Calculator for Use by an AGI
by
sweenesm
1y
ago
•
Applied to
“Clean” vs. “messy” goal-directedness (Section 2.2.3 of “Scheming AIs”)
by
RobertM
1y
ago
•
Applied to
FAQ: What the heck is goal agnosticism?
by
RobertM
1y
ago