This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Wikitags
LW
Login
Situational Awareness
Settings
Applied to
Revising Stages-Oversight Reveals Greater Situational Awareness in LLMs
by
Raemon
4d
ago
Applied to
Do models know when they are being evaluated?
by
Joe Needham
1mo
ago
Applied to
Cross-context abduction: LLMs make inferences about procedural training data leveraging declarative facts in earlier training data
by
Sohaib Imran
4mo
ago
Applied to
Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception?
by
Raemon
6mo
ago
Applied to
Owain Evans on Situational Awareness and Out-of-Context Reasoning in LLMs
by
Michaël Trazzi
7mo
ago
Applied to
Investigating the Ability of LLMs to Recognize Their Own Writing
by
Nina Panickssery
8mo
ago
Applied to
Me, Myself, and AI: the Situational Awareness Dataset (SAD) for LLMs
by
Owain_Evans
8mo
ago
Applied to
LLM Evaluators Recognize and Favor Their Own Generations
by
Arjun Panickssery
11mo
ago
Applied to
My intellectual journey to (dis)solve the hard problem of consciousness
by
Charbel-Raphaël
1y
ago
Applied to
Perceptual Blindspots: How to Increase Self-Awareness
by
Declan Molony
1y
ago
Applied to
Facts vs Interpretations
by
Declan Molony
1y
ago
Applied to
Revealing Intentionality In Language Models Through AdaVAE Guided Sampling
by
RiversHaveWings
1y
ago
Applied to
Some Quick Follow-Up Experiments to “Taken out of context: On measuring situational awareness in LLMs”
by
Raemon
1y
ago
Applied to
Paper: On measuring situational awareness in LLMs
by
Owain_Evans
2y
ago
Applied to
The Zeroth Skillset
by
Alan E Dunne
2y
ago
Applied to
The intelligence-sentience orthogonality thesis
by
Ben Smith
2y
ago
Mateusz Bagiński
v1.10.0
Jun 16th 2023 GMT
(
+23
/
-23
)
1
Mateusz Bagiński
v1.9.0
Jun 16th 2023 GMT
(
+37
/
-37
)
1