I’m grateful to Bogdan Cirstea, Konstantin Pilz and Raphaël S for providing feedback on this post.
This post tries to clarify the concept of situational awareness, in particular with respect to current large language models.
What is situational awareness
Not writing anything new here, just summarizing prior work.
(It’s worth noting that the usage of the term here is different from what’s usually meant by situational awareness in humans.)
Ajeya Cotra introduced the term of situational awareness in the context of AI Safety and Richard Ngo et al. recently elaborated on it. Situational awareness describes the degree to which an AI system understands its environment and its own state and behavior, in particular when that understanding causes specific behavior (such as deceptive... (read 1837 more words →)
Given the quote in the post, this is not really what they claim. They say (bold mine):
So on that dataset, I assume it might be true although "in the wild" it's not.