Maximilian Kaufmann

Message

188

Maximilian Kaufmann has not written any posts yet.

Paper: LLMs trained on “A is B” fail to learn “B is A”

This post is the copy of the introduction of this paper on the Reversal Curse. Authors: Lukas Berglund, Meg Tong, Max Kaufmann, Mikita Balesni, Asa Cooper Stickland, Tomasz Korbak, Owain Evans Abstract We expose a surprising failure of generalization in auto-regressive large language models (LLMs). If a model is trained...

Sep 23, 2023125

Paper: On measuring situational awareness in LLMs

This post is a copy of the introduction of this paper on situational awareness in LLMs. Authors: Lukas Berglund, Asa Cooper Stickland, Mikita Balesni, Max Kaufmann, Meg Tong, Tomasz Korbak, Daniel Kokotajlo, Owain Evans. Abstract We aim to better understand the emergence of situational awareness in large language models (LLMs)....

Sep 4, 2023109

LESSWRONG
LW

LESSWRONG
LW

Maximilian Kaufmann

Maximilian Kaufmann

Maximilian Kaufmann

Maximilian Kaufmann

Paper: LLMs trained on “A is B” fail to learn “B is A”

Paper: On measuring situational awareness in LLMs

Paper: LLMs trained on “A is B” fail to learn “B is A”

Paper: On measuring situational awareness in LLMs

Abstract

Abstract