LESSWRONG
LW

Comment Permalink

Answer by ChipmonkMar 03, 202560

https://x.com/sama/status/1621621724507938816

19

[ Question ]

Examples of self-fulfilling prophecies in AI alignment?

3rd Mar 2025

1 min read

19

Like Self-fulfilling misalignment data might be poisoning our AI models, what are historical examples of self-fulfilling prophecies that have affected AI alignment and development?

Put a few potential examples below to seed discussion.

Self Fulfilling/Refuting PropheciesAI

19

Examples of self-fulfilling prophecies in AI alignment?

New Answer

New Comment

3 Answers sorted by
top scoring

Mar 03, 2025

60

https://x.com/sama/status/1621621724507938816

Mar 03, 2025

60

Training on Documents About Reward Hacking Induces Reward Hacking

Mar 03, 2025

40

Situational Awareness and race dynamics? h/t Jan Kulveit @Jan_Kulveit

More from Chipmonk

Curated and popular this week