Mikita Balesni

Posts

Sorted by New

Wiki Contributions

Comments

Sorted by

I think one practical difference is whether filtering pre-training data to exclude cases of scheming is a useful intervention.