x
Model Spec Midtraining: Improving How Alignment Training Generalizes — LessWrong