alenoach's Shortform
Jul 25, 20252
Rationale It is hard to ensure that a powerful AI model gives good answers just because it believes these answers are true. Deceptive models could give the expected answers for whatever instrumental reasons. And sycophants may be telling the truth only when they expect it to be approved. Being sincere...