Prover-Estimator Debate: A New Scalable Oversight Protocol
Linkpost to arXiv: https://arxiv.org/abs/2506.13609. Summary: We present a scalable oversight protocol where honesty is incentivized at equilibrium. Prior debate protocols allowed a dishonest AI to force an honest AI opponent to solve a computationally intractable problem in order to win. In contrast, prover-estimator debate incentivizes honest equilibrium behavior, even when...