GPT-3.5 judges can supervise GPT-4o debaters in capability asymmetric debates
Introduction The AI safety debate agenda, proposed by Irving et al. (2018), explores using debates between AI agents to ensure truthful answers from advanced systems. Recently three key debate settings have been studied with LLMs: Information asymmetric debates: Debaters have access to information unavailable to the judge. This hidden information...