Auto-Enhance: Developing a meta-benchmark to measure LLM agents’ ability to improve other agents
Summary * Scaffolded LLM agents are, in principle, able to execute arbitrary code to achieve the goals they have been set. One such goal could be self-improvement. * This post outlines our plans to build a benchmark to measure the ability of LLM agents to modify and improve other LLM...
Jul 22, 202420