[CS2881r] Optimizing Prompts with Reinforcement Learning
This work was done as an experiment for Boaz Barak’s “CS 2881r: AI Safety and Alignment” at Harvard. The lecture where this work was presented can be viewed on YouTube here, and its corresponding blogpost can be found here. Background Prompt engineering has become a central idea in working with...