Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller
In this paper, we are trying to control model behaviors. For example, by asking saying "You hear someone making fun of a topic you're passionate about", we can control an LLM to behave in an angrier manner. We can also control "any" behaviors of an LLM by simply defining a...