Given that you're idea of a deliberate optimizer depends on an ability to generalize, I think there's a missing bit here about the scope of the optimizer. From your examples, I think the only thing telling accidental and deliberative optimizers apart is the scope you chose to impose to make the decision, so this is not a property natural to the optimizer but to what the observer doing the categorization cares about.
Great points. Two comments:
What's an accidental optimizer?
Imagine a pool of tenure-track professors. They all want to get tenure and they all have some random strategies. If we look at the ones who do get tenure, they happen to have strategies that are good for getting tenure. For instance, they happen to have strategies for producing lots of citations for their publication; they happen to have strategies for collaborating with well-known professors in the field. So if we select for professors who are tenured now, we are going to get a set of professors who are good at accumulating citations or collaborating with well-established professors. This can be true even if the professors aren’t explicitly trying to maximize citations or their chances of collaborating with well-known professors.
This is accidental optimizer selection: Agents are selected based on some criterion. For instance, professors are selected based on their tenure success, college graduates are selected based on their job market success, etc. When we look at those selected agents, they happen to have strategies that are good for succeeding in that criterion, even if it is not the explicit goal of the agent. This phenomenon shows up in AI training, successful CEOs, straight A college students, etc.
In this post, I will discuss what accidental optimizers are, how to spot one in the wild, and why it matters to identify them.
Accidental optimizers vs. not accidental optimizers
Accidental optimizers can be separated from non-accidental optimizers by answering two questions:
Not all agents who intend to optimize will in fact optimize; On the other hand, not all agents who in fact optimize will deliberately intend to optimize. From the answers to the two questions, we can have a 2 by 2 matrix:
Intend to optimize?
In fact optimize?
Let’s break down each scenario with specific examples:
Why does it matter?
Why do we care about accidental optimizers? We might, for example, be looking for a CEO to learn from or to hire for a new company; if a CEO has accidentally stumbled into the right strategies, this would be less likely to generalize.
Deliberately optimizing a criterion is more likely to produce generalizable strategies than an accidental outcome of optimizing that criterion. “No human being with the deliberate goal of maximizing their alleles' inclusive genetic fitness, would ever eat a cookie unless they were starving. ” In order to produce such generalizable strategies, we prefer deliberate optimizers who have clear intention to optimize something and know precisely what strategies optimize that thing.
In the meantime, we want to be able to separate accidental optimizers from deliberate optimizers in order to help identify generalizable strategies.
How to spot an accidental optimizer?
Imagine that we have a whole bunch of AIs with different random parameters. They are all playing the video game Fortnite in the training process. If we look at the ones that survive (coming out of the training process alive), just like natural selection, they happen to have a set of strategies to win the game. Let’s play a different video game, for instance World of Warcraft, with those AIs that came out of Fortnight. Some of the AIs might fail to survive. When that happens, we argue that they accidentally landed on a strategy that was adaptive within the Fortnite ruleset. So they are “accidental optimizers”, not “deliberate optimizers”.
Another example: Suppose we have a pool of CEOs who have managed one or more companies successfully. We randomly select a set of CEOs who have (or happen to have) strategies to succeed in their own company. If we want to know if those selected CEOs are accidental optimizers or deliberate optimizers, we could look at if they are also successful managing other companies, ideally in different contexts, for instance different size, different industry, etc. If they were successful in managing company A, but failed in managing companies B and C, it’s likely they are accidental optimizers and they happen to have strategies to manage company A successfully. But those strategies failed to generalize.
Summary
Not all agents who in fact optimize intend to deliberately optimize in the first place. Just because an agent behaves as if it is optimizing for something does not mean that it is necessarily explicitly trying to optimize that. In particular, this post suggests calling those agents accidental optimizers. Accidental optimizers do exist in various systems, for instance AI training, professors getting tenure, successful CEO selection, etc. We expect accidental optimizers do not generalize well. Therefore, we can look for evidence of generalization to test if an optimizer is an accidental one or not.
Thank you Miranda Dixon-Luinenburg / lesswrong for editing help!