One of the subskills mentioned in Eliezer's Security Mindset post is mitigating assumption risk–that is, the risk of losing utility because some of your assumptions are wrong. There are two main ways to do this:
- Gain more information about whether your assumptions hold
- Make the assumption irrelevant (such as the hashing passwords example)
Here are a bunch more examples:
- Repeating back what someone said in your own words, to check understanding
- Adding a margin of safety when estimating how much load a bridge can bear
- Using Statistical models that make fewer assumptions, or have fatter tails
- Exposing your work to attack in low-risk situations, such as comedians testing new material in small clubs, or Netflix's Chaos Monkey
- Emphasizing fast adaption to unexpected circumstances over better forecasting
- Putting spare capacity in steps in your process that aren't the bottleneck
- Testing code frequently while refactoring to check functionality doesn't unintentionally change
- Doing an analysis in different ways on different datasets, and only trusting them when the conclusions match
"Statistical models with fewer assumptions" is a tricky one, because the conditions under which your inferences work is not identical to the conditions you assume when deriving your inferences.
I mostly have in mind a historical controversy in the mathematical study of evolution. Joseph Felsenstein introduced maximum likelihood methods for inferring phylogenetic trees. He assumed a probabilistic model for how DNA sequences change over time, and from that he derived maximum likelihood estimates of phylogenetic trees of species based on their DNA sequences.
Felsenstein's maximum likelihood method was an alternative to another method, the "maximum parsimony" method. The maximum parsimony tree is the tree that requires you to assume the fewest possible sequence changes when explaining the data.
Some people criticized Felsenstein's maximum likelihood method, since it assumed a statistical model, whereas the maximum parsimony method did not. Felsenstein's response was to exhibit a phylogenetic tree and model of sequence change where maximum parsimony failed. Specifically, it was a tree connecting four species. And when you randomly generate DNA sequences using this tree and the specified probability model for sequence change, maximum parsimony gives the wrong result. When you generate short sequences, it may give the right result by chance, but as you generate longer seqences, maximum parsimony will, with probability 1, converge on the wrong tree. In statistical terms, maximum parsimony is inconsistent: it fails in the infinite-data limit, at least when that is the data-generating process.
What does this mean for the criticism that maximum likelihood makes assumptions? Well, it's true that maximum likelihood works when the data-generating process matches our assumptions, and may not work otherwise. But maximum parsimony also works for a limited set of data-generating processes. Can users of maximum parsimony, then, be accused of making the assumption that the data-generating process is one on which maximum parsimony is consistent?
The field of phylogenetic inference has since become very simulation-heavy. They assume a data-generating process, and test the output of maximum likelihood, maximum parsimony, and other methods. The conceern is, therefore, not so much on how many assumptions the statistical method makes, but on what range of data-generating processes it gives correct results.
This is an important distinction because, while we can assume that the maximum likelihood method works when its assumptions are true, it may also work when its assumptions are false. We have to explore with theory and simulations what is the set of data-generating processes on which it is effective, just like we do with "assumption-free" methods like maximum parsimony.
For more info, some of this story can be found in Felsenstein's book "Inferring Phylogenies", which also contains references to many of the original papers.