A few quick comments, on the same theme as but mostly unrelated to the exchange so far:
At some point I recall thinking to myself "huh, LessWrong is really having a surge of good content lately". Then I introspected and realized that about 80% of that feeling was just that you've been posting a lot.
"Please don't roll your own crypto" is a good message to send to software engineers looking to build robust products. But it's a bad message to send to the community of crypto researchers, because insofar as they believe you, then you won't get new crypto algorithms from them.
In the context of metaethics, LW seems much more analogous to the "community of crypto researchers" than the "software engineers looking to build robust products". Therefore this seems like a bad message to send to LessWrong, even if it's a good message to send to e.g. CEOs who justify immoral behavior with metaethical nihilism.
FWIW, in case this is helpful, my impression is that:
This has pretty low argumentative/persuasive force in my mind.
Note that my comment was not optimized for argumentative force about the overarching point. Rather, you asked how they "can" still benefit the world, so I was trying to give a central example.
In the second half of this comment I'll give a couple more central examples of how virtues can allow people to avoid the traps you named. You shouldn't consider these to be optimized for argumentative force either, because they'll seem ad-hoc to you. However, they might still be useful as datapoints.
Figuring out how to describe the underlying phenomenon I'm pointing at in a compelling, non-ad-hoc way is one of my main research focuses. The best I can do right now is to say that many of the ways in which people produce outcomes which are harmful (by their own lights) seem to arise from a handful of underlying dynamics. I call this phenomenon pessimization. One way in which I'm currently thinking about virtues is as a set of cognitive tools for preventing pessimization. As one example, kindness and forgiveness help to prevent cycles of escalating conflict with others, which is a major mechanism by which people's values get pessimized. This one is pretty obvious to most people; let me sketch out some less obvious mechanisms below.
what if someone isn't smart enough to come up with a new line of illegible research, but does see some legible problem with an existing approach that they can contribute to? What would cause them to avoid this?
This actually happened to me: when I graduated from my masters I wasn't cognitively capable of coming up with new lines of illegible alignment research, in part because I was too status-seeking. Instead I went to work at DeepMind, and ended up spending a lot of my time working on RLHF, which is a pretty central example of a "legible" line of research.
However, I also wasn't cognitively capable of making much progress on RLHF, because I couldn't see how it addressed the core alignment problem, and so it didn't seem fundamental enough to maintain my interest. Instead I spent most of my time trying to understand the alignment problem philosophically (resulting in this sequence) at the expense of my promotion prospects.
In this case I think I had the virtue of deep curiosity, which steered my attention towards illegible problems even though my top-down plan was to contribute to alignment by doing RLHF research. These days, whatever you might think of my research, few people complain that it's too legible.
There are other possible versions of me who had that deep curiosity but weren't smart enough to have generated a research agenda like my current one; however, I think they would still have left DeepMind, or at least not been very productive on RLHF.
And even the hypothetical virtuous person who starts doing illegible research on their own, what happens when other people catch up to him and the problem becomes legible to leaders/policymakers? How would they know to stop working on that problem and switch to another problem that is still illegible?
When a field becomes crowded, there's a pretty obvious inference that you can make more progress by moving to a less crowded field. I think people often don't draw that inference because moving to a less crowded field loses them prestige, is emotionally/financially risky, etc. Virtues help remove those blockers.
though I think you don't need to invoke knightian uncertainty. I think it's simply enough to model there being a very large attack surface combined with a more intelligent adversary.
One of the problems I'm pointing to is that you don't know what the attack surface is. This puts you in a pretty different situation than if you have a known large attack surface to defend, even against a smarter adversary (e.g. the whole length of a border; or every possible sequence of Go moves).
Separately, I may be being a bit sloppy by using "Knightian uncertainty" as a broad handle for cases where you have important "unknown unknowns", aka you don't even know what ontology to use. But it feels close enough that I'm by default planning to continue describing the research project outlined above as trying to develop a theory of Knightian uncertainty in which Bayesian uncertainty is a special case.
Fair point. Let me be more precise here.
Both the market for lemons in econ and adverse selection in trading are simple examples of models of adversarial dynamics. I would call these non-central examples of paranoia insofar as you know the variable about which your adversary is hiding information (the quality of the car/the price the stock should be). This makes them too simple to get at the heart of the phenomenon.
I think Habyrka is gesturing at something similar in his paragraph starting "All that said, in reality, navigating a lemon market isn't too hard." And I take him to be gesturing at a more central description of paranoia in his subsequent description: "What do you do in a world in which there are not only sketchy used car salesmen, but also sketchy used car inspectors, and sketchy used car inspector rating agencies, or more generally, competent adversaries who will try to predict whatever method you will use to orient to the world, and aim to subvert it for their own aims?"
This is similar to my criticism of maximin as a model of paranoia: "It's not actually paranoid in a Knightian way, because what if your adversary does something that you didn't even think of?"
Here's a gesture at making this more precise: what makes something a central example of paranoia in my mind is when even your knowledge of how your adversary is being adversarial is also something that has been adversarially optimized. Thus chess is not a central example of paranoia (except insofar as your opponent has been spying on your preparations, say) and even markets for lemons aren't a central example (except insofar as buyers weren't even tracking that dishonesty was a strategy sellers might use—which is notably a dynamic not captured by the economic model).
The people I instinctively checked after reading this: