I tend to explain wages in terms of ease of replacement. Companies will only pay a lot for something they can’t get for cheaper. If AI makes it possible for more people to code, then coders are easier to replace and wages should go down. For entry level jobs this effect is clear, but for senior positions it depends on how easy it is for an employee to get these productivity boosts with AI. Right now there’s a spectrum where almost anyone can make a simple website with AI, but beyond that people start to get filtered out. I expect the distribution to increasingly skew towards high skill programmers until eventually everyone is filtered out and the AI can do the job alone.
In the short term, I don’t expect the wages of high skill employees to change that much. On the one hand, if companies can rely on a few high skill workers, that saves money on hiring, training, and other logistics. Companies will pay higher wages if it means saving a lot of money in other areas. On the other hand, if there are fewer roles, there’s more competition for them, which drives wages down. These pressures act in opposite directions, so they approximately cancel out unless one turns out to be much stronger.
I think it’s an important caveat that this is meant for early AGI with human-expert-level capabilities, which means we can detect misalignment as it manifests in small-scale problems. When capabilities are weak, the difference between alignment and alignment-faking is less relevant because the model’s options are more limited. But once we scale to more capable systems, the difference becomes critical.
Whether this approach helps in the long term depends on how much the model internalizes the corrections, as opposed to just updating its in-distribution behavior. It’s possible that the behavior we see is not a good indicator of the internal nature of the model, so we would be improving the acting method of the model but not fixing the underlying misalignment. This is a question about the amount of overlap between visible misalignment and total misalignment. If most of the misalignment is invisible until late, then this approach is less helpful in the long term.
As you mention, the three examples here work regardless of whether SSA or SIA is true because none of the estimated outcomes affect the total number of observers. But the Doomsday Argument is different and does depend on SSA. If SIA is true, the early population of a long world is just as likely to exist as the total population of a short world, so there’s no update upon finding yourself in an early-seeming world.
A total utilitarian observing from outside both worlds will care just as much about the early population of a long world as the total population of a short world, so the expected value of both reference classes is the same. This suggests to me that if I care about myself, I should be indifferent between the possibilities that I’m early and that I’m in a world with a short lifespan. Of course, if my decisions in one world will affect more people, then I should adjust my actions accordingly.
The universal doomsday argument still relies on SSA, because under SIA, I’m equally surprised to exist as an early person whether most civilizations are short or long. If most civilizations are long, I’m surprised to be early. If most civilizations are short, I’m surprised to exist at all. I could have been any of the late people who failed to exist because most civilizations are short. In other words, the surprise of existing as an early person is equivalent in both cases under SIA, so there's no update. Only under SSA am I certain I will exist but unsure where in the universe I will be.