The fun thing is that the actual profile of wages earned can be absolutely identical and yet end up with incredibly different results for personal wage changes. For example:
In year 1, A earns $1/hr, B $2, C $3, D $4, and E $5.
In year 2, A earns $2/hr, B $3, C $4, D $5, and E $1.
A, B, C, and D personally all increased their income by substantial amounts and may vote accordingly. E lost a lot more than any of the others gained, but doesn't get more votes because of that. 80% of voters saw their income increase. What's more, this process can repeat endlessly.
If in year 2, A instead earns $5/hr, B $1, C $2, D $3, and E $4 then 80% of voters will be rather unhappy at the change despite the income distribution still being identical.
In the rain forecaster example, it appears that the agent ("you") is more of an expert on Alice's calibration than Alice is. Is this intended?
In practice, a lot of property is transferred into family trusts, and appointed family members exercise decision making over those assets according to the rules of that trust. A 100% death tax would simply ensure that essentially all property is managed in this manner for the adequately wealthy, and only impact families too disadvantaged to use this sort of structure. If you don't personally own anything of note at the time of your death, your taxes will be minimal.
You would also need a 100% gift tax, essentially prohibiting all gifts between private citizens. You bought your child something, or (worse!) gave them money to buy it themselves? That's clearly an attempt to get around inheritance tax and must be prevented.
There are also huge numbers of private businesses, for which this sort of tax would be nothing but an enormous micromanaging nationalization scheme with predictable disastrous results.
This does not work, at all.
I think one argument is that optimizing for IGF basically gives humans two jobs: survive, and have kids.
Animal skulls are evidence that the "survive" part can be difficult. We've nailed that one, though. Very few humans in developed countries die before reaching an age suitable for having kids. I doubt that there are any other animal species that come close to us in that metric. Almost all of us have "don't die" ingrained pretty deeply.
It's looking like we are moving toward failing pretty heavily on the second "have kids" job though, and you would think that would be the easier one.
So if there's a 50% failure rate on preserving outer optimizer values within the inner optimizer, that's actually pretty terrible.
It doesn't completely avoid the problem of priors, just the problem of arbitrarily fixing a specific type of update rule on fixed priors such as in Solomonoff induction. You can't afford this if you're a bounded agent, and a Solomonoff inductor can only get away with it since it has not just unbounded resources but actually infinite computational power in any given time period.
A bounded agent needs needs to be able to evaluate alternative priors, update rules, and heuristics in addition to the evidence and predictions themselves, or it won't even approximate bounded rationality. While this is a more complicated scenario than the Solomonoff updater in some senses, it is philosophically simpler since we can view it more like a "bootstrap" process and ask what sort of bootstrapping might "generally" do well rather than taking anything as fixed.
I suspect that heuristics that score highly involve "universal" but finite systems (such as Turing machines or other mathematical structures capable of representing their own rules), and a "simple and not too costly" evaluation heuristic (not just simplicity).
There would be "degenerate" distributions of universe rules that would be exceptions, so there is still a problem of describing what sort of distributions I'm thinking of as being "degenerate", and naturally this whole sort of statement is too vague to prove any such thing even if such proofs weren't famously difficult (and plausibly impossible to prove even if not false).
One thing that seems worth exploring from a conceptual point of view is doing away with priors altogether, and working more directly with metrics such as "what are the most expected-value rewarding actions that a bounded agent can make given the evidence so far". I suspect that from this point of view it doesn't much matter whether you use a computational basis such as Turing machines, something more abstract, or even something more concrete such as energy required to assemble and run a predicting machine.
From a computing point of view not all simple models will require the fewest resources available to actually follow it through and compare against other models, so in that respect this differs from Solomonoff induction (which assumes a system of infinite computing power). However, my expectation is that there usually would be some model only a bit more complex than the simplest that makes similar predictions at lower cost. A heuristic that examines models in a simplest-first order (but discarding ones that look expensive to evaluate) may well end up being close to optimal in trading off prediction accuracy against whatever costs there are of evaluating multiple models. There are exponentially fewer simple models to evaluate.
What makes you think that we're not at year(TAI)-3 right now? I'll agree that we might not be there yet, but you seem to be assuming that we can't be.
How do you propose that reasonable actors prevent reality from being fragile and dangerous?
Cyber attacks are generally based on poor protocols. Over time smart reasonable people can convince less smart reasonable people to follow better ones. Can reasonable people convince reality to follow better protocols?
As soon as you get into proposing solutions to this sort of problem, they start to look a lot less reasonable by current standards.
No, nobody has a logical solution to that (though there have been many claimed solutions). It is almost certainly not true.
Why not both?
Human design will determine the course of AGI development, and if we do the right things then whether it goes well is fully and completely up to us. Naturally at the moment we don't know what the right things are or even how to find them.
If we don't do the right things (as seems likely), then the kinds of AGI which survive will be the kind which evolve to survive. That's still largely up to us at first, but increasingly less up to us.