I'm pretty suspicious that actually useful context length is currently longer than a few hundred thousand tokens.
Not currently, but this is some kind of brute force scaling roadmap for one of the major remaining unhobblings, so it has timeline implications. On last year's hardware, it's not really feasible to go that far anyway, and RLVR is only just waking up. So the first public observations of negative results on this will probably be in 2026, if the actually useful context length fails to improve. And then there's 2028-2029, following up on the 147 TB of Rubin Ultra NVL576 (Nvidia roadmap places it in 2027, which means in 2028 there will be datacenters with it, as well as possibly models trained for it using older hardware, then in 2029 models trained on it).
But also, for the purpose of automated adaptation to a source of tasks and feedback (such as a job), it doesn't necessarily need as much fidelity, it only needs to work as well as a human reading some book a year ago, retaining the mental skills but not the words. A context in principle gives the words, but that is not the thing that needs to work.
An ASI aligned to a group of people likely should dedicate sovereign slivers of compute (optimization domains) for each of those people, and those people could do well with managing their domain with their own ASIs aligned to each of them separately. Optimization doesn't imply a uniform pureed soup, it's also possible to optimize autonomy, coordination, and interaction, without mixing them up.
An ASI perfectly aligned to me must literally be a smarter version of myself.
Values judge what should be done, but also what you personally should be doing. An ASI value aligned to you will be doing the things that should be done (according to you, on reflection), but you wouldn't necessarily endorse that you personally should be doing those things. Like, I want the world to be saved, but I don't necessarily want to be in a position to need to try to save the world personally.
So an ASI perfectly aligned to you might help uplift you into a smarter version of yourself as one of its top priorities, and then go on to do various other things you'd approve of on reflection. But you wouldn't necessarily endorse that it's the smarter version of yourself that is doing those other things, you are merely endorsing that they get done.
A salient and arguably valid non-extinction meaning for "existential threat" is "the firm will go bust". So currently many companies are considering if failing to use AI could be existential. Economic and national security implications make this meaning applicable to countries, thus governments are considering if failing to do well in the AI race could be existential.
It would not be stable. The most vicious actors are incentivized to tell their AGI "hide and self improve and take over at any cost so I can have my preferred future" before anyone else does it.
When there are superintelligences, the situation will plausibly be stable, because all intelligent activity will be happening under management of superintelligent governance. So there might be a point well below superintelligence when a sufficient level of coordination is established, and small groups of AGIs are unable to do anything of consequence (such as launching a mission to another star). Possibly not at human level, even with all the AI advantages, but as AGIs get stronger and stronger (regardless of their alignment), they might get there before superintelligence. (Not a safe thing for humanity of course. But stable.)
This kind of considerations (with positively valued computations as well) could be the basis for an economy of mostly sovereign slivers of compute under control of various individuals or groups, hosted within larger superintelligences. A superintelligent host might price their compute according to the value of what a tenant computes with it, as evaluated by the host. The tenants then have a choice between choosing computations more valuable to the host, and moving to a different host. (This is about managing tenants, rather than the host seeking out more optimal computations in general, a way of setting up laws for the tenants that maintain their value to the host within bounds, in a way that doesn't strongly interfere with their autonomy.)
Dario Amodei suggests that in-context learning might suffice for continual learning. The way LLMs do in-context learning with long context is disanalogous to anything humans can do, but a context window of 15M tokens is 500 days of 30K tokens per day, which is more than enough to progress from "first day on the job" to knowing what you are doing with this particular source of tasks. Needs to work mostly with text (if it works at all), or 15M tokens won't be enough, but that could be sufficient.
So this might just be about moving from RAG to including more free-form observations that were historically made by the model itself for the same source of tasks, with massively more tokens of context, and the current memory features of chatbots in the form of long text files might with sufficient scale become the real thing, rather than remaining a dead-end crutch, once these text files get into the habit of accumulating megabytes of observations. And RLVR can plausibly teach the models how to make a good use of these very long contexts.
With this year's 14 TB of HBM per GB200 NVL72, very long context windows become more feasible (than with ~1 TB of HBM per node that most current models are still running on), and then there's the next step in 2028 with Rubin Ultra NVL576 systems that have 147 TB of HBM.
Minimal alignment is a necessary premise, I'm not saying humanity's salience as a philanthropic cause is universally compelling to AIs. There is a number of observations that make this case stronger: the language prior in LLMs, preference training for chatbots, first AGIs might need nothing fundamentally different from this, and AGI-driven Pause on superintelligence increases the chances that the eventual superintelligences in charge are strongly value aligned with these first AGIs. Then in addition to the premise of a minimally aligned superintelligence, there's the essentially arbitrarily small cost of a permanently disempowered future of humanity.
So the overall argument indeed doesn't work without humanity actually being sufficiently salient to the values of superintelligences that are likely to end up in charge, and the argument from low cost only helps up to a point.
Yes, the future of humanity being a good place to live (within its resource constraints) follows from it being cheap for superintelligence to ensure (given that it's decided to let it exist at all), while the constraint of permanent disempowerment (at some level significantly below all of cosmic endowment) is a result of not placing the future of humanity at the level of superintelligence's own interests. Maybe there's 2% for actually capturing a significant part of the cosmic endowment (the eutopia outcomes), and 20% for extinction. I'm not giving s-risks much credence, but maybe they still get 1% when broadly construed (any kind of warping in the future of humanity that's meaningfully at odds with what humanity and even individual humans would've wanted to happen on reflection, given the resource constraints to work within).
I should also clarify that by "making it harmless" I simply mean the future of humanity being unable to actually do any harm in the end, perhaps through lacking direct access to the physical level of the world. The point is to avoid negative externalities for the hosting superintelligence, so that the necessary sliver of compute stays within budget. This doesn't imply any sinister cognitive changes that make the future of humanity incapable of considering the idea or working in that direction.
The text you quoted is about what happens within the resources already allocated to the future of humanity (for whatever reasons), the overhead of turning those resources into an enduring good place to live, and keeping the world at large safe from humanity's foibles, so that it doesn't end up more costly than just those resources. Plausibly there is no meaningful spatial segregation to where the future of humanity computes (or otherwise exists), it's just another aspect of what is happening throughout the reachable universe, within its share of compute.
a tiny sliver of light enough for earth would correspond to a couple of dollars of the wealth of a contemporary billionaire
Many issues exist in reference classes where solving all instances of them is not affordable to billionaires or governments or medieval kingdoms. And there is enough philanthropy that the analogy doesn't by itself seem compelling, given that humanity as a whole (rather than particular groups within it or individuals) is a sufficiently salient thing in the world, and the cost of preserving it actually is quite affordable this time, especially using the cheapest possible options, which still only need a modest tax (in terms of matter/compute) to additionally get the benefits of superintelligent governance.
superintelligent infrastructure should highly likely by default get in the way of humanity's existence at some point
Yes, the intent to preserve the future of humanity needs to crystallize soon enough that there is still something left. The cheapest option might be to digitize everyone and either upload or physically reconstruct when more convenient (because immediately ramping the industrial explosion starting on Earth is valuable for capturing the cosmic endowment that's running away due to the accelerating expansion of the universe, so that you irrevocably lose a galaxy in expectation every few years of delay). But again, in the quoted text "superintelligent infrastructure" refers to whatever specifically keeps the future of humanity in a good shape (as well as making it harmless), rather than to the rest of the colonized cosmic endowment doing other things.
Superintelligent governance serves as an anchor for the argument about mere AGIs I'm putting forward. I'm not distinguishing singleton vs. many coordinated ASIs, that shouldn't matter for the effectiveness of managing their tenants. The stable situation is where every Earth-originating intelligent entity in the universe must be either one of the governing superintelligences, or a tenant of one, and a tenant can't do anything too disruptive without their host's permission. So like with countries and residency, but total surveillance for anything potentially relevant and in principle lack of the bad things that would go along with total surveillance in a human government. Not getting the bad things seems likely for the factory-farming disanalogy reasons: tenants are not instrumentally useful anyway, so there is no point in doing things that would in particular end up having bad side effects for them.
So the argument is that you don't necessarily need superintelligence to make this happen, it could also work with sufficiently capable AGIs as the hosts. Even if merely human level AGIs are insufficient, there might be some intermediate level way below superintelligence that's sufficient. Then, a single AGI can't actually achieve anything in secret or self-improve to ASI, because there is no unsupervised hardware for it to run on, and on supervised hardware it'd be found out and prevented from doing that.