Plausibly, yes. But so does programming capability, which is actually a bigger deal. (And it's unclear that a traditionally envisioned intelligence explosion is possible with systems built on LLMs, though I'm certainly not convinced by that argument.)
But "creating safety guarantees into which you can plug in AI capabilities once they arise anyway" is the point, and it requires at least some non-trivial advances in AI capabilities.
You should probably read the current programme thesis.
It is speculative in the sense that any new technology being developed is speculative - but closely related approaches are already used for assurance in practice, so provable safety isn't actually just speculative, there are concrete benefits in the near term. And I would challenge you to name a different and less speculative framework that actually deals with any issues of ASI risks that isn't pure hopium.
Uncharitably, but I think not entirely inaccurately, these include: "maybe AI can't be that much smarter than humans anyways," "let's get everyone to stop forever," "we'll use AI to figure it out, even though we have no real ideas," "we just will trust that no-one makes it agentic," "the agents will be able to be supervised by other AI which will magically be easier to align," "maybe multiple AIs will compete in ways that isn't a disaster," "maybe we can just rely on prosaic approaches forever and nothing bad happens," "maybe it will be better than humans at having massive amounts of unchecked power by default." These all certainly seem to rely far more on speculative claims, with far less concrete ideas about how to validate or ensure them.
It is critical for guaranteed safe AI and many non-prosaic alignment agendas. I agree it has risks, since all AI capabilities and advances pose control risks, but it seems better than most types of general capabilities investments.
Do you have a more specific model of why it might be negative?
I don't think it was betrayal, I think it was skipping verbal steps, which left intent unclear.
If A had said "I promised to do X, is it OK now if I do Y instead?" There would presumably have been no confusion. Instead, they announced, before doing Y, their plan, leaving the permission request implicit. The point that "she needed A to acknowledge that he’d unilaterally changed an agreement" was critical to B, but I suspect A thought that stating the new plan did that implicitly.
Strongly agree that there needs to be an institutional home. My biggest problem is that there is still no such new home!
You should also read the relevant sequence about dissolving the problem of free will: https://www.lesswrong.com/s/p3TndjYbdYaiWwm9x
You believe that something inert cannot be doing computation. I agree. But you seem to think it's coherent that a system with no action - a post-hoc mapping of states - can be.
The place where comprehension of Chinese exists in the "chinese room" is the creation of the mapping - the mapping itself is a static object, and the person in the room by assumption is doing to cognitive work, just looking up entries. "But wait!" we can object, "this means that the Chinese room doesn't understand Chinese!" And I think that's the point of confusion - repeating someone else telling you answers isn't the same as understanding. The fact that the "someone else" wrote down the answers changes nothing. The question is where and when the computation occurred.
In our scenarios, there are a couple different computations - but the creation of the mapping unfairly sneaks in the conclusion that the execution of the computation, which is required to build the mapping, isn't what creates consciousness!
Good point. The problem I have with that is that in every listed example, the mapping either requires the execution of the conscious mind and a readout of its output and process in order to build it, or it stipulates that it is well enough understood that it can be mapped to an arbitrary process, thereby implicitly also requiring that it was run elsewhere.
I don't understand. You shouldn't get any changes from changing encoding if it produces the same proteins - the difference for mirror life is that it would also mirror proteins, etc.