I almost missed that there's new thoughts here, I thought this was a rehash of your previous post The AI Apocalypse Myth!
The new bit sounds similar to Elon Musk's curious AI plan. I think this means it has a similar problem: humans are complex and a bounty of data to learn about, but as the adage goes, "all happy families are alike; each unhappy family is unhappy in its own way." A curious/learning-first AI might make many discoveries about happy humans while it is building up power, and then start putting humans in a greater number of awful but novel and "interesting" situations once it doesn't need humanity to survive.
That said, this is only a problem if the AI is likely to not be empathetic/compassionate, which if I'm not mistaken, is one of the main things we would disagree on. I think that instead of trying to find these technical workarounds, you should argue for the much more interesting (and important!) position that AIs are likely to be empathetic and compassionate by default.
If instead you do want to be more persuasive with these workarounds, can I suggest adopting more of a security mindset? You appear to be looking for ways in which things can possibly go right, instead of all the ways things can go wrong. Alternatively, you don't appear to be modeling the doomer mindset very well, so you can't "put on your doomer hat" and check whether doomers would see your proposal as persuasive. Understanding a different viewpoint in depth is a big ask, but I think you'd find more success that way.
Thoughts on the different sub-questions, from someone that doesn't professionally work in AI safety:
I think the 1st argument proves too much - I don't think we usually expect simulations to never work unless otherwise proven? Maybe I'm misunderstanding your point? I agree with Vaughn downvotes assessment; maybe more specific arguments would help clarify your position (like, to pull something out of by posterior, "quantization of neuron excitation levels destroys the chaotic cascades necessary for intelligence. Also, chaos is necessary for intelligence because...").
To keep things brief, the human intelligence explosion seems to require open brain surgery to re-arrange neurons, which seems a lot more complicated than flipping bits in RAM.
Interesting, so maybe a more important crux between us is whether AI would have empathy for humans. You seem much more positive about AI working with humanity past the point that AI no longer needs humanity.
Some thoughts:
I'm going to summarize what I understand to be your train of thought, let me know if you disagree with my characterization, or if I've missed a crucial step:
I think other comments have addressed the 1st point. To throw in yet another analogy, Uber needs human drivers to make money today, but that dependence didn't stop it from trying to develop driverless cars (nor did that stop any of the drivers from driving for Uber!).
With regards to robotics progress, in your other post you seem to accept intelligence amplification as possible - do you think that robotics progress would not benefit from smarter researchers? Or, what do you think is fundamentally missing from robotics, given that we can already set up fully automated lights out factories? If it's about fine grained control, do you think the articles found with a "robot hand egg" web search indicate that substantial progress is a lot further away than really powerful AI? (Especially if, say, 10% of the world's thinking power is devoted to this problem?)
My thinking is that robotics is not mysterious - I suspect there are plenty of practical problems to be overcome and many engineering challenges in order to scale to a fully automated supply chain, but we understand, say, kinematics much more completely than we do understand how to interpret the inner workings of a neural network.
(You also include that you've assumed a multi-polar AI world, which I think only works as a deterrent when killing humans will also destroy the AIs. If the AIs all agree that it is possible to survive without humans, then there's much less reason to prevent a human genocide.)
On second thought, we may disagree only due to a question of time scale. Setting up an automated supply chain takes time, but even if it takes a long 30 years to do so, at some point it is no longer necessary to keep humans around (either for a singleton AI or an AI society). Then what?
Hello kgldeshapriya, welcome to LessWrong!
At first I thought that the OTP chips would be locked to a single program, which would make it infeasible since programs need to be updated regularly, but it sounds like the OTP chip is either on the control plane above the CPU/GPU, or physically passes CPU signals through it, so it can either kill power to the motherboard, or completely sever CPU processing. I'll assume one of these schemes is how you'd use the OTP chips.
I agree with JBlack that LW probably already has details on why this wouldn't work, but I'll freehand some problems below:
In summary, I think blowing the programming fuses on a control chip helps raise the bar for successful attacks a bit, but does not secure the robotics control system to the point that we can consider any AI advances "safe".