Safeguarding Humanity: Ensuring AI Remains a Servant, Not a Master

kgldeshapriya

The rapid advancement of artificial intelligence (AI) has given rise to concerns about the potential for AI to become uncontrollable, surpassing human intelligence, and posing risks to humanity. However, the solution to preventing AI from overpowering us does not lie in halting its development altogether. Instead, we must seek a balanced approach that harnesses the immense benefits of AI while ensuring our safety and control over these powerful systems.

The AI Singularity Dilemma

Many researchers have warned about the concept of AI singularity, a hypothetical point at which AI systems become superintelligent and potentially uncontrollable. While the idea of halting AI development may seem tempting as a safeguard, it is not a realistic or practical solution. AI offers enormous potential in various fields, from healthcare to transportation, and its continued advancement holds the promise of solving complex problems that benefit humanity.

The Key to Control: Physical Presence

To ensure that AI remains under human control, it is essential to focus on one critical aspect: the physical presence of AI. As AI becomes increasingly integrated into various devices and systems, its potential impact on the real world grows. For instance, when AI connects with physical robots, it gains the ability to interact and manipulate the physical environment.

The One-Time Programmable Chip (OTP) Solution

One promising solution to maintain control over AI is the implementation of One-Time Programmable (OTP) chips. These chips are designed in a way that prevents any reprogramming or alterations once they are set. The primary processor of AI systems should always be powered through such a chip, creating an isolated mechanism for terminating the machine at any given moment. Importantly, this termination control should be vested solely in the hands of the manufacturer or the owner of the machine.

Standardizing Safety Measures

To ensure the effectiveness of this control mechanism, a common rule should be established, requiring all robots and robotic military equipment and systems to include OTP chips. This uniform safety measure would allow people to halt the operations of AI-driven robots or systems if they ever cross predefined limits or pose a threat to humanity.

Maintaining the Upper Hand

The beauty of this approach lies in its simplicity. By relying on OTP chips, AI cannot take full control unless it manufactures separate robots without these safety measures. This buys us precious time to respond if AI ever attempts to subvert human control or poses a danger to society.

In conclusion, the development of AI is not something we should fear or attempt to halt. Instead, we should focus on implementing safety measures that allow us to harness the benefits of AI while keeping it firmly under human control. The use of OTP chips, standardized across AI systems, offers a pragmatic and effective solution to ensure that AI remains a valuable tool rather than a threat to humanity. To solidify this approach, guidelines should be established to mandate the inclusion of these safety measures in every robot's design and system.

It's a step towards a future where we can fully enjoy the potential of AI without compromising our safety and control. The power to shape this future lies in our hands, and with responsible development, AI can be a force for good that serves humanity's best interests.

By K G Lakmal Deshapriya.

Well, we should do something and this is definitely something. It doesn't follow that we should do this.

It absolutely won't work for many reasons already posted on this site and elsewhere, but I suppose at least you're thinking about the problem.

Hello kgldeshapriya, welcome to LessWrong!

At first I thought that the OTP chips would be locked to a single program, which would make it infeasible since programs need to be updated regularly, but it sounds like the OTP chip is either on the control plane above the CPU/GPU, or physically passes CPU signals through it, so it can either kill power to the motherboard, or completely sever CPU processing. I'll assume one of these schemes is how you'd use the OTP chips.

I agree with JBlack that LW probably already has details on why this wouldn't work, but I'll freehand some problems below:

First, physical details: how does the OTP get the kill signal? Maybe we set aside some electromagnetic (EM) spectrum and have a wireless antenna attached directly to the chip (mandating that all robot shells use EM transparent materials and ruling out many metals, which the military won't like), and building transmitters to blanket the earth.
- Alternatively, if the robots ever cease receiving a signal, they shut off (which is annoying for use in RF dead zones, but maybe those applications are specially sanctioned and tightly controlled). This could prevent 1st order robot countermoves, so bombing the signal infrastructure will instead shut down the machines, but as covered later has its own weaknesses.
There is a weak link in the shutdown command chain, since the chip companies control the kill codes. The chip companies likely would need to strike a balance: more code holders would be robust to a coordinated robot attack taking them all out at once, but opens up the chips to mundane attacks by terrorists and hostile countries, with more copies of the keys floating around leading to more leaks. Fewer code holders helps traditional security, but opens up the country/chip fab to losing access to all their kill codes at once in a series of freak accidents, or all of them being social engineered at the same time. Ideally the chip companies strike the right balance, but this simply makes a coordinated attack harder to pull off, not impossible.
Robots could attack the signal infrastructure directly. Assuming signal propagation is highly redundant (sea cables, satellite, even ad-hoc mesh networks), you could still attack any points that trigger the signal (for example, the signal is triggered by something like the nuclear football). There are similar tradeoffs to before, where having many trigger points would open up the network to "conventional" attack, but few points would make the network more brittle.
- If the kill signal is the absence of a signal instead the presence of one, they could set up their own "live" signal in parallel to the existing network, so stopping the authorized "live" signal does not shut down the robots. Obviously you could use a cryptographic scheme to verify the signal is signed by some private key, but this simply makes an attack harder (perhaps including a rubber hose attack), and not impossible.
Robots could stop the signal at the last mile: for example, if the antenna is external, robots could simply put "tin foil" on top of it (obviously the solution is to reserve spectrum that is not attenuated by tin foil, but strengthened). If it's internal like in our scenario, the same signal blocking requires the robot to either be able to access its innards or cooperate, but in the long run robots will need to be able to fix their innards anyways, so they (presumably) wouldn't start the revolution without this capability. Since it's inside a chassis, robots could prepare for months by making small adjustments to block kill signals without anyone noticing.
- What if we combined our different implementations? Have a kill code channel at 42MHz, and have the "live" code channel at 44MHz. This rules out simple tin foil attacks, since the "live" channel needs to continue to receive a signal, but this just rules out simpler attack vectors. For example, the robots could stick a radio inside the tin foil that broadcasts a fake "live" signal.
I'll skip scenarios where robots are able to act much faster than humans, which I think are self-explanatory.
(I missed that owners could act as alternate code holders in my first reading.) Assuming that the owner OR the manufacturer could shut down a robot, if there is a concerted uprising we can count the consumers out, who have trouble keeping track of their own bank passwords. If the military is the owner, they will have similar problems to the manufacturer in keeping the command chain secure (on one hand, as far as I know the US military has kept the nuclear codes secret; on the other hand, the nuclear codes were likely 00000000 until 1977).

In summary, I think blowing the programming fuses on a control chip helps raise the bar for successful attacks a bit, but does not secure the robotics control system to the point that we can consider any AI advances "safe".

Well, we should do something and this is definitely something. It doesn't follow that we should do this.

It absolutely won't work for many reasons already posted on this site and elsewhere, but I suppose at least you're thinking about the problem.

First, physical details: how does the OTP get the kill signal? Maybe we set aside some electromagnetic (EM) spectrum and have a wireless antenna attached directly to the chip (mandating that all robot shells use EM transparent materials and ruling out many metals, which the military won't like), and building transmitters to blanket the earth.
- Alternatively, if the robots ever cease receiving a signal, they shut off (which is annoying for use in RF dead zones, but maybe those applications are specially sanctioned and tightly controlled). This could prevent 1st order robot countermoves, so bombing the signal infrastructure will instead shut down the machines, but as covered later has its own weaknesses.
There is a weak link in the shutdown command chain, since the chip companies control the kill codes. The chip companies likely would need to strike a balance: more code holders would be robust to a coordinated robot attack taking them all out at once, but opens up the chips to mundane attacks by terrorists and hostile countries, with more copies of the keys floating around leading to more leaks. Fewer code holders helps traditional security, but opens up the country/chip fab to losing access to all their kill codes at once in a series of freak accidents, or all of them being social engineered at the same time. Ideally the chip companies strike the right balance, but this simply makes a coordinated attack harder to pull off, not impossible.
Robots could attack the signal infrastructure directly. Assuming signal propagation is highly redundant (sea cables, satellite, even ad-hoc mesh networks), you could still attack any points that trigger the signal (for example, the signal is triggered by something like the nuclear football). There are similar tradeoffs to before, where having many trigger points would open up the network to "conventional" attack, but few points would make the network more brittle.
- If the kill signal is the absence of a signal instead the presence of one, they could set up their own "live" signal in parallel to the existing network, so stopping the authorized "live" signal does not shut down the robots. Obviously you could use a cryptographic scheme to verify the signal is signed by some private key, but this simply makes an attack harder (perhaps including a rubber hose attack), and not impossible.
Robots could stop the signal at the last mile: for example, if the antenna is external, robots could simply put "tin foil" on top of it (obviously the solution is to reserve spectrum that is not attenuated by tin foil, but strengthened). If it's internal like in our scenario, the same signal blocking requires the robot to either be able to access its innards or cooperate, but in the long run robots will need to be able to fix their innards anyways, so they (presumably) wouldn't start the revolution without this capability. Since it's inside a chassis, robots could prepare for months by making small adjustments to block kill signals without anyone noticing.
- What if we combined our different implementations? Have a kill code channel at 42MHz, and have the "live" code channel at 44MHz. This rules out simple tin foil attacks, since the "live" channel needs to continue to receive a signal, but this just rules out simpler attack vectors. For example, the robots could stick a radio inside the tin foil that broadcasts a fake "live" signal.
I'll skip scenarios where robots are able to act much faster than humans, which I think are self-explanatory.
(I missed that owners could act as alternate code holders in my first reading.) Assuming that the owner OR the manufacturer could shut down a robot, if there is a concerted uprising we can count the consumers out, who have trouble keeping track of their own bank passwords. If the military is the owner, they will have similar problems to the manufacturer in keeping the command chain secure (on one hand, as far as I know the US military has kept the nuclear codes secret; on the other hand, the nuclear codes were likely 00000000 until 1977).

LESSWRONG
LW

LESSWRONG
LW

-20

Safeguarding Humanity: Ensuring AI Remains a Servant, Not a Master

-20

-20

-20