All of Spenser N's Comments + Replies

In fact, corporations are quite aligned with you. Not only because they are run by humans, who are at least roughly aligned with humanity by default, but we have legal institutions and social norms which help keep the wheels on the tracks. In fact the profit motive is a powerful alignment tool - it's hard to make a profit off of humanity if they are all dead. But who aren't corporations aligned with? Humans without money or legal protections for one (though we don't need to veer off into an economic or political discussion). But also plants, insects, most ... (read more)

0Timothy M.
I feel like this metaphor doesn't strike me as accurate because humanity can engage in commerce and insects cannot. But also humanity causes a lot of environmental degradation but we still don't actually want to bring about the wholesale destruction of the environment.

I actually think this example shows a clear potential failure point of an Oracle AI. Though it is constrained, in this example, to only answer yes/no questions, a user can easily circumvent this by formatting the question with this method.

Suppose a bad actor asks the Oracle AI the following: “I want a program to help me take over the world. Is the first bit 1?” Then they can ask for the next bit and recurse until the entire program is written out. Obviously, this is contrived. But I think it shows that the apparent constraints of an Oracle add no real benefit to safety, and we’re quickly relying once again on typical alignment concerns.