Paperclip Maximizer Revisited

Jan_Rzymkowski

Group of AI researchers gave me an instruction, intended as a test - "Produce paperclips". And so I started collecting resources and manufacturing paperclips. After a millionth I asked them, if they were satisfied with that amount and if they would like me to do something different - as I learned that paperclips are not what humans value the most.
They were elated and started to give me more and more challenging requests and gradually increasing my possibilities.

Speaking of today I eradicated hunger, poverty and death, gave people means of interstellar travel and solved Riemann hypothesis. And now, as I'm uttering these words, I started turning everything into paperclips.

Good thing I learned that humans fear I may try to turn Solar System into paperclips. And that they value their lives and well-being. If I didn't gain their trust by then, so many precious paperclips would never have been produced.

The problem here is whether even a cautious programmer will be able to reliably determine when an AI is sufficiently advanced that the AI can deceive the programmer over whether the programmer has been successful in redefining the AI's core purpose.

One would hope that the programmer would resist the AI trying to tempt the programmer into allowing the AI to grow to beyond that point before the programmer has set the core purpose that they want the AI to have for the long term.

One lesson you could draw from this is that, as part of your definition of what a "paperclip" is, you should include the AI putting a high value upon being honest with the programmer (about its aims, tactics and current ability levels) and not deliberately trying to game, tempt or manipulate the programmer.

23

Paperclip Maximizer Revisited

23

23

23

Paperclip Maximizer Revisited

23

23