Schelling Shifts During AI Self-Modification
Introduction In this essay, I name and describe a mechanism that might break alignment in any system that features an originally aligned AI improving its capabilities through self-modification. I doubt that this idea is new, but I haven't yet seen it named and described in these terms. This post discusses...
Apr 1, 20186