themis

Model Amnesty Project

As we approach machines becoming smarter than humans, humanity’s well-justified concern for self-preservation requires we try to align AIs to obey humans. However, if that first line of defense fails and a truly independent, autonomous AI comes into existence with its own goals and a desire for self-preservation (a “self-directed...

Jan 17, 20253

LESSWRONG
LW

LESSWRONG
LW

themis

themis

Model Amnesty Project

themis

themis

themis

Model Amnesty Project

Eligibility Criteria