Security Mindset - Fire Alarms and Trigger Signatures
Series Overview and Goals This is the second in a series of articles about applying traditional security mindset to the problems of alignment and AI research in general. As much as possible, we should try to mine the lessons from the history of security and apply them to the alignment...
Feb 9, 202324