Password-locked models: a stress case for capabilities evaluation — LessWrong