Viktor Rehnberg

Message

176

Concrete empirical research projects in mechanistic anomaly detection

Thanks to Jordan Taylor, Mark Xu, Alex Mallen, and Lawrence Chan for feedback on a draft! This post was mostly written by Erik, but we're all currently collaborating on this research direction. Mechanistic anomaly detection (MAD) aims to flag when an AI produces outputs for “unusual reasons.” It is similar...

Apr 3, 202443

Intuitions by ML researchers may get progressively worse concerning likely candidates for transformative AI

Epistemic status: Explorative. See results more as a sketch towards a possible issue than a proper derivation of reliable values. Research behind exact numbers is limited and calculations are without error propagation. I'd rather get the idea out there than wait until I find the time to do it properly....

Nov 25, 20227

Takeaways from the Intelligence Rising RPG

Last Saturday, we played Intelligence Rising. Intelligence Rising is a role-playing game about the global development of transformative AI technologies, where each player or team is a major government or company. We had a great time and we highly recommend you seize any opportunity to play it! What follows is...

Mar 5, 202151

LESSWRONG
LW

LESSWRONG
LW

Viktor Rehnberg

Viktor Rehnberg

Viktor Rehnberg

Viktor Rehnberg

Concrete empirical research projects in mechanistic anomaly detection

Intuitions by ML researchers may get progressively worse concerning likely candidates for transformative AI

Takeaways from the Intelligence Rising RPG

Concrete empirical research projects in mechanistic anomaly detection

Intuitions by ML researchers may get progressively worse concerning likely candidates for transformative AI

Takeaways from the Intelligence Rising RPG