Apparently, MIRI has given up on their current mainline approach to understanding agency and are trying to figure out what to do next. It seems like it might be worthwhile to collect some alternative approaches to the problem -- after all, intelligence and agency feature in pretty much all areas of human thought and action, so the space of possible ways to make progress should be pretty vast. By no means is it exhausted by the mathematical analysis of thought experiments! What are people's best ideas?
(By 'understanding agency' I mean research that is attempting to establish a better understanding of how agency works, not alignment research in general. So IDA would not be considered agent foundations, since it takes ML capabilities as a black-box. )
ETA: I originally wrote 'agent foundations' in place of 'understanding agency' in the above, which was ambiguous between a broad sense of the term(any research aimed at obtaining a foundational understanding of agency) and a narrow sense(the set of research directions outlined in the agent foundations agenda document). See this comment by Rob re: MIRI's ongoing work on agent foundations(narrow sense).
FWIW that's not how I read that update. It seems like MIRI was working on some secret "entirely different type of AI not based on existing machine learning techniques" that they gave up on, but are still pursuing their agent foundations agenda.
Gotcha. Either way, I think this is a great idea for a thread, and I appreciate you making it. :)
To avoid confusion, when I say "agent foundations" I mean one of these things:
- Work that's oriented toward the original "Agent Foundations" agenda, which put a large focus on "highly reliable agent design" (usually broken up into logical uncertainty and naturalized induction, decision theory, and Vingean reflection), and also tends to apply an HRAD-informed perspective to understanding things like corrigibility and value learning.
- Work that's oriented toward the
... (read more)