Mech Interp Wiki Page and Why You Should Edit Wikipedia
TL;DR: A couple months ago, we (Jo and Noah) wrote the first Wikipedia article on Mechanistic Interpretability. It was oddly missing despite Mech Interp’s visibility in alignment circles. We think Wikipedia is a top-of-funnel resource for journalists, policy staffers, and curious students, so filling that gap is cheap field-building. Seeing that the gap existed in one of the most important subfields in AIS, we suspect that there are probably many others. Below, we (1) list other alignment / EA topics that need pages or can be upgraded and (2) share some notes about editing Wikipedia. If you know the literature, an afternoon of edits may be surprisingly high-impact. PS: We also think that there existing a wiki page for the field that one is working in increases one's credibility to outsiders - i.e. if you tell someone that you're working in AI Control, and the only pages linked are from LessWrong and Arxiv, this might not be a good look. Wikipedia pages worth (re)writing Below are topics with plentiful secondary literature yet their wiki pages are either missing or a stub. * Reward hacking * AI Control * AI Evaluations * Agent Foundations (and particular ideas from it like Mesa-Optimization) * Mechanistic superposition * People pages – Neel Nanda, Dan Hendrycks, Chris Olah, Jacob Steinhardt; each now have mainstream press. * AI 2027 - got much media attention (read by J. D. Vance and NYT coverage, for instance), so this seems like a priority * The open letter * LessWrong * Rationalist Community * Many ideas on the concepts. * EA concepts. Wikipedia Tips * If you want to work on a new page, discuss with the community first by going to the talk page of a related topic or meta-page. * When making substantive updates, prepare the update in your user sandbox first. * Attribute claims (Nature says x, or expert y says) rather than advocate (z is important because...). * Special:WantedPages lets you filter red-links for terms like “alignment”, “Go