New OpenAI Paper - Language models can explain neurons in language models
Summary by OpenAI: We use GPT-4 to automatically write explanations for the behavior of neurons in large language models and to score those explanations. We release a dataset of these (imperfect) explanations and scores for every neuron in GPT-2. Link: https://openai.com/research/language-models-can-explain-neurons-in-language-models Please share your thoughts in the the comments!


To clarify, here are some examples of the type of projects I would love to help with:
Sponsoring University Research:
Funding researchers to publish papers on AI alignment and AI existential risk (X-risk). This could start with foundational, descriptive papers that help define the field and open the door for more academics to engage in alignment research. These papers could also provide references and credibility for others to build upon.
- Developing Accessible Pitches:
... (read more)Creating a "boilerplate" for how to effectively communicate the importance of AI alignment to non-rationalists, whether they are academics, policymakers, or the general public. This could include shareable content designed to resonate with people who may not already be engaged with rationalist or