AI Safety Camp, Virtual Edition 2023

Linda Linsefors

AI Safety Camp connects you with an experienced research lead to collaborate on a research project – helping you try your fit for a potential career in AI Safety research.

The applications for AI Safety Camp’s Virtual Edition in 2023 are now open!

AI Safety Camp Virtual 8 will be a 3.5-month long online research program from 4 March to 18 June 2023, where participants form teams to work on pre-selected projects.

We value people with diverse backgrounds and skillsets, such as cybersecurity or cognitive science. Not all projects require participants to have prior experience in AI Safety, mathematics or machine learning. You will be able to read in detail about the research topics & each project’s skill requirements for our upcoming edition by following the Project Proposal links below.

Projects

Projects you can apply to…

Conceptualise AGI dynamics

Uncertainty -> Soft Optimization with Jeremy Gillen

Inquire into Uncontrollable Dynamics of AGI with Remmelt Ellen

Discussing and Crystallising a Research Agenda Based on Positive Attractors and Inherently Interpretable Architectures with Robert Kralisch

Investigate transformer models

Cyborgism with Nicholas Kees Dupuis

Understanding Search in Transformers with Michael Ivanitskiy

Interdisciplinary Investigation of DebateGPT with Paul Bricman

Finetune language transformers

Does Introspective Truthfulness Generalize in LMs? with Jacob Pfau

Inducing Human-Like Biases in Moral Reasoning LMs with Bogdan-Ionut Cirstea

Create annotation tools

Behavioral Annotation Framework for the Contextualized and Personalized Fine-Tuning of Foundation Models with Eleanor “Nell” Watson

Review and analyse literature

How Should Machines Learn from Default Options? with En Qi Teo

Literature Review of the Neurological Basis of Human Values and Preferences with Linda Linsefors

Machine learning for Scientific Discovery: the Present and Future of Science-Producing AI Models with Eleni Angelou

Propose public policy/communication

Policy Proposals for High-Risk AI Regulation with Koen Holtman

Developing Specific Failure Stories about Uncontrollable AI with Karl von Wendt

Apply to AISC Virtual 2023

Apply if you…

want to try out & consider ways you could help ensure that future AI performs safely and in line with what people value upon reflection;
are ready to dig into our research leads’ research topics and are able to write down a few clear arguments for why you’d research one in particular & how you might start;
previously studied a topic or practiced skills unique to your perspective/background that can bolster your new research team’s progress;
can block off hours to focus on research from March to June 2023 on normal workdays and the weekends (at least 10 hours per week).

Application timeline

5 Jan 2023	00:01 UTC	Project proposals are posted on our website. Application form opens. Reviews start right away .
19 Jan	23:59 AoE	Deadline to apply to teams closes. Late submissions might not get a response.
1 March	23:59 AoE	Last applicant admitted or declined (most will be informed of our decision earlier).

First virtual edition – a spontaneous collage

Timeline

January 5: Accepted proposals are posted on the AISC website. Application to join teams open.
January 19: Application to join teams closes.
Until February end: Organisers pre-filter applications. RLs interview potential members and pick their team.
March 4-5: Opening weekend.
From here, teams meet weekly, and plan in their own work hours.
June 17-18: Closing weekend.
All teams present their results.

Team structure

Every team will have:

one Research Lead
one Team Coordinator
other team members

All team members are expected to work at least 10 hours per week on the project, which includes joining weekly team meetings, and communicating regularly (between meetings) with other team members about their work.

As of yet, we cannot commit to offering stipends compensation for team members, because a confirmed grant fell through. Another grantmaker is in the midst of evaluating a replacement grant for AI Safety Camp. If confirmed, team members can opt in to receive a minimum of $500 gross per month (up to $2000 for full-time work).

Research Lead (RL)

The RL is the person behind the research proposal. If a group forms around their topics, the RL will guide the research project, and keep track of relevant milestones. When things inevitably don’t go as planned (this is research after all) the RL is in charge of setting the new course.

The RL is part of the research team and will be contributing to research the same as everyone else on the team.

Team Coordinator (TC)

The TC is the ops person of the team. They are in charge of making sure meetings are scheduled, checks in with individuals on their task progress, etc. TC and RL can be the same person.

The role of the TC is important but not expected to take too much time (except for project management-heavy teams). Most of the time, the TC will act like a regular team member contributing to the research, same as everyone else on the team.

Other team members

Other team members will work on the project under the leadership of the RL and the TC. Team members will be selected based on relevant skills, understandings and commitments to contribute to the research project.

Questions?

You can contact us at contact@aisafety.camp

So it seems that a lot of people who applied to Understanding Search in Transformers project to do mechanistic interpretability research and probably a lot of them won't get in.
I think there's a lot of similar projects and potential low-hanging fruit people could work on and we probably could organize to make more teams working on similar things.
I’m willing to organize at least one such project myself(specifically working on trying to figure out how algorithm distillation https://arxiv.org/pdf/2210.14215.pdf works) and will talk with Linda about it in 2 weeks and write a longer post with more details but I thought it would be better to write a comment here to see if how many people are interested in that kind of thing beforehand.

I wonder if it would make sense to make this half-open, in the sense that you would publish on LW links to the study materials, and maybe also some of the results. So that people who didn't participate have a better idea.

There is no study material since this is not a course. If you are accepted to one of the project teams they you will work on that project.

You can read about the previous research outputs here: Research Outputs – AI Safety Camp

The most famous research to come out of AISC is the coin-run experiment.
(95) We Were Right! Real Inner Misalignment - YouTube
[2105.14111] Goal Misgeneralization in Deep Reinforcement Learning (arxiv.org)

But the projects are different each year, so the best way to get an idea for what it's like is just to read the project descriptions.

Nitpick: the box in the application form under the header "Going by past similar cases, how many hours..." doesn't allow for text formatting, embedding links and the like.

Thanks for letting us know. I've fixed this now.

Can I apply to work simultaneously on two projects?

I recommend applying to all projects you are interested in.

I don't remember if we made any official decision in regards to officially joining more than one team. I've posted the question to the other organisers. But either way, we do encourage teams to help each other out.

We don't have any rule against joining more than one project, but you'd have to convince us that you have time for it. As long as you don't have any other commitments it should be fine. But you would also have to be accepted to both project separately, since each project lead make the final decision as to who they want to accept.

I hope this answers your question Mateusz.

It does. Thank you very much

I'll take a look at the "Conceptualise AGI Dynamics" projects and see if I'm a particularly good fit for any of them.

There is no study material since this is not a course. If you are accepted to one of the project teams they you will work on that project.

You can read about the previous research outputs here: Research Outputs – AI Safety Camp

Nitpick: the box in the application form under the header "Going by past similar cases, how many hours..." doesn't allow for text formatting, embedding links and the like.

Thanks for letting us know. I've fixed this now.

Can I apply to work simultaneously on two projects?

It does. Thank you very much

I'll take a look at the "Conceptualise AGI Dynamics" projects and see if I'm a particularly good fit for any of them.

LESSWRONG
LW

LESSWRONG
LW

40

AI Safety Camp, Virtual Edition 2023

40

Ω 17

Projects

Conceptualise AGI dynamics

Investigate transformer models

Finetune language transformers

Create annotation tools

Review and analyse literature

Propose public policy/communication

Apply to AISC Virtual 2023

Apply if you…

Application timeline

Timeline

Team structure

Research Lead (RL)

Team Coordinator (TC)

Other team members

Questions?

40

Ω 17

40

Ω 17