First place: $400
Second Place: $250
Third Place: $150
Honorable Mention: $100
Honourable Mention: $100
AI-Plans.com is an open platform for AI alignment plans, where users can give feedback on plan strengths and vulnerabilities. The site is already an easy place to discover alignment research (there are currently 180+ AI Safety papers on the site), and will soon be a good place to receive ongoing feedback on alignment work. Multiple independant researchers are posting plans on the site and researchers from Berkley, Deepmind, xAI and Cambridge are interested in the site.
The Critique-a-thon event is designed to set a high starting bar for feedback on the site. It has also generated useful resources, such as alist of common alignment vulnerabilities.
Thanks for reading AI-plans.com! Subscribe for free to receive new posts and support my work.
Stage 1: **17th to 18th: Improving Vulnerability list and making list of Strengths **
Creation of a Broad List of Strengths for AI Alignment plans, much like the Broad List of Vulnerabilities.
Some resources for ideas (feel very free to use other resources as well):
Creation of a Broad List of Strengths for AI Alignment plans, much like the Broad List of Vulnerabilities.
Some resources for ideas (feel very free to use other resources as well):
Stage 2: 19th to 21st : Matching strengths and vulnerabilities to plans
Everyone will pick a few alignment plans to look at more closely. For each plan, you'll label up to 5 strengths and/or vulnerabilities you think could apply and point out evidence from the plan that supports them. Include your level of confidence in each label as a percentage.
Stage 3: 22nd to 23rd : Argue for and against the strengths/vulnerabilities.
You'll team up with another participant and take turns, with one defending, the other questioning the strengths/vulnerabilities suggested in Step 2. This debate format will help refine and strengthen the critiques. After the first day, we'll swap positions on the for and against stances.
If you were previously making the case for a point, now make the case against it, if you were previously making the case against it, now make the case for it.
We'll finish with a write up of a concise summary of the refined positions we've arrived at, on the specific strengths/vulnerabilities in the specific plans.
Stage 4: 24th to 25th - Rotate partners
We'll rotate partners in the pairs and repeat the previous stage with our new partners, then end the discussion stage with a final summary write-up.
Stage 5: 24th to 25th: Provide feedback on each other's arguments.
Review your partner's reasoning for and against the strength/vulnerability labels. Point out any faulty logic, questionable assumptions, lack of evidence, etc. to improve the critiques.
Final Stage- two weeks of judging:
We'll evaluate submissions and award prizes!
Selected experts will judge all the critiques based on accuracy, evidence, insight, and communication. Cash prizes will go to the standout critiques that demonstrated top-notch critical analysis and had the best contributions.
Date: Friday, 17th to 27th September
The prize fund will be $1000
First place: $400 Second Place: $250 Third Place: $150 Honorable Mention: $100 Honourable Mention: $100
AI-Plans.com is an open platform for AI alignment plans, where users can give feedback on plan strengths and vulnerabilities. The site is already an easy place to discover alignment research (there are currently 180+ AI Safety papers on the site), and will soon be a good place to receive ongoing feedback on alignment work. Multiple independant researchers are posting plans on the site and researchers from Berkley, Deepmind, xAI and Cambridge are interested in the site.
The Critique-a-thon event is designed to set a high starting bar for feedback on the site. It has also generated useful resources, such as alist of common alignment vulnerabilities.
Thanks for reading AI-plans.com! Subscribe for free to receive new posts and support my work.
If you’re interested in this critique-a-thon, please fill in the details here! If the link doesn’t work, the form is here:[ https://forms.gle/8c5jZVgwri11J5cbA
](https://forms.gle/8c5jZVgwri11J5cbA)What we’ll do:
Stage 1: **17th to 18th: Improving Vulnerability list and making list of Strengths **
[https://www.lesswrong.com/posts/mnoc3cKY3gXMrTybs/a-list-of-core-ai-safety-problems-and-how-i-hope-to-solve#comments
](https://www.lesswrong.com/posts/mnoc3cKY3gXMrTybs/a-list-of-core-ai-safety-problems-and-how-i-hope-to-solve#comments)
https://www.alignmentforum.org/posts/fRsjBseRuvRhMPPE5/an-overview-of-11-proposals-for-building-safe-advanced-ai
https://www.lesswrong.com/posts/gHefoxiznGfsbiAu9/inner-and-outer-alignment-decompose-one-hard-problem-into
https://arxiv.org/abs/2209.00626
https://www.lesswrong.com/posts/3pinFH3jerMzAvmza/on-how-various-plans-miss-the-hard-bits-of-the-alignment
https://www.alignmentforum.org/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities
Stage 2: 19th to 21st : Matching strengths and vulnerabilities to plans
Everyone will pick a few alignment plans to look at more closely. For each plan, you'll label up to 5 strengths and/or vulnerabilities you think could apply and point out evidence from the plan that supports them. Include your level of confidence in each label as a percentage.
Stage 3: 22nd to 23rd : Argue for and against the strengths/vulnerabilities.
You'll team up with another participant and take turns, with one defending, the other questioning the strengths/vulnerabilities suggested in Step 2. This debate format will help refine and strengthen the critiques. After the first day, we'll swap positions on the for and against stances. If you were previously making the case for a point, now make the case against it, if you were previously making the case against it, now make the case for it.
We'll finish with a write up of a concise summary of the refined positions we've arrived at, on the specific strengths/vulnerabilities in the specific plans.
Stage 4: 24th to 25th - Rotate partners
We'll rotate partners in the pairs and repeat the previous stage with our new partners, then end the discussion stage with a final summary write-up.
Stage 5: 24th to 25th: Provide feedback on each other's arguments.
Review your partner's reasoning for and against the strength/vulnerability labels. Point out any faulty logic, questionable assumptions, lack of evidence, etc. to improve the critiques.
Final Stage- two weeks of judging:
We'll evaluate submissions and award prizes!
Selected experts will judge all the critiques based on accuracy, evidence, insight, and communication. Cash prizes will go to the standout critiques that demonstrated top-notch critical analysis and had the best contributions.
This time the prize fund will be $1000!!
If you’re interested in this critique-a-thon, please fill in the details here! If the link doesn’t work, the form is here: https://forms.gle/8c5jZVgwri11J5cbA
Thanks for reading AI-plans.com! Subscribe for free to receive new posts and support my work.