We hosted an AI Safety “Thinkathon” in Chile. We had participation from 40 students with differing skill levels and backgrounds, with groups totalling 13 submissions.
We see potential in:
Similar introductory events aiming for a broad audience
Collaborating more often with student organizations
Leveraging remote help from external mentors
We experimented with an alternative naming, having remote mentors, different problem sources, and student organization partnerships, with varying results.
We could have improved planning and communicating the difficulty of challenges.
Introduction
In February, we ran the first AI Safety Hackathon in Chile (and possibly in all of South America[1]). This post provides some details about the event, a teaser of some resulting projects and our learnings throughout.
Goals and overview of the event
The hackathon was meant to kick-start our nascent AI Safety Group at UC Chile, generating interest in AI Safety and encouraging people to register for our AGI Safety Fundamentals course group.
It ran between the 25th and the 28th of February, the first two days being in-person events and the other two serving as additional time for participants to work on their proposals, with some remote assistance on our part. Participants formed teams of up to four people, and could choose to assist either virtually (through Discord) or in-person (on the first two days).
Aiming for a broad audience, we named the event “Thinkathon” (instead of hackathon) and provided plenty of introductory material alongside the proposed problems.
We think this was the right choice, as the desired effect was reflected on the participant demographics (see below).
We could have been better at preparing participants. Some participants suggested we could have done an introductory workshop.
We incorporated the two problems from the AI Alignment Awards (Goal Misgeneralization and the Shutdown problem), alongside easier, self-contained problems aimed at students with different backgrounds (like policy or psychology).
We think most teams weren't prepared to tackle the AI Alignment Awards challenges. Most teams (77%) chose them initially regardless of their experience, getting stuck quickly.
This might have worked better by communicating difficulty more clearly, as well as emphasizing that aiming for incremental progress rather than a complete solution is a better strategy for a beginner's hackathon.
As we don't know many people with previous experience in AIS in Chile, we got help from external mentors, which connected remotely to help participants.
We think this was a good decision, as participants rated mentor support highly (see below).
We think this was an excellent choice, as they provided a much broader platform for outreach and crucial logistics help.
We had a great time working with them, and they were eager to work with us again!
Things that went well
40 people attended in-person and 10 people remotely (through Discord), we were surprised by both the high number of attendants and the preference[2] for in-person participation.
We had a total of 13 submitted proposals, much higher than expected.
While all proposals were incremental contributions, most were of high quality.
Skill level and majors varied significantly, going from relatively advanced CS students to freshmen from other fields (like economics). We were aiming for diversity, so this is a win.
While we hugely delayed starting the AGISF program (we opened applications very recently), many participants were eager to apply to AGISF.
Things we could have done better
(Also check out the “Things we experimented with” section)
We had some issues navigating funding. This was in good part because we failed to anticipate some costs. In the end, we ended up spending around $20 USD per person total on food and catering. The collaborating student organizations covered most of the funding for prizes (around $15 USD / pp).
Logistics should have been planned with significantly more anticipation. We were overconfident about being able to sort out some details last minute. It turned out our university couldn't host the event (due to the academic calendar), and we were forced to find two new venues in a couple of days (which we were able to get for free).
At the end of the last day of the in-person phase, we didn't do a proper wrap-up. This contributed to some teams feeling a bit lost during the remote phase.
We didn't do an impressive job at positioning the AI Safety group, as we could have talked more about it during the event, and be quicker to follow-up on participants' interest after the event.
Challenge problems
We presented participants with the following six challenges. The linked write-ups are in Spanish, as they were made for participants.
Goal Misgeneralization from the AI Alignment Awards. Seven submitted entries addressed it, including the one that came in second place.
Shutdown Problem from the AI Alignment Awards. Two submitted entries were about it.
Honesty in language models[5]: Two proposals worked on this problem, both by doing empirical research via the OpenAI API. They earned the first and third places.
Around 50% of students were majoring in engineering (mostly CS engineering[7]), 21% in a (new) purely CS major, 11% in business and economics, and the rest were from a mix of majors (neuroscience, maths, biochemistry, medicine).
About 75% of participants were from UC Chile, our university, while about 25% were from other universities.
There was a significant gender imbalance (only 20% were women or non-binary people). We don't have great ideas on how to improve this.
To our surprise, around 54% of participants were incoming freshmen.
Insights from participants
Anecdotally:
It was surprisingly easy to talk about concerns related to AI Safety with everyone, including people with no previous exposure to them.
Most people didn't have trouble with understanding the problems themselves, independent of their background (except perhaps for goal misgeneralization).
We also ran a quick after-event survey (n=26):
Participants gave a 77% Net Promoter Score (4.6/5 likelihood to recommend), indicating high satisfaction with the event.
Frequent negative feedback included lack of food on the second day (n=5), lack of activities to socialize or meet other participants (n=4) and that the venues used were too far away (n=4).
76% hadn't heard of AI Safety before the event.[8]
Interest in working or doing research in AI Safety and AI more generally increased significantly, with a moderate effect size for the first (r = 0.54) and a small effect size (r = 0.32) for the latter.[9]
Participants reported having learned significantly about both AI in general and AI Safety (4.0/5 and 3.8/5 respectively) and valued mentor help highly (4.4/5)
Also, some random pictures:
Acknowledgements
We want to thank the following people for their amazing help:
A Wilcoxon signed-rank test showed a statistically significant increase in interest in AI (W = 115.5, p < 0.001, r = 0.32) as well as AI Safety (W = 190.0, p < 0.001, r = 0.54). This was based on retrospective Likert-style questions on interest. It was hard to account for survey effects.
TL;DR
Introduction
In February, we ran the first AI Safety Hackathon in Chile (and possibly in all of South America[1]). This post provides some details about the event, a teaser of some resulting projects and our learnings throughout.
Goals and overview of the event
The hackathon was meant to kick-start our nascent AI Safety Group at UC Chile, generating interest in AI Safety and encouraging people to register for our AGI Safety Fundamentals course group.
It ran between the 25th and the 28th of February, the first two days being in-person events and the other two serving as additional time for participants to work on their proposals, with some remote assistance on our part. Participants formed teams of up to four people, and could choose to assist either virtually (through Discord) or in-person (on the first two days).
We had help from Apart Research and partial funding from AI Alignment Awards.
Things we experimented with
Things that went well
Things we could have done better
(Also check out the “Things we experimented with” section)
Challenge problems
We presented participants with the following six challenges. The linked write-ups are in Spanish, as they were made for participants.
Highlighted submissions
Some highlighted proposals from the hackathon:
Investigating the Relationship Between Priming with Multiple Traits and Language Model Truthfulness in GPT-3: A research project exploring how different traits placed in prompts influence the truthfulness of GPT-3 based models, following this idea from AISI.
The challenge of Goal Misgeneralization in AI: Causes and Solutions: A general literature review on goal misgeneralization that also suggests a mix of solutions inspired heavily by previous work.
All proposals (some in Spanish) can be seen here.
Attendance Demographics
Insights from participants
Anecdotally:
We also ran a quick after-event survey (n=26):
Also, some random pictures:
Acknowledgements
We want to thank the following people for their amazing help:
We might or might have not chosen to write South America instead of Latin America to conveniently exclude Mexico.
Especially considering that participating in-person meant showing up at 9:00 of a summer vacation day in an inconveniently-located venue.
Taken from this idea by @Pablo Villalobos.
Adapted from this idea by Caroline Jeanmaire.
Taken from this idea by @Sabrina Zaki, Luke Ring and Aleks Baskakovs, that in turn comes from their (winning) proposal for Apart Research's language model hackathon.
Inspired the ideas in @Geoffrey Irving and Amanda Askell's AI Safety Needs Social Scientists and Riccardo Volpato's Research ideas to study humans with AI Safety in Mind.
People from CS Engineering take common courses in engineering, but are otherwise similar to CS.
This was explicitly defined in contrast to Ethics of AI or similar fields.
A Wilcoxon signed-rank test showed a statistically significant increase in interest in AI (W = 115.5, p < 0.001, r = 0.32) as well as AI Safety (W = 190.0, p < 0.001, r = 0.54). This was based on retrospective Likert-style questions on interest. It was hard to account for survey effects.