Virtually no-one differentiates between a 4 and a 5 on these kind of surveys. Either they are the kind of person who "always" puts 5, or they "never" do.
With Rat. Adjacent or other overthibkers, you can give more specific anchors (eg 5 = this was the best pairing). Or you can have specific unsealed questions (ie:
I suspect the reason you found little difference in match quality is because "being interested and able to attend a LessWrong event" is already a huge filter. Maybe you already knew that.
Summary: I organized two speed friending sessions that were formatted as social experiments. The first speed friending experiment was organized at the 2022 Less Wrong Community Weekend. More than 50 participants filled a matchmaking survey before the event and were randomly assigned into two groups: one where participants were matched with each other randomly, and the other where I tried my best to match people with each other using their survey answers. Analysis comparing the two groups did not indicate a difference in match quality.
A second experiment was organized at the 2023 event. This time, there was no matchmaking and instead participants were split into two groups within which they could freely match up with each other. For the experimental part, the two groups were given two different discussion prompts. As prompts, one group was given only the classic and succinct "FORD" — "family, occupation, recreation, and dreams", while the other group was given a version of the 36 love questions[1]. Again, the results showed no difference between the two groups.
The average match rating for the first experiment was 3.68 while for the second it was 3.96. This difference may suggest that either providing discussions prompts in general improves matches, or that giving people the opportunity to choose their own match (even if based on a first impression of some seconds) improves match quality. A third experiment could test this.
The data overall suggests that people had many positive encounters and almost everyone involved had fun. I recommend running similar experiments in the future.
Acknowledgments: I hired Santeri Koivula to do some of the writing and analysis — thanks! Thank you also to all the participants and everyone who gave me ideas & feedback.
In the first experiment, participants filled a matchmaking survey consisting of the following questions:
A total of 56 people answered the matchmaking survey. I randomly split participants into two groups, randomly matching the other group into pairs and matching the other one by hand. Each person was matched with three other people. My algorithm for matching people would be best described as "vibes". I got to know each answer and tried to find common factors. I made percentage predictions about the quality of each match. I added a short comment to each match explaining why I made it, including for example: "party philosophers", "poly", "nature entrepreneurs", "dark humor party ppl", "EA achievers" and "similar psychometrics".
For the actual session at the event, I had printed out several sheets with lists of the matches. In each of the three rounds, participants had 15 minutes to spend getting to know their match. After each round they were asked to fill another survey with the following questions:
Organizing the session was chaotic. Several people didn't show up and this left some people without matches. Many people wanted to join without having filled the form beforehand. I prioritized giving people quieter space for conversations rather than have everyone in the same room, so there was no easy way to announce when the 15 minutes was up. Nevertheless the session was reasonably successful, and the match rating survey received 104 answers. The overall mean rating for match quality was 3.7 out of 5.
Our most important research question was “Does the average rate of positive encounters differ for the two groups?”. We defined a positive encounter to be an encounter which both participants rated >=3, and at least one of them rated it >=4. We excluded pairs where only one participant filled the Match Survey from the analysis.
The rate of positive encounters for the hand-picked group was 16/22 -> 72.7%
The rate of positive encounters for the randomized group was 8/11 -> 72.7%
There was no difference between the groups, indicating that matchmaking in this specific case didn't work.
I had made predictions about the probability of each manually chosen match being a very positive encounter (that is, both participants rate the outcome 4 or 5). I made overall 28 predictions, and the Brier score for these was 0.26, which is a bit worse than just guessing 50% for each match. Looking more closely at the results, my largest error was that I systematically predicted very positive encounters as more likely than they actually were.
So overall there was no difference between hand-picked matches and random matches for this speed friending session. Additionally, I couldn't well predict the quality of hand-picked matches.
But plenty of people had good and even amazing matches! What made the matches good? There's no clear leader here — but if you had the trio of emotional connection, shared goals, and admiration — odds are on the side of a good match.
The most common positive aspects described were collaborative opportunities and good vibe.
The second experiment tested the impact of giving different discussion prompts rather than matchmaking. The match rating form was slightly modified for this, with irrelevant questions removed and two new questions added:
The average ratings for the two experiment groups were again the same — 3.97 and 3.96. Running the second experiment was easier since people matched with each other during the event and less work was required beforehand. Timeboxing the matches was again the main challenge as this time we had the session outdoors and many exceeded the intended 15 minute slots.
https://www.nytimes.com/2015/01/09/style/no-37-big-wedding-or-small.html