Game that might improve research productivity

Jack R

To figure out how to make progress on problems that require cognitive effort, it seems useful to think of times in the past when you’ve made lots of cognitive progress. For me, high school math camp was the most cognitively productive time.

It seems plausible that the main reason I was productive at math camp was because of the competitive atmosphere. In light of this, I want to experiment with systems that harness people’s competitiveness to drive learning and insight, since maybe this can increase productivity substantially. “Insight Hunt” is a way this could be done.

I suspect the below idea is somewhat far from the ideal and that it could change a fair bit in response to user feedback. If anyone tries an Insight Hunt, please leave your feedback and results (including the name of the winner if you want increased motivation) in the comments! Also, feel free to suggest alternative games/systems which leverage competitiveness for cognitive productivity.

Lastly, I suspect that Insight Hunt works best when the stuff the players want to learn is more “shallow” i.e. knowledge/facts that require few inferential steps from what most people know to understand. This is because if one person wants to learn linear algebra (not shallow), and another person knows linear algebra, the game will devolve into a lecture (I suspect). Examples of shallow knowledge include alignment theory (given some pre-requisites, like basic ML; this is the primary use-case I'm excited about) and psych study results. Non-examples include theoretical physics and math.

Insight Hunt

There are three players; one player is the “judge” and two are the “seekers”
Who plays as the judge rotates such that everyone is the judge equally often
Every round:
- The seekers spend 8 minutes trying to find “insights” (i.e. true and useful sentences) that score highly according to the judge, who scores the insights based on how useful and interesting they found the sentence
  - Before the round, it’s in the judge’s interest (who wants to learn) to tell the seekers what kind of stuff they want to know, or e.g. what papers/posts/ideas they want summarized to them
    - Examples include “what is deep double descent?” or “what are the main reasons sleeping 8 hours might be important?” or “what’s John Wentworth’s main problem with ELK?”
  - Finding insights might look like quickly digesting a paper, DMing someone for a quick insight, or using an on-demand tutor service
- The seekers each get 1 minute to explain each insight; after 1 minute is up, the judge awards anywhere from 1 to 10 points
  - Optional: scores are displayed on a scoreboard
  - Optional: a prize (e.g. $1000 or being announced as the winner publicly) is given to the person with the highest score in the end

Be careful that this doesn't devolve into... I'm not going to say clickbait, because it's a little longer timescale than that. Flashes in the pan?

Things that sound great for the first minute or two but fall apart on deeper inspection.

I'd be interested in you sharing any reasons why you think this might fall apart, e.g. any insights you've gained from deeper inspection.

I've made a career of being good at finding 'weird' failure modes^[1]. I won't give further details about said career here.

=====

In general: look for the perverse incentives. Especially in competitive settings, or where money is on the line.

In this case, the actual target is something along the lines of 'teach the judge the basics of alignment theory', and the metric is 'say things that the judge believes is good and interesting within the next minute'.

Well... what happens when the judge has a subtle^[2] misconception? Seekers are incentivized to go along with it rather than fight it. This helps the metric but hurts the underlying target.

What happens when there's a single step in a knowledge chain that takes more than a minute to be interesting? Seekers are incentivized to ignore that knowledge chain. This helps the metric but hurts the underlying target.

Etc.

^{^}
Of course, I'm firmly on the side of having found >10 of the last 5 failures. Things wouldn't be great if everyone was me, but having one of me on a team can work reasonably well.
^{^}
Read: would take longer than a minute to get the judge to realize that it was a problem at all.