Be careful that this doesn't devolve into... I'm not going to say clickbait, because it's a little longer timescale than that. Flashes in the pan?
Things that sound great for the first minute or two but fall apart on deeper inspection.
I'd be interested in you sharing any reasons why you think this might fall apart, e.g. any insights you've gained from deeper inspection.
I've made a career of being good at finding 'weird' failure modes[1]. I won't give further details about said career here.
=====
In general: look for the perverse incentives. Especially in competitive settings, or where money is on the line.
In this case, the actual target is something along the lines of 'teach the judge the basics of alignment theory', and the metric is 'say things that the judge believes is good and interesting within the next minute'.
Well... what happens when the judge has a subtle[2] misconception? Seekers are incentivized to go along with it rather than fight it. This helps the metric but hurts the underlying target.
What happens when there's a single step in a knowledge chain that takes more than a minute to be interesting? Seekers are incentivized to ignore that knowledge chain. This helps the metric but hurts the underlying target.
Etc.
To figure out how to make progress on problems that require cognitive effort, it seems useful to think of times in the past when you’ve made lots of cognitive progress. For me, high school math camp was the most cognitively productive time.
It seems plausible that the main reason I was productive at math camp was because of the competitive atmosphere. In light of this, I want to experiment with systems that harness people’s competitiveness to drive learning and insight, since maybe this can increase productivity substantially. “Insight Hunt” is a way this could be done.
I suspect the below idea is somewhat far from the ideal and that it could change a fair bit in response to user feedback. If anyone tries an Insight Hunt, please leave your feedback and results (including the name of the winner if you want increased motivation) in the comments! Also, feel free to suggest alternative games/systems which leverage competitiveness for cognitive productivity.
Lastly, I suspect that Insight Hunt works best when the stuff the players want to learn is more “shallow” i.e. knowledge/facts that require few inferential steps from what most people know to understand. This is because if one person wants to learn linear algebra (not shallow), and another person knows linear algebra, the game will devolve into a lecture (I suspect). Examples of shallow knowledge include alignment theory (given some pre-requisites, like basic ML; this is the primary use-case I'm excited about) and psych study results. Non-examples include theoretical physics and math.
Insight Hunt