I think the most straightforward "edutainment" design would be a "rube or blegg" model of presenting conflicting evidence and then revealing the Word of God objective truth at the end of the game - different biases can be targetted with different forms of evidence, different models of interpretation (e.g. whether or not players can assign confidence levels in their guesses), and different scoring methods (e.g. whether the game is iterative, whether it's many one shots but probability of success over many games is the goal, etc.).
A more compelling example that won't turn off as many people (ew, edutainment? bo-ring) would probably be a multiplayer game in which the players are randomly led to believe incompatible conclusions and then interact. Availability of public information and the importance of having been right all along or committing strongly to a position early could be calibrated to target specific biases and fallacies.
As someone with aspirations to game design, this is a particularly interesting concept. One great aspect of video game culture is that most multiplayer games are one-offs from a social perspective: There's no social penalty for denigrating an ally's ability since you will never see them again, and there's no gameplay penalty for being wrong. This means that insofar as any and all facets in the course of a game where trusting an ally is not necessary, one can greatly underestimate the ally's skill FOREVER without ever being critically wrong. This makes online gaming perhaps the most fertile incubator of socially negative confirmation bias anywhere ever. If an ally is judged poorly, there's no penalty for declaring them as poor prematurely, and in fact people seem to apply profound confirmation bias on all evidence for the remainder of the game.
Could a game effectively be designed to target this confirmation bias and give the online gaming community a more constructive and realistic picture? I'll definitely be mulling this over. Great post.
If I understand your 'problem' correctly - estimating potential ally capabilities and being right/wrong about that (say, when considering teammates/guildmates/raid members/whatever), then it's not nearly a game-specific concept - it applies to any partner-selection without perfect information, like mating or in job interviews. As long as there is a large enough pool of potential parners, and you don't need all of the 'good' ones, then false negatives don't really matter as much as the speed or ease of the selection process and the cost of false positives, ...
You may have heard about IARPA's Sirius Program, which is a proposal to develop serious games that would teach intelligence analysts to recognize and correct their cognitive biases. The intelligence community has a long history of interest in debiasing, and even produced a rationality handbook based on internal CIA publications from the 70's and 80's. Creating games which would systematically improve our thinking skills has enormous potential, and I would highly encourage the LW community to consider this as a potential way forward to encourage rationality more broadly.
While developing these particular games will require thought and programming, the proposal did inspire the NYC LW community to play a game of our own. Using a list of cognitive biases, we broke up into groups of no larger than four, and spent five minutes discussing each bias with regards to three questions:
The Sirius Program specifically targets Confirmation Bias, Fundamental Attribution Error, Bias Blind Spot, Anchoring Bias, Representativeness Bias, and Projection Bias. To this list, I also decided to add the Planning Fallacy, the Availability Heuristic, Hindsight Bias, the Halo Effect, Confabulation, and the Overconfidence Effect. We did this Pomodoro style, with six rounds of five minutes, a quick break, another six rounds, before a break and then a group discussion of the exercise.
Results of this exercise are posted below the fold. I encourage you to try the exercise for yourself before looking at our answers.
Caution: Dark Arts! Explicit discussion of how to exploit bugs in human reasoning may lead to discomfort. You have been warned.
Confirmation Bias
Fundamental Attribution Error
Bias Blind Spot
Anchoring Bias
Representativeness Bias
Projection Bias
Planning Fallacy
Availability Heuristic
Hindsight Bias
Halo Effect
Confabulation
Overconfidence Bias
Summary
How long do you think it should take to solve a major problem if you are not wasting any time? Everything written above was created in a sum total of one hour of work. How many of these ideas had never even occurred to us before we sat down and thought about it for five minutes? Take five minutes right now and write down what areas of your life you could optimize to make the biggest difference. You know what to do from there. This is the power of rationality.