Note: The following is a draft post I've had since 2009, and it's not great but it's worth posting for discussion. I do like the way that it prefigures some of the problems of Quirrell Points when traitors are allowed...
Need to see if this can be easily gamed, but...
Step 1. Introduce Prisoner's Dilemma. Set up computer system so that they can log in and play it in partners with investments of points (caution them: this is their actual grade at stake). Let them know that they currently don't have enough points for a passing grade on that part of the course, but that maximum investment and mutual cooperation will result in A's for (almost) everyone on it (with high probability); also that points are converted to grades on a logarithmic scale. Let them know that creating institutions and alliances is a good strategy in such games.
Initially, each student is allowed to play once per day, with 1 partner. Students log in, enter the name of their requested PD partner, enter how much they're willing to invest, and enter C or D. They'll get a "bank statement" daily as well.
If both enter C, they each get 1.2 points back (per initial point invested). If one enters C and the other D, the cooperator gets nothing back and the defector gets 2 points (per point invested). If both enter D, then each gets 0.5 points back.
Once they've had some practice with this, we move to
Step 2: Bigger investments, luck, observation.
Introduce larger group investments with higher rates of return. E.g. a five-person opportunity that pays each C player 0.2 guaranteed plus 0.4 for each C (not counting themselves), and each D player gets 0.5 guaranteed plus 0.5 for each C. (Set these up to be balanced in some way.)
Add a factor of luck, so that people can't just (be forced to) show one another their bank statements as proof of their cooperation. People should average the proper amounts, but have enough variation that it's often difficult to tell whether they cooperated or defected. (Or is there another way to handle this?)
Finally, allow a third option of observation within a deal. One idea: if you choose to observe, then you take up a spot in the investment, keep your own point, and don't count as a C for others, but you get to see some or all of the other players' choices. Make sure that information is proportionately expensive.
Step 3: Keep scaling it up, adjust (only) as necessary.
If things get too unbalanced, some progressive taxation and welfare might be in order; but disrupting the system too much will tend to destroy the things they need to learn.
Ideally, larger and larger opportunities should be offered, with the kicker being a gigantic one-shot Prisoner's Dilemma that the whole class can partake in. C might get guaranteed 0.2 plus 0.2 for every other C, while D gets guaranteed 1 plus 0.3 for every C.
Implementation:
Need a non-hackable computer program to work off of. Also, have students (for real points) give critiques and suggestions for improvement.
Similar to V_V's comment. I'm only a layman, but I don't understand one of the first steps.
The paper appears to jump straight into the logic, justifying itself (pardon the pun) by a statement about the intricacies of formal systems proving things about other formal systems.
Can you explain how and why all AI systems correspond to formal systems - what precisely you mean by 'formal system'?
(Btw, I have in fact read GEB, but I am still unable to confidently define a formal system.)
Or a better question might be "Why does this need to be possible within the realms of formal logic to be doable at the level of code/AI? There are many general problems intractable as part of pure maths, yet many numerical methods exist to solve particular instantiations. Why must this be doable at the level of pure logic?"
Good question as well! I should add motivation to the notes on this point.
Basically, we study deterministic mathematical objects within formal systems because these cases are easier for us to analyze and prove theorems about; it's an example of looking first under the streetlight before meandering farther into the darkness. Hopefully this way we'll notice many phenomena that still hold when we add randomness, bounded computations, and other realistic limitations.
And indeed, plenty of the topics in MIRI research do hold up when you make them messier. Many of the topics in formal systems can handle random variables just as easily as deterministic facts. There's a version of Löb's Theorem for bounded-length proof searches, and tracing it through shows that Löbian cooperation and other phenomena behave the same there as in the unbounded case (for sufficiently large choices of the bounds). Using quantum random variables to evaluate logical counterfactuals doesn't solve implementing UDT, it just collapses it into CDT; so we still need to study logical counterfactuals. And so on.
Also, when I say "formal system" you can usually substitute in "Peano Arithmetic" or "Peano Arithmetic plus some extra axioms".