The MIT Mystery Hunt is a collection of puzzles, solved in teams over a long weekend every year. The prize for winning is that your team gets to write next year's hunt. Mystery Hunt puzzles are generally designed to take a few hours for a few people. A hunt typically has around 100 such puzzles, organized into a dozen or so metapuzzles; the metapuzzles can typically be solved with only a subset of the answers from the puzzles for that round, so not every puzzle needs to be solved to win.

My team, Codex, won in 2011 and thus wrote the 2012 hunt, which has just concluded. I wanted to share some thoughts about the hunt, and also share one of the puzzles that didn't make it in, but that I think Less Wrong will appreciate.

Edward Z. Yang compared the process of solving puzzles to science. It's not always that way -- in particular, Duck Konundrum is the prototype of a class of puzzle which merely requires following a very complicated set of instructions, while Square Mess is a simple matter of programming (well, and univat n ovt rabhtu qvpgvbanel). But it's a pretty good way of looking at things.

This year, I was a puzzle editor as well as an author. One of the things I learned about puzzles is that authors always think their puzzles are solvable, whether or not they are. This is the Illusion of Transparency in action -- it's obvious to the author how the puzzle ought to be solved. One job of editors is to ensure that every aha is properly clued, and that there is internal confirmation that solvers are on the right track. Internal confirmation means that when there are two steps to solving a puzzle, the intermediate result contains something intelligible even with omissions or errors. For example, if an intermediate result is a set of trigrams, those trigrams should be plausibly English-like. In nature, internal confirmation comes naturally, since all of nature follows a single set of rules. But in a puzzle, the rules are entirely arbitrary, so internal confirmation must be added.

In past hunts, a number of puzzles went completely unsolved, because there wasn't a rigorous testsolving process. Some puzzles were released with serious undetected errors, and some puzzles were simply too hard. In 2012, every puzzle was solved forwards (that is, without inferring the answer from the constraints in the metapuzzle) at least once.

The only way to tell if a puzzle really works is to have some solvers test it. Of course, these solvers can't just be people picked off the street -- they should be familiar with the conventions of the form (for instance, when converting between numbers and letters, A=1, and A+A=B, generally). Sometimes specialized knowledge is needed; some of the puzzles I wrote could not have been solved by non-programmers, and one of Codex's puzzles which failed testing required a solver with perfect pitch. But generally, it should be clear from looking at a puzzle what kind of knowledge is needed (at least for the first step). Codex avoided the problems of the past by testing every puzzle. Every puzzle that wasn't solved cleanly (and some that were) got revised and tested until it either passed, or was cut.

One of the puzzles that failed testing was one that I wrote with Danielle Sucher and Emily Morgan: Write More. We think Less Wrong readers might appreciate it anyway, so I'm posting it here.

New Comment
17 comments, sorted by Click to highlight new comments since: Today at 9:43 AM

Goddamnit! From the start of the hunt, I was waiting for the HPMoR puzzle to come out, and it turns out you guys were holding it back???

I laughed pretty hard at some of the lines. I'm still solving, I'll let you know if I managed to get it.

ROT13

Nyevtug, V tnir hc naq ybbxrq ng gur fbyhgvbaf. Vg gheaf bhg V jnf whfg bar fgrc sebz gur raq, naq V unq rira tbar qbja gur "rnpu punenpgre cebonoyl ercerfragf n pbtavgvir ovnf, nf cre synibegrkg" cngu. Ohg nsgre fpehgvavmvat rnpu punenpgre, V pbhyqa'g cnegvphyneyl frr nal cnggreaf va gur reebef gung gurl znqr, fb V tnir hc ba gung yrnq.

RL'f puncgref ner qryvpvbhfyl zhygvynlrerq va ubj gurl qrzbafgengr gurve gurzr ercrngrqyl, naq lbh cebonoyl pbhyq unir gnxra rknzcyrf qverpgyl sebz gur grkg (v.r. ZpTbantnyy ercrngrqyl znxrf gur shaqnzragny nggevohgvba reebe va puncgre bs gur fnzr anzr)

Jbhyq unir jbexrq zhpu orggre vs lbh unq hfrq yrff bofpher pbtavgvir ovnfrf, naq unq pyhrq gurz va zber fgebatyl, be creuncf n zvk bs irel boivbhf ovnfrf naq zber bofpher barf.

Jr qvqa'g jnag gb hfr gur puncgre gvgyrf, fvapr gung jbhyq zvavzvmr gur nun (be pnhfr pbashfvba, fvapr gurer jrer nyernql puncgre ahzoref orvat hfrq). Nyfb, jr unq nyernql tbar guebhtu n srj erjevgrf naq unq eha bhg bs grfgref. V qba'g guvax jr hfrq gbb-bofpher ovnfrf -- shaqnzragny nggevohgvba naq tnzoyre'f snyynpl ner eryngviryl jryy xabja, naq nyy bs gur ovnfrf jrer va Jvxvcrqvn'f yvfg. Va gur ebhtuyl fvzvyne chmmyrf Rvtug Abg Fb Qrnqyl Fvaf (2008) naq Haangheny Ynj (2011), gurer ner pnabavpny yvfgf, juvyr urer, gurer'f bayl Jvxvcrqvn. V whfg guvax vg'f n uneq ceboyrz gb hanzovthbhfyl qrzbafgengr reebef va gubhtug, tvira gung rirelbar znxrf gurfr fbegf bs reebef.

Nyfb, jr'er cebonoyl abg nf tbbq ng snasvp nf RL.

Lbh'er evtug; V tnir hc gbb rneyl orpnhfr V unq orra cevzrq jvgu "guvf chmmyr jnf fb uneq jr qvqa'g chg vg ba gur uhag". Pbby chmmyr; unir sha npghnyyl fbyivat chmmyrf arkg lrne!

[-]tgb12y40

I'm curious - what kind of a puzzle would require perfect pitch that couldn't be done with a microphone and computer? Is that simply against the rules or would that actually be insufficient to solve the problem?

Since the puzzle may end up being resurrected next time we win, I don't want to spoil the details. But we had someone who is doing a Ph.D in computer music who was unable to get the correct pitches out by computer. Perhaps he could have given more time.

I should add that I don't really understand music very well, so I wouldn't necessarily be able to explain it even if it were in the hunt.

Or even someone reasonably musical and a piano (or more likely a piano smartphone app).

Well, you could muck things up a bit so that the analysis becomes difficult. For example, you could convert the 30th through 91st prime numbers into a waveform with a pitch that can be picked out by humans but not by a guitar tuner. Maybe.

One of the puzzles that failed testing was one that I wrote with Danielle Sucher and Emily Morgan: Write More. We think Less Wrong readers might appreciate it anyway, so I'm posting it here.

Yeah, that seems like it only make sense if you've read HPMOR. Particularly Neville's.

As this point HPMR is very popular. Last year, when a friend and I cosplayed as Harry and Hermione from HPMR at a con, some people actively recognized who were supposed to be.

What's the difference between cosplaying HP!HarryHermione and HPMOR!HarryHermione?

We were in camo clothing with wands and army patches for Sunshine and Chaos. Pictures can be found here.

I haven't solved it, but my meta-reasoning would begin with the fact that the puzzle was originally directed to people who cannot be assumed to know about HPMOR, or even HP. The actual puzzle has to use the text in some other way, and the puzzle is to find the actual puzzle. All those day numbers -- what else might they mean? Is there exactly one named character in the text without a secret diary? Then the answer might involve writing theirs. Is there some pattern in who each character's diary refers to on different days?

Oh, I just assumed the answer was the sequence of question-marked days.

Or maybe the answer is to just...(spoiler omitted, but consider gur ragenapr gb Zbevn).

Hm, I never realized you were here too!

Codex is everywhere! That is, I think many Codexians are connected to via multiple distinct paths.