Scenario
You're not certain that aliens are real. Even after receiving that email from your friend who works at NASA, with the subject line "What do you make of this?" and a single binary file attached, you're still not certain. It's rather unlikely that aliens would be near enough to Earth to make contact with humanity, and unlikelier still that of all the humans on Earth, you would end up being the one tasked with deciphering their message. You are a professional cryptanalyst, but there are many of those in the world. On the other hand, it's already been months since April Fools, and your friend isn't really the prankster type.
You sigh and download the file.
Rules
No need to use spoiler text in the comments: This is a collaborative contest, where working together is encouraged. (The contest is between all of you and the difficulty of the problem.)
Success criteria: The people of LessWrong will be victorious if they can fully describe the process that generated the message, ideally by presenting a short program that generates the message, but a sufficiently precise verbal description is fine too.
The timeframe of the contest is about 2 weeks. The code that generated the message will be revealed on Tuesday July 12.
Why is this interesting?
This contest is, of course, based on That Alien Message. Recently there was some discussion under I No Longer Believe Intelligence to be "Magical" about how realistic that scenario actually was. Not the part where the entire universe was a simulation by aliens, but the part where humanity was able to figure out the physics of the universe 1 layer up by just looking at a few frames of video. Sure it may be child's play for Solomonoff Induction, but do bounded agents really stand a chance? This contest should provide some experimental evidence.
Hex dump of the first chunk of the file:
I just picked out enough lines to make some obvious observations. There are multiple sequences of
ffffff
, which I assume to be some kind of record separator. The beginning of the file is highly structured; the latter part of the file much less so. The obvious conclusion is that the first part of the file is some kind of metadata or header, while the main body begins further down.The nature of binary representations of floating-point is that nice bit-patterns make for round numbers and vice-versa, so I'm not sure that we can conclude a lot from that. The fact that the floating-point interpretation of the data results in numbers that cluster around certain values is telling, but could still be a red-herring. Part of my reluctance to endorse that theory is narrative: we were told that this is a simulated alien message, and what are the odds that aliens have independently invented double-precision floating point?
In any case, I'm reading those threads attentively, but in the meantime I'm going to pursue some hunches of my own.