These results are all well within each other's margin of error, i.e. with another programmer their order might be reversed.
Since these games were all meant for humans to play them (initially), we could also define some kind of "human-centered complexity", i.e. "how complex are the rules, given their players". There could be rules that are complex in a vacuum, but that mimick some behavioral rule that's already implemented in the (average) human player, and thus may less add complexity than a supposedly simpler (but not human-natural) rule.
(The ordering of two strings A and B by their complexity may yield a different result compared to the ordering of two strings (by complexity) AH and BH. Let A and B be game rules and H be some human component, then the games were devised as AH and BH, not as A and B.)
"with another programmer their order might be reversed"
This seems unlikely to me, having done it, but I would love to see someone else's attempt.
Consider that the other programmer might choose another programming language for her attempt, or compile into different ASM code to compress (or choose some other suitably-looking intermediate stage to compress at). Some programming language may (will) generate smaller code for certain commands than for others, compared to another PL.
Just considering some fixed PL, if you use one kind of loop the number of resulting GOTOs etc. may be different from using another kind of loop, even if one or the other seems to be the shorter implementation in the high-level program code you're writing, that's no guaranteed invariant.
The question you were asking was not "Choosing this one programming language I like, which game has the simplest rule?", so unless you have a really good argument why your PL of choice happens to generate the shortest code (not even the most efficient code), variances may be significant.
There are specialized languages (or at least versions of common PLs) for e.g. real time programming / optimized for satisfying specific constraints, these would be much better candidates for generating e.g. the minimum amount of machine code.
Writing Hex, for example, felt so much simpler than Go that I really have trouble imagining a programmer trying their hardest to write succinct code for both and ending up with a shorter program for Go than Hex.
I agree that a replication would be helpful, but I expect the ordering of Hex < Dots and Boxes < Go < Checkers to remain the same. If their programs came out different lengths, I suspect if we used each others insights to improve each others code we would still end up with Hex < Dots and Boxes < Go < Checkers.
(If two did switch, I would expect it to be Go and Checkers. I used someone else's carefully thought out ruleset for Go but couldn't fine a similar one for Checkers, so that one's probably got more slack to take out.)
Chinese rules for Go are quite simple. Japanese rules are quite complex (to the point where a world championship level match had a rule disagreement that resulted in a player agreeing to being forced to play a certain move in return for a promise that the rule would get changed in the future. Ouch.)
I agree that the chinese rules are more elegant, anyway.
I'm not sure about that. In a sense I'd say the Platonic ideal of what the Japanese rules are supposed to be is more elegant that the Chinese rules, it's just hard to write down the ideal rules explicitly.
(without the swap rule)
I'd like to point out that without the swap rule it's also very easy to write a program that plays perfectly.
I don't believe your comment is true in any meaningful sense. Can you explain what you mean?
Details: It's easy to prove that the first player wins in Hex without the swap rule, but it's even easier to prove the second wins in any (deterministic, ...) game with the swap rule. Neither proof is constructive, and so neither provides an efficient program.
Interpreting your statement differently, it's easy to write a program that plays any (deterministic, ...) game optimally. Just explore the full game tree! The program won't terminate for a while, however, and this interpretation makes no distinction between the versions with and without the swap rule.
How well does this map to the human experience of the game? Do two experienced players need the swap rule for the game to remain interesting?
It depends on the skill difference and the size of the board, on smaller boards the advantage is probably pretty large: Discussion on LittleGolem
Here's a fun game:
Define a set of software specifications. For example, you could specify that the software needs to correctly implement a board game.
Players submit implementations that must meet the specifications; a submission that does not meet the specification is considered incorrect and automatically disqualified.
Submissions that correctly implement the specification are compressed. The winner is the implemetation with the shortest compressed file size.
I think this game would be fun, but to make it really interesting you would want to use language-specific compressors instead of gzip. For example, you would want the Java-specific compressor to know about things like abstract classes, public/private access notations, and other aspects of Java syntax. So the compressor would actually be closely related to the compiler.
I think this kind of quantitative approach would correspond closely to a lot of our intuitions about what good software looks like. For example, the compression principle would prompt people to make most methods private, because that would allow the compressor to save bits when encoding object method calls. It would also promote code reuse for obvious reasons.
Code golfing is basically this, without the compression. You can see some examples here. The result matches very badly with our intuition of good code. Even with language-specific compression, which would presumably allow comments and reasonable variable names without any loss, I would still the optimal code to be bad from a conventional point of view.
What bad coding habits do you think trying to optimize for compressibility would promote? I can think of about 10 good habits it would promote, the most important being Don't Repeat Yourself.
Obviously you need to avoid penalizing for good variable names and comments.
What bad coding habits do you think trying to optimize for compressibility would promote?
Being easily understandable by humans.
Knuth: “The best programs are written so that computing machines can perform them quickly and so that human beings can understand them clearly. A programmer is ideally an essayist who works with traditional aesthetic and literary forms as well as mathematical concepts, to communicate the way that an algorithm works and to convince a reader that the results will be correct.”
I'm thinking in particular of two bad features:
Bad interfaces: Functions need only work on the inputs which are actually given to them in practice. They may behave unexpectedly for valid input, and they definitely won't give useful error messages for invalid input. They may have ad-hoc alterations to their behavior because it is useful for the particular domain.
Bad idioms: There is no disincentive to use unfamiliar features and unconventional patterns which will confusing anybody trying to read the code. The canonical way to do something is often not the most efficient one.
You can see these features in code golfing.
More abstractly, good code is defined by a broad range of criteria, and optimizing for only one feature is likely to do less well than intentionally setting out to write good code.
On the other hand, STEPS is project designed to make a working personal computer ins as little lines of code as possible. According to their report, making shorter code also results in simpler design. This is evidence in favor of your conclusion.
The point about good error messages seems right; I would deal with this by saying that the specification should require that the code give meaningful error messages when called on invalid input. I'm not sure what you mean by ad-hoc alterations; it seems to me that the compression principle discourages the use of ad-hoc alterations and that's exactly the point.
Here are some good cases where the compression principle would agree with standard good coding practices:
Don't import java.util.*; use import java.util.List and java.util.Map
I am curious about the compression algorithm that would compress "import java.util.List; import java.util.Map" to a shorter string than "import java.util.*;".
Unless you would artificially make it compress the latter very inefficiently -- but, of course, using that kind of strategy you could "prove" almost anything.
It's not the import statements that get compressed, it's the code beneath the imports that gets better compression because you limited the number of classes you imported. The compressor can compress references to class names to log(N) bits, where N is the number of class names that are in scope/imported.
I interpret Daniel_Burfoot's idea as: "import java.util.*" makes subsequent mentions of List longer, since there are more symbols in scope that it has to be distinguished from.
But I don't think that idea actually works. You can decompose the probability of a conjunction into a product of conditional probabilities, and you get the same number regardless of the order of said decomposition. Whatever probability (and corresponding total compressed size) you assign to a certain sequence of imports and symbols, you could just as well record the symbols first and then the imports. In which case by the time you get to the imports you already know that the program didn't invoke any utils other than List and Map, so even being able to represent the distinction between the two forms of import is counterproductive for compression.
The intuition of "there are more symbols in scope that it has to be distinguished from" goes wrong because it fails to account for updating a probability distribution over what symbol comes next based on what symbols you've seen. Such an update can include knowledge of which symbols come in the same header, if that's correlated with which symbols are likely to appear in the same program.
You're totally right about the fact that the compressor could change the order of the raw text in the encoding format, and could infer the import list from the main code body, and encode class name references based on the number of previous occurrences of the class name in the code body. It's not clear to me a priori that this will actually give better compression in practice, but it's possible.
But even if that's true, the main point still holds: limiting the number of external classes you use allows the compressor to same bits.
I believe his imagined compression scheme compresses variables so that when there are N variables in scope, it takes log(N) bits to refer to a variable, and similarly for methods.
If the source code compressor is well-written (and specialized to the language in question), then the programmer can achieve compression by using the practices mentioned above. itaibn0 has the right idea.
Code golfing encourages factoring out coincidental (as opposed to fundamental) redundancy, which is bad for understandability/maintainability. For example, there are C golf programs that read (coincidentally useful) characters from their own source to avoid having to include string literals. Basically, golfed code tends to have fragile interdependence between parts (high coupling), which makes it hard to extend.
Usually, with contests like this, you have to include size of the compression software in the program itself. Basically, you make an executable file that decompresses itself into the game. For examle, .kkrieger is a first-person-shooter with rules that are 96k in size. You simply download the game and run it.
If the compressor is neutral (gzip) or language-specific but not task-specific (as Daniel_Burfoot proposes) you should be able to ignore it.
You might also look at the computation complexity of solving the games. Games, Puzzles, and Computations, by Hearn and Demaine would be relevant here. Dots and Boxes is apparently NP-hard; a variant of Go called Rengo Kriegspiel might be undecidable; NxN Checkers is Exptime-complete.
You might also look at the computation complexity of solving the games.
You mean as another, mostly unrelated task.
Well, it's not entirely unrelated, since jkaufman says:
Go is the most interesting of the three, and has stood up to centuries of analysis and play, but Dots and Boxes is surprisingly complex (pdf) and there used to be professional Checkers players.
The interest here is not provided by the complexity of the rules themselves, but by the complexity of solving the games (or, rather, playing them well, but this is probably related). One can easily imagine games with very complex rules that nonetheless admit simple strategies and are thus boring.
We seem to agree: if I tell you the length of two pieces of code but nothing else, you won't be able to tell me which of them is more likely to terminate. It could be the one, or it could be the other. The relationship may not be strictly orthogonal (e.g.: longer code could contain more unintended infinite loops), but enough to call it mostly unrelated.
Same with complexity of rules versus "solving the games". Go compensates for the simplicity of its rules by blowing up the search space (it's a big board), which doesn't take any noteworthy additional complexity. The rules of a Go variant played on a 5 x 5 board would have about the same complexity as if played on a 3^^^3 x 3^^^3 board.
Some of the easiest-to-play children's board games have some of the hardest rules, compared to Go.
Yes, but we can generalize the games (which is what Hearn and Demain do), and see how the solving complexity changes with the size of the board. This is the only reasonable way to talk about the computational complexity of games.
a variant of Go called Rengo Kriegspiel might be undecidable
Where by "variant" you mean so different it belongs in a different category of games.
FWIW, I tried replacing gzip by better compressors. These files are so small that bzip and LZMA do worse, but paq was consistently 10% better than gzip. In particular, it didn't change the order. (I'm surprised that LZMA ever does worse than LZ. Maybe xz, the implementation I used, has big headers?)
Maybe xz, the implementation I used, has big headers?
Maybe. Does performance improve when you disable the default CRC-64 integrity check with --check=none
?
I tried and that's only 8 bytes, a single checksum. I'm seeing 50-100 more bytes for xz than gz on all of them, from the smallest hex-sm.py to the largest checkers.py.
Then I read the man page. It mentions "the filter chain...which normally would be stored in the container headers," which could be it, but doesn't sound like a lot of bytes. Also, I discovered another option: --format=lzma gets checkers almost all the way down to gz (xz,lzma,gz=1864,1826,1821), but gets hex-sm only halfway (xz,lzma,gz=428,391,356). (xz meaning without checksum)
I'm afraid I'm lacking on details, it was in a foreign country and every was speaking in a foreign language (Netherlands, Dutch). I was told that he was a professional. I'll add that the guy was being filmed, but for what I don't know. Anyway, he walked around twenty of us like the chess professionals, playing a move against each as soon as he arrived. He beat everyone too.
How about graph traversal problems like Shortest Path, Traveling Salesman (Shortest Tour), Capacitated Vehicle Routing, Arc Routing with Profits, et cetera. People in computer science and operations research seem to have nearly unlimited interest in such things. Or are you looking for something adversarial?
I've seen various contenders for the title of simplest abstract game that's interesting enough that a professional community could reasonably play it full time. While Go probably has the best ratio of interest to complexity, Checkers and Dots and Boxes might be simpler while remaining sufficiently interesting. [1] But is Checkers actually simpler than Go? If so, how much? How would we decide this?
Initially you might approach this by writing out rules. There's an elegant set for Go and I wrote some for Checkers, but English is a very flexible language. Perhaps my rules are underspecified? Perhaps they're overly verbose? It's hard to say.
A more objective test is to write a computer program that implements the rules. It needs to determine whether moves are valid, and identify a winner. The shorter the computer program, the simpler the rules of the game. This only gives you an upper bound on the complexity, because someone could come along and write a shorter one, but in general we expect that shorter programs imply shorter possible programs.
To investigate this, I wrote ones for each of the three games. I wrote them quickly, and they're kind of terse, but they represent the rules as efficiently as I could figure out. The one for Go is based off Tromp's definition of the rules while the other two implement the rules as they are in my head. This probably gives an advantage to Go because those rules had a lot of care go into them, but I'm not sure how much of one.
The programs as written have some excess information, such as comments, vaguely friendly error messages, whitespace, and meaningful variable names. I took a jscompiler-like pass over them to remove as much of this as possible, and making them nearly unreadable in the process. Then I ran them through a lossless compressor, gzip, and computed their sizes:
(The programs are on github. If you have suggestions for simplifying them further, send me a pull request.)
[1] Go is the most interesting of the three, and has stood up to centuries of analysis and play, but Dots and Boxes is surprisingly complex (pdf) and there used to be professional Checkers players. (I'm having a remarkably hard time determining if there are still Checkers professionals.)
I also posted this on my blog.