timtyler comments on AI cooperation in practice - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (157)
You don't specify the implementation of "proves(code, maxsteps)".
The output would appear to depend on that.
Of course, there's the whole business of universal heat death once again...
Nothing depends on the details of proof verifier, since it completely covers all proofs up to some length, a set of proofs which is completely defined by the logical language. The arguments of proves(-,-) determine its value (and it's even a primitive recursive function).
On the other hand, pathological formal language can be constructed, that makes proof of "main()==1" arbitrary large (e.g. where 1 is represented by 3^^^3 symbols). Thus it is better to be proven that formal language isn't pathological, i.e there is no prover proves' that satisfies condition
...but the the "logical language" is not specified! That is one of the implementation details that is missing in the problem specification.
You know, there are always unstated assumptions, like sanity of the reader.
Usually, when posing puzzles, it is best to try to minimise the number of unstated assumptions the solution depends upon - or there won't be a unique answer.
Enumeration order of proofs is not specified, so we can construct pathological prover, that reorders proofs such that all proofs of "main()==1" require more steps than maxsteps.
You can't influence the length of a proof by placing it at a different position in some list of proofs. Parameter 'maxsteps' specifies the maximum length (or Gödel number) of the proofs that get checked, not the maximum number of steps performed by the proof checker algorithm.
Ouch. maxsteps means maxprooflenght. My inattention, sorry.
The argument is admittedly sketchy, but I don't think it depends much on the implementation of "proves". Of course I'd want the proof system to hold the same properties that are used in the proof of Löb's Theorem, because my argument is almost an exact replica of that; but pretty much all proof systems actually used by people (apart from some extremely weak ones) satisfy these properties. And you're guaranteed not to become worse off if you make your proof system stronger while your opponent stays the same. That's a really big deal, if my argument is indeed correct.
About heat death: even though the two programs say they can iterate up to 3^^^3 or something, the actual time to cooperation will be much less, because the reasoning steps that I outlined are exactly the reasoning steps they'd need to stumble upon. Yes, that would make a proof of length 10000 or so (which would mean 2^10000 proofs tried), but a that's long way from 3^^^3. If the programs use some kind of heuristics to search for working proofs (but still guarantee that they will do the brute force search if heuristics fail), the proofs can be found quicker still.
Re: "apart from some extremely weak ones"
It does depend on the implementation - because it is relatively easy to imagine poor provers that result in a failure to understand and prove anything much. So, you would need to specify some constraints on the prover before you can make statements about what it outputs.
timtyler: I think you're confused about how proof verifiers work. A proof verifier takes a proof as input, and verifies that the given proof is correct. It doesn't have to generate the proof itself, and it gets to assume that each step is some small, easily-checkable application from a finite set of axioms.
Put another way, if the proof you give a verifier makes a large leap in reasoning, then the verifier will just tell you that the proof is wrong. Moreover, by the formalization that we demand of those proofs, the proof is wrong.
So, the problem that proof verifiers solve is computable. So, we need only to assume that the proof verifier that these programs have is correctly implemented, we don't really need to know their internal details.
(If you already understand this, though, then I'm confused about your confusion. Could happen.)
[edit:] Actually, I think I might see the point you're making; we have to assume the theory the prover understands is rich enough to model stepwise program semantics. This probably should have been specified, but we can just assume it.
It's a nice result, but I doubt it'll ever be practically relevant.
You could implement the algorithm today, although with a limit of less than 3^^^3. Rough guesstimates suggest that about a limit of 50 would be workable at home, 100 for a worldwide distributed computing effort and 500 for a universe-sized computer that does one operation per bit per plank time. In all cases, nowhere near 10000 (let alone 3^^^3).
Heuristics could help a bit, but making the algorithm more complex is likely to also make the proof that it works longer and so harder to find. The fallback to brute-force search is unhelpful because no computer that could actually exist would be able to find the proof that way.
If program A uses heuristics, program B doesn't need to prove that these heuristics are correct. It only needs to see that A's use of heuristics makes A either output 1 or move on to the brute-force search.
Agreed with the rest of your comment. An easier way to reach the same conclusion: if program B is implemented as "always defect", program A has to run the brute-force search to the end, there's no shortcut (because having any shortcut in the code would make the proofs invalid if B were a copy of A).