Passing in the source code is not the same as quining. A program that is passed in its own source code can easily check to see it's been altered (e.g. by including a cryptographic signature in the source code). With quining, the program can be mutated without easy detection.
How does that help? A quine-like program could just as well put its real payload in a string with a cryptographic signature, verify the signature, and then eval the string with the string as input; thus emulating the "passed its own sourcecode" format. You could mess with that if you're smart enough to locate and delete the "verify the signature" step, but then you could do that in the real "passed its own sourcecode" format too.
Conversely, even if the tournament program itself is honest, contestants can lie to their simulations of their opponents about what sourcecode the simulation is of.
Are we using summand to mean the same thing here? To me, if we have an expression X1 + X2 + X3, then the summands are X1, X2, and X3. If we want to estimate Y, and E[X1+X2+X3] = Y, but E[X2] is close to 0 while Var[X2] is large, then X1+X3 is a better estimate for Y than X1+X2+X3 is.
Assume you have noisy measurements X1, X2, X3 of physical quantities Y1, Y2, Y3 respectively; variables 1, 2, and 3 are independent; X2 is much noisier than the others; and you want a point-estimate of Y = Y1+Y2+Y3. Then you shouldn't use either X1+X2+X3 or X1+X3. You should use E[Y1|X1] + E[Y2|X2] + E[Y3|X3]. Regression to the mean is involved in computing each of the conditional expectations. Lots of noise (relative to the width of your prior) in X2 means that E[Y2|X2] will tend to be close to the prior E[Y2] even for extreme values of X2, but E[Y2|X2] is still a better estimate of that portion of the sum than E[Y2] is.
Yes, but the distinction between the two isn't captured by computability theory (unless you are allowed to pose infinitely many problems). Also, if the universe is spatially infinite, it can solve the halting problem in a deeply silly way, namely there could be an infinite string of bits somewhere, each a fixed distance from the next, that just hardcodes the solution to the halting problem.
This is obviously unsatisfying, but part of the reason it's unsatisfying is that even if such an infinite string of bits existed (and even if we somehow verified that it was trustworthy), it would constitute a horribly inefficient algorithm for solving the halting problem: moving at constant speed, it would take time exponential in the length of the code for a Turing machine to go to the place in the universe where the bit determining whether that Turing machine halts is stored. But this is a complexity consideration, not a computability consideration.
unless you are allowed to pose infinitely many problems
Or one selected at random from an infinite class of problems.
Also, if the universe is spatially infinite, it can solve the halting problem in a deeply silly way, namely there could be an infinite string of bits somewhere, each a fixed distance from the next, that just hardcodes the solution to the halting problem.
That's why both computability theory and complexity theory require algorithms to have finite sized sourcecode.
What's impossible is MWI with something exactly like Time-Turners including Novikov.
I am ignorant on these topics, but isn't Novikov consistency predicated on QM? In that the "actual" paradox-free world is produced by a sum-over-histories? What about MWI prevents this?
Sorry if this is an incredibly stupid question.
Novikov consistency is synonymous with Stable Time Loop, where all time travelers observe the same events as they remember from their subjectively-previous iteration. This is as opposed to MWI-based time travel, where the no paradox rule merely requires that the overall distribution of time travelers arriving at t0 is equal to the overall distribution of people departing in time machines at t1.
Yes, Novikov talked about QM. He used the sum-over-histories formulation, restricted to the subset of histories that each singlehandedly form a classical stable time loop. This allows some form of multiple worlds, but not standard MWI: This forbids any Everett branching from happening during the time loop (if any event that affects the time traveler's state branched two ways, one of them would be inconsistent with your memory), and instead branches only on the question of what comes out of the time machine.
The obvious continuing example of this is Tor, developed by the US government and still supported and not cracked down upon, because cracking down would defeat the point of making it, which was to enable its servants to browse anonymously and also help out its enemies' critics & foes.
The US government made Tor? Awesome. I wonder which part of the government did it. The intelligence agencies could be expect to oppose it because they effectively lose power.
The US government made Tor? Awesome. I wonder which part of the government did it.
Right. In general, spawning time travel is paradox-free, that's why I am not clear on why "the impossibility of Time-Turners given MWI". Presumably if you already spawn uncountable numbers of worlds all the time, it's not a big deal to spawn one more.
You can certainly postulate a physics that's both MWI and contains something sorta like Time-Turners except without the Novikov property. The problem with that isn't paradox, it just doesn't reproduce the fictional experimental evidence we're trying to explain. What's impossible is MWI with something exactly like Time-Turners including Novikov.
Thanks for writing this! This is definitely an important skill and it doesn't seem like there was such a post on LW already.
Some mild theoretical justification: one reason to expect this procedure to be reliable, especially if you break up an estimate into many pieces and multiply them, is that you expect the errors in your pieces to be more or less independent. That means they'll often more or less cancel out once you multiply them (e.g. one piece might be 4 times too large but another might be 5 times too small). More precisely, you can compute the variance of the logarithm of the final estimate and, as the number of pieces gets large, it will shrink compared to the expected value of the logarithm (and even more precisely, you can use something like Hoeffding's inequality).
Another mild justification is the notion of entangled truths. A lot of truths are entangled with the truth that there are about 300 million Americans and so on, so as long as you know a few relevant true facts about the world your estimates can't be too far off (unless the model you put those facts into is bad).
More precisely, you can compute the variance of the logarithm of the final estimate and, as the number of pieces gets large, it will shrink compared to the expected value of the logarithm (and even more precisely, you can use something like Hoeffding's inequality).
If success of a fermi estimate is defined to be "within a factor of 10 of the correct answer", then that's a constant bound on the allowed error of the logarithm. No "compared to the expected value of the logarithm" involved. Besides, I wouldn't expect the value of the logarithm to grow with number of pieces either: the log of an individual piece can be negative, and the true answer doesn't get bigger just because you split the problem into more pieces.
So, assuming independent errors and using either Hoeffding's inequality or the central limit theorem to estimate the error of the result, says that you're better off using as few inputs as possible. The reason fermi estimates even involve more than 1 step, is that you can make the per-step error smaller by choosing pieces that you're somewhat confident of.
(Bit off-topic:)
Why are we so invested in solving the Löb problem in the first place?
In a scenario in which there is AGI that has not yet foomed, would that AGI not refrain from rewriting itself until it had solved the Löb problem, just as Gandhi would reject taking a pill which would make him more powerful at the risk of changing his goals?
In other words, does coming up with a sensible utility function not take precedence? Let the AGI figure out the rewriting details, if the utility function is properly implemented, the AGI won't risk changing itself until it has come up with its own version of staying reflectively consistent. If it did not protect its own utility function in such a way, that would be such a fundamental flaw that having solved Löb's problem wouldn't matter, it would just propagate the initial state of the AGI not caring about preserving its utility function.
It certainly would be nice having solved that problem in advance, but since resources are limited ...
edit: If the anwer is along the lines of "we need to have Löb's problem solved in order to include 'reflectively consistent under self-modication' in the AGI's utility function in a well-defined way" I'd say "Doesn't the AGI already implicitly follow that goal, since the best way to adhere to utility function U1 is to keep utility function U1 unchanged?" This may not be true, maybe without having solved Löb's problem, AGI's may end up wireheading themselves.
What if, in building a non-Löb-compliant AI, you've already failed to give it part of your inference ability / trust-in-math / whatever-you-call-it? Even if the AI figures out how to not lose any more, that doesn't mean it's going to get back the part you missed.
Possibly related question: Why try to solve decision theory, rather than just using CDT and let it figure out what the right decision theory is? Because CDT uses its own impoverished notion of "consequences" when deriving what the consequence of switching decision theories is.
I got a different updated value ratio in part 2. If my calculations are wrong, would someone correct me?
V = Value; A = Analysis predicted value
Prior Probabilities:
P(V=0)=0.4999
P(V=1)=0.5
P(V=100)=0.0001
Analysis Result Probabilities:
P(A=0)=(0.5*0.4999)+((1/3)*0.5*1)=0.4166
(half the ones that are zero plus a third of the half where the test fails)
P(A=1)=0.4167
P(A=100)=0.1667
Accurate analysis results:
P(A=0|V=0)=P(A=1|V=1)=P(A=100|V=100)=(1/2)+(1/3)=2/3
(the half when the analysis works and reports accurately plus
the third when it fails but gives the correct value anyway)
Inaccurate analysis results:
P(A=0|V=1)=P(A=0|V=100)=P(A=1|V=0)=P(A=1|V=100)=P(A=100|V=0)=P(A=100|V=1)=(1/2)*(1/3) = 1/6
Posterior Probabilities:
P(V=0|A=0)=P(A=0|V=0)*P(V=0)/P(A=0) = (2/3)*0.4999/0.4166 = 0.8000
P(V=1|A=1)=0.7999
P(V=100|A=100)=0.0004
P(V=0|A=1)=0.2000
P(V=0|A=100)=0.4998
P(V=1|A=0)=0.2000
P(V=1|A=100)=0.4999
P(V=100|A=0)=4E-5
P(V=100|A=1)=4E-5
So,
EV(A=0)=0.800*0+0.2000*1+4E-5*100=0.2040
EV(A=1)=0.8039
EV(A=100)=0.5399
So the ratio goes from 50:1 to 80:54 unless I'm off somewhere. I'm just starting to learn this stuff so any feedback will be welcome.
EDIT: formatting
DOUBLE EDIT: I realize this isn't the point of the article and has no bearing on the conclusion. This was an exercise in how to update an EV for me.
"50:4" in the post refers to "P(V=1|A=100)*1 : P(V=100|A=100)*100", not "EV(A=1) : EV(A=100)". EV(A=1) is irrelevant, since we know that A is in fact 100.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
I'm playing Prisoner's Dilemma and wish to test if an opponent X is honest. I might try the following:
(1) Create two programs, Y and Z, which are algorithmically equivalent but obfuscated versions of X. (2) Run Y and Z against each other.
If Y and Z don't cooperate with each other, that's a good indication that X recognizes itself with a source-code comparison and that I shouldn't trust X.
This honesty check doesn't work if Y and Z are given access to their sources. Sure, when I simulate Y against Z, I could lie to Y and tell Y that its source is X (so Y believes itself to be unmodified). But when my deluded Y simulation is deciding whether to cooperate with Z, it (Y) may run Z in simulation. If Y informs its Z-simulation that Z's source is Z, then that Z-simulation will not be deluded into thinking that it is unmodified. Y's simulation of Z will be able to detect that it is an (obfuscated) simulation and act accordingly.
This honesty check isn't fool proof. X can recognize itself with a more complicated handshake — one that survives code obfuscation. But if X recognizes itself with a more complicated handshake, then X doesn't need to know its own source code (and we shouldn't bother passing the source code in).
I had in mind an automated wrapper generator for the "passed own sourcecode" version of the contest:
Note that for all values of X and Y, (WrappedCliqueBot X Y) == (CliqueBot CliqueBot Y), and there's no possible code you could add to CliqueBot that would break this identity. Now I just realized that the very fact that WrappedCliqueBot doesn't depend on its "self" argument, provides a way to distinguish it from the unmodified CliqueBot using only blackbox queries, so in that sense it's not quite functionally identical. Otoh, if you consider it unfair to discriminate against agents just because they use old-fashioned quine-type self-reference rather than exploiting the convenience of a "self" argument, then this transformation is fair.