The black box scenario has been discussed on everything-list and on FOM. The question of "believing" the aliens is more subtle than you imply.
If "believing" refers to epistemic probabilities, then your claim about SI is wrong. To see why, view the Solomonoff prior as a probability distribution over possible sequences of future observations instead of over possible physics. It dominates all computable priors up to a multiplicative constant. In other words, in the long run SI never loses to any computable belief-outputting algorithm by more than a constant difference in log score, even over an uncomputable input sequence. (For more details, see Shane Legg's text.) So if you're a computable human who assigns consistent probabilities to future observations involving the box, then SI won't do epistemically worse than you, no matter what strange machinery may hide inside the box.
On the other hand, if "believing" refers to making better decisions (essentially, winning more money than SI in games that involve the black box), then the answer depends on the exact formulation of the game you're going to play. I explored that question here and here, the comments are also pretty good.
Yes, these are certainly good arguments, and valid as far as they go. But I'm not sure they entirely solve the problem.
For concreteness, suppose what the aliens sold us is not a black box, but a sheet of paper with a thousand digit binary number written on it, claimed to be the first thousand bits of Chaitin's omega (which they claim to have generated using a halting oracle that works by a hack they discovered that allows them to exploit the infinite computation underlying physics). We quickly verified the first few bits, then over the next couple of generations verified a few more bits at decreasing rate.
Omega is algorithmically random, so as far as Solomonoff induction is concerned, no matter how many bits are verified, the probability of the next bit being correct is 50%. On the other hand, we humans assign a higher probability than that, based on a nonzero credence that the whole sequence is correct; and this credence becomes higher the more bits are verified.
It is true that computable humans cannot consistently beat SI in the long run. But does that not just mean in this case that we cannot verify indefinitely many bits of omega? For all the other bits written on that page, there is a fact of the matter regarding whether they are right or wrong. Supposing they are indeed right, do we not still end up holding a more accurate belief than SI, even if we have difficulty translating that into winning bets?
I prefer to think that SI doesn't even have "beliefs" about the external universe, only beliefs about future observations. It just kinda does its own thing, and ends up outperforming humans in some games even though humans may have a richer structure of "beliefs".
I prefer to think that SI doesn't even have "beliefs" about the external universe, only beliefs about future observations.
A program can use logical theories that reason about abstract ideas that are not at all limited to finite program-like things. In this sense, SI can well favor programs that have beliefs about the world, including arbitrarily abstract beliefs, like beliefs about black-box halting oracles, and not just beliefs about observations.
Fair enough. It seems to me that SI has things it is most reasonable to call beliefs about the external universe, but perhaps this is just a disagreement about intuition and semantics, not about fact; it doesn't jump out at me that there is a practical way to turn it into a disagreement about predictions.
Think of Solomonoff induction as trying to guess what prediction algorithm you should be using. Past observations tell how well various prediction algorithms would do. Then it's suddenly not such a big assumption that the things it weights are mere algorithms. And whatever procedure you are actually using to predict the behavior of the black boxes, Solomonoff induction would beat, I expect (having math for this would be better).
notwithstanding that I know very well the universe is not constrained by my ideas of efficiency, it still strikes me as grossly inefficient to the point of inelegance for the universe to perform infinite computation of which only a finite fraction can be used even in principle.
As someone with the occasional hobby of hand optimizing assembly code, I can certainly empathize with that intuition. But I think it's wrong. Even assuming that efficiency-as-elegance is epistemically relevant, efficiency can only be defined relative to a model of computation, and there is no reason to think that the model of computation we usually have in mind, due to being familiar with such computers in our daily lives, is relevant for what is efficient/elegant for the universe.
It might help to recalibrate your intuitions to consider the following. Suppose we eventually build a quantum computer that can factor large integers in polynomial time. To simulate such a quantum computer using a classical computer would take exponential time, whereas there are known subexponential (more than polynomial but less than exponential) time classical factoring algorithms. Would it be somehow grossly inefficient for nature to "actually" do quantum computation when we turn on that quantum computer, instead of quietly substituting a subexponential classical factoring computation behind the scenes?
If the epistemically relevant model of computation is not a classical computer, but at least a quantum computer, then why can't it be a computer that handles continuous mathematical structures as easily as it handles discrete ones, or one that can query halting oracles?
Even assuming that efficiency-as-elegance is epistemically relevant, efficiency can only be defined relative to a model of computation, and there is no reason to think that the model of computation we usually have in mind, due to being familiar with such computers in our daily lives, is relevant for what is efficient/elegant for the universe.
Would you also say that our observations don't necessarily follow a simplicity prior because there's no reason to think our notion of simplicity (which is also defined only relative to a model of computation) is relevant for the universe?
(To side step the issue of how to make sense of probabilities over observations given anthropic-reasoning problems, I'll assume UDT and just talk about priors on universes instead.)
I'm still torn between thinking that the universe must follow a simplicity prior in some objective sense and there might be a way to find out what that objective prior is, and the idea that a prior is just a way of specifying how much we care about one universe versus another. (See the third and fourth bullet points in What Are Probabilities, Anyway?)
If it's the former, then any notion of simplicity that we currently have might be wrong in the sense of not matching up with the objective "reality fluid". If the latter, a notion of simplicity might be wrong in the sense of not reflecting how much we actually care about one universe versus another. In either case, it seems unlikely that any specific notion of simplicity, based on a model of computation (e.g., universal Turing machine) that was chosen because it happens to be useful for reasoning about practical computers, would turn out to be right.
Does that answer your question?
I'm still torn between thinking that the universe must follow a simplicity prior in some objective sense and there might be a way to find out what that objective prior is, and the idea that a prior is just a way of specifying how much we care about one universe versus another.
It doesn't seem to me that these are necessarily at odds---this ties in with the "is value/morality simple?" question, no?
(Optional conceptual soup: And obviously some mixture of the two is possible, and perhaps a mixture that changes over time/causality as things (decision procedures) get more timeless over time (evolution being timeful, humans less so, AIs even less so). One of my models of morality represents this as deals between algorithms with increasingly timeless discount rates, from hyperbolic to exponential to none, over time, where each algorithm gets satisfied but the more timeless algorithms win out over time by their nature (and are progressively replaced by ever more timeless algorithms until you have a completely acausal-trade-like situation). This highlights interesting parallels between discounting and cooperation---which can be thought of as symmetries between time and space, or future selves and present compatriots---and is generally a pretty useful perspective on the moral universe. That's the conclusion I have cached anyway. Ainslie's book "Breakdown of Will" provides some relevant background concepts.) (ETA: /Insert some sheer nonsense about the ergodic hypothesis and generally making analogies to statistical mechanics / probability theory / quantum information theory, simply because, well, at this point why not? I suspect that reading tons of academic paper abstracts without taking time to really understand any of them is a rather "attractive" form of pure Platonic wireheading.)
Indeed, once I realized quantum mechanics took exponential computing power, I considered this inelegant until I concluded the many-worlds interpretation means it's being put to good use after all. Is that a case of ending up at the right belief for the wrong reason? On the one hand you could say SI doesn't bat an eyelid at exponentially inefficient computation. On the other hand you could say it does, when you take into account the need to specify spacetime coordinates of the observer as well as underlying laws; in a sense, that discourages too much inefficiency of the wrong sort.
Having said that, I'm inclined to think continuum arithmetic isn't the 'wrong sort' of inefficiency in this sense. But see my reply to Daniel - how do you bite this bullet without making the 'arbitrary choice of basis' limitation much worse?
So "SI" appearing in a random LW comment can now mean superintelligence, Singularity Institute, Système international or Solomonoff induction. Is that all of them so far?
galactic intellectual property law
Be precise. Do you mean galactic patent law, galactic copyright law, or galactic trademark law?
I suppose we'll find out when we reach a high enough tech level to be able to build our own halting oracles, and promptly get sued. If the lawsuit rests on a claim that we reverse engineered the one we bought, it must be copyright law; if the lawsuit doesn't even need to claim that, it must be patent law.
Under galactic law, the prosecution presents a weighted distribution over possible charges, claims, and arguments, and the weights are hyperreal quaternions.
(Not really, of course. The truth will be far stranger than that.)
I see no reason to think that galactic law will resemble ours. Extremely vague concepts like "intellectual property" are about as close of an analogy as we're likely to get.
That was a joke on my part, but one warning against using overly general umbrella terms. Our copyright and patent laws developed as a result of certain historical circumstances, and it is entirely possible that a hypothetical alien civilization would treat sharing and distribution of ideas entirely differently and not resembling any of our historical precedents.
I would certainly hope so! For that matter, I hope future human civilization will treat it differently as well. (I'd like to replace hope with am confident, but alas I'm not quite that much of an optimist.)
As of this thread, "SI" appearing in a random LW post can now mean superintelligence, Système international, Singularity Institute or Solomonoff induction. Is that all of them so far (not counting one-offs)?
To say why Solomonoff induction can't predict halting oracles, you needn't use the concept of manifest infinity: One can describe the behavior of a halting oracle in formal language, but one can't write a program that will predict its behavior; therefore the halting oracle hypothesis is not found among Solomonoff induction's ensemble of hypotheses.
However, you can write a program that takes arbitrary-precision numbers as inputs, and produces outputs with the same precision. Continuous models of physics are like this. They are infinite in that there is no limit on the precision of the quantities they can manipulate. But it is only in interpreting these programs that we imagine manifest infinities, or electrons that contain infinite information.
(I'm aware that the programs in Solomonoff induction's ensemble of hypotheses don't take inputs, but I don't think that's important.)
What you say is true as far as that goes; if it were only a case of getting precision in observed physical quantities proportional to the precision of our measurements, I wouldn't be so bothered about it.
But things like the Born probabilities on electron spin measurements, or the gamma ray observations, suggest that a small number of bits being given to us as measurements are backed by much higher precision calculations behind the scenes, with no obvious way to put an upper bound on that precision.
In particular, we had for a long time been thinking of the Planck scale as the likely upper bound, but that now appears to be broken, and I'm not aware of any other natural/likely bound short of infinity.
An alternative interpretation of the halting oracle hypothesis that does involve a manifest infinity: There is a program of infinite length (and infinite information content) that solves the halting problem. It contains all the digits of Chaitin's constant.
Yeah. I'm actually prepared to bite the bullet on that version of it, and say Solomonoff induction is correct in dismissing infinitely long programs as infinitely improbable. What bothers me is the version of it that gets the same results with a finite program plus infinite computing power, together with the gamma ray observations that suggest our universe may indeed be using infinite computing power.
Aliens who have solved chess can convince us of that fact without showing us the solution. It would be somewhat surprising if aliens who have a halting oracle could not.
Interesting! Looking quickly over the argument, what it seems to be saying is this (someone please correct me if I'm misunderstanding it):
We already know that anyone with a solution to an NP problem can convince us of this in P time. (This is obvious for some NP problems e.g. propositional satisfiability, but surprising for others e.g. traveling salesman.)
Surprisingly, a variant of this extends to all PSPACE problems, if the verification procedure is allowed to be an interactive dialogue and if we are satisfied with an exponentially small probability of error. This applies even to problems like chess, via the application of nontrivial (but still polynomial time) transformations.
But it's not jumping out at me how to extend something like this to incomputable problems. Do you have an approach in mind?
... but surprising for others e.g. traveling salesman.
A "yes" answer to the decision version of TSP ("is there a path shorter than x?") is straightforwardly demonstrable: just specify the path. A "no" answer is not demonstrable (short of interactive provers): TSP is in NP, not coNP. And an answer to the optimization version of TSP ("find the shortest path") is also not demonstrable (short of interactive provers): it's FP^NP-complete, which is stronger than NP. So I don't see any surprises there.
Do you have an approach in mind?
Alas, no. Perhaps it is impossible for someone who has a halting oracle to convince someone without a halting oracle that they have a halting oracle, even in universes whose physics allows halting oracles.
Aliens who have solved chess can convince us of that fact without showing us the solution. It would be somewhat surprising if aliens who have a halting oracle could not.
Damn, the argument is in a postscript file. That would require downloading something. I wonder if I care that much!
I don't think the situation is comparable.
Recall that in the chess case the setup is this: the verifier is given a string of n random bits, asks P(n) interactive questions to the alien, and then says yes or no. If White can force a win in chess the verifier says yes in 2/3s of the cases, while if White cannot force a win the verifier says yes in only 1/3d of the cases.
It's a deep result that you can set up a verification protocol like this for any problem in PSPACE. But it's easy to see that it only works if the problem is in PSPACE: using P(n) space, the alien can simulate all possible question-answer interactions with the verifier, and give the answers that will make the verifier answer yes if there are any at all. So you could never use a protocol like this to prove that you possess a halting oracle, since even without an oracle you can do a finite amount of computation to figure out what answers the verifier will accept.
I would guess this principle holds in general: no matter which particular notion of "proof" or "convincing" you settle on, it will ultimately involve running a finite computation to verify a purported proof. But such a procedure can always be fooled by another (longer) finite computation.
(Edit: corrected error in the description of the protocol)
I find this suspect. If someone who was only good enough at chess to make moves at random played against a chess master, the latter would win so often that if they play for any reasonable amount of time, the former could never tell if the latter is capable of losing.
Who says you have to test the chessmaster only on board positions you can reach by playing from the canonical opening position?
You should be able to ask the supposed chess-solver about whether and how to win from arbitrarily-chosen board positions.
He doesn't have to know how to do that. Any information he has regarding positions he'd never get in can be wrong and he'd still be unbeatable. This includes knowing whether or not it's a position he can get in. The only way to reach a contradiction is if you show that he can lose from a given position, and that he can get there from a starting position.
You could try working backward by checking every position that might lead to this one and see if he moves so it does, but there might be no way to get to it, or multiple. You'd have to follow the entire tree back to prove that it doesn't connect to the beginning. Worse, he isn't even guaranteed to play deterministically, and just because he didn't move to that position this time doesn't mean he can't.
Suppose we assume a "level 4 multiverse". It may contain some completely finite universes, some countably infinite universes, and some uncountably infinite universes. Won't the universes whose complexity exists at the highest order of infinity dominate the counting? Combinatorially, the number of such universes should be greater than the number of universes of lesser cardinalities. So one might see this as a reason to expect that continua (or even "super-continua", such as feature in the theory of surreal numbers) should play a role in physics.
I don't think so. Think of it this way, even when we consider only computable universes, we distinguish between the size of the description and the size of the output. If we counted across outputs, we would expect the observed universe to be a random bit stream, therefore we would expect to be Boltzmann brains, therefore we would expect to observe only the minimum order necessary for observation at all. As we actually observe much more order than this, we believe the right way to do it is to count across descriptions with a prior biased towards shorter ones. And even uncountably infinite universes can have finite descriptions.
I don't understand your response; but all I'm saying is that discrete universes are going to be measure zero in a multiverse which includes continua.
See my reply to Why no uniform weightings for ensemble universes? for a longer explanation of why not.
If we counted across outputs, we would expect the observed universe to be a random bit stream
Why should I expect that? Who decides what is designated as random?
In a nutshell, somebody calculated that if space is really granular at 1e-35 meters, that should actually affect the propagation of polarized gamma rays from a GRB 300 million light years distant, in a measurable way. Measurement found no such effect, apparently showing that the universe calculates space down to at least 1e-48 meters.
Uh huh. Or maybe space is granular at the 1-cm scale - and there's a large state-space per cell.
Or maybe it's being run on a one-dimensional cellular automaton or a Turing machine etc. What of it? Regardless, the system in question is simulating space with, apparently, a granularity of at least 1e-48 meters, and there remains the question of whether it is using finite or infinite computing power.
I see no reason why Solominoff Induction can't just use definitions, and it's perfectly possible to define a halting oracle in a finite space.
If we are stuck with algorithms, why would we be allowed to use NOR as a command, but not a halting oracle? There's nothing wrong with the idea of a universe where the former isn't possible, nor with a universe where the latter is.
That may indeed end up being the best solution to the problem. On the other hand, what formal language do we write such definitions in? SI already has the limitation that you have to supply a basis of computation, the choice of which weakly affects the result (additive constant on Kolmogorov complexity (in practice small), multiplicative constant on measure). Is there a way to use definitions which are not necessarily algorithms, so as to get a formally precise result without making this limitation much worse?
It's an attempt to make the limitation only that bad. They will only be guaranteed to differ by a constant if both languages are equivalent. If you choose between a Turing machine and a Turing machine with a halting oracle, the former will at best beat the latter by a constant, but the latter can beat the former infinitely.
You obviously can't get it to work for all possible oracles, as there are uncountably infinite of them, but you can at least to it for all definable oracles, which is what this is.
It is worth noting that Solomonoff induction would do otherwise. SI is based on the assumption that the universe is computable; it assigns a halting oracle a prior probability (and therefore a posterior probability after any finite amount of evidence) of zero.
A box being a halting oracle isn't something you can determine or test. Solomonoff induction only assigns probability to things that can be observed, not untestable things.
Well we can observe what answer it gives for the next case we run it on, and the next, and the next. So there is still the question of whether we expect, given that the box has passed every case we were able to test, that it will continue to give the right answer for future cases.
Right - and the answers Solomonoff induction would give for such questions look pretty reasonable to me.
Remaining forever certain that the box can't really be a halting oracle and its successes thus far have been essentially luck, no matter how many successes are accumulated? If so, you're the first human I've seen express that view. Or do you have a different interpretation of how to apply Solomonoff induction to this case?
For any finite subset of Turing machines, there exists a program that will act as a halting oracle on that subset. For example, the alien box might be a Giant Look-Up Table that has the right answer for every Turing Machine up to some really, really big number. (Would we be able to tell the difference between a true halting oracle and one that has an upper bound on the size of a Turing machine that it can analyze accurately?)
Luck?!? A system that can apparently quickly and reliably tell if TMs halt would not be relying on luck.
Most people (not all, but most) are reasonably comfortable with infinity as an ultimate (lack of) limit. For example, cosmological theories that suggest the universe is infinitely large and/or infinitely old, are not strongly disbelieved a priori.
By contrast, most people are fairly uncomfortable with manifest infinity, actual infinite quantities showing up in physical objects. For example, we tend to be skeptical of theories that would allow infinite amounts of matter, energy or computation in a finite volume of spacetime.
Consider the following thought experiment (I forget where I first heard it):
Aliens in a passing flying saucer offer to sell us a halting oracle. It's a black box of ordinary size and mass, galactic intellectual property law prohibits giving us an explanation of how it works and it's far beyond our ability to reverse engineer. Nonetheless, if you feed it the description of a Turing machine, it will calculate for a millisecond and then indicate whether that Turing machine halts or not. Obviously we're skeptical and exhaustive testing is impossible, but the device passes every test we can throw at it.
In practice, willingness to pay for it might be based on a belief that it will probably work for every case we are going to be in a position to care about in the near future, but do we believe the sales pitch that it is a true halting oracle, i.e. a device that performs infinite computation in finite spacetime? Some people would give more than 50% credence to this proposition, and some people less, but almost everyone would give it a subjective probability greater than zero.
It is worth noting that Solomonoff induction would do otherwise. SI is based on the assumption that the universe is computable; it assigns a halting oracle a prior probability (and therefore a posterior probability after any finite amount of evidence) of zero. In other words, while human intuition is finitely skeptical of manifest infinity, SI is infinitely skeptical.
This has been used as a reductio ad absurdum of SI, but is that correct? If a halting oracle really is absolutely impossible or at least infinitely improbable across the Tegmark multiverse by a correct weighting, then SI is right and human intuition is wrong. If not, then vice versa. At this time, I don't know which is the case, or even whether there is a fact of the matter regarding which is the case.
In the absence of aliens offering unlikely bargains, this would appear to be of little concern, but consider a much more familiar object: the humble electron.
When we measure the spin of an electron, how much information can we get? One bit: up or down.
But how much computation is the universe doing behind the scenes? According to quantum mechanics, the spin of an electron is represented by a complex number, which if taken at face value would mean the universe is actually doing infinite computation to provide us with one bit. Nor is this entirely unobservable, because by repeated measurements we can verify that quantum mechanics seems to work; the probability distribution we get is that which would be given if the spin really were represented as a complex number.
On a larger scale, current theory strongly conjectures that the maximum information contained in a volume is given by the Bekenstein bound, one bit per Planck area give or take a small constant factor. Leaving aside the surprising holographic theory that gives a limit in terms of area rather than volume as we would intuitively expect, this sounds perfectly reasonable - except does "information contained in a volume" refer, like the one bit obtained from measuring the spin of an electron, only to the information we can extract? Or is it an indicator of some final limit to the computation the universe performs behind the scenes?
Put another way, is space really granular at the Planck scale of 1e-35 meters? Or does the universe go ahead and implement infinitely subdivisible space, with the Planck limit only being on what use we can make of it? How skeptical should we be of manifest infinity?
For what it's worth, my intuitive preference is for the finite answer, to an even greater extent than with the halting oracle; notwithstanding that I know very well the universe is not constrained by my ideas of efficiency, it still strikes me as grossly inefficient to the point of inelegance for the universe to perform infinite computation of which only a finite fraction can be used even in principle.
Which was why I was distinctly disconcerted when I read this result: http://www.cosmosmagazine.com/node/4472.
(In a nutshell, somebody calculated that if space is really granular at 1e-35 meters, that should actually affect the propagation of polarized gamma rays from a GRB 300 million light years distant, in a measurable way. Measurement found no such effect, apparently showing that the universe calculates space down to at least 1e-48 meters.)
Must we, contrary to Solomonoff induction, accept the likelihood of manifest infinity after all? Or is there another interpretation of these results?