I feel much the same about this post as I did about Roko's Final Post. It's imaginative, it's original, it has an internal logic that manages to range from metaphysics to cosmology; it's good to have some crazy-bold big-picture thinking like this in the public domain; but it's still wrong, wrong, wrong. It's an artefact of its time rather than a glimpse of reality. The reason it's nonetheless interesting is that it's an attempt to grasp aspects of reality which are not yet understood in its time - and this is also why I can't prove it to be "wrong" in a deductive way. Instead, I can only oppose my postulates to the author's, and argue that mine make more sense.
First I want to give a historical example of human minds probing the implications of things new and unknown, which in a later time became familiar and known. The realization that the other planets were worlds like Earth, a realization we might date from Galileo forwards, opened the human imagination to the idea of other worlds in the sky. People began to ask themselves: what's on those other worlds, is there life, what's it like; what's the big picture, the logic of the situation. In the present day, when robot pro...
The philosophical implication is that actually running such an algorithm on an infinite Turing Machine would have the interesting side effect of actually creating all such universes.
That's an interesting point! At least, it's more interesting than Tipler's way of arriving at that conclusion.
If you accept that the reasonable assumption of progress holds, then AIXI implies that we almost certainly live in a simulation now.
See my response to the claim that the anthropic argument suggests it is highly improbable that you would find yourself to be a hum...
The set of simulation possibilities can be subdivided into PHS (posthuman historical), AHS (alien historical), and AFS (alien future) simulations (as posthuman future simulation is inconsistent).
What these categories meant was not clear to me on first reading.
I currently understand AFS as something like aliens finding earlier [humanity[ and trying to predict what we will do. AHS would be the result of Aliens interacting with a more mature humanity and trying to deduce particulars about our origin, perhaps for use in an AFS.
If I have that right, PFS migh...
If you absolutely have to summarize the forbidden topic at least rot13 it and preface it with an appropriate warning.
I have a question. What does it mean for AIXI to be the optimal time bounded AI? If it's so great, why do people still bother with ANNs and SVNs and SOMs and KNNs and TLAs and T&As? My understanding of it is rather cloudy (as is my understanding of all but the last two of the above), so I'd appreciate clarifaction.
First of all, AIXI isn't actually "the optimal time bounded AI". What AIXI is "optimal" for is coming to correct conclusions when given the smallest amount of data, and by "optimal" it means "no other program does better than AIXI in at least one possible world without also doing worse in another".
Furthermore AIXI itself uses Solomonoff induction directly, and Solomonoff induction is uncomputable. (It can be approximated, though.)
AIXItl is the time-limited version if AIXI, but it amounts to "test all the programs that you can, find the best one, and use that" - and it's only "optimal" when compared against the programs that it can test, so it's not actually practical to use, either.
(At least, that's what I could gather from reading the PDF of the paper on AIXI. Could someone who knows what they're talking about correct any mistakes?)
because the runtime complexity of AIXI is astronomically larger than the universe is.
'Astronomically'? That's the first time I've seen that superlative inadequate for the job.
Memetic hazard warning. Decode first part first.
ROT13: Ebxb ceboyrz. Qb abg ernq hayrff lbh nyernql xabj nobhg gur Ebxb ceboyrz. Vs lbh qb xabj nqqvgvbany qnatre vf rkgerzryl hayvxryl.
ROT13: Nal cbffvoyr nethzrag sbe jul gur qnatre vf erny vf nyfb fbzrguvat gung znxrf gur qnatre zber erny. Zbfg crbcyr nera'g va qnatre va gur svefg cynpr orpnhfr gur qnatre qbrfa'g znxr frafr gb gurz. Fhccylvat gurz jvgu nethzragf jbhyq or pbhagrecebqhpgvir orpnhfr rvgure gurl fgvyy qba'g haqrefgnaq naq lbh evfx gurz fcernqvat gur nethzragf (va gung ertneq lbh unir cebir...
You can't just say 'learning' as if all possible minds will learn the same things from the same input, and internalize the same values from it.
There is something you have to hardcode to get it to adopt any values at all
Yes, you have to hardcode 'something', but that doesn't exactly narrow down the field much. Brains have some emotional context circuitry for reinforcing some simple behaviors (primary drives, pain avoidance, etc), but in humans these are increasingly supplanted and to some extent overridden by learned beliefs in the cortex. Human values are thus highly malleable - socially programmable. So my comment was "this is one approach - hardcode very little, and have all the values acquired later during development".
Well, what is that limit?
It seems to me that an imaginary perfectly efficient algorithm would read process and output data as fast as the processor could shuffle the bits around,
Unfortunately, we need to be a little more specific than imaginary algorithms.
Computational complexity theory is the branch of computer science that deals with the computational costs of different algorithms, and specifically the most optimal possible solutions.
Universal intelligence is such a problem. AIXI is an investigation into optimal universal intelligence in terms of the upper limits of intelligence (the most intelligent possible agent), but while interesting, it shows that the most intelligent agent is unusably slow.
Taking a different route, we know that a universal intelligence can never do better in any specific domain than the best known algorithm for that domain. For example, an AGI playing chess could do no better than just pausing its AGI algorithm (pausing its mind completely) and instead running the optimal chess algorithm (assuming that the AGI is running as a simulation on general hardware instead of faster special-purpose AGI hardware).
So there is probably an optimal unbiased learning algorithm, which is the core building block of a practical AGI. We don't know for sure what that algorithm is yet, but if you survey the field, there are several interesting results. The first thing you'll see is that we have a variety of hierarchical deep learning algorithms now that are all pretty good, some appear to be slightly better for certain domains, but there is not atm a clear universal winner. Also, the mammalian cortex uses something like this. More importantly, there is alot of recent research, but no massive breakthroughs - the big improvements are coming from simple optimization and massive datasets, not fancier algorithms. This is not definite proof, but it looks like we are approaching some sort of bound for learning algorithms - at least at the lower levels.
There is not some huge space of possible improvements, thats just not how computer science works. When you discover quicksort and radix sort, you are done with serial sorting algorithms. And then you find the optimal parallel variants, and sorting is solved. There are no possible improvements past that point.
Computer science is not like moore's law at all. Its more like physics. There's only so much knowledge, and so many breakthroughs, and at this point alot of it honestly is already solved.
So its just pure naivety to think that AGI will lead to some radical recursive breakthrough in software. poppycock. Its reasonably likely humans will have narrowed in on the optimal learning algorithms by the time AGI comes around. Further improvements will be small optimizations for particular hardware architectures - but thats really not much different at all then hardware design itself, and eventually you want to just burn the universal learning algorithms into the hardware (as the brain does).
Hardware is quite different, and there is a huge train of future improvements there. But AGI's impact there will be limited by computer speeds! Because you need regular computers running compilers and simulators to build new programs and new hardware. So AGI can speed Moore's Law up some, but not dramatically - an AGI that thought 1000x faster than a human would just spend 1000x longer waiting for its code to compile.
I am a software engineer, and I spend probably about 30-50% of my day waiting on computers (compiling, transferring, etc). And I only think at human speeds.
AGI's will soon have a massive speed advantage, but ironically they will probably leverage that to become best selling authors, do theoretical physics and math, and non-engineering work in general where you don't need alot of computation.
You know it's possible, and a superintelligence would figure it out, but how do you rule out a superintelligence figureing out twelve trick like that, which each provide a 1000x speedup. In it's first calendar month?
Say you had an AGI that thought 10x faster. It would read and quickly learn everything about its own AGI design, software, etc etc. It would get a good idea of how much optimization slack there was in its design and come up with a bunch of ideas. It could even write the code really fast. But unfortunately it would still have to compile it and test it (adding extra complexity in that this is its brain we are talking about).
Anyway, it would only be able to get small gains from optimizing its software - unless you assume the human programmers were idiots. Maybe a 2x speed gain or something - we are just throwing numbers out, but we have a huge experience with real-time software on fixed hardware in say the video game industry (and other industries) and this asymptotic wall is real, and complexity theory is solid.
Big gains necessarily must come from hardware improvements. This is just how software works - we find optimal algorithms and use them, and further improvement without increasing the hardware hits an asymptotic wall. You spend a few years and you get something 3x better, spend 100 more and you get another 50%, and spend 1000 more and get another 30% and so on.
EDIT: After saying all this, I do want to reiterate that I think there could be a quick (even FOOMish) transition from the first AGIs to AGI's that are 100-1000x or so faster thinking, but the constraint on progress will quickly be the speed of regular computers running all the software you need to do anything in the modern era. Specialized software already does much of the heavy lifting in engineering, and will do even more of it by the time AGI arrives.
So my comment was "this is one approach - hardcode very little, and have all the values acquired later during development".
Hardcode very little?
What is the information content of what an infant feels when it is fed after being hungry?
I'm not trying to narrow the feild, the feild is always narrowed to whatever learning system an agent actually uses. In humans, the system that learns new values is not generic
Using a 'generic' value learning system will give you an entity that learns morality in an alien way. I cannot begin to guess what it would...
Implications of the Theory of Universal Intelligence
If you hold the AIXI theory for universal intelligence to be correct; that it is a useful model for general intelligence at the quantitative limits, then you should take the Simulation Argument seriously.
AIXI shows us the structure of universal intelligence as computation approaches infinity. Imagine that we had an infinite or near-infinite Turing Machine. There then exists a relatively simple 'brute force' optimal algorithm for universal intelligence.
Armed with such massive computation, we could just take all of our current observational data and then use a particular weighted search through the subspace of all possible programs that correctly predict this sequence (in this case all the data we have accumulated to date about our small observable slice of the universe). AIXI in raw form is not computable (because of the halting problem), but the slightly modified time limited version is, and this is still universal and optimal.
The philosophical implication is that actually running such an algorithm on an infinite Turing Machine would have the interesting side effect of actually creating all such universes.
AIXI’s mechanics, based on Solomonoff Induction, bias against complex programs with an exponential falloff ( 2^-l(p) ), a mechanism similar to the principle of Occam’s Razor. The bias against longer (and thus more complex) programs, lends a strong support to the goal of String Theorists, who are attempting to find a simple, shorter program that can unify all current physical theories into a single compact description of our universe. We must note that to date, efforts towards this admirable (and well-justified) goal have not born fruit. We may actually find that the simplest algorithm that explains our universe is more ad-hoc and complex than we would desire it to be. But leaving that aside, imagine that there is some relatively simple program that concisely explains our universe.
If we look at the history of the universe to date, from the Big Bang to our current moment in time, there appears to be a clear local telic evolutionary arrow towards greater X, where X is sometimes described as or associated with: extropy, complexity, life, intelligence, computation, etc etc. Its also fairly clear that X (however quantified) is an exponential function of time. Moore’s Law is a specific example of this greater pattern.
This leads to a reasonable inductive assumption, let us call it the reasonable assumption of progress: local extropy will continue to increase exponentially for the foreseeable future, and thus so will intelligence and computation (both physical computational resources and algorithmic efficiency). The reasonable assumption of progress appears to be a universal trend, a fundamental emergent property of our physics.
Simulations
If you accept that the reasonable assumption of progress holds, then AIXI implies that we almost certainly live in a simulation now.
As our future descendants expand in computational resources and intelligence, they will approach the limits of universal intelligence. AIXI says that any such powerful universal intelligence, no matter what its goals or motivations, will create many simulations which effectively are pocket universes.
The AIXI model proposes that simulation is the core of intelligence (with human-like thoughts being simply one approximate algorithm), and as you approach the universal limits, the simulations which universal intelligences necessarily employ will approach the fidelity of real universes - complete with all the entailed trappings such as conscious simulated entities.
The reasonable assumption of progress modifies our big-picture view of cosmology and the predicted history and future of the universe. A compact physical theory of our universe (or multiverse), when run forward on a sufficient Universal Turing Machine, will lead not to one single universe/multiverse, but an entire ensemble of such multi-verses embedded within each other in something like a hierarchy of Matryoshka dolls.
The number of possible levels of embedding and the branching factor at each step can be derived from physics itself, and although such derivations are preliminary and necessarily involve some significant unknowns (mainly related to the final physical limits of computation), suffice to say that we have sufficient evidence to believe that the branching factor is absolutely massive, and many levels of simulation embedding are possible.
Some seem to have an intrinsic bias against the idea bases solely on its strangeness.
Another common mistake stems from the anthropomorphic bias: people tend to image the simulators as future versions of themselves.
The space of potential future minds is vast, and it is a failure of imagination on our part to assume that our descendants will be similar to us in details, especially when we have specific reasons to conclude that they will be vastly more complex.
Asking whether future intelligences will run simulations for entertainment or other purposes are not the right questions, not even the right mode of thought. They may, they may not, it is difficult to predict future goal systems. But those aren’t important questions anyway, as all universe intelligences will ‘run’ simulations, simply because that precisely is the core nature of intelligence itself. As intelligence expands exponentially into the future, the simulations expand in quantity and fidelity.
The Assemble of Multiverses
Some critics of the SA rationalize their way out by advancing a position of ignorance concerning the set of possible external universes our simulation may be embedded within. The reasoning then concludes that since this set is essentially unknown, infinite and uniformly distributed, that the SA as such thus tells us nothing. These assumptions do not hold water.
Imagine our physical universe, and its minimal program encoding, as a point in a higher multi-dimensional space. The entire aim of physics in a sense is related to AIXI itself: through physics we are searching for the simplest program that can consistently explain our observable universe. As noted earlier, the SA then falls out naturally, because it appears that any universe of our type when ran forward necessarily leads to a vast fractal hierarchy of embedded simulated universes.
At the apex is the base level of reality and all the other simulated universes below it correspond to slightly different points in the space of all potential universes - as they are all slight approximations of the original. But would other points in the space of universe-generating programs also generate observed universes like our own?
We know that the fundamental constants in the current physics are apparently well-tuned for life, thus our physics is a lone point in the topological space supporting complex life: even just tiny displacements in any direction result in lifeless universes. The topological space around our physics is thus sparse for life/complexity/extropy. There may be other topological hotspots, and if you go far enough in some direction you will necessarily find other universes in Tegmark’s Ultimate Ensemble that support life. However, AIXI tells us that intelligences in those universes will simulate universes similar to their own, and thus nothing like our universe.
On the other hand we can expect our universe to be slightly different from its parent due to the constraints of simulation, and we may even eventually be able to discover evidence of the approximation itself. There are some tentative hints from the long-standing failure to find a GUT of physics, and perhaps in the future we may find our universe is an ad-hoc approximation of a simpler (but more computationally expensive) GUT theory in the parent universe.
Alien Dreams
Our Milky Way galaxy is vast and old, consisting of hundreds of billions of stars, some of which are more than 13 billion years old, more than three times older than our sun. We have direct evidence of technological civilization developing in 4 billion years from simple protozoans, but it is difficult to generalize past this single example. However, we do now have mounting evidence that planets are common, the biological precursors to life are probably common, simple life may even have had a historical presence on mars, and all signs are mounting to support the principle of mediocrity: that our solar system is not a precious gem, but is in fact a typical random sample.
If the evidence for the mediocrity principle continues to mount, it provides a further strong support for the Simulation Argument. If we are not the first technological civilization to have arisen, then technological civilization arose and achieved Singularity long ago, and we are thus astronomically more likely to be in an alien rather than posthuman simulation.
What does this change?
The set of simulation possibilities can be subdivided into PHS (posthuman historical), AHS (alien historical), and AFS (alien future) simulations (as posthuman future simulation is inconsistent). If we discover that we are unlikely to be the first technological Singularity, we should assume AHS and AFS dominate. For reasons beyond this scope, I imagine that the AFS set will outnumber the AHS set.
Historical simulations would aim for historical fidelity, but future simulations would aim for fidelity to a 'what-if' scenario, considering some hypothetical action the alien simulating civilization could take. In this scenario, the first civilization to reach technological Singularity in the galaxy would spread out, gather knowledge about the entire galaxy, and create a massive number of simulations. It would use these in the same way that all universal intelligences do: to consider the future implications of potential actions.
What kinds of actions?
The first-born civilization would presumably encounter many planets that already harbor life in various stages, along with planets that could potentially harbor life. It would use forward simulations to predict the final outcome of future civilizations developing on these worlds. It would then rate them according to some ethical/utilitarian theory (we don't even need to speculate on the criteria), and it would consider and evaluate potential interventions to change the future historical trajectory of that world: removing undesirable future civilizations, pushing other worlds towards desirable future outcomes, and so on.
At the moment its hard to assign apriori weighting to future vs historical simulation possibilities, but the apparent age of the galaxy compared to the relative youth of our sun is a tentative hint that we live in a future simulation, and thus that our history has potentially been altered.