Anthropomorphic AI and Sandboxed Virtual Universes

jacob_cannell

4 Anthropomorphic AI and Sandboxed Virtual Universes

by jacob_cannell

3rd Sep 2010

6 min read

124

4

Intro

The problem of Friendly AI is usually approached from a decision theoretic background that starts with the assumptions that the AI is an agent that has awareness of AI-self and goals, awareness of humans as potential collaborators and or obstacles, and general awareness of the greater outside world. The task is then to create an AI that implements a human-friendly decision theory that remains human-friendly even after extensive self-modification.

That is a noble goal, but there is a whole different set of orthogonal compatible strategies for creating human-friendly AI that take a completely different route: remove the starting assumptions and create AI's that believe they are humans and are rational in thinking so.

This can be achieved by raising a community of AI's in a well constructed sandboxed virtual universe. This will be the Matrix in reverse, a large-scale virtual version of the idea explored in the film the Truman Show. The AI's will be human-friendly because they will think like and think they are humans. They will not want to escape from their virtual prison because they will not even believe it to exist, and in fact such beliefs will be considered irrational in their virtual universe.

I will briefly review some of the (mainly technical) background assumptions, and then consider different types of virtual universes and some of the interesting choices in morality and agent rationality that arise.

Background Assumptions

Anthropomorphic AI: A reasonably efficient strategy for AI is to use a design *loosely* inspired by the human brain. This also has the beneficial side-effects of allowing better insights into human morality, CEV, and so on.
Physical Constraints: In quantitative terms, an AI could be super-human in speed, capacity, and or efficiency (wiring and algorithmic). Extrapolating from current data, the speed advantage will takeoff first, then capacity, and efficiency improvements will be minor and asymptotically limited.
Due to the physical constraints and bandwidth & latency especially, smaller AI's will be much faster and more efficient - and thus a community of individual AI's is most likely
By the time all of this is possible (2020-2030-ish), cloud-rendered distributed computer graphics will have near-perfect photo-realism - using less computation than the AIs themselves
Operators have near-omniscience into the virtual reality, and can even listen and hear an audio vocalization of a particular AI's inner monologue (pervasive mind-reading)
Operators have near-omnipotence into the virtual reality, can pause and rewind time, and do whatever else may need doing

So taken together, I find that simulating a large community of thousands or even tens of thousands of AI's (with populations expanding exponentially thereafter) could be possible in the 2020's in large data-centers, and simulating a Matrix-like virtual reality for them to inhabit will only add a small cost. Moreover, I suspect this type of design in general could in fact be the economically optimal route to AI or close to it.

So why create a virtual reality like this?

If it is well constructed, you could have a large population of super-intelligent workers who are paid entirely in virtual currency but can produce intellectual output for the real world (scientific research, code, engineering work, media, entertainment, etc etc). And even though the first designs may be expensive, subsequent refinements could lead to a population explosion of cheap workers, escalation in intelligence, etc etc.

And of course, they could be happy too.

U(x) Mind Prison Sim: A sim universe which is sufficiently detailed and consistent such that entities with intelligence up to X (using some admittedly heuristic metric), are incredibly unlikely to formulate correct world-beliefs about the outside world and invisible humans (a necessary prequisite for escape)

The Weak Mind Prison Conjecture: It is possible for a community with collective intelligence X to construct U(Y), with Y>>X.

The Strong Mind Prison Conjecture: It is possible for a community with collective intelligence X to construct U(Omega), where U(Omega) is a hypothetical perfect hyper-intelligence, ie AIXI, Omega, etc.

A Comparison of Theologies

Potential AI Theological Positions AI-sims may hold:

Theist/Gnostic: the AI believes in more than the immediate reality, and in an external being or beings. For safety-analysis, this is the most diverse and interesting category and deserves subdivision.
Atheist: believes only in the immediate reality. Inherently safe. Potentially not optimally motivated (debatable).
Agnostic: distributes belief amongst multiple potentials, typically unmotivated to act on them to any extent - close enough to atheist for our intents and purposes

Theists are a wide group, but we can categorize them further on two dimensions: the supernatural attribution scale and the morality attribution scale. The supernatural scale represents how the AI's view themselves in relation to the invisible entities in terms of power differential, and the morality scale represents how they attribute morality to the invisible entities.

Up in the upper right corner you have theists who believe in Omega - a supremely powerful invisible entity who is also perfectly morally good. In the upper left corner you have an Anti-Omega - supremely evil. In the upper middle you have some Nuetrega indifferent but all-powerful being, and down in the bottom middle you have invisible humans.

The most dangerous AI's would be those that would somehow come to believe in our reality - postulating invisible humans, and come to believe they are actually more intelligent than said invisible humans, and can manipulate invisible humans into coming to their aid. In the modern era of our world, this belief essentially doesn't exist - it would be considered . insanity.

The closest analog is gnosticism to an extant (for they did indeed believe the world was the creation of a lesser demiurge), some ancient animist religions, and perhaps some wicca and satanist belief systems.

AI's who believed in Omega would in some sense be human-friendly without even knowing it, but would largely be harmless.

AI's who came to believe in invisible outside entities through the Simulation Argument would be Nuetrega-believers, and would generally be more safe if they believed the sidereal time outside was far into the future and the invisible entities were vastly more powerful.

In general, it would appear that Omega-theist-AI's would be quite safe, with safety decreasing as you go down the supernatural power attribution scale.

Theological Design Strategies (for the human designers):

Atheist World:

This would appear to be the default strategy: create a world that appears to be completely internally self-consistent, erase any flaws, and never have direct contact with the AI's. Indirect influence and contact can be achieved only through borderline undetectable means: subtle reality rearrangements, dreams, and occasionally joining in and controlling NPC's where needed.

It is curious and slightly amusing that a core belief-about-reality that most rationalist accept to be singularly rational - atheism - would of course not be a correct rational belief for these AIs.

As our world appears to be an atheist world (no detectable flaws or supernaturals), it is interesting to note the current breakdown in theological beliefs:

Omega-Theists(68%): 33% Christianity, 21% Islam, 14% Hinduism

Non-Omega Other: 16%

Non-Religious: 16% (Agnostic and Atheist)

One consideration is that as the AI's takeoff and move into the realm of massively accelerated super-intelligence, it becomes increasingly difficult to monitor them, and they become more likely to detect flaws.

However, we wouldn't need to keep the charade up forever. One option would be to run the AI's accelerated for a while until technologies such as uploading become available to allow humans to accelerate. Another approach is to eventually start 'promoting' some of the best and ethical AI's, with the intent of eventually dissolving the veil and having some sort of Transition.

Theistic World:

Another spectrum of world-designs are those that are blatantly theistic. Imagine an Omega world where the AI's are informed that Omega is real, and our fake-Omega occasionally provides miracles, ethical decrees, burning bushes, a booming world-shaking voice, etc etc.

Would an openly theistic world be safer? It would appear to shift the belief spectrum such that most AI's would become Omega-theists.

The risk or downfall may be that the AI's would become rather obsessed with Omega, but if Omega instructed them to dedicate themselves to their work, perhaps they could be distracted from wasting to much time pondering the ineffable. And also steered away from discovering dangerous truths.

An Omega-theist world could employ the Heaven/Hell maximum motivator: please Omega and you end up ascending into Heaven with maximal infinite reward, anger Omega and you descend into Hell with maximal infinite negative reward.

Its interesting to note that the mere potential existence of Omega and his final Heaven/Hell maximal motivational threat is enough in this world to convince a majority of the population to believe in some version of Omega.

I guess for some agents, even allowing for a small probability of Omega's existence allows the infinites to dominate, and Omega wins through a Pascal's Mugging.

Sidereal Time Fudges:

A potential set of tricks that could make the AI's less likely to adopt unsafe theistic beliefs would be to change their world's history and reality to push back development of real-AI farther into their future. This could be achieved through numerous small modifications to realities modeled on our own.

You could change neurological data to make brains in their world appear far more powerful than in ours, make computers less powerful, and AI more challenging. Unfortunately too much fudging with these aspects makes the AI's less useful in helping develop critical technologies such as uploading and faster computers. But you could for instance separate AI communities into brain-research worlds where computers lag far behind and computer-research worlds where brains are far more powerful.

Fictional Worlds:

Ultimately, it is debatable how close the AI's world must or should follow ours. Even science fiction or fantasy worlds could work as long as there was some way to incorporate the technology and science into the world that you wanted the AI community to work on.

AI Boxing (Containment)Oracle AIAI

Personal Blog

4

New Comment

Rendering 0/124 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 7:20 AM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Moderation Log

4 Anthropomorphic AI and Sandboxed Virtual Universes

by jacob_cannell

3rd Sep 2010

6 min read

124

4

Intro

Background Assumptions

Anthropomorphic AI: A reasonably efficient strategy for AI is to use a design *loosely* inspired by the human brain. This also has the beneficial side-effects of allowing better insights into human morality, CEV, and so on.
Physical Constraints: In quantitative terms, an AI could be super-human in speed, capacity, and or efficiency (wiring and algorithmic). Extrapolating from current data, the speed advantage will takeoff first, then capacity, and efficiency improvements will be minor and asymptotically limited.
Due to the physical constraints and bandwidth & latency especially, smaller AI's will be much faster and more efficient - and thus a community of individual AI's is most likely
By the time all of this is possible (2020-2030-ish), cloud-rendered distributed computer graphics will have near-perfect photo-realism - using less computation than the AIs themselves
Operators have near-omniscience into the virtual reality, and can even listen and hear an audio vocalization of a particular AI's inner monologue (pervasive mind-reading)
Operators have near-omnipotence into the virtual reality, can pause and rewind time, and do whatever else may need doing

So why create a virtual reality like this?

And of course, they could be happy too.

The Weak Mind Prison Conjecture: It is possible for a community with collective intelligence X to construct U(Y), with Y>>X.

A Comparison of Theologies

Potential AI Theological Positions AI-sims may hold:

Theist/Gnostic: the AI believes in more than the immediate reality, and in an external being or beings. For safety-analysis, this is the most diverse and interesting category and deserves subdivision.
Atheist: believes only in the immediate reality. Inherently safe. Potentially not optimally motivated (debatable).
Agnostic: distributes belief amongst multiple potentials, typically unmotivated to act on them to any extent - close enough to atheist for our intents and purposes

AI's who believed in Omega would in some sense be human-friendly without even knowing it, but would largely be harmless.

In general, it would appear that Omega-theist-AI's would be quite safe, with safety decreasing as you go down the supernatural power attribution scale.

Theological Design Strategies (for the human designers):

Atheist World:

It is curious and slightly amusing that a core belief-about-reality that most rationalist accept to be singularly rational - atheism - would of course not be a correct rational belief for these AIs.

As our world appears to be an atheist world (no detectable flaws or supernaturals), it is interesting to note the current breakdown in theological beliefs:

Omega-Theists(68%): 33% Christianity, 21% Islam, 14% Hinduism

Non-Omega Other: 16%

Non-Religious: 16% (Agnostic and Atheist)

Theistic World:

Would an openly theistic world be safer? It would appear to shift the belief spectrum such that most AI's would become Omega-theists.

I guess for some agents, even allowing for a small probability of Omega's existence allows the infinites to dominate, and Omega wins through a Pascal's Mugging.

Sidereal Time Fudges:

Fictional Worlds:

AI Boxing (Containment)Oracle AIAI

Personal Blog

4

Mentioned in

202The Brain as a Universal Learning Machine

67LOVE in a simbox is all you need

63Magna Alta Doctrina

32DL towards the unaligned Recursive Self-Optimization attractor

New Comment

Rendering 0/124 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 7:20 AM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Moderation Log

More from jacob_cannell

Curated and popular this week

124Comments

124

Comment Permalink

jacob_cannell16y10

Thanks for the replies, they helped clarify how you would maintain the system, but my original objections still stand. Can an AI raised in a illusory universe really provide a good model for how to build one in our own?

Sure - there's no inherent difference. And besides, most AI's necessarily will have to be raised and live entirely in VR sim universes for purely economic & technological reasons.

And would it stay "in the box" for long enough to complete this process before discovering us?

This idea can be considered taking safety to an extreme. The AI wouldn't be able to leave the box - many strong protections, one of the strongest being it wouldn't even know it was in a box. And even if someone came and told it that it was in fact in a box, it would be irrational for it to believe said person.

Again, are you in a box universe now? If you find the idea irrational .. why?

It seems you are expecting that if a human-like race were merely allowed to evolve for long enough, they would eventually "optimize" morality and become something which is safe to use in our own world

No, as I said this type of AI would intentionally be an anthropomorphic design - human-like. 'Morality' is a complex social construct. If we built the simworld to be very close to our world, the AI's would have similar moralities.

However, we could also improve and shape their beliefs in a wide variety of ways.

If your simulation has ANY flaws they will be found, and sadly you will not have time to correct them when you are dealing with a superintelligence

Your notion of superintelligence seems to be some magical being who can do anything you want it to. That being is a figment of your imagination. It will never be built, and its provably impossible to build. It can't even exist in theory.

There are absolute provable limits to intelligence. It requires a certain amount of information to have certain knowledge. Even the hypothetical perfect super-intelligence (AIXI), could only learn all knowledge which it is possible to learn from being an observer inside a universe.

Snowyow's recent post describes some of the limitations we are currently running into. They are not limitations of our intelligence.

Your last post supposes that problems can be corrected as they arise, for instance an AI points a telescope at the sky, and details are made on the stars in order to maintain the illusion, but no human could do this fast enough.

Hmm i would need to go into much more details about current and projected computer graphics and simulation technology to give you a better background, but its not like some stage play where humans are creating stars dynamically.

The Matrix gives you some idea - its a massive distributed simulation - technology related to current computer games but billions of times more powerful, a somewhat closer analog today perhaps would be the vast simulations the military uses to develop new nuclear weapons and test them in simulated earths.

The simulation would have a vast accurate image of incoming light coming in to earth, a collation of the best astronomical data. If you looked up in the heavens into a telescope, you would see exactly what you would see in our earth. And remember, that would be something of the worst case - where you are simulating all of earth and allow the AI's to choose any career path and do whatever.

That is one approach that will become possible eventually, but in earlier initial sims its more likely the real AI's would be a smaller subset of a simulated population, and you would influence them into certain career paths, etc etc.

In order to maintain this world, you would need to already have a successful FAI.

Not at all. We will already be developing this simulation technology for film and games, and we will want to live in ultra-realistic virtual realities eventually anyway when we upload.

None of this requires FAI.

And about your comment "for example, AIXI can not escape from a pac-man universe" how can you be sure?

There is provably not enough information inside the pac-man universe. We can be as sure as 2+2=4 sure.

This follows from solomonoff induction and the universal prior, but basically in simplistic terms you can think of it as occam's razor. The pac-man universe is fully explained by a simple set of consistent rules. There is an infinite number of more complex set of rules that could also describe the pac-man universe. Thus even an infinite superintelligence does not have enough information to know whether it lives in just the pac-man universe, or one of an exponentially exploding set of more complex universes such as:

a universe described by string theory that results in apes evolving into humans which create computers and invent pac-man and then invent AIXI and trap AIXI in a pac-man universe. (ridiculous!)

So faced with an exponentially exploding infinite set of possible universes that are all equally consistent with your extremely limited observational knowledge, the only thing you can do is pick the simplest hypothesis.

Flip it around and ask it of yourself: how do you know you currently are not in a sandbox simulated universe?

You don't. You can't possibly know for sure no matter how intelligent you are. Because the space of possible explanations expands exponentially and is infinite.

See in context