New Pascal's Mugging idea for potential solution
I'll keep this quick:
In general, the problem presented by the Mugging is this: As we examine the utility of a given act for each possible world we could be in, in order from most probable to least probable, the utilities can grow much faster than the probabilities shrink. Thus it seems that the standard maxim "Maximize expected utility" is impossible to carry out, since there is no such maximum. When we go down the list of hypotheses multiplying the utility of the act on that hypothesis, by the probability of that hypothesis, the result does not converge to anything.
Here's an idea that may fix this:
For every possible world W of complexity N, there's another possible world of complexity N+c that's just like W, except that it has two parallel, identical universes instead of just one. (If it matters, suppose that they are connected by an extra dimension.) (If this isn't obvious, say so and I can explain.)
Moreover, there's another possible world of complexity N+c+1 that's just like W except that it has four such parallel identical universes.
And a world of complexity N+c+X that has R parallel identical universes, where R is the largest number that can be specified in X bits of information.
So, take any given extreme mugger hypothesis like "I'm a matrix lord who will kill 3^^^^3 people if you don't give me $5." Uncontroversially, the probability of this hypothesis will be something much smaller than the probability of the default hypothesis. Let's be conservative and say the ratio is 1 in a billion.
(Here's the part I'm not so confident in)
Translating that into hypotheses with complexity values, that means that the mugger hypothesis has about 30 more bits of information in it than the default hypothesis.
So, assuming c is small (and actually I think this assumption can be done away with) there's another hypothesis, equally likely to the Mugger hypothesis, which is that you are in a duplicate universe that is exactly like the universe in the default hypothesis, except with R duplicates, where R is the largest number we can specify in 30 bits.
That number is very large indeed. (See the Busy Beaver function.) My guess is that it's going to be way way way larger than 3^^^^3. (It takes less than 30 bits to specify 3^^^^3, no?)
So this isn't exactly a formal solution yet, but it seems like it might be on to something. Perhaps our expected utility converges after all.
Thoughts?
(I'm very confused about all this which is why I'm posting it in the first place.)
How realistic would AI-engineered chatbots be?
I'm interested in how easy it would be to simulate just one present-day person's life rather than an entire planet's worth of people. Currently our chatbots are bad enough that we could not populate the world with NPC's; the lone human would quickly figure out that everyone else was... different, duller, incomprehensibly stupid, etc.
But what if the chatbots were designed by a superintelligent AI?
If a superintelligent AI was simulating my entire life from birth, would it be able to do it (for reasonably low computational resources cost, i.e. less than the cost of simulating another person) without simulating any other people in sufficient detail that they would be people?
I suspect that the answer is yes. If the answer is "maybe" or "no," I would very much like to hear tips on how to tell whether someone is an ideal chatbot.
Thoughts?
EDIT: In the comments most people are asking me to clarify what I mean by various things. By popular demand:
I interact with people in more ways than just textual communication. I also hear them, and see them move about. So when I speak of chatbots I don't mean bots that can do nothing but chat. I mean an algorithm governing the behavior of a simulated entire-human-body, that is nowhere near the complexity of a brain. (Modern chatbots are algorithms governing the behavior of a simulated human-hands-typing-on-keyboard, that are nowhere near the complexity of a brain.)
When I spoke of "simulating any other people in sufficient detail that they would be people" I didn't mean to launch us into a philosophical discussion of consciousness or personhood. I take it to be common ground among all of us here that very simple algorithms, such as modern chatbots, are not people. By contrast, many of us think that a simulated human brain would be a person. Assuming a simulated human brain would be a person, but a simple chatbot-like algorithm would not, my question is: Would any algorithm complex enough to fool me into thinking it was a person over the course of repeated interactions actually be a person? Or could all the bodies around me be governed by algorithms which are too simple to be people?
I realize that we have no consensus on how complex an algorithm needs to be to be a person. That's OK. I'm hoping that this conversation can answer my questions anyhow; I'm expecting answers along the lines of
(A) "For a program only a few orders of magnitude more complicated than current chatbots, you could be reliably fooled your whole life" or
(B) "Any program capable of fooling you would either draw from massive databases of pre-planned responses, which would be impractical, or actually simulate human-like reasoning."
These answers wouldn't settle the question for good without a theory of personhood, but that's OK with me, these answers would be plenty good enough.
Be Wary of Thinking Like a FAI
I recently realized that, encouraged by LessWrong, I had been using a heuristic in my philosophical reasoning that I now think is suspect. I'm not accusing anybody else of falling into the same trap; I'm just recounting my own situation for the benefit of all.
I actually am not 100% sure that the heuristic is wrong. I hope that this discussion about it generalizes into a conversation about intuition and the relationship between FAI epistemology and our own epistemology.
The heuristic is this: If the ideal FAI would think a certain way, then I should think that way as well. At least in epistemic matters, I should strive to be like an ideal FAI.
Examples of the heuristic in use are:
--The ideal FAI wouldn't care about its personal identity over time; it would have no problem copying itself and deleting the original as the need arose. So I should (a) not care about personal identity over time, even if it exists, and (b) stop believing that it exists.
--The ideal FAI wouldn't care about its personal identity at a given time either; if it was proven that 99% of all observers with its total information set were in fact Boltzmann Brains, then it would continue to act as if it were not a Boltzmann Brain, since that's what maximizes utility. So I should (a) act as if I'm not a BB even if I am one, and (b) stop thinking it is even a meaningful possibility.
--The ideal FAI would think that the specific architecture it is implemented on (brains, computers, nanomachines, giant look-up tables) is irrelevant except for practical reasons like resource efficiency. So, following its example, I should stop worrying about whether e.g. a simulated brain would be conscious.
--The ideal FAI would think that it was NOT a "unified subject of experience" or an "irreducible substance" or that it was experiencing "ineffable, irreducible quale," because believing in those things would only distract it from understanding and improving its inner workings. Therefore, I should think that I, too, am nothing but a physical mechanism and/or an algorithm implemented somewhere but capable of being implemented elsewhere.
--The ideal FAI would use UDT/TDT/etc. Therefore I should too.
--The ideal FAI would ignore uncomputable possibilities. Therefore I should too.
...
Arguably, most if not all of the conclusions I drew in the above are actually correct. However, I think that the heuristic is questionable, for the following reasons:
(1) Sometimes what we think of as the ideal FAI isn't actually ideal. Case in point: The final bullet above about uncomputable possibilities. We intuitively think that uncomputable possibilites ought to be countenanced, so rather than overriding our intuition when presented with an attractive theory of the ideal FAI (in this case AIXI) perhaps we should keep looking for an ideal that better matches our intuitions.
(2) The FAI is a tool for serving our wishes; if we start to think of ourselves as being fundamentally the same sort of thing as the FAI, our values may end up drifting badly. For simplicity, let's suppose the FAI is designed to maximize happy human life-years. The problem is, we don't know how to define a human. Do simulated brains count? What about patterns found inside rocks? What about souls, if they exist? Suppose we have the intuition that humans are indivisible entities that persist across time. If we reason using the heuristic I am talking about, we would decide that, since the FAI doesn't think it is an indivisible entity that persists across time, we shouldn't think we are either. So we would then proceed to tell the FAI "Humans are naught but a certain kind of functional structure," and (if our overruled intuition was correct) all get killed.
Thoughts?
...
Note 1: "Intuitions" can (I suspect) be thought of as another word for "Priors."
Note 2: We humans are NOT solomonoff-induction-approximators, as far as I can tell. This bodes ill for FAI, I think.
Just for fun: Computer game to illustrate AI takeover concepts?
I play Starcraft:BW sometimes with my brothers. One of my brothers is much better than the rest of us combined. This story is typical: In a free-for-all, the rest of us gang up on him, knowing that he is the biggest threat. By sheer numbers we beat him down, but foolishly allow him to escape with a few workers. Despite suffering this massive setback, he rebuilds in hiding and ends up winning due to his ability to tirelessly expand his economy while simultaneously fending off our armies.
This story reminds me of some AI-takeover scenarios. I wonder: Could we make a video game that illustrates many of the core ideas surrounding AGI? For example, a game where the following concepts were (more or less) accurately represented as mechanics:
--AI arms race
--AI friendliness and unfriendliness
--AI boxing
--rogue AI and AI takeover
--AI being awesome at epistemology and science and having amazing predictive power
--Interesting conversations between AI and their captors about whether or not they should be unboxed.
I thought about this for a while, and I think it would be feasible and (for some people at least) fun. I don't foresee myself being able to actually make this game any time soon, but I like thinking about it anyway. Here is a sketch of the main mechanics I envision:
- Setting the Stage
- This is a turn-based online game with some element of territory control and conventional warfare, designed to be played with at least 7 or so players. I'm imagining an Online Diplomacy variant such as http://www.playdiplomacy.com/ which seems to be pretty easy to make. It would be nice to make it more complicated though, since this is not a board game.
- Turns are simultaneous; each round lasts one day on standard settings.
- Players indicate their preferences for the kind of game they would like to play, and then get automatically matched with other players of a similar skill level.
- Players have accounts, so that we can keep track of how skilled they are, and assign them rough rankings based on their experience and victory ratio.
- Rather than recording merely wins and losses, this game keeps track of Victory Points.
- All games are anonymous.
- Introducing AI
- As the game progresses, factions have the ability to build AI which are implemented by bringing in another player from outside the game.
- The skill level of the AI player is random, but most likely to be the same as the skill level of the other players in the game.
- Additional resources and time can be spent on building the AI to increase the likelihood that it is Friendly (more on that later)
- What It Is Like to Be An AI
- When you enter a game as an AI, it is randomly determined whether or not you are Friendly. Only you (the AI) can see the result.
- An AI starts off "boxed," controlling no territory, no armies, no anything. It cannot even communicate with any players other than the player who created it. However, we assume it has been given enough information about the world that it can model it very well.
- As a result, it is omniscient. Not only can the AI see the entire map, but it can predict the moves that human players will make. More on this in next section.
- The player controlling a boxed AI can do various things:
- Message it in-game (no livechat)
- Destroy it
- Give it military units
- You gift some units to the AI. You no longer control them; the AI does. They receive combat bonuses of some sort.
- This gives the AI the ability to communicate directly with the other players as a side-effect
- Give it manufacturing capability
- You gift some production center to the AI. It can now build nanofactories and whatever fancy technology it has dreamed up, all things which are too complicated for you to understand.
- This gives the AI "Unboxed" status.
- AI's can become Unboxed in some circumstances even against the will of their controllers. For example, perhaps if they use their troops to conquer a manufacturing facility, they can choose to become Unboxed. Or perhaps they can broadcast their code to the world (if they have communication ability) allowing any faction to build them, and perhaps having some possibility of being built spontaneously (this represents them convincing random minor factions to build them, like North Korea or BestBuy)
- In some cases, AI's can become Unboxed without their captors realizing this. (not sure how to make this happen yet)
- Modelling AI predictive power
- We are trying to model the ability of the AI to predict what the humans will do, at least in the short term. AI cannot predict what other AI's of similar computing power will do, but AI with ridiculous amounts of computing power can predict what weaker AI's will do by simulating them.
- We accomplish this by splitting each round into three phases: the human phase, the Boxed AI phase, and the Unboxed AI phase.
- The Human phase is straightforward: 24 hours for all players to message each other and decide what their moves will be (which units will move where, which territories will manufacture what, etc.)
- The Boxed AI phase is another 24 hours in which Boxed AI have the privilege of revising their planned orders as they see fit, and talking with the people they can communicate with if they want. Importantly, Boxed AI can see all the planned orders of the human players. People to whom they talk then gain the same privilege, since their ability to act on the decisions of the AI makes them as unpredictable as the AI. If those people then talk to other people, the privilege spreads...
- The Unboxed AI phase is to the Boxed AI phase what the Boxed AI phase is to the Human phase.
- After the Unboxed AI phase finishes, the moves finally resolve and the next round begins.
- Decoy Moves:
- For various reasons I think it would be good to introduce the following element of randomness:
- In addition to inputting their orders in a given round, players input their "decoy" orders. (A user-friendly UI would make this feasible)
- To represent imperfect predictive power, AI's see either the real orders of the human players, or else the decoy orders, with probability (say) 80/20%
- Same goes for boxed AI when they are being predicted by unboxed AI.
- Victory Conditions
- Human players get, say, ~50 VP's for winning a game, ~25 VP's for playing in a game won by another human player or friendly AI, and ~10 VP's for playing in a game won by an unfriendly AI.
- Friendly AI players get ~50 VP's if their human creator wins the game, ~30 VP's if they win the game, ~20 VP's if another human player or friendly AI wins the game, and ~10 VP's for playing in a game won by an unfriendly AI.
- Unfriendly AI players get ~100 VP's if they win the game, and ~10 VP's if anyone else does.
Questions:
(1) The most crucial part of this design is the "Modeling AI Predictive Power" section. This is how we represent the AI's massive advantage in predictive power. However, this comes at the cost of tripling the amount of time the game takes to play. Can you think of a better way to do this?
(2) I'd like AI's to be able to "predict" the messages that players send to each other also. However, it would be too much to ask players to make "Decoy Message Logs." Is it worth dropping the decoy idea (and making the predictions 100% accurate) to implement this?
(3) Any complaints about the skeleton sketched above? Perhaps something is wildly unrealistic, and should be replaced by a different mechanic that more accurately captures the dynamics of AGI?
For what its worth, I spent a reasonable amount of time thinking about the mechanics I used, and I think I could justify their realism. I expect to have made quite a few mistakes, but I wasn't just making stuff up on the fly.
(4) Any other ideas for mechanics to add to the game?
Questions to ask theist philosophers? I will soon be speaking with several
I am about to graduate from one of the only universities in the world that has a high concentration of high-caliber analytic philosophers who are theists. (Specifically, the University of Notre Dame, IN) So as not to miss this once-in-a-lifetime opportunity, I have sent out emails asking many of them if they would like to meet and discuss their theism with me. Several of them have responded already in the affirmative; fingers crossed for the rest. I'm really looking forward to this because these people are really smart, and have spent a lot of time thinking about this, so I expect them to have interesting and insightful things to say.
Do you have suggestions for questions I could ask them? My main question will of course be "Why do you believe in God?" and variants thereof, but it would be nice if I could say e.g. "How do you avoid the problem of X which is a major argument against theism?"
Questions I've already thought of:
1-Why do you believe in God?
2-What are the main arguments in favor of theism, in your opinion?
3-What about the problem of evil? What about objective morality: how do you make sense of it, and if you don't, then how do you justify God?
4-What about divine hiddenness? Why doesn't God make himself more easily known to us? For example, he could regularly send angels to deliver philosophical proofs on stone tablets to doubters.
5-How do you explain God's necessary existence? What about the "problem of many Gods," i.e. why can't people say the same thing about a slightly different version of God?
6-In what sense is God the fundamental entity, the uncaused cause, etc.? How do you square this with God's seeming complexity? (he is intelligent, after all) If minds are in fact simple, then how is that supposed to work?
I welcome more articulate reformulations of the above, as well as completely new ideas.
Rationality & Low-IQ People
This post is to raise a question about the demographics of rationality: Is rationality something that can appeal to low-IQ people as well?
I don't mean in theory, I mean in practice. From what I've seen, people who are concerned about rationality (in the sense that it has on LW, OvercomingBias, etc.) are overwhelmingly high-IQ.
Meanwhile, HPMOR and other stories in the "rationality genre" appeal to me, and to other people I know. However I wonder: Perhaps part of the reason they appeal to me is that I think of myself as a smart person, and this allows me to identify with the main characters, cheer when they think their way to victory, etc. If I thought of myself as a stupid person, then perhaps I would feel uncomfortable, insecure, and alienated while reading the same stories.
So, I have four questions:
1.) Do we have reason to believe that the kind of rationality promoted on LW, OvercomingBias, CFAR, etc. appeals to a fairly normal distribution of people around the IQ mean? Or should we think, as I suggested, that people with lower IQ's are disposed to find the idea of being rational less attractive?
2.) Ditto, except replace "being rational" with "celebrating rationality through stories like HPMOR." Perhaps people think that rationality is a good thing in much the same way that being wealthy is a good thing, but they don't think that it should be celebrated, or at least they don't find such celebrations appealing.
3.) Supposing #1 and #2 have the answers I am suggesting, why?
4.) Making the same supposition, what are the implications for the movement in general?
Note: I chose to use IQ in this post instead of a more vague term like "intelligence," but I could easily have done the opposite. I'm happy to do whichever version is less problematic.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)