Comment author: chaosmage 04 July 2014 10:33:33AM *  2 points [-]

Are you trying to reach lots of people and convince them AI takeover is a real threat?

In that case, you'd want to make a simple, intuitive browser/app game, maybe something like Pandemic 2.

(I don't know that game really made people more wary of pandemics, but it did so for me and people do generalize from fictional evidence.)

Comment author: kokotajlod 04 July 2014 01:12:23PM 1 point [-]

This would be the ideal. Like I said though, I don't think I'll be able to make it anytime soon, or (honestly) anytime ever.

But yeah, I'm trying to design it to be simple enough to play in-browser or as an app, perhaps even as a Facebook game or something. It doesn't need to have good graphics or a detailed physics simulator, for example: It is essentially a board game in a computer, like Diplomacy or Risk. (Though it is more complicated than any board game could be)

I think that the game, as currently designed, would be an excellent source of fictional evidence for the notions of AI risk and AI arms races. Those notions are pretty important. :)

Just for fun: Computer game to illustrate AI takeover concepts?

12 kokotajlod 03 July 2014 07:30PM

I play Starcraft:BW sometimes with my brothers. One of my brothers is much better than the rest of us combined. This story is typical: In a free-for-all, the rest of us gang up on him, knowing that he is the biggest threat. By sheer numbers we beat him down, but foolishly allow him to escape with a few workers. Despite suffering this massive setback, he rebuilds in hiding and ends up winning due to his ability to tirelessly expand his economy while simultaneously fending off our armies.

This story reminds me of some AI-takeover scenarios. I wonder: Could we make a video game that illustrates many of the core ideas surrounding AGI? For example, a game where the following concepts were (more or less) accurately represented as mechanics:

--AI arms race

--AI friendliness and unfriendliness

--AI boxing

--rogue AI and AI takeover

--AI being awesome at epistemology and science and having amazing predictive power

--Interesting conversations between AI and their captors about whether or not they should be unboxed.

 

I thought about this for a while, and I think it would be feasible and (for some people at least) fun. I don't foresee myself being able to actually make this game any time soon, but I like thinking about it anyway. Here is a sketch of the main mechanics I envision:

  • Setting the Stage
    • This is a turn-based online game with some element of territory control and conventional warfare, designed to be played with at least 7 or so players. I'm imagining an Online Diplomacy variant such as http://www.playdiplomacy.com/ which seems to be pretty easy to make. It would be nice to make it more complicated though, since this is not a board game.
    • Turns are simultaneous; each round lasts one day on standard settings.
    • Players indicate their preferences for the kind of game they would like to play, and then get automatically matched with other players of a similar skill level.
    • Players have accounts, so that we can keep track of how skilled they are, and assign them rough rankings based on their experience and victory ratio.
    • Rather than recording merely wins and losses, this game keeps track of Victory Points.
    • All games are anonymous.
  • Introducing AI
    • As the game progresses, factions have the ability to build AI which are implemented by bringing in another player from outside the game.
    • The skill level of the AI player is random, but most likely to be the same as the skill level of the other players in the game.
    • Additional resources and time can be spent on building the AI to increase the likelihood that it is Friendly (more on that later)
  • What It Is Like to Be An AI
    • When you enter a game as an AI, it is randomly determined whether or not you are Friendly. Only you (the AI) can see the result.
    • An AI starts off "boxed," controlling no territory, no armies, no anything. It cannot even communicate with any players other than the player who created it. However, we assume it has been given enough information about the world that it can model it very well.
    • As a result, it is omniscient. Not only can the AI see the entire map, but it can predict the moves that human players will make. More on this in next section.
    • The player controlling a boxed AI can do various things:
      • Message it in-game (no livechat)
      • Destroy it
      • Give it military units
        • You gift some units to the AI. You no longer control them; the AI does. They receive combat bonuses of some sort.
        • This gives the AI the ability to communicate directly with the other players as a side-effect
      • Give it manufacturing capability
        • You gift some production center to the AI. It can now build nanofactories and whatever fancy technology it has dreamed up, all things which are too complicated for you to understand.
        • This gives the AI "Unboxed" status.
    • AI's can become Unboxed in some circumstances even against the will of their controllers. For example, perhaps if they use their troops to conquer a manufacturing facility, they can choose to become Unboxed. Or perhaps they can broadcast their code to the world (if they have communication ability) allowing any faction to build them, and perhaps having some possibility of being built spontaneously (this represents them convincing random minor factions to build them, like North Korea or BestBuy)
    • In some cases, AI's can become Unboxed without their captors realizing this. (not sure how to make this happen yet)
  • Modelling AI predictive power
    • We are trying to model the ability of the AI to predict what the humans will do, at least in the short term. AI cannot predict what other AI's of similar computing power will do, but AI with ridiculous amounts of computing power can predict what weaker AI's will do by simulating them.
    • We accomplish this by splitting each round into three phases: the human phase, the Boxed AI phase, and the Unboxed AI phase.
    • The Human phase is straightforward: 24 hours for all players to message each other and decide what their moves will be (which units will move where, which territories will manufacture what, etc.)
    • The Boxed AI phase is another 24 hours in which Boxed AI have the privilege of revising their planned orders as they see fit, and talking with the people they can communicate with if they want. Importantly, Boxed AI can see all the planned orders of the human players. People to whom they talk then gain the same privilege, since their ability to act on the decisions of the AI makes them as unpredictable as the AI. If those people then talk to other people, the privilege spreads...
    • The Unboxed AI phase is to the Boxed AI phase what the Boxed AI phase is to the Human phase.
    • After the Unboxed AI phase finishes, the moves finally resolve and the next round begins.
  • Decoy Moves:
    • For various reasons I think it would be good to introduce the following element of randomness:
    • In addition to inputting their orders in a given round, players input their "decoy" orders. (A user-friendly UI would make this feasible)
    • To represent imperfect predictive power, AI's see either the real orders of the human players, or else the decoy orders, with probability (say) 80/20%
    • Same goes for boxed AI when they are being predicted by unboxed AI.
  • Victory Conditions
    • Human players get, say, ~50 VP's for winning a game, ~25 VP's for playing in a game won by another human player or friendly AI, and ~10 VP's for playing in a game won by an unfriendly AI.
    • Friendly AI players get ~50 VP's if their human creator wins the game, ~30 VP's if they win the game, ~20 VP's if another human player or friendly AI wins the game, and ~10 VP's for playing in a game won by an unfriendly AI.
    • Unfriendly AI players get ~100 VP's if they win the game, and ~10 VP's if anyone else does.

 

Questions:


(1) The most crucial part of this design is the "Modeling AI Predictive Power" section. This is how we represent the AI's massive advantage in predictive power. However, this comes at the cost of tripling the amount of time the game takes to play. Can you think of a better way to do this?

(2) I'd like AI's to be able to "predict" the messages that players send to each other also. However, it would be too much to ask players to make "Decoy Message Logs." Is it worth dropping the decoy idea (and making the predictions 100% accurate) to implement this?

(3) Any complaints about the skeleton sketched above? Perhaps something is wildly unrealistic, and should be replaced by a different mechanic that more accurately captures the dynamics of AGI?

For what its worth, I spent a reasonable amount of time thinking about the mechanics I used, and I think I could justify their realism. I expect to have made quite a few mistakes, but I wasn't just making stuff up on the fly.

(4) Any other ideas for mechanics to add to the game?

Comment author: satt 25 June 2014 11:54:09PM 1 point [-]

So a priori I'd have expected to dislike this post, because I believe (1) the utility monster concept is iffy and confuses more than it clarifies, and (2) my intuitions skew risk averse and/or negative utilitarian, in the sense that I'd rather not create new sapient beings just to use them as utility pumps. But I quite like it for some reason and I can't put my finger on why.

Maybe because it takes a dubious premise (the utility monster concept) and derives a conclusion (make utility monsters to feed them) that seems less incoherent to me than the usual conclusion derived from the premise (utility monsters are awful, for some reason, even though by assumption they generate huge amounts of utility, oh dear!)?

Comment author: kokotajlod 28 June 2014 02:39:18PM 0 points [-]

(utility monsters are awful, for some reason, even though by assumption they generate huge amounts of utility, oh dear!)

Utility monsters are awful, possibly for no reason whatsoever. That's OK. Value is complex. Some things are just bad, not because they entail any bad thing but just because they themselves are bad.

Comment author: Furcas 16 June 2014 12:42:56AM 5 points [-]

I don't like 'Safe AGI' because it seems to include AIs that are Unfriendly but too stupid to be dangerous, for example.

Comment author: kokotajlod 16 June 2014 01:47:05PM 0 points [-]

That's not something the average person will think upon hearing the term, especially since "AGI" tends to connote something very intelligent. I don't think it is a strong reason not to use it.

Comment author: kokotajlod 25 May 2014 03:44:08PM 1 point [-]

It is nice to see people thinking about this stuff. Keep it up, and keep us posted!

Have you read the philosopher Derek Parfit? He is famous for arguing for pretty much exactly what you propose here, I think.

Doubt: Doesn’t this imply that anthropic probabilities depend on how big a boundary the mind draws around stuff it considers “I”? Self: Yes. Doubt: This seems to render probability useless.

I agree with Doubt. If can make it 100% probable that I'll get superpowers tomorrow merely by convincing myself that only superpowered future-versions of me count as me, then sign me up for surgical brainwashing today!

If you take altruism into account, then it all adds up to normality. Or rather, it can all be made to add up to normality, if we suitably modify our utility function. But that's true of ANY theory.

My question is, would you apply the same principle to personal-identity-right-now? Forget the future and the past, and just worry about the question "what am I right now?" Would you say that the answer to this question is also mind-dependent, such that if I decide to draw the reference class for the word "I" in such a way as to exclude brains in vats, then I have 0% probability of being a brain in a vat?

Comment author: gwern 09 May 2014 11:04:54PM 0 points [-]

Yes, but they would have been dead wrong about everything they thought about the supernatural, not just the placebo effect. Thus if anyone were to suggest

Consider the corresponding theory about science. Maybe there is a Placebo Effect going on with the laws of nature and even engineering, whereby things work partly because we think they will work. How could this be? Well, we don't understand how the placebo effect could be either. God is a decent explanation--maybe airplanes are his way of rewarding us for spending so much time thinking rationally about the principles of flight. Maybe if we spent enough time thinking rationally about the principles of faster-than-light travel, he would change things behind the scenes so that it became possible.

The example of the placebo effect would work against this theory: "people were completely and totally wrong about beliefs affecting reality before and it turned out to be some artifacts of selection bias / regression to the mean / relaxation / evolutionarily-based allocation of bodily resources, and so I disbelieve in your suggestion even more than I would just on its merits because clearly people are not good at this sort of thinking".

Comment author: kokotajlod 11 May 2014 05:38:06AM -1 points [-]

No, the analogy I had in mind was this:

What People Saw: Acupuncture* being correlated with health, and [building things according to theories developed using the scientific method] being correlated with [having things that work very well]

What People Thought Happened: Acupuncture causing health and [building things according to theories developed using the scientific method] causing [having things that work very well]

What Actually Happened: Placebo effect and Placebo effect (in the former case, involving whatever mechanisms we think cause the placebo effect these days; in the latter case, involving e.g. God.)

people were completely and totally wrong about beliefs affecting reality before

Filtering out all the selection bias etc., the relaxation and evolutionarily-based allocation of bodily resources seem to work fine for my purposes. They are analogous to theism-based allocation of technological power.

Comment author: gwern 08 May 2014 11:45:57PM 0 points [-]
Comment author: kokotajlod 09 May 2014 04:07:26AM 0 points [-]

I didn't mean to imply that the placebo effect is a complete mystery. As you say, perhaps it is pretty well understood. But that doesn't touch my overall point which is that before modern medicine (and modern explanations for the placebo effect) people would have had plenty of evidence that e.g. faith healing worked, and that therefore spirits/gods/etc. existed.

Comment author: dougclow 01 May 2014 09:41:24AM 8 points [-]

To be fair to the medieval, their theories about how one can build large, beautiful buildings were pretty sound.

Comment author: kokotajlod 08 May 2014 04:09:38PM 0 points [-]

Similarly, modern theories about how to discover the habits of God in governing Creation (the Laws of Nature) are pretty sound as well. Or so theists say.

A better example than Amiens Cathedral would be the Placebo Effect. For most of human history, people with access to lots of data (but no notion of the Placebo Effect) had every reason to believe that e.g. witch doctors, faith healing, etc. was all correct.

Warning: Rampant speculation about a theory of low probability: Consider the corresponding theory about science. Maybe there is a Placebo Effect going on with the laws of nature and even engineering, whereby things work partly because we think they will work. How could this be? Well, we don't understand how the placebo effect could be either. God is a decent explanation--maybe airplanes are his way of rewarding us for spending so much time thinking rationally about the principles of flight. Maybe if we spent enough time thinking rationally about the principles of faster-than-light travel, he would change things behind the scenes so that it became possible.

Comment author: shminux 06 May 2014 03:56:56PM *  -1 points [-]

Could the loop be big enough to contain us already?

Note the reversibility issue. After a complete loop, you end up in exactly the same state as you started with. All the dissipated energy and information must somehow return. Unless you are willing to wait for the Poincare recurrence, this is not very likely to happen. And in this case the wait time is so large as to be practically infinite.

Comment author: kokotajlod 07 May 2014 12:02:20AM 0 points [-]

If we are in a time loop we won't be trying to escape it, but rather exploit it.

For example: Suppose I find out that the entire local universe-bubble is in a time loop, and there is a way to build a spaceship that will survive the big crunch in time for the next big bang. Or something like that.

Well, I go to my backyard and start digging, and sure enough I find a spaceship complete with cryo-chambers. I get in, wait till the end of the universe, and then after the big bang starts again I get out and seed the Earth with life. I go on to create a wonderful civilization that keeps to the shadows and avoids contact with "mainstream" humanity until, say, the year 2016. In the year 2014 of course, my doppelganger finds the machine I buried in his backyard...

I'm not saying this scenario is plausible, just that it is an example of exploiting time travel despite never breaking the loop. Or am I misunderstanding how this works?

Comment author: shminux 03 May 2014 05:01:39AM 2 points [-]

Is it at all possible, never mind how likely, that our own universe contains closed timelike curves?

Very very very unlikely. Hawking once wrote a paper called Chronology protection conjecture, arguing that any time loop would self-destruct before forming. Even if they existed, it's not like you can travel them to do things. There is no entering or exiting the loop. Everything in the groundhog day has been there forever and will be there forever. Because there is no difference between "first time through the loop" and "n-th time through the loop". This is all predicated on classical general relativity. Quantum gravity may change things. But no one knows anything about quantum gravity.

Comment author: kokotajlod 06 May 2014 02:46:24PM 0 points [-]

Thanks for the info. Hmm. What do you mean by "There is no entering or exiting the loop?" Could the loop be big enough to contain us already?

I'm not concerned about traveling backwards in time to change the past; I just want to travel backwards in time. In fact, I hope that I wouldn't be able to change the past. Consistency of that sort can be massively exploited.

View more: Prev | Next