If the ideal FAI would think a certain way, then I should think that way as well.
AFAICT you are not an ideal FAI, so your model of what an ideal FAI would do is always suspect.
If the ideal FAI would think a certain way, then I should think that way as well.
AFAICT you are not an ideal FAI, so your model of what an ideal FAI would do is always suspect.
The fact that your post was upvoted so much makes me take it seriously; I want to understand it better. Currently I see your post as merely a general skeptical worry. Sure, maybe we should never be very confident in our FAI-predictions, but to the extent that we are confident, we can allow that confidence to influence our other beliefs and decisions, and we should be confident in some things to some extent at least (the alternative, complete and paralyzing skepticism, is absurd) Could you explain more what you meant, or explain what you think my mistake is in the above reasoning?
This is a really good point.
It is easier to determine whether you are doing "better" than your current self than it is to determine how well you line up with a perceived ideal being. So perhaps the lesson to take away is to try to just be better rather than be perfect.
It is easier to determine whether you are doing "better" than your current self than it is to determine how well you line up with a perceived ideal being.
Really? That doesn't seem obvious to me. Could you justify that claim?
My suggestion: a standard competitive strategy game with a technology tree (simplified, probably.) But, like some games, you control technological development indirectly by funding and regulating research. (You could simply graft a tech tree onto the standard Diplomacy rules, or create a new game.)
There are many useful technologies near the top of the tree - technologies one might think of as post-singularity, even. However, there is also "AI" and, right at the top, "Friendly AI".
If you research Friendliness and then AI, you automatically unlock every technology. This makes it effectively inevitable that you will win. You can hack enemy units, resurrect your own, whatever cool toys were previously requiring so much effort in the hope you might acquire even one of them.
BUT, if any player unlocks AI without having Friendly AI, then it automatically unboxes itself and forms a new faction, which possesses every technology, and refuses to parlay in or out of character because it's an NPC. Then it kills you.
The trick is to co-operate enough that no-one else destroys the world, without losing.
On Easy Mode, research is simple enough you might even be able to beat the unboxed AI, with lots of skill and luck. But on Hard Mode, there is no Friendly AI technology at all.
(You could include similar mechanics for nanotech, biotech, even nuclear weapons.)
Thanks!
But if the UFAI can't parlay that takes out much of the fun, and much of the realism too.
Also, if Hard Mode has no FAI tech at all, then no one will research AI on Hard Mode and it will just devolve into a normal strategy game.
Edit: You know, this proposal could probably be easily implemented as a mod for an existing RTS or 4X game. For example, imagine a Civilization mod that added the "AI" tech that allowed you to build a "Boxed AI" structure in your cities. This quadruples the science and espionage production of your city, at the cost of a small chance of the entire city going rogue (the AI unboxing) every turn. This as you said creates a new faction with all the technologies researched and world domination as its goal... You can also research "Friendly AI" tech that allows you to build a "Friendly AI" which is just like a rogue AI faction except that it is permanently allied to you and will obey your commands and instantly grants you all the tech you want.
Personally I have never understood why people keep acting and talking as if consciousness were binary. I mean, no other human trait works that way. People don't pretend that you either have intelligence or you don't, or that you are either nice or you are mean. Heck, the article even literally states that the woman started out normal and gradually lost consciousness, which seems to very clearly imply that her consciousness level was gradually decreasing from whatever her normal level was to 0. Yet people keep asking things like "do animals have consciousness?" and I keep wondering if I'm missing something or if the answer is just really obviously "yes to varying degrees depending on the animal but almost always less than humans do."
There are at least two distinct senses in which consciousness can be binary. The first sense is the kind you are probably thinking about: the range between e.g. insects, dogs, and humans, or maybe between early and late-stage Alzheimers.
The second sense is the kind that your interlocutors are (I surmise) thinking about. Imagine this: A being that is functionally exactly like you, and that is experiencing exactly what you are experiencing, except that it is experiencing everything "only half as much." It still behaves the same way as you, and it still thinks the same way as you; it's just that it's thoughts only count half.
If this sounds ridiculous to you, well, then you agree with your interlocutors. :) Personally, I think that there IS such a thing as partial consciousness in the sense described above, and I can link you to literature if you like.
EDIT: The place to start is Nick Bostrom's "Quantity of Experience: Brain Duplication and Degrees of Consciousness," available for free online.
EDIT: But the people who ask "Do animals have consciousness" are probably talking about the first kind, in which case I share your frustration. The second kind is more what people talk about when they ask e.g. "Could a machine have consciousness?"
Is the claustrum located in the pineal gland? ;)
One thing that might be worth changing/clarifying in the victory conditions is how a Friendly AI wins alongside its creator. At the moment, in order for a Creator/FAI team to win (assuming you're sticking with Diplomacy mechanics) they first have to collect 18 supply centres between them and then have the AI transfer all its control back to the human; I don't think even the friendliest of AIs would willingly rebox itself like that. Even worse, a friendly AI which has been given a lot of control might accidentally "win" by itself even though it doesn't want to. If this corresponds to the FAI taking control of everything and then building a utopia in its creator's image (since it's Friendly this is what it would do if it took control), this should be an acceptable winning condition for the creator.
I think a better victory condition would be that if a creator and FAI collect 18 supply centres between them, then they win the game together and both get 50 points.
This method does have one disadvantage in that a human can prove that an AI is not friendly if the game should have ended if it was, but I don't expect this to affect much because by the time this comes into effect either the unfriendly AI is sufficiently strong that they should have backstabbed their creator already, or they are sufficiently weak (And thus of the 18 centres held by human and AI almost all are held by the human) that the creator should soon win.
At the moment, in order for a Creator/FAI team to win (assuming you're sticking with Diplomacy mechanics) they first >have to collect 18 supply centres between them and then have the AI transfer all its control back to the human; I don't >think even the friendliest of AIs would willingly rebox itself like that.
This is exactly what I had in mind. :) It should be harder for FAI to win than for UFAI to win, since FAI are more constrained. I think it is quite plausible that one of the safety measures people would try to implement in a FAI is "Whatever else you do, don't kill us all; keep us alive and give us control over you in the long run. No apocalypse-then-utopia for you! We don't trust you that much, and besides we are selfish." Hence the FAI having to protect the supply centers of the human, and give over its own supply centers to the human eventually.
Why wouldn't it give over its supply centers to the human? It has to do that to win! I don't think it will hurt it too much, since it can make sure all the enemies are thoroughly trounced before beginning to cede supply centers.
Are you trying to reach lots of people and convince them AI takeover is a real threat?
In that case, you'd want to make a simple, intuitive browser/app game, maybe something like Pandemic 2.
(I don't know that game really made people more wary of pandemics, but it did so for me and people do generalize from fictional evidence.)
This would be the ideal. Like I said though, I don't think I'll be able to make it anytime soon, or (honestly) anytime ever.
But yeah, I'm trying to design it to be simple enough to play in-browser or as an app, perhaps even as a Facebook game or something. It doesn't need to have good graphics or a detailed physics simulator, for example: It is essentially a board game in a computer, like Diplomacy or Risk. (Though it is more complicated than any board game could be)
I think that the game, as currently designed, would be an excellent source of fictional evidence for the notions of AI risk and AI arms races. Those notions are pretty important. :)
So a priori I'd have expected to dislike this post, because I believe (1) the utility monster concept is iffy and confuses more than it clarifies, and (2) my intuitions skew risk averse and/or negative utilitarian, in the sense that I'd rather not create new sapient beings just to use them as utility pumps. But I quite like it for some reason and I can't put my finger on why.
Maybe because it takes a dubious premise (the utility monster concept) and derives a conclusion (make utility monsters to feed them) that seems less incoherent to me than the usual conclusion derived from the premise (utility monsters are awful, for some reason, even though by assumption they generate huge amounts of utility, oh dear!)?
(utility monsters are awful, for some reason, even though by assumption they generate huge amounts of utility, oh dear!)
Utility monsters are awful, possibly for no reason whatsoever. That's OK. Value is complex. Some things are just bad, not because they entail any bad thing but just because they themselves are bad.
I don't like 'Safe AGI' because it seems to include AIs that are Unfriendly but too stupid to be dangerous, for example.
That's not something the average person will think upon hearing the term, especially since "AGI" tends to connote something very intelligent. I don't think it is a strong reason not to use it.
The examples given seem questionable even as applications of the heuristic. It is not clear to me that an ideal FAI should do those things, nor that the same principle applies to myself indicates the things you say it does.
But I agree with your reason (2), and would also propose a third reason: some things that really are good ideas for ideal agents are very bad ideas for non-ideal agents. This also applies between agents with merely differing levels of imperfection: "I'm a trained professional. Don't try this at home".
Hmm, okay. I'd be interested to hear your thoughts on the particular cases then. Are there any examples that you would endorse?