Jonii comments on What if AI doesn't quite go FOOM? - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (186)
This AI would prevent anyone, including SIAI, from developing any sort of an AI.
I share this confusion. The only reasonable interpretation I can see is that Mass Driver prefers an AI that ensures nobody programs another AI, ever.
Otherwise you'd have to build the AI to tell Friendly AIs in development from Unfriendly ones, which appears to be tantamount to programming Friendliness in the first place.
I suppose there are situations in which we might choose to take that option. Most obviously if we do have the tech to build such an AI, don't have the tech to build a full AI with CEV capabilities and we also know that some other fool is two months away from releasing a happy face maximizer. The options are then to kill the foolish AI developers or release the anti-AI and accept an eternity of mediocrity.
Letting the happy face AI run is arguably the right decision in this situation, if stopping the programmers is absolutely impossible for some reason (although I doubt very much that the "no AI AI" is a possible construction; compare this with a "do nothing AI" - how do you specify its goals, and have it optimize the world, in such a way that nothing happens?).
I don't agree. I don't want to die. I would prefer to live in a world that relied on non-AI technology. I hope you and Rolf do not get in my way when I do what needs to be done.
And I want to live in a world that has maximal benefits for the largest average group of stakeholders not just a group of elites like we have now. Unfortunately non-AI based governing systems are run by humans and history shows that fair systems are unstable and eventually become usurped by those who place their own interests ahead of the rest of the population. Will an AI system be better than that? I don't know. Historically both benevolent dictatorships and republics are reasonable systems but the benevolent dictator eventually dies and republics ALWAYS transform into rule by the elite.
What I want is a long lived benevolent dictator whether it's a transhuman or an AI but frankly I'd trust a benevolent AI ahead of an allegedly benevolent transhuman. Hell I'm not sure that I'd even trust myself to be a 100% fair benevolent transhuman dictator and I'm pretty reasonable. Power corrupts and all that.
This is a question of fact, not a fight between different preferences. I'm not certain either way, so I don't argue that UFAI is definitely the right choice, but that the opposite is not obviously the right choice. You should give the arguments (which at least Wei Dai, Carl Schulman and I take seriously) some consideration, irrespective of how absurd the conclusion sounds. There seems to be deep similarity in the structure of disagreement between this idea and cryonics.
I am disagreeing on the question of fact. What we can do without an FAI is by far superior to any scraps we can expect smiley-face maximiser to contribute due to exchanges.The greatest of the existential risks that not having an FAI entails is the threat of an uFAI. anti-AI removes that. We do have some potential for survival based on other technologies within our grasp. SIAI would have to devote itself to solving other hard problems.
Wei mentioned a combinatorial explosion. He may have been applying it somewhat differently than I am but I am claiming that an overwhelming number of the possible mind designs that Smiley is bargaining with are also bad for me. He is bargaining with a whole lot of Clippy's brothers and sisters. Bargaining with a whole lot of GAIs that are released that care primarily about their own propagation. Even more importantly that small proportion of FAIs that do exist are not friendly to things I care about. Almost none of them will result in me personally being alive.
This all assumes that the bargaining does in fact go ahead. I'm not certain either way either and nor am I certain that in the specific case of Smiley one of his optimal trading partners will be an FAI which I happen to like.
All this means that I am comfortable with the assertion you quote. If you or Rolf did try to stop me from pressing that no-AI button then you would just be obstacles that needed to be eliminated, even if your motives are pure. My life and all that I hold dear is at stake!
I think that makes some sense. It's not clear to me that building a smiley-face maximizer that trades with AIs in other possible worlds would be better than having a no-AI future.
There is another possibility to consider though. Both we and the smiley-face maximizer would be better off if we did allow it to be built, and then it gives our preferences some control (enough for us to be better off than the no-AI future). It's not clear that this opportunity for trade can be realized, but we should spend some time thinking about it before ruling it out.
It seems like we really need a theory of games that tells us (human beings) how to play games with superintelligences. We can't depend on our FAIs to play the games for us, because we have to decide now what to do, including the above example, and also what kind of FAI to build.
Sounds like Drescher's bounded Newcomb. This perspective suddenly painted it FAI-complete.
Can you please elaborate? I looked up "FAI-complete", and found this but I still don't get your point.
That FAI is good for you is a property of the term "FAI". If it doesn't create value for you, it's not FAI, but something like Smileys and Paperclippers, potential trade partner but not your guy. Let's keep it simple.
"Friendly to their Creator AI", choose an acronym. Perhaps F<BabyEater>AI. Across the multiverse most civilizations that engage in successful AI efforts will produce an AI that is not friendly to me. AIs that are actually FAIs (which include by definition my own survival) are negligible.
Releasing a Smiley will make me die and destroy everything I care about. I will kill anyone who stops me preventing that disaster. That is as simple as I can make it.
Formal preference is a more abstract concept than survival in particular, and even though all else equal, in usual situations, survival is preferable to non-survival, there could be situations even better than "survival". It's not out of the question "by definition" (you know better than to invoke this argument pattern).
Formal preference is one particular thing. You can't specify additional details without changing the concept. If preference says that "survival" is a necessary component, that worlds without "survival" are equally worthless, then so be it. But it could say otherwise. You can't study something and already know the answer, you can't just assume to know that this property that intuitively appeals to you is unquestionably present. How do you know? I'd rather build on clear foundation, and remain in doubt about what I can't yet see.
Negligible, non-negligible, that's what the word means. It talks about specifically working for your preference, because of what AI values and not because it needs to do so for trade. FAI could be impossible, for example, that doesn't change the concept. BabyEater's AI could be an UFAI, or it could be a FAI, depending on how well it serves your preference. It could turn out to be a FAI, if the sympathy aspect of their preference is strong enough to dole you a fair part of the world, more than you own by pure game-theoretic control.
FAI doesn't imply full control given to your preference (for example, here on Earth we have many people with at least somewhat different preferences, and all control likely won't be given to any single person). The term distinguishes AIs that optimize for you because of their own preference (and thus generate powerful control in the mathematical universe for your values, to a much more significant extent than you can do yourself), from AIs that optimize for you because of control pressure (in other terms, trade opportunity) from another AI (which is the case for "UFAI").
(I'm trying to factor the discussion into the more independent topics to not lose track of the structure of the argument.)
Please don't derail a civilized course of discussion, this makes clear communication more expensive in effort. This particular point was about a convention for using a word, and not about that other point you started talking sarcastically about here.
Also, speculating on the consequences of a conclusion (like the implication from it being correct to not release the UFAI, to you therefore having to destroy everything that stands in the way of preventing that event, an implication with which I more or less agree, if you don't forget to take into account the moral value of said murders) is not helpful in the course of arguing about which conclusion is the correct one.
Sadly Vladimir this failure to understand stakeholder theory is endemic in AI discussions. Friendly AI cannot possibly be defined as being "if it doesn't create value for you it's not FAI" because value is arbitrary. There are some people who want to die and others to want to live being the stark example. Everyone being killed is thus value for some and not value for others and vice versa.
What we end up with is having to define friendly as being "creating value for the largest possible number of human stakeholders even if some of them lose".
For example, someone who derives value from ordering people around or having everyone else be their personal slaves such as Caligula or the ex-dictator Gaddafi doesn't (didn't....) see value in self-rule for the people and thus fought hard to maintain the status quo, murdering many people in the process.
In any scenario whereby you consider the wants of those who seek most of the world's resources or domination over others, you're going to end up with an impossible conundrum for any putative FAI.
So given that scenario, what is really in all of our best interests if some of us aren't going to get what we want and there is only one Earth?
One answer I've seen is that the AI will create as many worlds as necessary in order to accommodate everyone's desires in a reasonably satisfactory fashion. So, Gaddafi will get a world of his own, populated by all the people who (for some reason) enjoy being oppressed. If an insufficient number of such people exist, the FAI will create a sufficient number of non-sentient bots to fill out the population.
The AI can do all this because, as a direct consequence of its ability to make itself smarter exponentially, it will quickly acquire quasi-godlike powers, by, er, using some kind of nanotechnology or something.
I understand that you don't want to die or lose the future, and I understand the ingrained thought that UFAI = total loss, but please try to look past that, consider that you may be wrong, see that being willing to 'eliminate' your allies over factual disagreements loses, and cooperate in the iterated epistemic prisoner's dilemma with your epistemic peers. You seem to be pretty obviously coming at this question from a highly emotional position, and should try to deal with that before arguing the object level.
That it's far superior is not obvious, both because it's not obvious how well we could reasonably expect to do without FAI (How likely would we be to successfully construct a singleton locking in our values? How efficiently could we use resources? Would the anti-AI interfere with human intelligence enhancement or uploading, either of which seems like it would destroy huge amounts of value?), and because our notional utility function might see steeply diminishing marginal returns to resources before using the entire future light cone (see this discussion).
I am, or at least was, considering the facts, including what was supplied in the links. I was also assuming for the sake of the argument that the kind of agent that the incompetent AI developers created would recursively improve to one that cooperated without communication with other universes.
Discussing the effects and implications of decisions in counterfactuals is not something that is at all emotional for me. It fascinates me. On the other hand the natural conclusion to counterfactuals (which are inevitably discussing extreme situations) is something that does seem to inspire emotional judgments, which is something that overrides my fascination.