FAI Research Constraints and AGI Side Effects
Ozzie Gooen and Justin Shovelain
Summary
Friendly artificial intelligence (FAI) researchers have at least two significant challenges. First, they must produce a significant amount of FAI research in a short amount of time. Second, they must do so without producing enough general artificial intelligence (AGI) research to result in the creation of an unfriendly artificial intelligence (UFAI). We estimate the requirements of both of these challenges using two simple models.
Our first model describes a friendliness ratio and a leakage ratio for FAI research projects. These provide limits on the allowable amount of artificial general intelligence (AGI) knowledge produced per unit of FAI knowledge in order for a project to be net beneficial.
Our second model studies a hypothetical FAI venture, which is responsible for ensuring FAI creation. We estimate necessary total FAI research per year from the venture and leakage ratio of that research. This model demonstrates a trade off between the speed of FAI research and the proportion of AGI research that can be revealed as part of it. If FAI research takes too long, then the acceptable leakage ratio may become so low that it would become nearly impossible to safely produce any new research.
Sequential Organization of Thinking: "Six Thinking Hats"
Many people move chaotically from thought to thought without explicit structure. Inappropriate structuring may leave blind spots or cause the gears of thought to grind to a halt, but the advantages of appropriate structuring are immense:
Correct thought structuring ensures that you examine all relevant facets of an issue, idea, or fact.
- It ensures you know what to do next at every stage and are not frustrated or crippled by akrasia between moments of choice; the next action is always obvious.
- It minimizes the overhead of task switching: you are in control and do not dither between possibilities.
- It may be used in a social context so that potentially challenging issues and thoughts may be brought up in a non-threatening manner (let's look at the positive aspects, now let's focus purely on the negative...).
To illustrate thought structuring, I use the example of Edward de Bono's "six thinking hats" mnemonic. With Edward de Bono's "six thinking hats" method you metaphorically put on various colored "hats" (perspectives) and switch "hats" depending on the task. I will use the somewhat controversial issue of cryonics as my running example.1
Meetup: Bay Area: Sunday, March 7th, 7pm
Eliezer Yudkowsky, Alicorn, and Michael Vassar will be present.
Some other extra guests - Wei Dai, Stuart Armstrong, and Nick Tarleton - will be also be there, following our short Decision Theory mini-workshop.
Intuitive supergoal uncertainty
There is a common intuition and feeling that our most fundamental goals may be uncertain in some sense. What causes this intuition? For this topic I need to be able to pick out one’s top level goals, roughly one’s context insensitive utility function, and not some task specific utility function, and I do not want to imply that the top level goals can be interpreted in the form of a utility function. Following from Eliezer’s CFAI paper I thus choose the word “supergoal” (sorry Eliezer, but I am fond of that old document and its tendency to coin new vocabulary). In what follows, I will naturalistically explore the intuition of supergoal uncertainty.
To posit a model, what goal uncertainty (including supergoal uncertainty as an instance) means is that you have a weighted distribution over a set of possible goals and a mechanism by which that weight may be redistributed. If we take away the distribution of weights how can we choose actions coherently, how can we compare? If we take away the weight redistribution mechanism we end up with a single goal whose state utilities may be defined as the weighted sum of the constituent goals’ utilities, and thus the weight redistribution mechanism is necessary for goal uncertainty to be a distinct concept.
Minneapolis Meetup: Survey of interest
Frank Adamek and I are going to host a Less Wrong/Overcoming Bias meetup tentatively on Saturday September 26 at 3pm in Coffman Memorial Union at the University of Minnesota (there is a coffee shop and a food court there). Frank is the president of the University of Minnesota transhumanist group and some of them may be attending also. We'd like to gauge the level of interest so please comment if you'd be likely to attend.
(ps. If you have any time conflicts or would like to suggest a better venue please comment)
Causes of disagreements
You have a disagreement before you. How do you handle it?
Causes of fake disagreements:
Is the disagreement real? The trivial case is an apparent disagreement occuring over a noisy or low information channel. Internet chat is especially liable to fail this way because of the lack of tone, body language, and relative location cues. People can also disagree through the use of differing definitions with corresponding denotations and connotations. Fortunately, when recognized this cause of disagreement rarely produces problems; the topic at issue rarely is the definitions themselves. If there is a game theoretic reason the agents may also give the appearance of disagreement even though they might well agree in private. The agents could also disagree if they are victims of a Man-in-the-middle attack where someone is intercepting and altering the messages passed between the two parties. Finally, the agents could disagree simply because they are in different contexts. Is the sun yellow I ask? Yes, say you. No, say the aliens at Eta Carinae.
Causes of disagreements about predictions:
Evidence
Assuming the disagreement is real what does that give us? Most commonly the disagreement is about the facts that predicate our actions. To handle these we must first consider our relationship to the other person and how they think (a la superrationality); observations made by others may not be given the same weight we would give those observations if we had made them ourselves. After considering this we must then merge their evidence with our own in a controlled way. With people this gets a bit tricky. Rarely do people give us information we can handle in a cleanly Bayesian way (a la Aumann's agreement theorem). Instead we must merge our explicit evidence sets along with vague abstracted probabilistic intuitions that are half speculation and half partially forgotten memories.
View more: Next
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)