Why does superintelligence require global coordination? Apparently all one needs to do is to develop an FAI, and the rest will take care of itself.
Because FAI is a hard problem. If it were easy then we would not still be paying people $70 trillion per year worldwide to do work that machines aren't smart enough to do yet.
Hello,
I appreciate the thoughtful response. I plan to respond at greater length in the future, both to this post and to some other content posted by SI representatives and commenters. For now, I wanted to take a shot at clarifying the discussion of "tool-AI" by discussing AIXI. One of the the issues I've found with the debate over FAI in general is that I haven't seen much in the way of formal precision about the challenge of Friendliness (I recognize that I have also provided little formal precision, though I feel the burden of formalization is on SI here). It occurred to me that AIXI might provide a good opportunity to have a more precise discussion, if in fact it is believed to represent a case of "a rare exception who specified his AGI in such unambiguous mathematical terms that he actually succeeded at realizing, after some discussion with SIAI personnel, that AIXI would kill off its users and seize control of its reward button."
So here's my characterization of how one might work toward a safe and useful version of AIXI, using the "tool-AI" framework, if one could in fact develop an efficient enough approximation of AIXI to qualify as a powerful AGI. Of course, this is just a rough outline of what I have in mind, but hopefully it adds some clarity to the discussion.
A. Write a program that
- Computes an optimal policy, using some implementation of equation (20) on page 22 of http://www.hutter1.net/ai/aixigentle.pdf
- "Prints" the policy in a human-readable format (using some fixed algorithm for "printing" that is not driven by a utility function)
- Provides tools for answering user questions about the policy, i.e., "What will be its effect on _?" (using some fixed algorithm for answering user questions that makes use of AIXI's probability function, and is not driven by a utility function)
- Does not contain any procedures for "implementing" the policy, only for displaying it and its implications in human-readable form
B. Run the program; examine its output using the tools described above (#2 and #3); if, upon such examination, the policy appears potentially destructive, continue tweaking the program (for example, by tweaking the utility it is selecting a policy to maximize) until the policy appears safe and desirable
C. Implement the policy using tools other than AIXI agent
D. Repeat (B) and (C) until one has confidence that the AIXI agent reliably produces safe and desirable policies, at which point more automation may be called for
My claim is that this approach would be superior to that of trying to develop "Friendliness theory" in advance of having any working AGI, because it would allow experiment- rather than theory-based development. Eliezer, I'm interested in your thoughts about my claim. Do you agree? If not, where is our disagreement?
If we were smart enough to understand its policy, then it would not be smart enough to be dangerous.
I'm interested in any compiled papers or articles you wrote about AGI motivation systems, aside from the forthcoming book chapter, which I will read. Do you have any links?
There will never be a singularity. A singularity is infinitely far in the future in "perceptual time" measured in bits learned by intelligent agents. But evolution is a chaotic process whose only attractor is a dead planet. Therefore there is a 100% chance that the extinction of all life (created by us or not) will happen first. (95%).
There will be a net positive to society by measures of overall health, wealth and quality of life if the government capped reproduction at a sustainable level and distributed tradeable reproductive credits for that amount to all fertile young women. (~85% confident)
It's a good idea but upvote because evolution will thwart your plans.
The most advanced computer that it is possible to build with the matter and energy budget of Earth, would not be capable of simulating a billion humans and their environment, such that they would be unable to distinguish their life from reality (20%). It would not be capable of adding any significant measure to their experience, given MWI.(80%, which is obscenely high for an assertion of impossibility about which we have only speculation). Any superintelligent AIs which the future holds will spend a small fraction of their cycles on non-heuristic (self-conscious) simulation of intelligent life.(Almost meaningless without a lot of defining the measure, but ignoring that, I'll go with 60%)
NOT FOR SCORING: I have similarly weakly-skeptical views about cryonics, the imminence and speed of development/self-development of AI, how much longer Moore's law will continue, and other topics in the vaguely "singularitarian" cluster. Most of these views are probably not as out of the LW mainstream as it would appear, so I doubt I'd get more than a dozen or so karma out of any of them.
I also think that there are people cheating here, getting loads of karma for saying plausibly silly things on purpose. I didn't use this as my contrarian belief, because I suspect most LWers would agree that there are at least some cheaters among the top comments here.
I disagree because a simulation could program you to believe the world was real and believe it was more complex than it actually was. Upvoted for under confidence.
It is not possible for an agent to make a rational choice between 1 or 2 boxes if the agent and Omega can both be simulated by Turing machines. Proof: Omega predicts the agent's decision by simulating it. This requires Omega to have greater algorithmic complexity than the agent (including the nonzero complexity of the compiler or interpreter). But a rational choice by the agent requires that it simulate Omega, which requires that the agent have greater algorithmic complexity instead.
In other words, the agent X, with complexity K(X), must model Omega which has complexity K(X + "put $1 million in box B if X does not take box A"), which is slightly greater than K(X).
In the framework of the ideal rational agent in AIXI, the agent guesses that Omega is the shortest program consistent with the observed interaction so far. But it can never guess Omega because its complexity is greater than that of the agent. Since AIXI is optimal, no other agent can make a rational choice either.
As an aside, this is also a wonderful demonstration of the illusion of free will.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Maybe I am missing something, but hasn't a seed AI already been planted? Intelligence (whether that means ability to achieve goals in general, or whether it means able to do what humans can do) depends on both knowledge and computing power. Currently the largest collection of knowledge and computing power on the planet is the internet. By the internet, I mean both the billions of computers connected to it, and the two billion brains of its human users. Both knowledge and computing power are growing exponentially, doubling every 1 to 2 years, in part by adding users, but mostly on the silicon side by collecting human knowledge and the hardware to sense, store, index, and interpret it.
My question: where is the internet's reward button? Where is its goal of "make humans happy", or whatever it is, coded? How is it useful to describe the internet as a self-improving goal-directed optimization process?
I realize that it is useful, although not entirely accurate, to describe the human brain as a goal directed optimization process. Humans have certain evolved goals, such as food, and secondary goals such as money. Humans who are better at achieving these goals are assumed to be more intelligent. The model is not entirely accurate because humans are not completely rational. We don't directly seek positive reinforcement. Rather, positive reinforcement is a signal that has the effect of increasing the probability of performing actions that immediately preceded it, for example, shooting heroin into a vein. Thus, unlike a rational agent, your desire to use heroin (or wirehead) depends on how many times you have tried it in the past.
We like the utility model because it is mathematically simple. But it also leads to a proof that ideal rational agents cannot exist (AIXI). Sometimes a utility model is still a useful approximation, and sometimes not. Is it useful to model a thermostat as an agent that "wants" to keep the room at a constant temperature? Is it useful to model practical AI this way?
I think the internet has the potential to grow into something you might not wish for, for example, something that will marginalize human brains as an insignificant component. But what are the real risks here? Is it really a problem of misinterpreting or taking over its goals.