More intuitive explanations!

22 XiXiDu 06 January 2012 06:10PM

The post on two easy to grasp explanations on Gödel's theorem and the Banach-Tarski paradox made me think of other explanations that I found easy or insightful and that I could share them as well.

1) Here is a nice proof of the Pythagorean theorem:

2) An easy and concise explanation of expected utility calculations by Luke Muehlhauser:

Decision theory is about choosing among possible actions based on how much you desire the possible outcomes of those actions.

How does this work? We can describe what you want with something called a utility function, which assigns a number that expresses how much you desire each possible outcome (or “description of an entire possible future”). Perhaps a single scoop of ice cream has 40 “utils” for you, the death of your daughter has -⁠274,000 utils for you, and so on. This numerical representation of everything you care about is your utility function.

We can combine your probabilistic beliefs and your utility function to calculate the expected utility for any action under consideration. The expected utility of an action is the average utility of the action’s possible outcomes, weighted by the probability that each outcome occurs.

Suppose you’re walking along a freeway with your young daughter. You see an ice cream stand across the freeway, but you recently injured your leg and wouldn’t be able to move quickly across the freeway. Given what you know, if you send your daughter across the freeway to get you some ice cream, there’s a 60% chance you’ll get some ice cream, a 5% your child will be killed by speeding cars, and other probabilities for other outcomes.

To calculate the expected utility of sending your daughter across the freeway for ice cream, we multiply the utility of the first outcome by its probability: 0.6 × 40 = 24. Then, we add to this the product of the next outcome’s utility and its probability: 24 + (0.05 × -⁠274,000) = -⁠13,676. And suppose the sum of the products of the utilities and probabilities for other possible outcomes was 0. The expected utility of sending your daughter across the freeway for ice cream is thus very low (as we would expect from common sense). You should probably take one of the other actions available to you, for example the action of not sending your daughter across the freeway for ice cream — or, some action with even higher expected utility.

A rational agent aims to maximize its expected utility, because an agent that does so will on average get the most possible of what it wants, given its beliefs and desires.

3) Micro- and macroevolution visualized.

4) Slopes of Perpendicular Lines.

5) Proof of Euler's formula using power series expansions.

6) Proof of the Chain Rule.

7) Multiplying Negatives Makes A Positive.

8) Completing the Square and Derivation of Quadratic Formula.

9) Quadratic factorization.

10) Remainder Theorem and Factor Theorem.

11) Combinations with repetitions.

12) Löb's theorem.


 

Explained: Gödel's theorem and the Banach-Tarski Paradox

10 XiXiDu 06 January 2012 05:23PM

I want to share the following explanations that I came across recently and which I enjoyed very much. I can't tell and don't suspect that they come close to an understanding of the original concepts but that they are so easy to grasp that it is worth the time if you don't already studied the extended formal versions of those concepts. In other words, by reading the following explanations your grasp of the matter will be less wrong than before but not necessarily correct.

World's shortest explanation of Gödel's theorem

by Raymond Smullyan, '5000 BC and Other Philosophical Fantasies' via Mark Dominus (ask me for the PDF of the book)

We have some sort of machine that prints out statements in some sort of language. It needn't be a statement-printing machine exactly; it could be some sort of technique for taking statements and deciding if they are true. But let's think of it as a machine that prints out statements.

In particular, some of the statements that the machine might (or might not) print look like these:

P*x (which means that the machine will print x)
NP*x (which means that the machine will never print x)
PR*x (which means that the machine will print xx)
NPR*x (which means that the machine will never print xx)

For example, NPR*FOO means that the machine will never print FOOFOO. NP*FOOFOO means the same thing. So far, so good.

Now, let's consider the statement NPR*NPR*. This statement asserts that the machine will never print NPR*NPR*.

Either the machine prints NPR*NPR*, or it never prints NPR*NPR*.

If the machine prints NPR*NPR*, it has printed a false statement. But if the machine never prints NPR*NPR*, then NPR*NPR* is a true statement that the machine never prints.

So either the machine sometimes prints false statements, or there are true statements that it never prints.

So any machine that prints only true statements must fail to print some true statements.

Or conversely, any machine that prints every possible true statement must print some false statements too.

Mark Dominus further writes,

The proof of Gödel's theorem shows that there are statements of pure arithmetic that essentially express NPR*NPR*; the trick is to find some way to express NPR*NPR* as a statement about arithmetic, and most of the technical details (and cleverness!) of Gödel's theorem are concerned with this trick. But once the trick is done, the argument can be applied to any machine or other method for producing statements about arithmetic.

The conclusion then translates directly: any machine or method that produces statements about arithmetic either sometimes produces false statements, or else there are true statements about arithmetic that it never produces. Because if it produces something like NPR*NPR* then it is wrong, but if it fails to produce NPR*NPR*, then that is a true statement that it has failed to produce.

So any machine or other method that produces only true statements about arithmetic must fail to produce some true statements.

The Banach-Tarski Paradox

by MarkCC

Suppose you have a sphere. You can take that sphere, and slice it into a finite number of pieces. Then you can take those pieces, and re-assemble them so that, without any gaps, you now have two spheres of the exact same size as the original.

[...]

How about this? Take the set of all natural numbers. Divide it into two sets: the set of even naturals, and the set of odd naturals. Now you have two infinite sets, the set {0, 2, 4, 6, 8, ...}, and the set {1, 3, 5, 7, 9, ...}. The size of both of those sets is the ω - which is also the size of the original set you started with.

Now take the set of even numbers, and map it so that for any given value i, f(i) = i/2. Now you've got a copy of the set of natural numbers. Take the set of odd naturals, and map them with g(i) = (i-1)/2. Now you've got a second copy of the set of natural numbers. So you've created two identical copies of the set of natural numbers out of the original set of natural numbers.

[...] math doesn't have to follow conservation of mass [...]. A sphere doesn't have a mass. It's just an uncountably infinite set of points with a particular collection of topological relationship and geometric relationships.

Intuition and Mathematics

5 XiXiDu 31 December 2011 06:58PM

While reading the answer to the question 'What is it like to have an understanding of very advanced mathematics?' I became curious about the value of intuition in mathematics and why it might be useful.

It usually seems to be a bad idea to try to solve problems intuitively or use our intuition as evidence to judge issues that our evolutionary ancestors never encountered and therefore were never optimized to judge by natural selection.

And so it seems to be especially strange to suggest that intuition might be a good tool to make mathematical conjectures. Yet people like fields medalist Terence Tao seem to believe that intuition should not be disregarded when doing mathematics,

...“fuzzier” or “intuitive” thinking (such as heuristic reasoning, judicious extrapolation from examples, or analogies with other contexts such as physics) gets deprecated as “non-rigorous”. All too often, one ends up discarding one’s initial intuition and is only able to process mathematics at a formal level, thus getting stalled at the second stage of one’s mathematical education.

The point of rigour is not to destroy all intuition; instead, it should be used to destroy bad intuition while clarifying and elevating good intuition. It is only with a combination of both rigorous formalism and good intuition that one can tackle complex mathematical problems;

The author mentioned at the beginning also makes the case that intuition is an important tool,

You are often confident that something is true long before you have an airtight proof for it (this happens especially often in geometry). The main reason is that you have a large catalogue of connections between concepts, and you can quickly intuit that if X were to be false, that would create tensions with other things you know to be true, so you are inclined to believe X is probably true to maintain the harmony of the conceptual space. It's not so much that you can imagine the situation perfectly, but you can quickly imagine many other things that are logically connected to it.

But what do those people mean when they talk about 'intuition', what exactly is its advantage? The author hints at an answer,

You go up in abstraction, "higher and higher". The main object of study yesterday becomes just an example or a tiny part of what you are considering today. For example, in calculus classes you think about functions or curves. In functional analysis or algebraic geometry, you think of spaces whose points are functions or curves -- that is, you "zoom out" so that every function is just a point in a space, surrounded by many other "nearby" functions. Using this kind of zooming out technique, you can say very complex things in short sentences -- things that, if unpacked and said at the zoomed-in level, would take up pages. Abstracting and compressing in this way allows you to consider extremely complicated issues while using your limited memory and processing power.

At this point I was reminded of something Scott Aaronson wrote in his essay 'Why Philosophers Should Care About Computational Complexity',

...even if computers were better than humans at factoring large numbers or at solving randomly-generated Sudoku puzzles, humans might still be better at search problems with “higher-level structure” or “semantics,” such as proving Fermat’s Last Theorem or (ironically) designing faster computer algorithms. Indeed, even in limited domains such as puzzle-solving, while computers can examine solutions millions of times faster, humans (for now) are vastly better at noticing global patterns or symmetries in the puzzle that make a solution either trivial or impossible. As an amusing example, consider the Pigeonhole Principle, which says that n+1 pigeons can’t be placed into n holes, with at most one pigeon per hole. It’s not hard to construct a propositional Boolean formula Φ that encodes the Pigeonhole Principle for some fixed value of n (say, 1000). However, if you then feed Φ to current Boolean satisfiability algorithms, they’ll assiduously set to work trying out possibilities: “let’s see, if I put this pigeon here, and that one there ... darn, it still doesn’t work!” And they’ll continue trying out possibilities for an exponential number of steps, oblivious to the “global” reason why the goal can never be achieved. Indeed, beginning in the 1980s, the field of proof complexity—a close cousin of computational complexity—has been able to show that large classes of algorithms require exponential time to prove the Pigeonhole Principle and similar propositional tautologies.

Again back to the answer on 'what it is like to have an understanding of very advanced mathematics'. The author writes,

...you are good at modularizing a conceptual space and taking certain calculations or arguments you don't understand as "black boxes" and considering their implications anyway. You can sometimes make statements you know are true and have good intuition for, without understanding all the details. You can often detect where the delicate or interesting part of something is based on only a very high-level explanation.

Humans are good at 'zooming out' to detect global patterns. Humans can jump conceptual gaps by treating them as "black boxes". 

Intuition is a conceptual bird's-eye view that allows humans to draw inferences from high-level abstractions without having to systematically trace out each step. Intuition is a wormhole. Intuition allows us get from here to there given limited computational resources.

If true, it also explains many of our shortcomings and biases. Intuitions greatest feature is also our biggest flaw.

The introduction of suitable abstractions is our only mental aid to organize and master complexity. — Edsger W. Dijkstra

Our computational limitations make it necessary to take shortcuts and view the world as a simplified model. That heuristic is naturally prone to error and introduces biases. We draw connections without establishing them systematically. We recognize patterns in random noise.

Many of our biases can be seen as a side-effect of making judgments under computational restrictions. A trade off between optimization power and resource use.

It it possible to correct for the shortcomings of intuition other than by refining rationality and becoming aware of our biases? That's up to how optimization power scales with resources and if there are more efficient algorithms that work under limited resources.

Should we discount extraordinary implications?

9 XiXiDu 29 December 2011 02:51PM

(Spawned by an exchange between Louie Helm and Holden Karnofsky.)

tl;dr:

The field of formal rationality is relatively new and I believe that we would be well-advised to discount some of its logical implications that advocate extraordinary actions.

Our current methods might turn out to be biased in new and unexpected ways. Pascal's mugging, the Lifespan Dilemma, blackmailing and the wrath of Löb's theorem are just a few examples on how an agent build according to our current understanding of rationality could fail.

Bayes’ Theorem, the expected utility formula, and Solomonoff induction are all reasonable heuristics. Yet those theories are not enough to build an agent that will be reliably in helping us to achieve our values, even if those values were thoroughly defined.

If we wouldn't trust a superhuman agent equipped with our current grasp of rationality to be reliably in extrapolating our volition, how can we trust ourselves to arrive at correct answers given what we know?

We should of course continue to use our best methods to decide what to do. But I believe that we should also draw a line somewhere when it comes to extraordinary implications.

Intuition, Rationality and Extraordinary Implications

It doesn't feel to me like 3^^^^3 lives are really at stake, even at very tiny probability.  I'd sooner question my grasp of "rationality" than give five dollars to a Pascal's Mugger because I thought it was "rational". — Eliezer Yudkowsky

Holden Karnofsky is suggesting that in some cases we should follow the simple rule that "extraordinary claims require extraordinary evidence".

I think that we should sometimes demand particular proof P; and if proof P is not available, then we should discount seemingly absurd or undesirable consequences even if our theories disagree.

I am not referring to the weirdness of the conclusions but the foreseeable scope of the consequences of being wrong about them. We should be careful in using the implied scope of certain conclusions to outweigh their low probability. I feel we should put more weight to the consequences of our conclusions being wrong than being right.

As an example take the idea of quantum suicide and assume it would make sense under certain circumstances. I wouldn’t commit quantum suicide even given a high confidence in the many-worlds interpretation of quantum mechanics being true. Logical implications just don’t seem enough in some cases.

To be clear, extrapolations work and often are the best we can do. But since there are problems such as the above, that we perceive to be undesirable and that lead to absurd actions and their consequences, I think it is reasonable to ask for some upper and lower bounds regarding the use and scope of certain heuristics.

We are not going to stop pursuing whatever terminal goal we have chosen just because someone promises us even more utility if we do what that person wants. We are not going to stop loving our girlfriend just because there are other people who do not approve our relationship and who together would experience more happiness if we divorced than the combined happiness of us and our girlfriend being in love. Therefore we already informally established some upper and lower bounds.

I have read about people who became very disturbed and depressed taking ideas too seriously. That way madness lies, and I am not willing to choose that path yet.

Maybe I am simply biased and have been unable to overcome it yet. But my best guess right now is that we simply have to draw a lot of arbitrary lines and arbitrarily refuse some steps.

Taking into account considerations of vast utility or low probability quickly leads to chaos theoretic considerations like the butterfly effect. As a computationally bounded and psychical unstable agent I am unable to cope with that. Consequently I see no other way than to neglect the moral impossibility of extreme uncertainty.

Until the problems are resolved, or rationality is sufficiently established, I will continue to put vastly more weight on empirical evidence and my intuition than on logical implications, if only because I still lack the necessary educational background to trust my comprehension and judgement of the various underlying concepts and methods used to arrive at those implications.

Expected Utility Maximization and Complex Values

One of the problems with my current grasp of rationality that I perceive to be unacknowledged are the consequences of expected utility maximization with respect to human nature and our complex values.

I am still genuinely confused about what a person should do. I don't even know how much sense that concept makes. Does expected utility maximization has anything to do with being human?

Those people who take existential risks seriously and who are currently involved in their mitigation seem to be disregarding many other activities that humans usually deem valuable because the expected utility of saving the world does outweigh the pursuit of other goals. I do not disagree with that assessment but find it troubling.

The problem is, will there ever be anything but a single goal, a goal that can either be more effectively realized and optimized to yield the most utility or whose associated expected utility simply outweighs all other values?

Assume that humanity managed to create a friendly AI (FAI). Given the enormous amount of resources that each human is poised to consume until the dark era of the universe, wouldn't the same arguments that now suggest that we should contribute money to existential risk charities then suggest that we should donate our resources to the friendly AI? Our resources could enable it to find a way to either travel back in time, leave the universe or hack the matrix. Anything that could avert the end of the universe and allow the FAI to support many more agents has effectively infinite expected utility.

The sensible decision would be to concentrate on those scenarios with the highest expected utility now, e.g. solving friendly AI, and worry about those problems later. But not only does the same argument always work but the question is also relevant to the nature of friendly AI and our ultimate goals. Is expected utility maximization even compatible with our nature? Does expected utility maximization lead to world states in which wireheading is favored, either directly or indirectly by focusing solely on a single high-utility goal that does outweigh all other goals?

Conclusion

  1. Being able to prove something mathematically doesn't prove its relation to reality.
  2. Relativity is less wrong than Newtonian mechanics but it still breaks down in describing singularities including the very beginning of the universe.

It seems to me that our notion of rationality is not the last word on the topic and that we shouldn't act as if it was.

Q&A with Michael Littman on risks from AI

15 XiXiDu 19 December 2011 09:51AM

[Click here to see a list of all interviews]

Michael L. Littman is a computer scientist. He works mainly in reinforcement learning, but has done work in machine learning, game theory, computer networking, Partially observable Markov decision process solving, computer solving of analogy problems and other areas. He is currently a professor of computer science and department chair at Rutgers University.

Homepage: cs.rutgers.edu/~mlittman/

Google Scholar: scholar.google.com/scholar?q=Michael+Littman

The Interview:

Michael Littman: A little background on me.  I've been an academic in AI for not-quite 25 years.  I work mainly on reinforcement learning, which I think is a key technology for human-level AI---understanding the algorithms behind motivated behavior.  I've also worked a bit on topics in statistical natural language processing (like the first human-level crossword solving program).  I carried out a similar sort of survey when I taught AI at Princeton in 2001 and got some interesting answers from my colleagues.  I think the survey says more about the mental state of researchers than it does about the reality of the predictions. 

In my case, my answers are colored by the fact that my group sometimes uses robots to demonstrate the learning algorithms we develop.  We do that because we find that non-technical people find it easier to understand and appreciate the idea of a learning robot than pages of equations and graphs.  But, after every demo, we get the same question: "Is this the first step toward Skynet?"  It's a "have you stopped beating your wife" type of question, and I find that it stops all useful and interesting discussion about the research.

Anyhow, here goes:

Q1: Assuming no global catastrophe halts progress, by what year would you assign a 10%/50%/90% chance of the development of roughly human-level machine intelligence?

Michael Littman:

10%: 2050 (I also think P=NP in that year.)
50%: 2062
90%: 2112

Q2: What probability do you assign to the possibility of human extinction as a result of badly done AI?

Michael Littman: epsilon, assuming you mean: P(human extinction caused by badly done AI | badly done AI)

I think complete human extinction is unlikely, but, if society as we know it collapses, it'll be because people are being stupid (not because machines are being smart).

Q3: What probability do you assign to the possibility of a human level AGI to self-modify its way up to massive superhuman intelligence within a matter of hours/days/< 5 years?

Michael Littman: epsilon (essentially zero).  I'm not sure exactly what constitutes intelligence, but I don't think it's something that can be turbocharged by introspection, even superhuman introspection.  It involves experimenting with the world and seeing what works and what doesn't.  The world, as they say, is its best model.  Anything short of the real world is an approximation that is excellent for proposing possible solutions but not sufficient to evaluate them.

Q3-sub: P(superhuman intelligence within days | human-level AI running at human-level speed equipped with a 100 Gigabit Internet connection) = ?

Michael Littman: Ditto.

Q3-sub: P(superhuman intelligence within < 5 years | human-level AI running at human-level speed equipped with a 100 Gigabit Internet connection) = ?

Michael Littman: 1%. At least 5 years is enough for some experimentation.

Q4: Is it important to figure out how to make AI provably friendly to us and our values (non-dangerous), before attempting to solve artificial general intelligence?

Michael Littman: No, I don't think it's possible.  I mean, seriously, humans aren't even provably friendly to us and we have thousands of years of practice negotiating with them.

Q5: Do possible risks from AI outweigh other possible existential risks, e.g. risks associated with the possibility of advanced nanotechnology?

Michael Littman: In terms of science risks (outside of human fundamentalism which is the only non-negligible risk I am aware of), I'm most afraid of high energy physics experiments, then biological agents, then, much lower, information technology related work like AI.

Q6: What is the current level of awareness of possible risks from AI, relative to the ideal level?

Michael Littman: I think people are currently hypersensitive.  As I said, every time I do a demo of any AI ideas, no matter how innocuous, I am asked whether it is the first step toward Skynet.  It's ridiculous.  Given the current state of AI, these questions come from a simple lack of knowledge about what the systems are doing and what they are capable of.  What society lacks is not a lack of awareness of risks but a lack of technical understanding to *evaluate* risks.  It shouldn't just be the scientists assuring people everything is ok.  People should have enough background to ask intelligent questions about the dangers and promise of new ideas.

Q7: Can you think of any milestone such that if it were ever reached you would expect human‐level machine intelligence to be developed within five years thereafter?

Michael Littman: Slightly subhuman intelligence?  What we think of as human intelligence is layer upon layer of interacting subsystems.  Most of these subsystems are complex and hard to get right.  If we get them right, they will show very little improvement in the overall system, but will take us a step closer.  The last 5 years before human intelligence is demonstrated by a machine will be pretty boring, akin to the 5 years between the ages of 12 to 17 in a human's development.  Yes, there are milestones, but they will seem minor compared to first few years of rapid improvement.

Question about timeless physics

3 XiXiDu 16 December 2011 01:09PM

Related to: lesswrong.com/lw/qp/timeless_physics/

Why do I find myself at this point in time, configuration space, rather than another point? In other words, why do I have certain expectations rather than others?

I don't expect the U.S. presidential elections to have happened but to happen next, where "to happen" and "to have happened" internally marks the sequential order of steps indexed by consecutive timestamps. But why do I find myself to have that particular expectation rather than any other, what is it that does privilege this point?

So you seem to remember Time proceeding along a single line.  You remember that the particle first went left, and then went right.  You ask, "Which way will the particle go this time?"

My question is why I find myself to remember that the particle went left and then right rather than left but not yet right?

But both branches, both future versions of you, just exist.  There is no fact of the matter as to "which branch you go down".  Different versions of you experience both branches.

Yes, but why does my version experience this point of my branch and not any other point of my branch?

I understand that if this universe was a giant simulation and that if it was to halt and then resume, after some indexical measure of causal steps used by those outside of it, then I wouldn't notice it. Therefore if you remove the notion of an outside world there ceases to be any measure of how many causal steps it took until I continued my relational measure of progression.

But that's not my question. Assume for a moment that my consciousness experience is not a causal continuum but a discrete sequence of causal steps from 1, 2, 3, ... to N where N marks this point. Why do I find myself at N rather than 10 or N+1?

Q&A with Richard Carrier on risks from AI

16 XiXiDu 13 December 2011 10:00AM

[Click here to see a list of all interviews]

I am emailing experts in order to raise and estimate the academic awareness and perception of risks from AI.

Richard Carrier is a world-renowned author and speaker. As a professional historian, published philosopher, and prominent defender of the American freethought movement, Dr. Carrier has appeared across the country and on national television defending sound historical methods and the ethical worldview of secular naturalism. His books and articles have also received international attention. He holds a Ph.D. from Columbia University in ancient history, specializing in the intellectual history of Greece and Rome, particularly ancient philosophy, religion, and science, with emphasis on the origins of Christianity and the use and progress of science under the Roman empire. He is best known as the author of Sense and Goodness without God, Not the Impossible Faith, and Why I Am Not a Christian, and a major contributor to The Empty Tomb, The Christian Delusion, The End of Christianity, and Sources of the Jesus Tradition, as well as writer and editor-in-chief (now emeritus) for the Secular Web, and for his copious work in history and philosophy online and in print. He is currently working on his next books, Proving History: Bayes's Theorem and the Quest for the Historical Jesus, On the Historicity of Jesus Christ, The Scientist in the Early Roman Empire, and Science Education in the Early Roman Empire. To learn more about Dr. Carrier and his work follow the links below.

Homepage: richardcarrier.info

Blog: freethoughtblogs.com/carrier/ (old blog: richardcarrier.blogspot.com)

Selected articles:

The Interview:

Richard Carrier: Note that I follow and support the work of The Singularity Institute on precisely this issue, which you are writing for (if you are a correspondent for Less Wrong). And I believe all AI developers should (e.g. CALO). So my answers won't be too surprising (below). But also keep in mind what I say (not just on "singularity" claims) at:

http://richardcarrier.blogspot.com/2009/06/are-we-doomed.html

Q1: Assuming no global catastrophe halts progress, by what year would you assign a 10%/50%/90% chance of the development of roughly human-level machine intelligence?

Richard Carrier: 2020/2040/2080

Explanatory remark to Q1:

P(human-level AI by (year) | no wars ∧ no disasters ∧ beneficially political and economic development) = 10%/50%/90%

Q2: What probability do you assign to the possibility of human extinction as a result of badly done AI?

Richard Carrier: Here the relative probability is much higher that human extinction will result from benevolent AI, i.e. eventually Homo sapiens will be self-evidently obsolete and we will voluntarily transition to Homo cyberneticus. In other words, we will extinguish the Homo sapiens species ourselves, voluntarily. If you asked for a 10%/50%/90% deadline for this I would say 2500/3000/4000.

However, perhaps you mean to ask regarding the extinction of all Homo, and their replacement with AI that did not originate as a human mind, i.e. the probability that some AI will kill us and just propagate itself.

The answer to that is dependent on what you mean by "badly done" AI: (a) AI that has more power than we think we gave it, causing us problems, or (b) AI that has so much more power than we think we gave it that it can prevent our taking its power away.

(a) is probably inevitable, or at any rate a high probability, and there will likely be deaths or other catastrophes, but like other tech failures (e.g. the Titanic, three mile island, hijacking jumbo jets and using them as guided missiles) we will prevail, and very quickly from a historical perspective (e.g. there won't be another 9/11 using airplanes as missiles; we only got jacked by that unforeseen failure once). We would do well to prevent as many problems as possible by being as smart as we can be about implementing AI, and not underestimating its ability to outsmart us, or to develop while we aren't looking (e.g. Siri could go sentient on its own, if no one is managing it closely to ensure that doesn't happen).

(b) is very improbable because AI function is too dependent on human cooperation (e.g. power grid; physical servers that can be axed or bombed; an internet that can be shut down manually) and any move by AI to supplant that requirement would be too obvious and thus too easily stopped. In short, AI is infrastructure dependent, but it takes too much time and effort to build an infrastructure, and even more an infrastructure that is invulnerable to demolition. By the time AI has an independent infrastructure (e.g. its own robot population worldwide, its own power supplies, manufacturing plants, etc.) Homo sapiens will probably already be transitioning to Homo cyberneticus and there will be no effective difference between us and AI.

However, given no deadline, it's likely there will be scenarios like: "god" AI's run sims in which digitized humans live, and any given god AI could decide to delete the sim and stop running it (and likewise all comparable AI shepherding scenarios). So then we'd be asking how likely is it that a god AI would ever do that, and more specifically, that all would (since there won't be just one sim run by one AI, but many, so one going rogue would not mean extinction of humanity).

So setting aside AI that merely kills some people, and only focusing on total extinction of Homo sapiens, we have:

P(voluntary human extinction by replacement | any AGI at all) = 90%+

P(involuntary human extinction without replacement | badly done AGI type (a)) = < 10^-20

[and that's taking into account an infinite deadline, because the probability steeply declines with every year after first opportunity, e.g. AI that doesn't do it the first chance it gets is rapidly less likely to as time goes on, so the total probability has a limit even at infinite time, and I would put that limit somewhere as here assigned.]

P(involuntary human extinction without replacement | badly done AGI type (b)) = .33 to .67

However, P(badly done AGI type (b)) = < 10^-20

Explanatory remark to Q2:

P(human extinction | badly done AI) = ?

(Where 'badly done' = AGI capable of self-modification that is not provably non-dangerous.)

Q3: What probability do you assign to the possibility of a human level AGI to self-modify its way up to massive superhuman intelligence within a matter of hours/days/< 5 years?

Richard Carrier: Depends on when it starts. For example, if we started a human-level AGI tomorrow, it's ability to revise itself would be hugely limited by our slow and expensive infrastructure (e.g. manufacturing the new circuits, building the mainframe extensions, supplying them with power, debugging the system). In that context, "hours" and "days" have P --> 0, but 5 years has P = 33%+ if someone is funding the project, and likewise 10 years has P=67%+; and 25 years, P=90%+. However, suppose human-level AGI is first realized in fifty years when all these things can be done in a single room with relatively inexpensive automation and the power demands of any new system were not greater than are normally supplied to that room. Then P(days) = 90%+. And with massively more advanced tech, say such as we might have in 2500, then P(hours) = 90%+.

However...

Explanatory remark to Q3:

P(superhuman intelligence within hours | human-level AI running at human-level speed equipped with a 100 Gigabit Internet connection) = ?
P(superhuman intelligence within days | human-level AI running at human-level speed equipped with a 100 Gigabit Internet connection) = ?
P(superhuman intelligence within < 5 years | human-level AI running at human-level speed equipped with a 100 Gigabit Internet connection) = ?

Richard Carrier: Perhaps you are confusing intelligence with knowledge. Internet connection can make no difference to the former (since an AGI will have no more control over the internet than human operators do). That can only expand a mind's knowledge. As to how quickly, it will depend more on the rate of processed seconds in the AGI itself, i.e. if it can simulate human thought only at the same pace as non-AI, then it will not be able to learn any faster than a regular person, no matter what kind of internet connection it has. But if the AGI can process ten seconds time in one second of non-AI time, then it can learn ten times as fast, up to the limit of data access (and that is where internet connection speed will matter). That is a calculation I can't do. A computer science expert would have to be consulted to calculate reasonable estimates of what connection speed would be needed to learn at ten times normal human pace, assuming the learner can learn that fast (which a ten:one time processor could); likewise a hundred times, etc. And all that would tell you is how quickly that mind can learn. But learning in and of itself doesn't make you smarter. That would require software or circuit redesign, which would require testing and debugging. Otherwise once you had all relevant knowledge available to any human software/circuit design team, you would simply be no smarter than them, and further learning would not help you (thus humans already have that knowledge level: that's why we work in teams to begin with), thus AI is not likely to much exceed us in that ability. The only edge it can exploit is speed of a serial design thought process, but even that runs up against the time and resource expense of testing and debugging anything it designed, and that is where physical infrastructure slows the rate of development, and massive continuing human funding is needed. Hence my probabilities above.

Q4: Is it important to figure out how to make AI provably friendly to us and our values (non-dangerous), before attempting to solve artificial general intelligence?

Richard Carrier: Yes. At the very least it is important to take the risks very seriously, and incorporate it as a concern within every project flow. I believe there should always be someone expert in the matter assigned to any AGI design team, who is monitoring everything being done and assessing its risks and ensuring safeguards are in place before implementation at each step. It already concerns me that this might not be a component of the management of Siri, and Siri achieving AGI is a low probability (but not vanishingly low; I'd say it could be as high as 1% in 10 years unless Siri's processing space is being deliberately limited so it cannot achieve a certain level of complexity, or in other ways its cognitive abilities being actively limited).

Explanatory remark to Q4:

How much money is currently required to mitigate possible risks from AI (to be instrumental in maximizing your personal long-term goals, e.g. surviving this century), less/no more/little more/much more/vastly more?

Richard Carrier: Required is not very much. A single expert monitoring Siri who has real power to implement safeguards would be sufficient, so with salary and benefits and collateral overhead, that's no more than $250,000/year, for a company that has billions in liquid capital. (Because safeguards are not expensive, e.g. capping Siri's processing space costs nothing in practical terms; likewise writing her software to limit what she can actually do no matter how sentient she became, e.g. imagine an army of human hackers hacked Siri at the source and could run Siri by a million direct terminals, what could they do? Answering that question will evoke obvious safeguards to put on Siri's physical access and software; the most obvious is making it impossible for Siri to rewrite her own core software.)

But what actually is being spent I don't know. I suspect "a little more" needs to be spent than is, only because I get the impression AI developers aren't taking this seriously, and yet the cost of monitoring is not that high.

And yet you may notice all this is separate from the question of making AGI "provably friendly" which is what you asked about (and even that is not the same as "provably safe" since friendly AGI poses risks as well, as the Singularity Institute has been pointing out).

This is because all we need do now is limit AGI's power at its nascence. Then we can explore how to make AGI friendly, and then provably friendly, and then provably safe. In fact I expect AGI will even help us with that. Once AGI exists, the need to invest heavily in making it safe will be universally obvious. Whereas before AGI exists there is little we can do to ascertain how to make it safe, since we don't have a working model to test. Think of trying to make a ship safe, without ever getting to build and test any vessel, nor having knowledge of any other vessels, and without knowing anything about the laws of buoyancy. There wouldn't be a lot you could do.

Nevertheless it would be worth some investment to explore how much we can now know, particularly as it can be cross-purposed with understanding human moral decision making better, and thus need not be sold as "just AI morality" research. How much more should we spend on this now? Much more than we are. But only because I see that money benefiting us directly, in understanding how to make ordinary people better, and detect bad people, and so on, which is of great value wholly apart from its application to AGI. Having it double as research on how to design moral thought processes unrestrained by human brain structure would then benefit any future AGI development.

Q5: Do possible risks from AI outweigh other possible existential risks, e.g. risks associated with the possibility of advanced nanotechnology?

Explanatory remark to Q5:

What existential risk (human extinction type event) is currently most likely to have the greatest negative impact on your personal long-term goals, under the condition that nothing is done to mitigate the risk?

Richard Carrier: All existential risks are of such vastly low probability it would be beyond human comprehension to rank them, and utterly pointless to anyway. And even if I were to rank them, extinction by comet, asteroid or cosmological gamma ray burst vastly outranks any manmade cause. Even extinction by supervolcano vastly outranks any manmade cause. So I don't concern myself with this (except to call for more investment in earth impactor detection, and the monitoring of supervolcano risks).

We should be concerned not with existential risks, but ordinary risks, e.g. small scale nuclear or biological terrorism, which won't kill the human race, and might not even take civilization into the Dark Ages, but can cause thousands or millions to die and have other bad repercussions. Because ordinary risks are billions upon billions of times more likely than extinction events, and as it happens, mitigating ordinary risks entails mitigating existential risks anyway (e.g. limiting the ability to go nuclear prevents small scale nuclear attacks just as well as nuclear annihilation events, in fact it makes the latter billions of times less likely than it already is).

Thus when it comes to AI, as an existential risk it just isn't one (P --> 0), but as a panoply of ordinary risks, it is (P --> 1). And it doesn't matter how it ranks, it should get full attention anyway, like all definite risks do. It thus doesn't need to be ranked against other risks, as if terrorism were such a great risk we should invest nothing in earthquake safety, or vice versa.

Q6: What is the current level of awareness of possible risks from AI, relative to the ideal level?

Richard Carrier: Very low. Even among AI developers it seems.

Q7: Can you think of any milestone such that if it were ever reached you would expect human‐level machine intelligence to be developed within five years thereafter?

Richard Carrier: There will not be "a" milestone like that, unless it is something wholly unexpected (like a massive breakthrough in circuit design that allows virtually infinite processing power on a desktop: which development would make P(AGI within five years) > 33%). But wholly unexpected discoveries have a very low probability. Sticking only with what we already expect to occur, the five-year milestone for AGI will be AHI, artificial higher intelligence, e.g. a robot cat that behaved exactly like a real cat. Or a Watson who can actively learn on its own without being programmed with data (but still can only answer questions, and not plan or reason out problems). The CALO project is likely to develop an increasingly sophisticated Siri-like AI that won't be AGI but will gradually become more and more like AGI, so that there won't be any point where someone can say "it will achieve AGI within 5 years." Rather it will achieve AGI gradually and unexpectedly, and people will even debate when or whether it had.

Basically, I'd say once we have "well-trained dog" level AI, the probability of human-level AI becomes:

P(< 5 years) = 10%
P(< 10 years) = 25%
P(< 20 years) = 50%
P(< 40 years) = 90%

Objections to Coherent Extrapolated Volition

11 XiXiDu 22 November 2011 10:32AM

In poetic terms, our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.

— Eliezer Yudkowsky, May 2004, Coherent Extrapolated Volition

Foragers versus industry era folks

Consider the difference between a hunter-gatherer, who cares about his hunting success and to become the new tribal chief, and a modern computer scientist who wants to determine if a “sufficiently large randomized Conway board could turn out to converge to a barren ‘all off’ state.”

The utility of the success in hunting down animals and proving abstract conjectures about cellular automata is largely determined by factors such as your education, culture and environmental circumstances. The same forager who cared to kill a lot of animals, to get the best ladies in its clan, might have under different circumstances turned out to be a vegetarian mathematician solely caring about his understanding of the nature of reality. Both sets of values are to some extent mutually exclusive or at least disjoint. Yet both sets of values are what the person wants, given the circumstances. Change the circumstances dramatically and you change the persons values.

What do you really want?

You might conclude that what the hunter-gatherer really wants is to solve abstract mathematical problems, he just doesn’t know it. But there is no set of values that a person “really” wants. Humans are largely defined by the circumstances they reside in.

  • If you already knew a movie, you wouldn’t watch it.
  • To be able to get your meat from the supermarket changes the value of hunting.

If “we knew more, thought faster, were more the people we wished we were, and had grown up closer together” then we would stop to desire what we learnt, wish to think even faster, become even different people and get bored of and rise up from the people similar to us.

A singleton is an attractor

A singleton will inevitably change everything by causing a feedback loop between itself as an attractor and humans and their values.

Much of our values and goals, what we want, are culturally induced or the result of our ignorance. Reduce our ignorance and you change our values. One trivial example is our intellectual curiosity. If we don’t need to figure out what we want on our own, our curiosity is impaired.

A singleton won’t extrapolate human volition but implement an artificial set values as a result of abstract high-order contemplations about rational conduct.

With knowledge comes responsibility, with wisdom comes sorrow

Knowledge changes and introduces terminal goals. The toolkit that is called ‘rationality’, the rules and heuristics developed to help us to achieve our terminal goals are also altering and deleting them. A stone age hunter-gatherer seems to possess very different values than we do. Learning about rationality and various ethical theories such as Utilitarianism would alter those values considerably.

Rationality was meant to help us achieve our goals, e.g. become a better hunter. Rationality was designed to tell us what we ought to do (instrumental goals) to achieve what we want to do (terminal goals). Yet what actually happens is that we are told, that we will learn, what we ought to want.

If an agent becomes more knowledgeable and smarter then this does not leave its goal-reward-system intact if it is not especially designed to be stable. An agent who originally wanted to become a better hunter and feed his tribe would end up wanting to eliminate poverty in Obscureistan. The question is, how much of this new “wanting” is the result of using rationality to achieve terminal goals and how much is a side-effect of using rationality, how much is left of the original values versus the values induced by a feedback loop between the toolkit and its user?

Take for example an agent that is facing the Prisoner’s dilemma. Such an agent might originally tend to cooperate and only after learning about game theory decide to defect and gain a greater payoff. Was it rational for the agent to learn about game theory, in the sense that it helped the agent to achieve its goal or in the sense that it deleted one of its goals in exchange for a allegedly more “valuable” goal?

Beware rationality as a purpose in and of itself

It seems to me that becoming more knowledgeable and smarter is gradually altering our utility functions. But what is it that we are approaching if the extrapolation of our volition becomes a purpose in and of itself? Extrapolating our coherent volition will distort or alter what we really value by installing a new cognitive toolkit designed to achieve an equilibrium between us and other agents with the same toolkit.

Would a singleton be a tool that we can use to get what we want or would the tool use us to do what it does, would we be modeled or would it create models, would we be extrapolating our volition or rather follow our extrapolations?

(This post is a write-up of a previous comment designated to receive feedback from a larger audience.)

OPERA Confirms: Neutrinos Travel Faster Than Light

10 XiXiDu 18 November 2011 09:58AM

New high-precision tests carried out by the OPERA collaboration in Italy broadly confirm its claim, made in September, to have detected neutrinos travelling at faster than the speed of light. The collaboration today submitted its results to a journal, but some members continue to insist that further checks are needed before the result can be considered sound.

Link: nextbigfuture.com/2011/11/faster-than-light-neutrinos-opera.html

The OPERA Collaboration sent to the Cornell Arxiv an updated version of their preprint today, where they summarize the results of their analysis, expanded with additional statistical tests, and including the check performed with 20 additional neutrino interactions they collected in the last few weeks. These few extra timing measurements crucially allow the ruling out of some potential unaccounted sources of systematic uncertainty, notably ones connected to the knowledge of the proton spill time distribution.

[...]

So what does OPERA find ? Their main result, based on the 15,233 neutrino interactions collected in three years of data taking, is unchanged from the September result. The most interesting part of the new publication is instead that the  find that the 20 new neutrino events (where neutrino speeds are individually measured, as opposed to the combined measurement done with the three-year data published in September) confirm the earlier result: the arrival times appear to occur about 60 nanoseconds before they are expected.

Link: science20.com/quantum_diaries_survivor/opera_confirms_neutrinos_travel_faster_light-84763

Paper: kruel.co/paper-neutrino-velocity-JHEP.pdf

Previously on LW: lesswrong.com/lw/7rc/particles_break_lightspeed_limit/

Why an Intelligence Explosion might be a Low-Priority Global Risk

3 XiXiDu 14 November 2011 11:40AM

(The following is a summary of some of my previous submissions that I originally created for my personal blog.)

As we know,
There are known knowns.
There are things
We know we know.
We also know
There are known unknowns.
That is to say
We know there are some things
We do not know.
But there are also unknown unknowns,
The ones we don’t know
We don’t know.

— Donald Rumsfeld, Feb. 12, 2002, Department of Defense news briefing

Intelligence, a cornucopia?

It seems to me that those who believe into the possibility of catastrophic risks from artificial intelligence act on the unquestioned assumption that intelligence is kind of a black box, a cornucopia that can sprout an abundance of novelty. But this implicitly assumes that if you increase intelligence you also decrease the distance between discoveries.

Intelligence is no solution in itself, it is merely an effective searchlight for unknown unknowns and who knows that the brightness of the light increases proportionally with the distance between unknown unknowns? To enable an intelligence explosion the light would have to reach out much farther with each increase in intelligence than the increase of the distance between unknown unknowns. I just don’t see that to be a reasonable assumption.

Intelligence amplification, is it worth it?

It seems that if you increase intelligence you also increase the computational cost of its further improvement and the distance to the discovery of some unknown unknown that could enable another quantum leap. It seems that you need to apply a lot more energy to get a bit more complexity.

If any increase in intelligence is vastly outweighed by its computational cost and the expenditure of time needed to discover it then it might not be instrumental for a perfectly rational agent (such as an artificial general intelligence), as imagined by game theorists, to increase its intelligence as opposed to using its existing intelligence to pursue its terminal goals directly or to invest its given resources to acquire other means of self-improvement, e.g. more efficient sensors.

What evidence do we have that the payoff of intelligent, goal-oriented experimentation yields enormous advantages (enough to enable an intelligence explosion) over evolutionary discovery relative to its cost?

We simply don’t know if intelligence is instrumental or quickly hits diminishing returns.

Can intelligence be effectively applied to itself at all? How do we know that any given level of intelligence is capable of handling its own complexity efficiently? Many humans are not even capable of handling the complexity of the brain of a worm.

Humans and the importance of discovery

There is a significant difference between intelligence and evolution if you apply intelligence to the improvement of evolutionary designs:

  • Intelligence is goal-oriented.
  • Intelligence can think ahead.
  • Intelligence can jump fitness gaps.
  • Intelligence can engage in direct experimentation.
  • Intelligence can observe and incorporate solutions of other optimizing agents.

But when it comes to unknown unknowns, what difference is there between intelligence and evolution? The critical similarity is that both rely on dumb luck when it comes to genuine novelty. And where else but when it comes to the dramatic improvement of intelligence itself does it take the discovery of novel unknown unknowns?

We have no idea about the nature of discovery and its importance when it comes to what is necessary to reach a level of intelligence above our own, by ourselves. How much of what we know was actually the result of people thinking quantitatively and attending to scope, probability, and marginal impacts? How much of what we know today is the result of dumb luck versus goal-oriented, intelligent problem solving?

Our “irrationality” and the patchwork-architecture of the human brain might constitute an actual feature. The noisiness and patchwork architecture of the human brain might play a significant role in the discovery of unknown unknowns because it allows us to become distracted, to leave the path of evidence based exploration.

A lot of discoveries were made by people who were not explicitly trying to maximizing expected utility. A lot of progress is due to luck, in the form of the discovery of unknown unknowns.

A basic argument in support of risks from superhuman intelligence is that we don’t know what it could possible come up with. That is also why it is called it a “Singularity“. But why does nobody ask how a superhuman intelligence knows what it could possible come up with?

It is not intelligence in and of itself that allows humans to accomplish great feats. Even people like Einstein, geniuses who were apparently able to come up with great insights on their own, were simply lucky to be born into the right circumstances, the time was ripe for great discoveries, thanks to previous discoveries of unknown unknowns.

Evolution versus Intelligence

It is argued that the mind-design space must be large if evolution could stumble upon general intelligence and that there are low-hanging fruits that are much more efficient at general intelligence than humans are, evolution simply went with the first that came along. It is further argued that evolution is not limitlessly creative, each step must increase the fitness of its host, and that therefore there are artificial mind designs that can do what no product of natural selection could accomplish.

I agree with the above, yet given all of the apparent disadvantages of the blind idiot God, evolution was able to come up with altruism, something that works two levels above the individual and one level above society. So far we haven’t been able to show such ingenuity by incorporating successes that are not evident from an individual or even societal position.

The example of altruism provides evidence that intelligence isn’t many levels above evolution. Therefore the crucial question is, how great is the performance advantage? Is it large enough to justify the conclusion that the probability of an intelligence explosion is easily larger than 1%? I don’t think so. To answer this definitively we would have to fathom the significance of the discovery (“random mutations”) of unknown unknowns in the dramatic amplification of intelligence versus the invention (goal-oriented “research and development”) of an improvement within known conceptual bounds.

Another example is flight. Artificial flight is not even close to the energy efficiency and maneuverability of birds or insects. We didn’t went straight from no artificial flight towards flight that is generally superior to the natural flight that is an effect of biological evolution.

Take for example a dragonfly. Even if we were handed the design for a perfect artificial dragonfly, minus the design for the flight of a dragonfly, we wouldn’t be able to build a dragonfly that can take over the world of dragonflies, all else equal, by means of superior flight characteristics.

It is true that a Harpy Eagle can lift more than three-quarters of its body weight while the Boeing 747 Large Cargo Freighter has a maximum take-off weight of almost double its operating empty weight (I suspect that insects can do better). My whole point is that we never reached artificial flight that is strongly above the level of natural flight. An eagle can after all catch its cargo under various circumstances like the slope of a mountain or from beneath the sea, thanks to its superior maneuverability.

Humans are biased and irrational

It is obviously true that our expert systems are better than we are at their narrow range of expertise. But that expert systems are better at certain tasks does not imply that you can effectively and efficiently combine them into a coherent agency.

The noisiness of the human brain might be one of the important features that allows it to exhibit general intelligence. Yet the same noise might be the reason that each task a human can accomplish is not put into execution with maximal efficiency. An expert system that features a single stand-alone ability is able to reach the unique equilibrium for that ability. Whereas systems that have not fully relaxed to equilibrium feature the necessary characteristics that are required to exhibit general intelligence. In this sense a decrease in efficiency is a side-effect of general intelligence. If you externalize a certain ability into a coherent framework of agency, you decrease its efficiency dramatically. That is the difference between a tool and the ability of the agent that uses the tool.

In the above sense, our tendency to be biased and act irrationally might partly be a trade off between plasticity, efficiency and the necessity of goal-stability.

Embodied cognition and the environment

Another problem is that general intelligence is largely a result of an interaction between an agent and its environment. It might be in principle possible to arrive at various capabilities by means of induction, but it is only a theoretical possibility given unlimited computational resources. To achieve real world efficiency you need to rely on slow environmental feedback and make decision under uncertainty.

AIXI is often quoted as a proof of concept that it is possible for a simple algorithm to improve itself to such an extent that it could in principle reach superhuman intelligence. AIXI proves that there is a general theory of intelligence. But there is a minor problem, AIXI is as far from real world human-level general intelligence as an abstract notion of a Turing machine with an infinite tape is from a supercomputer with the computational capacity of the human brain. An abstract notion of intelligence doesn’t get you anywhere in terms of real-world general intelligence. Just as you won’t be able to upload yourself to a non-biological substrate because you showed that in some abstract sense you can simulate every physical process.

Just imagine you emulated a grown up human mind and it wanted to become a pick up artist, how would it do that with an Internet connection? It would need some sort of avatar, at least, and then wait for the environment to provide a lot of feedback.

Therefore even if we’re talking about the emulation of a grown up mind, it will be really hard to acquire some capabilities. Then how is the emulation of a human toddler going to acquire those skills? Even worse, how is some sort of abstract AGI going to do it that misses all of the hard coded capabilities of a human toddler?

Can we even attempt to imagine what is wrong about a boxed emulation of a human toddler, that makes it unable to become a master of social engineering in a very short time?

Can we imagine what is missing that would enable one of the existing expert systems to quickly evolve vastly superhuman capabilities in its narrow area of expertise? Why haven’t we seen a learning algorithm teaching itself chess intelligence starting with nothing but the rules?

In a sense an intelligent agent is similar to a stone rolling down a hill, both are moving towards a sort of equilibrium. The difference is that intelligence is following more complex trajectories as its ability to read and respond to environmental cues is vastly greater than that of a stone. Yet intelligent or not, the environment in which an agent is embedded plays a crucial role. There exist a fundamental dependency on unintelligent processes. Our environment is structured in such a way that we use information within it as an extension of our minds. The environment enables us to learn and improve our predictions by providing a testbed and a constant stream of data.

Necessary resources for an intelligence explosion

If artificial general intelligence is unable to seize the resources necessary to undergo explosive recursive self-improvement then the ability and cognitive flexibility of superhuman intelligence in and of itself, as characteristics alone, would have to be sufficient to self-modify its way up to massive superhuman intelligence within a very short time.

Without advanced real-world nanotechnology it will be considerable more difficult for an AGI to undergo quick self-improvement. It will have to make use of existing infrastructure, e.g. buy stocks of chip manufactures and get them to create more or better CPU’s. It will have to rely on puny humans for a lot of tasks. It won’t be able to create new computational substrate without the whole economy of the world supporting it. It won’t be able to create an army of robot drones overnight without it either.

Doing so it would have to make use of considerable amounts of social engineering without its creators noticing it. But, more importantly, it will have to make use of its existing intelligence to do all of that. The AGI would have to acquire new resources slowly, as it couldn’t just self-improve to come up with faster and more efficient solutions. In other words, self-improvement would demand resources. The AGI could not profit from its ability to self-improve regarding the necessary acquisition of resources to be able to self-improve in the first place.

Therefore the absence of advanced nanotechnology constitutes an immense blow to the possibility of explosive recursive self-improvement and risks from AI in general.

One might argue that an AGI will solve nanotechnology on its own and find some way to trick humans into manufacturing a molecular assembler and grant it access to it. But this might be very difficult.

There is a strong interdependence of resources and manufacturers. The AGI won’t be able to simply trick some humans to build a high-end factory to create computational substrate, let alone a molecular assembler. People will ask questions and shortly after get suspicious. Remember, it won’t be able to coordinate a world-conspiracy, it hasn’t been able to self-improve to that point yet because it is still trying to acquire enough resources, which it has to do the hard way without nanotech.

Anyhow, you’d probably need a brain the size of the moon to effectively run and coordinate a whole world of irrational humans by intercepting their communications and altering them on the fly without anyone freaking out.

People associated with the SIAI would at this point claim that if the AI can’t make use of nanotechnology it might make use of something we haven’t even thought about. But what, magic?

Artificial general intelligence, a single break-through?

Another point to consider when talking about risks from AI is how quickly the invention of artificial general intelligence will take place. What evidence do we have that there is some principle that, once discovered, allows us to grow superhuman intelligence overnight?

If the development of AGI takes place slowly, a gradual and controllable development, we might be able to learn from small-scale mistakes while having to face other risks in the meantime. This might for example be the case if intelligence can not be captured by a discrete algorithm, or is modular, and therefore never allow us to reach a point where we can suddenly build the smartest thing ever that does just extend itself indefinitely.

To me it doesn’t look like that we will come up with artificial general intelligence quickly, but rather that we will have to painstakingly optimize our expert systems step by step over long periods of times.

Paperclip maximizers

It is claimed that an artificial general intelligence might wipe us out inadvertently while undergoing explosive recursive self-improvement to more effectively pursue its terminal goals. I think that it is unlikely that most AI designs will not hold.

I agree with the argument that any AGI that isn’t made to care about humans won’t care about humans. But I also think that the same argument applies for spatio-temporal scope boundaries and resource limits. Even if the AGI is not told to hold, e.g. compute as many digits of Pi as possible, I consider it an far-fetched assumption that any AGI intrinsically cares to take over the universe as fast as possible to compute as many digits of Pi as possible. Sure, if all of that are presuppositions then it will happen, but I don’t see that most of all AGI designs are like that. Most that have the potential for superhuman intelligence, but who are given simple goals, will in my opinion just bob up and down as slowly as possible.

Complex goals need complex optimization parameters (the design specifications of the subject of the optimization process against which it will measure its success of self-improvement).

Even the creation of paperclips is a much more complex goal than telling an AI to compute as many digits of Pi as possible.

For an AGI, that was designed to design paperclips, to pose an existential risk, its creators would have to be capable enough to enable it to take over the universe on its own, yet forget, or fail to, define time, space and energy bounds as part of its optimization parameters. Therefore, given the large amount of restrictions that are inevitably part of any advanced general intelligence, the nonhazardous subset of all possible outcomes might be much larger than that where the AGI works perfectly yet fails to hold before it could wreak havoc.

Fermi paradox

The Fermi paradox does allow for and provide the only conclusions and data we can analyze that amount to empirical criticism of concepts like that of a Paperclip maximizer and general risks from superhuman AI’s with non-human values without working directly on AGI to test those hypothesis ourselves.

If you accept the premise that life is not unique and special then one other technological civilisation in the observable universe should be sufficient to leave potentially observable traces of technological tinkering.

Due to the absence of any signs of intelligence out there, especially paper-clippers burning the cosmic commons, we might conclude that unfriendly AI could not be the most dangerous existential risk that we should worry about.

Summary

In principle we could build antimatter weapons capable of destroying worlds, but in practise it is much harder to accomplish.

There are many question marks when it comes to the possibility of superhuman intelligence, and many more about the possibility of recursive self-improvement. Most of the arguments in favor of those possibilities solely derive their appeal from being vague.

Further reading

View more: Prev | Next