Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Facing the Intelligence Explosion discussion page

18 lukeprog 26 November 2011 08:05AM

I've created a new website for my ebook Facing the Intelligence Explosion:

 

Sometime this century, machines will surpass human levels of intelligence and ability, and the human era will be over. This will be the most important event in Earth’s history, and navigating it wisely may be the most important thing we can ever do.

Luminaries from Alan Turing and Jack Good to Bill Joy and Stephen Hawking have warned us about this. Why do I think they’re right, and what can we do about it?

Facing the Intelligence Explosion is my attempt to answer those questions.

 

 

This page is the dedicated discussion page for Facing the Intelligence Explosion.

If you'd like to comment on a particular chapter, please give the chapter name at top of your comment so that others can more easily understand your comment. For example:

Re: From Skepticism to Technical Rationality

Here, Luke neglects to mention that...

I Stand by the Sequences

13 Grognor 15 May 2012 10:21AM

Edit, May 21, 2012: Read this comment by Yvain.

Forming your own opinion is no more necessary than building your own furniture.

- Peter de Blanc

There's been a lot of talk here lately about how we need better contrarians. I don't agree. I think the Sequences got everything right and I agree with them completely. (This of course makes me a deranged, non-thinking, Eliezer-worshiping fanatic for whom the singularity is a substitute religion. Now that I have   admitted   this, you don't have to point it out a dozen times in the comments.) Even the controversial things, like:

  • I think the many-worlds interpretation of quantum mechanics is the closest to correct and you're dreaming if you think the true answer will have no splitting (or I simply do not know enough physics to know why Eliezer is wrong, which I think is pretty unlikely but not totally discountable).
  • I think cryonics is a swell idea and an obvious thing to sign up for if you value staying alive and have enough money and can tolerate the social costs.
  • I think mainstream science is too slow and we mere mortals can do better with Bayes.
  • I am a utilitarian consequentialist and think that if allow someone to die through inaction, you're just as culpable as a murderer.
    • I completely accept the conclusion that it is worse to put dust specks in 3^^^3 people's eyes than to torture one person for fifty years. I came up with it independently, so maybe it doesn't count; whatever.
  • I tentatively accept Eliezer's metaethics, considering how unlikely it is that there will be a better one (maybe morality is in the gluons?)
  • "People are crazy, the world is mad," is sufficient for explaining most human failure, even to curious people, so long as they know the heuristics and biases literature.
  • Edit, May 27, 2012: You know what? I forgot one: Gödel, Escher, Bach is the best.

There are two tiny notes of discord on which I disagree with Eliezer Yudkowsky. One is that I'm not so sure as he is that a rationalist is only made when a person breaks with the world and starts seeing everybody else as crazy, and two is that I don't share his objection to creating conscious entities in the form of an FAI or within an FAI. I could explain, but no one ever discusses these things, and they don't affect any important conclusions. I also think the sequences are badly-organized and you should just read them chronologically instead of trying to lump them into categories and sub-categories, but I digress.

Furthermore, I agree with every essay I've ever read by Yvain, I use "believe whatever gwern believes" as a heuristic/algorithm for generating true beliefs, and don't disagree with anything I've ever seen written by Vladimir Nesov, Kaj Sotala, Luke Muelhauser, komponisto, or even Wei Dai; policy debates should not appear one-sided, so it's good that they don't.

I write this because I'm feeling more and more lonely, in this regard. If you also stand by the sequences, feel free to say that. If you don't, feel free to say that too, but please don't substantiate it. I don't want this thread to be a low-level rehash of tired debates, though it will surely have some of that in spite of my sincerest wishes.

Holden Karnofsky  said:

I believe I have read the vast majority of the Sequences, including the AI-foom debate, and that this content - while interesting and enjoyable - does not have much relevance for the arguments I've made.

I can't understand this. How could the sequences  not  be relevant? Half of them were created when Eliezer was thinking about AI problems.

So I say this, hoping others will as well:
I stand by the sequences.

And with that, I tap out.  I have found the answer, so I am leaving the conversation.

Even though I am not important here, I don't want you to interpret my silence from now on as indicating compliance.

After some degree of thought and nearly 200 comment replies on this article, I regret writing it. I was insufficiently careful, didn't think enough about how it might alter the social dynamics here, and didn't spend enough time clarifying, especially regarding the third bullet point. I also dearly hope that I have not entrenched anyone's positions, turning them into allied soldiers to be defended, especially not my own. I'm sorry.

Just another day in utopia

73 Stuart_Armstrong 25 December 2011 09:37AM

(Reposted from discussion at commentator suggestion)

Thinking of Eliezer's fun theory and the challenge of creating actual utopias where people would like to live, I tried to write a light utopia for my friends around Christmas, and thought it might be worth sharing. It's a techno-utopia, but (considering my audience) it's only a short inferential distance from normality.

 

 

Just another day in Utopia

Ishtar went to sleep in the arms of her lover Ted, and awoke locked in a safe, in a cargo hold of a triplane spiralling towards a collision with the reconstructed temple of Solomon.

 

Again! Sometimes she wished that a whole week would go by without something like that happening. But then, she had chosen a high excitement existence (not maximal excitement, of course – that was for complete masochists), so she couldn’t complain. She closed her eyes for a moment and let the thrill and the adrenaline warp her limbs and mind, until she felt transformed, yet again, into a demi-goddess of adventure. Drugs couldn’t have that effect on her, she knew; only real danger and challenge could do that.

 

continue reading »

Evaluating the feasibility of SI's plan

24 JoshuaFox 10 January 2013 08:17AM

(With Kaj Sotala)

SI's current R&D plan seems to go as follows: 

1. Develop the perfect theory.
2. Implement this as a safe, working, Artificial General Intelligence -- and do so before anyone else builds an AGI.

The Singularity Institute is almost the only group working on friendliness theory (although with very few researchers). So, they have the lead on Friendliness. But there is no reason to think that they will be ahead of anyone else on the implementation.

The few AGI designs we can look at today, like OpenCog, are big, messy systems which intentionally attempt to exploit various cognitive dynamics that might combine in unexpected and unanticipated ways, and which have various human-like drives rather than the sort of supergoal-driven, utility-maximizing goal hierarchies that Eliezer talks about, or which a mathematical abstraction like AIXI employs.

A team which is ready to adopt a variety of imperfect heuristic techniques will have a decisive lead on approaches based on pure theory. Without the constraint of safety, one of them will beat SI in the race to AGI. SI cannot ignore this. Real-world, imperfect, safety measures for real-world, imperfect AGIs are needed.  These may involve mechanisms for ensuring that we can avoid undesirable dynamics in heuristic systems,  or AI-boxing toolkits usable in the pre-explosion stage, or something else entirely. 

SI’s hoped-for theory will include a reflexively consistent decision theory, something like a greatly refined Timeless Decision Theory.  It will also describe human value as formally as possible, or at least describe a way to pin it down precisely, something like an improved Coherent Extrapolated Volition.

The hoped-for theory is intended to  provide not only safety features, but also a description of the implementation, as some sort of ideal Bayesian mechanism, a theoretically perfect intelligence.

SIers have said to me that SI's design will have a decisive implementation advantage. The idea is that because strap-on safety can’t work, Friendliness research necessarily involves more fundamental architectural design decisions, which also happen to be general AGI design decisions that some other AGI builder could grab and save themselves a lot of effort. The assumption seems to be that all other designs are based on hopelessly misguided design principles. SI-ers, the idea seems to go, are so smart that they'll  build AGI far before anyone else. Others will succeed only when hardware capabilities allow crude near-brute-force methods to work.

Yet even if the Friendliness theory provides the basis for intelligence, the nitty-gritty of SI’s implementation will still be far away, and will involve real-world heuristics and other compromises.

We can compare SI’s future AI design to AIXI, another mathematically perfect AI formalism (though it has some critical reflexivity issues). Schmidhuber, Hutter, and colleagues think that their AXI can be scaled down into a feasible implementation, and have implemented some toy systems. Similarly, any actual AGI based on SI's future theories will have to stray far from its mathematically perfected origins.

Moreover, SI's future friendliness proof may simply be wrong. Eliezer writes a lot about logical uncertainty, the idea that you must treat even purely mathematical ideas with same probabilistic techniques as any ordinary uncertain belief. He pursues this mostly so that his AI can reason about itself, but the same principle applies to Friendliness proofs as well.

Perhaps Eliezer thinks that a heuristic AGI is absolutely doomed to failure; that a hard takeoff  immediately soon after the creation of the first AGI is so overwhelmingly likely that a mathematically designed AGI is the only one that could stay Friendly. In that case, we have to work on a pure-theory approach, even if it has a low chance of being finished first. Otherwise we'll be dead anyway. If an embryonic AGI will necessarily undergo an intelligence explosion, we have no choice but to "shut up and do the impossible."

I am all in favor of gung-ho knife-between-the teeth projects. But when you think that your strategy is impossible, then you should also look for a strategy which is possible, if only as a fallback. Thinking about safety theory until drops of blood appear on your forehead (as Eliezer puts it, quoting Gene Fowler), is all well and good. But if there is only a 10% chance of achieving 100% safety (not that there really is any such thing), then I'd rather go for a strategy that provides only a 40% promise of safety, but with a 40% chance of achieving it. OpenCog and the like are going to be developed regardless, and probably before SI's own provably friendly AGI. So, even an imperfect safety measure is better than nothing.

If heuristic approaches have a 99% chance of an immediate unfriendly explosion, then that might be wrong. But SI, better than anyone, should know that any intuition-based probability estimate of “99%” really means “70%”. Even if other approaches are long-shots, we should not put all our eggs in one basket. Theoretical perfection and stopgap safety measures can be developed in parallel.

Given what we know about human overconfidence and the general reliability of predictions, the actual outcome will to a large extent be something that none of us ever expected or could have predicted. No matter what happens, progress on safety mechanisms for heuristic AGI will improve our chances if something entirely unexpected happens.

What impossible thing should SI be shutting up and doing? For Eliezer, it’s Friendliness theory. To him, safety for heuristic AGI is impossible, and we shouldn't direct our efforts in that direction. But why shouldn't safety for heuristic AGI be another impossible thing to do?

(Two impossible things before breakfast … and maybe a few more? Eliezer seems to be rebuilding logic, set theory, ontology, epistemology, axiology, decision theory, and more, mostly from scratch. That's a lot of impossibles.)

And even if safety for heuristic AGIs is really impossible for us to figure out now, there is some chance of an extended soft takeoff that will allow for the possibility of us developing heuristic AGIs which will help in figuring out AGI safety, whether because we can use them for our tests, or because they can by applying their embryonic general intelligence to the problem. Goertzel and Pitt have urged this approach.

Yet resources are limited. Perhaps the folks who are actually building their own heuristic AGIs are in a better position than SI to develop safety mechanisms for them, while SI is the only organization which is really working on a formal theory on Friendliness, and so should concentrate on that. It could be better to focus SI's resources on areas in which it has a relative advantage, or which have a greater expected impact.

Even if so, SI should evangelize AGI safety to other researchers, not only as a general principle, but also by offering theoretical insights that may help them as they work on their own safety mechanisms.

In summary:

1. AGI development which is unconstrained by a friendliness requirement is likely to beat a provably-friendly design in a race to implementation, and some effort should be expended on dealing with this scenario.

2. Pursuing a provably-friendly AGI, even if very unlikely to succeed, could still be the right thing to do if it was certain that we’ll have a hard takeoff very soon after the creation of the first AGIs. However, we do not know whether or not this is true.

3. Even the provably friendly design will face real-world compromises and errors in its  implementation, so the implementation will not itself be provably friendly. Thus, safety protections of the sort needed for heuristic design are needed even for a theoretically Friendly design.

Harry Potter and the Methods of Rationality discussion thread, part 18, chapter 87

4 Alsadius 22 December 2012 07:55AM

This is a new thread to discuss Eliezer Yudkowsky’s Harry Potter and the Methods of Rationality and anything related to it. This thread is intended for discussing chapter 87The previous thread has passed 500 comments. 

There is now a site dedicated to the story at hpmor.com, which is now the place to go to find the authors notes and all sorts of other goodies. AdeleneDawner has kept an archive of Author’s Notes. (This goes up to the notes for chapter 76, and is now not updating. The authors notes from chapter 77 onwards are on hpmor.com.) 

The first 5 discussion threads are on the main page under the harry_potter tag.  Threads 6 and on (including this one) are in the discussion section using its separate tag system.  Also: 12345678910111213141516, 17.

Spoiler Warning: this thread is full of spoilers. With few exceptions, spoilers for MOR and canon are fair game to post, without warning or rot13. More specifically:

You do not need to rot13 anything about HP:MoR or the original Harry Potter series unless you are posting insider information from Eliezer Yudkowsky which is not supposed to be publicly available (which includes public statements by Eliezer that have been retracted).

If there is evidence for X in MOR and/or canon then it’s fine to post about X without rot13, even if you also have heard privately from Eliezer that X is true. But you should not post that “Eliezer said X is true” unless you use rot13.

Train Philosophers with Pearl and Kahneman, not Plato and Kant

61 lukeprog 06 December 2012 12:42AM

Part of the sequence: Rationality and Philosophy

Hitherto the people attracted to philosophy have been mostly those who loved the big generalizations, which were all wrong, so that few people with exact minds have taken up the subject.

Bertrand Russell

 

I've complained before that philosophy is a diseased discipline which spends far too much of its time debating definitions, ignoring relevant scientific results, and endlessly re-interpreting old dead guys who didn't know the slightest bit of 20th century science. Is that still the case?

You bet. There's some good philosophy out there, but much of it is bad enough to make CMU philosopher Clark Glymour suggest that on tight university budgets, philosophy departments could be defunded unless their work is useful to (cited by) scientists and engineers — just as his own work on causal Bayes nets is now widely used in artificial intelligence and other fields.

How did philosophy get this way? Russell's hypothesis is not too shabby. Check the syllabi of the undergraduate "intro to philosophy" classes at the world's top 5 U.S. philosophy departmentsNYU, Rutgers, Princeton, Michigan Ann Arbor, and Harvard — and you'll find that they spend a lot of time with (1) old dead guys who were wrong about almost everything because they knew nothing of modern logic, probability theory, or science, and with (2) 20th century philosophers who were way too enamored with cogsci-ignorant armchair philosophy. (I say more about the reasons for philosophy's degenerate state here.)

As the CEO of a philosophy/math/compsci research institute, I think many philosophical problems are important. But the field of philosophy doesn't seem to be very good at answering them. What can we do?

Why, come up with better philosophical methods, of course!

Scientific methods have improved over time, and so can philosophical methods. Here is the first of my recommendations...

continue reading »

[link] Interview with Anders Sandberg on how to make a difference through research and how to choose a research topic

8 Pablo_Stafforini 01 December 2012 06:25PM

Here. Some excerpts:

What do you think are some good heuristics for doing high impact research?

One idea is to go for under-researched fields. Progress in a field is typically a very convex learning curve: rapid progress at first when the low-hanging fruits get picked by the pioneers, followed by slowing progress as the problems get harder and it takes longer to learn the necessary skills to get to them. So the same amount of effort might produce far more progress in a little studied field than in a big one. [...]

It can also help to turn the question around: what aspects of human life matter? Looking at human life, we sleep about a third of the time, and there’s very little research into how to enhance sleep. Understanding the health effects of what we eat is probably worth billions of pounds per year. But there are no financial incentives here. Maybe a simple approach for finding high impact research areas might be to look at the most common google searches: you can get a pretty good idea of what human behaviour matters a lot!

Do you think it’s better to be a generalist and get a broad understanding of a lot of things, or to specialise early and really focus on a single area you think is high impact?

Over the history of my academic career my most useful courses have been linear algebra, all the statistics and probability theory I’ve been able to pick up, some basic computer science, and a course on natural disasters. [...]

Even if you do focus on one field, knowing enough about other fields is good as you can recognise when you need the help of someone from another department.

What other barriers are there to doing important research?

Looking at some of these under-researched fields, the problem is that a lot of them don’t even exist as fields. Typically you’re unlikely to get funding in unknown fields as well: unless you’re a really good salesman! So one heuristic would be to look at the topics you know, do a matrix and look at the interactions: which areas do you see that have nobody doing anything in?

When I went to a computational neuroscience conference last year, I was slightly depressed as I saw a poster which was exactly the same research topic as my last poster! It was pretty clear the young grad student had reached the same conclusion I did, and had never heard of my research which was published 6 years ago! Many fields have this problem that they don’t have very much of a memory, which affects progress.

Fields like AI are struggling because there’s no good way of comparing progress. How much smarter are current general AI programs than some of the classics? Nobody knows, and you can’t test the older programs because the source code and everything has been lost except a few bizarre papers from the early 70s.

LessWrong help desk - free paper downloads and more

35 jsalvatier 07 October 2012 11:45PM

Over the last year, VincentYu, gwern, myself and others have provided 132 academic papers for the LessWrong community (out of 152 requests, a 87% success rate) through the Free research, editing and articles thread. We originally intended to provide editing, research and general troubleshooting help, but article downloads are by far the most requested service.

If you're doing a LessWrong relevant project we want to help you. If you need help accessing a journal article or academic book chapter, we can get it for you. If you need some research or writing help, we can help there too.

Turnaround times for articles published in the last 20 years or so is usually less than a day. Older articles often take a couple days.

Please make new article requests in the comment section of this thread.

If you would like to help out with finding papers, please monitor this thread for requests. If you want to monitor via RSS like I do, Google Reader will give you the comment feed if you give it the URL for this thread (or use this link directly). 

If you have some special skills you want to volunteer, mention them in the comment section.

A summary of the Hanson-Yudkowsky FOOM debate

22 Kaj_Sotala 15 November 2012 07:25AM

In late spring this year, Luke tasked me with writing a summary and analysis of the Hanson-Yudkowsky FOOM debate, with the intention of having it eventually published in somewhere. Due to other priorities, this project was put on hold for the time being. Because it doesn't look like it will be finished in the near future, and because Curiouskid asked to see it, we thought that we might as well share the thing.

I have reorganized the debate, presenting it by topic rather than in chronological order: I start by providing some brief conceptual background that's useful for understanding Eliezer's optimization power argument, after which I present his argument. Robin's various objections follow, after which there is a summary of Robin's view of how the Singularity will be like, together with Eliezer's objections to that view. Hopefully, this should make the debate easier to follow. This summary also incorporates material from the 90-minute live debate on the topic that they had in 2011. The full table of contents:

  1. Introduction
  2. Overview
  3. The optimization power argument
    1. Conceptual background
    2. The argument: Yudkowsky
    3. Recursive self-improvement
    4. Hard takeoff
    5. Questioning optimization power: the question of abstractions
    6. Questioning optimization power: the historical record
    7. Questioning optimization power: the UberTool question
  4. Hanson's Singularity scenario
    1. Architecture vs. content, sharing of information
    2. Modularity of knowledge
    3. Local or global singularity?
  5. Wrap-up
  6. Conclusions
  7. References

Here's the link to the current draft, any feedback is welcomed. Feel free to comment if you know of useful references, if you think I've misinterpreted something that was said, or if you think there's any other problem. I'd also be curious to hear to what extent people think that this outline is easier to follow than the original debate, or whether it's just as confusing.

Things philosophers have debated

4 Eliezer_Yudkowsky 31 October 2012 05:09AM

Straight from Wikipedia.

I just had to stare at this a while.  We can have papers published about this, we really ought to be able to get papers published about Friendly AI subproblems.

My favorite part is at the very end.


Trivialism is the theory that every proposition is true. A consequence of trivialism is that all statements, including all contradictions of the form "p and not p" (that something both 'is' and 'isn't' at the same time), are true.[1]

[edit]See also

[edit]References

  1. ^ Graham Priest; John Woods (2007). "Paraconsistency and Dialetheism"The Many Valued and Nonmonotonic Turn in Logic. Elsevier. p. 131. ISBN 978-0-444-51623-7.

[edit]Further reading

View more: Next