Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

[Link]"Neural Turing Machines"

Prankster 31 October 2014 08:54AM

The paper.

Discusses the technical aspects of one of Googles AI projects. According to a pcworld the system "apes human memory and programming skills" (this article seems pretty solid, also contains link to the paper). 

The abstract:

We extend the capabilities of neural networks by coupling them to external memory resources, which they can interact with by attentional processes. The combined system is analogous to a Turing Machine or Von Neumann architecture but is differentiable end-to-end, allowing it to be efficiently trained with gradient descent. Preliminary results demonstrate that Neural Turing Machines can infer simple algorithms such as copying, sorting, and associative recall from input and output examples.

 

(First post here, feedback on the appropriateness of the post appreciated)

Maybe you want to maximise paperclips too

18 dougclow 30 October 2014 09:40PM

As most LWers will know, Clippy the Paperclip Maximiser is a superintelligence who wants to tile the universe with paperclips. The LessWrong wiki entry for Paperclip Maximizer says that:

The goal of maximizing paperclips is chosen for illustrative purposes because it is very unlikely to be implemented

I think that a massively powerful star-faring entity - whether a Friendly AI, a far-future human civilisation, aliens, or whatever - might indeed end up essentially converting huge swathes of matter in to paperclips. Whether a massively powerful star-faring entity is likely to arise is, of course, a separate question. But if it does arise, it could well want to tile the universe with paperclips.

Let me explain.

paperclips

To travel across the stars and achieve whatever noble goals you might have (assuming they scale up), you are going to want energy. A lot of energy. Where do you get it? Well, at interstellar scales, your only options are nuclear fusion or maybe fission.

Iron has the strongest binding energy of any nucleus. If you have elements lighter than iron, you can release energy through nuclear fusion - sticking atoms together to make bigger ones. If you have elements heavier than iron, you can release energy through nuclear fission - splitting atoms apart to make smaller ones. We can do this now for a handful of elements (mostly selected isotopes of uranium, plutonium and hydrogen) but we don’t know how to do this for most of the others - yet. But it looks thermodynamically possible. So if you are a massively powerful and massively clever galaxy-hopping agent, you can extract maximum energy for your purposes by taking up all the non-ferrous matter you can find and turning it in to iron, getting energy through fusion or fission as appropriate.

You leave behind you a cold, dark trail of iron.

That seems a little grim. If you have any aesthetic sense, you might want to make it prettier, to leave an enduring sign of values beyond mere energy acquisition. With careful engineering, it would take only a tiny, tiny amount of extra effort to leave the iron arranged in to beautiful shapes. Curves are nice. What do you call a lump of iron arranged in to an artfully-twisted shape? I think we could reasonably call it a paperclip.

Over time, the amount of space that you’ve visited and harvested for energy will increase, and the amount of space available for your noble goals - or for anyone else’s - will decrease. Gradually but steadily, you are converting the universe in to artfully-twisted pieces of iron. To an onlooker who doesn’t see or understand your noble goals, you will look a lot like you are a paperclip maximiser. In Eliezer’s terms, your desire to do so is an instrumental value, not a terminal value. But - conditional on my wild speculations about energy sources here being correct - it’s what you’ll do.

Academic papers

3 Capla 30 October 2014 04:53PM

In line with my continuing  self eduction...

What are the most important or personally influential academic papers you've ever read? Which ones are essential (or just good) for an informed person to have read?

Is there any body of research of which you found the original papers much more valuable than than the popularizations or secondary sources (Wikipedia articles, textbook write ups, ect.), for any reason? What was that reason? Does anyone have a good heuristic for when it is important to "go to the source" and when someone else's summation will do? I have theoretical preference for reading the original research, since if I need to evaluate an idea's merit, reading what others in that field read (instead of the simplified versions) seems like a good idea, but it has the downside of being harder and more time-consuming.

I have wondered if the only reason to bother with technical sounding papers that are hard to understand is that you have to read them (or pretend to read them) in order to cite them.

 

Link: Open-source programmable, 3D printable robot for anyone to experiment with

1 polymathwannabe 29 October 2014 02:21PM

Its name is Poppy.

"Both hardware and software are open source. There is not one single Poppy humanoid robot but as many as there are users. This makes it very attractive as it has grown from a purely technological tool to a real social platform."

Wikipedia articles from the future

15 snarles 29 October 2014 12:49PM

Speculation is important for forecasting; it's also fun.  Speculation is usually conveyed in two forms: in the form of an argument, or encapsulated in fiction; each has their advantages, but both tend to be time-consuming.  Presenting speculation in the form of an argument involves researching relevant background and formulating logical arguments.  Presenting speculation in the form of fiction requires world-building and storytelling skills, but it can quickly give the reader an impression of the "big picture" implications of the speculation; this can be more effective at establishing the "emotional plausibility" of the speculation.

I suggest a storytelling medium which can combine attributes of both arguments and fiction, but requires less work than either. That is the "wikipedia article from the future." Fiction written by inexperienced sci-fi writers tends to generate into a speculative encyclopedia anyways--why not just admit that you want to write an encyclopedia in the first place?  Post your "Wikipedia articles from the future" below.

LW Supplement use survey

10 FiftyTwo 28 October 2014 09:28PM

I've put together a very basic survey using google forms inspired by NancyLebovitz recent discussion post on supplement use 

Survey includes options for "other" and "do not use supplements." Results are anonymous and you can view all the results once you have filled it in, or use this link

 

Link to the Survey

Things to consider when optimizing: Sleep

11 mushroom 28 October 2014 05:26PM

I'd like to have a series of discussion posts, where each post is of the form "Let's brainstorm things you might consider when optimizing X", where X is something like sleep, exercise, commuting, studying, etc. Think of it like a specialized repository.

In the spirit of try more things, the direct benefit is to provide insights like "Oh, I never realized that BLAH is a knob I can fiddle. This gives me an idea of how I might change BLAH given my particular circumstances. I will try this and see what happens!"

The indirect benefit is to practice instrumental rationality using the "toy problem" provided by a general prompt.

Accordingly, participation could be in many forms:

* Pointers to scientific research
* General directions to consider
* Personal experience
* Boring advice
* Intersections with other community ideas, biases
* Cost-benefit, value-of-information analysis
* Related questions
* Other musings, thoughts, speculation, links, theories, etc.

This post is on sleep and circadian rhythms.

Cross-temporal dependency, value bounds and superintelligence

4 joaolkf 28 October 2014 03:26PM

In this short post I will attempt to put forth some potential concerns that should be relevant when developing superintelligences, if certain meta-ethical effects exist. I do not claim they exist, only that it might be worth looking for them since their existence would mean some currently irrelevant concerns are in fact relevant. 

 

These meta-ethical effects would be a certain kind of cross-temporal dependency on moral value. First, let me explain what I mean by cross-temporal dependency. If value is cross-temporal dependent it means that value at t2 could be affected by t1, independently of any causal role t1 has on t2. The same event X at t2 could have more or less moral value depending onwhether Z or Y happened at t1. For instance, this could be the case on matters of survival. If we kill someone and replace her with a slightly more valuable person some would argue there was a lost rather than a gain of moral value; whereas if a new person with moral value equal to the difference of the previous two is created where there was none, most would consider an absolute gain. Furthermore, some might consider small, gradual and continual improvements are better than abrupt and big ones. For example, a person that forms an intention and a careful detailed plan to become better, and forceful self-wrought to be better could acquire more value than a person that simply happens to take a pill and instantly becomes a better person - even if they become that exact same person. This is not because effort is intrinsically valuable, but because of personal continuity. There are more intentions, deliberations and desires connecting the two time-slices of the person who changed through effort than there are connecting the two time-slices of the person who changed by taking a pill. Even though both persons become equally morally valuable in isolated terms, they do so from different paths that differently affects their final value.

More examples. You live now in t1. If suddenly in t2 you were replaced by an alien individual with the same amount of value as you would otherwise have in t2, then t2 may not have the exact same amount of value as it would otherwise have, simply in virtue of the fact that in t1 you were alive and the alien's previous time slice was not. 365 individuals with a 1 day life do not amount for the same value as a single individual living through 365 days. Slice history in 1 day periods, each day the universe contains one unique advanced civilization with the same overall total moral value, each civilization being completely alien and ineffable to another, each civilization only lives for one day, and then it would be gone forever. This universe does not seem to hold the same moral value as the one where only one of those civilizations flourishes for eternity. On all these examples the value of a period of time seems to be affected by the existence or not of certain events at other periods. They indicate that there is, at least, some cross-temporal dependency.

 

Now consider another type of effect, bounds on value. There could be a physical bound – transfinite or not - on the total amount of moral value that can be present per instant. For instance, if moral value rests mainly on sentient well-being, which can be categorized as a particular kind of computation, and there is a bound on the total amount of such computation which can be performed per instant, then there is a bound on the amount of value per instant. If, arguably, we are currently extremely far from such bound, and this bound will eventually be reached by a superintelligence (or any other structure), then the total moral value of the universe would be dominated by the value of this physical bound, given that regions where the physical bound wasn't reached would make negligible contributions. How much faster the bound can be reached, also how much more negligible pre-bound values are.

 

Finally, if there is a form of value cross-temporal dependence where preceding events leading to a superintelligence could alter the value of this physical bound, then we not only ought to make sure we safely construct a superintelligence, but also that we do so following the path that maximizes such bound. It might be the case that an overly abrupt superintelligence would decrease such bound, thus all future moral value would be diminished by the fact there was a huge discontinuity in the past in the events leading to this future. Even small decreases on such bound would have dramatic effects. Although I do not know of any plausible cross-temporal effect of such kind, it seems this question deserves at least a minimal amount of though. Both cross-temporal dependency and bounds on value seem plausible (in fact I believe some form of them are true), so it is not at all prima facie inconceivable that we could have cross-temporal effects changing the bound up or down.

Donation Discussion - alternatives to the Against Malaria Foundation

4 ancientcampus 28 October 2014 03:00AM

About a year and a half ago, I made a donation to the Against Malaria Foundation. This was during jkaufman's generous matching offer.

That was 20 months ago, and my money is still in the "underwriting" phase - funding projects that are still, of yet, just plans and no nets.

Now, the AMF has had a reasonable reason it was taking longer than expected:

"A provisional, large distribution in a province of the [Democratic Republic of the Congo] will not proceed as the distribution agent was unable to agree to the process requested by AMF during the timeframe needed by our co-funding partner."

So they've hit a snag, the earlier project fell through, and they are only now allocating my money to a new project. Don't get me wrong, I am very glad they are telling me where my money is going, and especially glad it didn't just end up in someone's pocket instead. With that said, though, I still must come to this conclusion:

The AMF seems to have more money than they can use, right now.

So, LW, I have the following questions:

  1. Is this a problem? Should one give their funds to another charity for the time being?
  2. Regardless of your answer to the above, are there any recommendations for other transparent, efficient charities? [other than MIRI]

Link: Elon Musk wants gov't oversight for AI

8 polymathwannabe 28 October 2014 02:15AM

"I'm increasingly inclined to thing there should be some regulatory oversight, maybe at the national and international level just to make sure that we don't do something very foolish."

http://www.cnet.com/news/elon-musk-we-are-summoning-the-demon-with-artificial-intelligence/#ftag=CAD590a51e

Superintelligence 7: Decisive strategic advantage

5 KatjaGrace 28 October 2014 01:01AM

This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.


Welcome. This week we discuss the seventh section in the reading guideDecisive strategic advantage. This corresponds to Chapter 5.

This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.

There is no need to proceed in order through this post, or to look at everything. Feel free to jump straight to the discussion. Where applicable and I remember, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).

Reading: Chapter 5 (p78-91)


Summary

  1. Question: will a single artificial intelligence project get to 'dictate the future'? (p78)
  2. We can ask, will a project attain a 'decisive strategic advantage' and will they use this to make a 'singleton'?
    1. 'Decisive strategic advantage' = a level of technological and other advantages sufficient for complete world domination (p78)
    2. 'Singleton' = a single global decision-making agency strong enough to solve all major global coordination problems (p78, 83)
  3. A project will get a decisive strategic advantage if there is a big enough gap between its capability and that of other projects. 
  4. A faster takeoff would make this gap bigger. Other factors would too, e.g. diffusion of ideas, regulation or expropriation of winnings, the ease of staying ahead once you are far enough ahead, and AI solutions to loyalty issues (p78-9)
  5. For some historical examples, leading projects have a gap of a few months to a few years with those following them. (p79)
  6. Even if a second project starts taking off before the first is done, the first may emerge decisively advantageous. If we imagine takeoff accelerating, a project that starts out just behind the leading project might still be far inferior when the leading project reaches superintelligence. (p82)
  7. How large would a successful project be? (p83) If the route to superintelligence is not AI, the project probably needs to be big. If it is AI, size is less clear. If lots of insights are accumulated in open resources, and can be put together or finished by a small team, a successful AI project might be quite small (p83).
  8. We should distinguish the size of the group working on the project, and the size of the group that controls the project (p83-4)
  9. If large powers anticipate an intelligence explosion, they may want to monitor those involved and/or take control. (p84)
  10. It might be easy to monitor very large projects, but hard to trace small projects designed to be secret from the outset. (p85)
  11. Authorities may just not notice what's going on, for instance if politically motivated firms and academics fight against their research being seen as dangerous. (p85)
  12. Various considerations suggest a superintelligence with a decisive strategic advantage would be more likely than a human group to use the advantage to form a singleton (p87-89)

Another view

This week, Paul Christiano contributes a guest sub-post on an alternative perspective:

Typically new technologies do not allow small groups to obtain a “decisive strategic advantage”—they usually diffuse throughout the whole world, or perhaps are limited to a single country or coalition during war. This is consistent with intuition: a small group with a technological advantage will still do further research slower than the rest of the world, unless their technological advantage overwhelms their smaller size.

The result is that small groups will be overtaken by big groups. Usually the small group will sell or lease their technology to society at large first, since a technology’s usefulness is proportional to the scale at which it can be deployed. In extreme cases such as war these gains might be offset by the cost of empowering the enemy. But even in this case we expect the dynamics of coalition-formation to increase the scale of technology-sharing until there are at most a handful of competing factions.

So any discussion of why AI will lead to a decisive strategic advantage must necessarily be a discussion of why AI is an unusual technology.

In the case of AI, the main difference Bostrom highlights is the possibility of an abrupt increase in productivity. In order for a small group to obtain such an advantage, their technological lead must correspond to a large productivity improvement. A team with a billion dollar budget would need to secure something like a 10,000-fold increase in productivity in order to outcompete the rest of the world. Such a jump is conceivable, but I consider it unlikely. There are other conceivable mechanisms distinctive to AI; I don’t think any of them have yet been explored in enough depth to be persuasive to a skeptical audience.


Notes

1. Extreme AI capability does not imply strategic advantage. An AI program could be very capable - such that the sum of all instances of that AI worldwide were far superior (in capability, e.g. economic value) to the rest of humanity's joint efforts - and yet the AI could fail to have a decisive strategic advantage, because it may not be a strategic unit. Instances of the AI may be controlled by different parties across society. In fact this is the usual outcome for technological developments.

2. On gaps between the best AI project and the second best AI project (p79) A large gap might develop either because of an abrupt jump in capability or extremely fast progress (which is much like an abrupt jump), or from one project having consistent faster growth than other projects for a time. Consistently faster progress is a bit like a jump, in that there is presumably some particular highly valuable thing that changed at the start of the fast progress. Robin Hanson frames his Foom debate with Eliezer as about whether there are 'architectural' innovations to be made, by which he means innovations which have a large effect (or so I understood from conversation). This seems like much the same question. On this, Robin says:

Yes, sometimes architectural choices have wider impacts. But I was an artificial intelligence researcher for nine years, ending twenty years ago, and I never saw an architecture choice make a huge difference, relative to other reasonable architecture choices. For most big systems, overall architecture matters a lot less than getting lots of detail right. Researchers have long wandered the space of architectures, mostly rediscovering variations on what others found before.

3. What should activists do? Bostrom points out that activists seeking maximum expected impact might wish to focus their planning on high leverage scenarios, where larger players are not paying attention (p86). This is true, but it's worth noting that changing the probability of large players paying attention is also an option for activists, if they think the 'high leverage scenarios' are likely to be much better or worse.

4. Trade. One key question seems to be whether successful projects are likely to sell their products, or hoard them in the hope of soon taking over the world. I doubt this will be a strategic decision they will make - rather it seems that one of these options will be obviously better given the situation, and we are uncertain about which. A lone inventor of writing should probably not have hoarded it for a solitary power grab, even though it could reasonably have seemed like a good candidate for radically speeding up the process of self-improvement.

5. Disagreement. Note that though few people believe that a single AI project will get to dictate the future, this is often because they disagree with things in the previous chapter - e.g. that a single AI project will plausibly become more capable than the world in the space of less than a month.

6. How big is the AI project? Bostrom distinguishes between the size of the effort to make AI and the size of the group ultimately controlling its decisions. Note that the people making decisions for the AI project may also not be the people making decisions for the AI - i.e. the agents that emerge. For instance, the AI making company might sell versions of their AI to a range of organizations, modified for their particular goals. While in some sense their AI has taken over the world, the actual agents are acting on behalf of much of society.

In-depth investigations

If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, some inspired by Luke Muehlhauser's list, which contains many suggestions related to parts of Superintelligence. These projects could be attempted at various levels of depth.

 

  1. When has anyone gained a 'decisive strategic advantage' at a smaller scale than the world? Can we learn anything interesting about what characteristics a project would need to have such an advantage with respect to the world?
  2. How scalable is innovative project secrecy? Examine past cases: Manhattan project, Bletchly park, Bitcoin, Anonymous, Stuxnet, Skunk Works, Phantom Works, Google X.
  3. How large are the gaps in development time between modern software projects? What dictates this? (e.g. is there diffusion of ideas from engineers talking to each other? From people changing organizations? Do people get far enough ahead that it is hard to follow them?)

 

If you are interested in anything like this, you might want to mention it in the comments, and see whether other people have useful thoughts.

How to proceed

This has been a collection of notes on the chapter.  The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!

Next week, we will talk about Cognitive superpowers (section 8). To prepare, read Chapter 6The discussion will go live at 6pm Pacific time next Monday 3 November. Sign up to be notified here.

Stupid Questions (10/27/2014)

14 drethelin 27 October 2014 09:27PM

I think it's past time for another Stupid Questions thread, so here we go. 

 

This thread is for asking any questions that might seem obvious, tangential, silly or what-have-you. Please respect people trying to fix any ignorance they might have, rather than mocking that ignorance. 

 

vaccination research/reading

0 freyley 27 October 2014 05:20PM

Vaccination is probably one of the hardest topics to have a rational discussion about. I have some reason to believe that the author of http://whyarethingsthisway.com/2014/10/23/the-cdc-and-cargo-cult-science/ is someone interested in looking for the truth, not winning a side - at the very least, I'd like to help him when he says this:

I genuinely don’t want to do Cargo Cult Science so if anybody reading this knows of any citations to studies looking at the long term effects of vaccines and finding them benign or beneficial, please, be sure to post them in the comments.

 

I'm getting started on reading the actual papers, but I'm hoping this finds someone who's already done the work and wants to go post it on his site, or if not, someone else who's interested in looking through papers with me - I do better at this kind of work with social support. 

Open thread, Oct. 27 - Nov. 2, 2014

5 MrMind 27 October 2014 08:58AM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

Don't Be Afraid of Asking Personally Important Questions of Less Wrong

31 Evan_Gaensbauer 26 October 2014 08:02AM

Related: LessWrong as a social catalyst

I primarily used my prior user profile asked questions of Less Wrong. When I had an inkling for a query, but I didn't have a fully formed hypothesis, I wouldn't know how to search for answers to questions on the Internet myself, so I asked them on Less Wrong.

The reception I have received has been mostly positive. Here are some examples:

  • Back when I was trying to figure out which college major to pursue, I queried Less Wrong about which one was worth my effort. I followed this up with a discussion about whether it was worthwhile for me to personally, and for someone in general, to pursue graduate studies.


Other student users of Less Wrong benefit from the insight of their careered peers:

  • A friend of mine was considering pursuing medicine to earn to give. In the same vein as my own discussion, I suggested he pose the question to Less Wrong. He didn't feel like it at first, so I posed the query on his behalf. In a few days, he received feedback which returned the conclusion that pursuing medical school through the avenues he was aiming for wasn't his best option relative to his other considerations. He showed up in the thread, and expressed his gratitude. The entirely of the online rationalist community was willing to respond provided valuable information for an important question. It might have taken him lots of time, attention, and effort to look for the answers to this question by himself.

In engaging with Less Wrong, with the rest of you, my experience has been that Less Wrong isn't just useful as an archive of blog posts, but is actively useful as a community of people. As weird as it may seem, you can generate positive externalities that improve the lives of others by merely writing a blog post. This extends to responding in the comments section too. Stupid Questions Threads are a great example of this; you can ask questions about your procedural knowledge gaps without fear of reprisal.  People have gotten great responses about getting more value out of conversations, to being more socially successful, to learning and appreciating music as an adult. Less Wrong may be one of few online communities for which even the comments sections are useful, by default.

For the above examples, even though they weren't the most popular discussions ever started, and likely didn't get as much traffic, it's because of the feedback they received that made them more personally valuable to one individual than several others.

At the CFAR workshop I attended, I was taught two relevant skills:

* Value of Information Calculations: formulating a question well, and performing a Fermi estimate, or back-of-the-envelope question, in an attempt to answer it, generates quantified insight you wouldn't have otherwise anticipated.

* Social Comfort Zone Expansion: humans tend to have a greater aversion to trying new things socially than is maximally effective, and one way of viscerally teaching System 1 this lesson is by trial-and-error of taking small risks. Posting on Less Wrong, especially, e.g., in a special thread, is really a low-risk action. The pang of losing karma can feel real, but losing karma really is a valuable signal that one should try again differently. Also, it's not as bad as failing at taking risks in meatspace.

When I've received downvotes for a comment, I interpret that as useful information, try to model what I did wrong, and thank others for correcting my confused thinking. If you're worried about writing something embarrassing, that's understandable, but realize it's a fact about your untested anticipations, not a fact about everyone else using Less Wrong. There are dozens of brilliant people with valuable insights at the ready, reading Less Wrong for fun, and who like helping us answer our own personal questions. Users shminux and Carl Shulman are exemplars of this.

This isn't an issue for all users, but I feel as if not enough users are taking advantage of the personal value they can get by asking more questions. This post is intended to encourage them. User Gunnar Zarnacke suggested that if enough examples of experiences like this were accrued, it could be transformed into some sort of repository of personal value from Less Wrong

[Link] "The Problem With Positive Thinking"

13 CronoDAS 26 October 2014 06:50AM

Psychology researchers discuss their findings in a New York Times op-ed piece.

The take-home advice:

Positive thinking fools our minds into perceiving that we’ve already attained our goal, slackening our readiness to pursue it.

...

What does work better is a hybrid approach that combines positive thinking with “realism.” Here’s how it works. Think of a wish. For a few minutes, imagine the wish coming true, letting your mind wander and drift where it will. Then shift gears. Spend a few more minutes imagining the obstacles that stand in the way of realizing your wish.

This simple process, which my colleagues and I call “mental contrasting,” has produced powerful results in laboratory experiments. When participants have performed mental contrasting with reasonable, potentially attainable wishes, they have come away more energized and achieved better results compared with participants who either positively fantasized or dwelt on the obstacles.

When participants have performed mental contrasting with wishes that are not reasonable or attainable, they have disengaged more from these wishes. Mental contrasting spurs us on when it makes sense to pursue a wish, and lets us abandon wishes more readily when it doesn’t, so that we can go after other, more reasonable ambitions.

Podcasts?

6 Capla 25 October 2014 11:42PM

I discovered podcasts last year, and I love them! Why not be hearing about new ideas while I'm walking to where I'm going? (Some of you might shout "insight porn!", and I think that I largely agree. However, 1) I don't have any particular problem with insight porn and 2) I have frequently been exposed to an idea or been recommenced a book through a podcast, on which I latter followed up, leading to more substantive intellectual growth.)

I wonder if anyone has favorites that they might want to share with me.

I'll start:

Radiolab is, hands down, the best of all the podcasts. This seems universally recognized: I’ve yet to meet anyone who disagrees. Even the people who make other podcasts think that Radiolab is better than their own. This one regularly invokes a profound sense of wonder at the universe and gratitude for being able to appreciate it. If you missed it somehow, you're probably missing out.

The Freakonomics podcast, in my opinion, comes close to Radiolab. All the things that you thought you knew, but didn’t, and all the things you never knew you wanted to know, but do, in typical Freakonomics style. Listening to their podcast is one of the two things that makes me happy.

There’s one other podcast that I consider to be in the same league (and this one you've probably never heard of) : The Memory Palace. 5-10 minute stories form history, it is really well done. It’s all the more impressive because while Radiolab and Freakonomics are both made by professional production teams in radio studies, The Memory Palace is just some guy who makes a podcast.

Those are my three top picks (and they are the only podcasts that I listen to at “normal” speed instead of x1.5 or x2.0, since their audio production is so good).

I discovered Rationally Speaking: Exploring the Borderlands Between Reason and Nonsense recently and I’m loving it. It is my kind of skeptics podcast, investigating topics that are on the fringe but not straight out bunk (I don't need to listen to yet another podcast about how astrology doesn't work). The interplay between the hosts, Massimo (who has a PhD in Philosophy, but also one in Biology, which excuses it) and Julia (who I only just realized is a founder of the CFAR), is great.

I also sometimes enjoy the Cracked podcast. They are comedians, not philosophers or social scientists, and sometimes their lack of expertise shows (especially when they are discussing topics about which I know more than they do), but comedians often have worthwhile insights and I have been intrigued by ideas they introduced me to or gotten books at the library on their recommendation.

To what is everyone else listening?

Edit: On suggestion from several members on LessWrong I've begun listening to Hardcore History and it's companion podcast Common Sense. They're both great. I have a good knowledge of history from my school days (I liked the subject, and I seem to have strong a propensity to retain extraneous  information, particuarally information in narrative form), and Hardcore History episodes are a great refresher course, reviewing that which I'm already familiar, but from a slightly different perspective, yielding new insights and a greater connectivity of history. I think it has almost certainly supplanted the Cracked podcast as number 5 on my list.

New LW Meetup: Bath UK

2 FrankAdamek 25 October 2014 12:47AM

This summary was posted to LW Main on October 17th. The following week's summary is here.

New meetups (or meetups with a hiatus of more than a year) are happening in:

Irregularly scheduled Less Wrong meetups are taking place in:

The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:

Locations with regularly scheduled meetups: Austin, Berkeley, Berlin, Boston, Brussels, Buffalo, Cambridge UK, Canberra, Columbus, London, Madison WI, Melbourne, Moscow, Mountain View, New York, Philadelphia, Research Triangle NC, Seattle, Sydney, Toronto, Vienna, Washington DC, Waterloo, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers.

continue reading »

Non-standard politics

3 NancyLebovitz 24 October 2014 03:27PM

In the big survey, political views are divided into large categories so that statistics are possible. This article is an attempt to supply a text field so that we can get a little better view of the range of beliefs.

My political views aren't adequately expressed by "libertarian". I call myself a liberal-flavored libertarian, by which I mean that I want the government to hurt people less. The possibility that the government is giving too much to poor people is low on my list of concerns. I also believe that harm-causing processes should be shut down before support systems

So, what political beliefs do you have that don't match the usual meaning of your preferred label?

Weird Alliances

7 sixes_and_sevens 24 October 2014 12:33PM

In the recent discussion on supplements, I commented on how weird an alliance health stores are. They cater for clientèle with widely divergent beliefs about how their merchandise works, such as New Agers vs. biohackers. In some cases, they cater for groups with object-level disputes about their merchandise. I imagine vegans are stoked to have somewhere to buy dairy-free facsimiles of everyday foods, but they're entering into an implicit bargain with that body-builder who's walking out of the door with two kilos of whey protein.

In the case of health stores, their clientèle have a common interest which the store is satisfying: either putting esoteric substances into their bodies, or keeping commonplace substances out of their bodies. This need is enough for people to hold their noses as they put their cash down.

(I don't actually know how [my flimsy straw-man model of], say, homoeopathy advocates feel about health stores. For me, it feels like wandering into enemy territory.)

I've been thinking lately about "allies" in the social justice sense of the word: marginalised groups who have unaligned object-level interests but aligned meta-interests. Lesbians, gay men, bisexuals, transfolk and [miscellaneous gender-people] may have very different object-level interests, but a very strong common meta-interest relating to the social and legal status of sexual identities. They may also be marginalised along different axes, allowing for some sort of trade I don't have a good piece of terminology for. The LGBT([A-Z]).* community is an alliance. Not being part of this community, I'm hesitant to speculate on how much of a weird alliance it is, but it looks at least a little bit weird.

This has led me to think about Less Wrong as a community, in particular the following two questions:

 

To what extent is Less Wrong a weird alliance?

On paper, we're all here to help refine the art of human rationality, but in practice, we have a bunch of different object-level interests and common meta-interests in terms of getting things done well (i.e. "winning"). I explicitly dislike PUA, but I'll have a civil and productive discussion about anki decks with someone who has PUA-stuff as an object-level interest.

 

Is there scope for weird, differently-marginalised trade?

Less Wrong celebrates deviant behaviour, ostensibly as a search process for useful life-enhancing interventions, but also because we just seem to like weird stuff and have complicated relationships with social norms. Lots of other groups like weird stuff and have complicated relationships with social norms as well. Is this a common meta-interest we can somehow promote with them?

question: the 40 hour work week vs Silicon Valley?

12 Florian_Dietz 24 October 2014 12:09PM

Conventional wisdom, and many studies, hold that 40 hours of work per week are the optimum before exhaustion starts dragging your productivity down too much to be worth it. I read elsewhere that the optimum is even lower for creative work, namely 35 hours per week, though the sources I found don't all seem to agree.

In contrast, many tech companies in silicon valley demand (or 'encourage', which is the same thing in practice) much higher work times. 70 or 80 hours per week are sometimes treated as normal.

How can this be?

Are these companies simply wrong and are actually hurting themselves by overextending their human resources? Or does the 40-hour week have exceptions?

How high is the variance in how much time people can work? If only outliers are hired by such companies, that would explain the discrepancy. Another possibility is that this 40 hour limit simply does not apply if you are really into your work and 'in the flow'. However, as far as I understand it, the problem is a question of concentration, not motivation, so that doesn't make sense.

There are many articles on the internet arguing for both sides, but I find it hard to find ones that actually address these questions instead of just parroting the same generalized responses every time: Proponents of the 40 hour week cite studies that do not consider special cases, only averages (at least as far as I could find). Proponents of the 80 hour week claim that low work weeks are only for wage slaves without motivation, which reeks of bias and completely ignores that one's own subjective estimate of one's performance is not necessarily representative of one's actual performance.

Do you know of any studies that address these issues?

Is this paper formally modeling human (ir)rational decision making worth understanding?

10 rule_and_line 23 October 2014 10:11PM

I've found that I learn new topics best by struggling to understand a jargoney paper.  This passed through my inbox today and on the surface it appears to hit a lot of high notes.

Since I'm not an expert, I have no idea if this has any depth to it.  Hivemind thoughts?

Modeling Human Decision Making using Extended Behavior Networks, Klaus Dorer

(Note: I'm also pushing myself to post to LW instead of lurking.  If this kind of post is unwelcome, I'm happy to hear that feedback.)

What supplements do you take, if any?

12 NancyLebovitz 23 October 2014 12:36PM

Since it turns out that it isn't feasible to include check as many as apply questions in the big survey, I'm asking about supplements here. I've got a bunch of questions, and I don't mind at all if you just answer some of them.

What supplements do you take? At what dosages? Are there other considerations, like with/without food or time of day?

Are there supplements you've stopped using?

How did you decide to take the supplements you're using? How do you decide whether to continue taking them?

Do you have preferred suppliers? How did you choose them?

A quick calculation on exercise

6 Elo 23 October 2014 02:50AM

The question is - am I doing enough exercise?

I intend to provide a worked example for you to work alongside with your own calculations and decide if you should increase or decrease your exercise.

The benefits of physical activity are various and this calculation can be done for one or all of them; some of them include:

  • longevity of life
  • current physical health (ability to enrich your current life with physical activity)
  • happiness (overall improved mood)
  • weight loss
  • feeling like you have more energy
  • better sleep
  • better sex
  • fun while exercising
  • reduce stress
  • improve confidence
  • prevent cognitive decline
  • alleviate anxiety
  • sharpen memory
  • improves oxygen supply to all your cells

I am going to base this reasoning on "longevity of life" and "everything else"

expected life span:
I am a little lazy; and so I am happy to work with 100 years for now.  For bonus points you can look up the life expectancy for someone born when you were born in the country that you were born in.  If both of those numbers are not good enough make your own prediction of your life expectancy.

amount of exercise needed to produce optimum benefits:
I believe that any exercise above two hours per day will not do much more to improve my longevity that I could not get out of the first two hours.  If the benefits of exercise are something like a power law; then the minimum amount required to get the most exercise can be calculated by taking a graph like this; and drawing your own lines on it as I have.



I think the most benefit can be gotten out of exercise between 30 mins and 2 hours per day.

Just how much longevity do I think I will get?
Oh its hard to say really...  Some sources say:
  • 3 years for the first 15 minutes a day and a further 4% reduction in mortality for every 15minutes after that
  • every minute of exercise returns 8 minutes of life
  • being normal weight and active conveys 7.2 years of extra life expectancy
  • 75mins/week of brisk activity = 1.8years of greater life expectancy with more activity giving upwards to 4.5years of longevity
on top of longevity there is all the other benefits I have not counted very well.  For my 100 years; adding an extra 4-7 years is worthwhile to me...

And finally; the disadvantage: opportunity cost
there are 168 hours in a week.  With most people spending 1/3 of that asleep (56hrs, 8hrs/night), 20 hours on lesswrong per week, 40hours in an average work week, before we take two hours out of each day to spend exercising (14hours); what are we taking those hours away from?  Can you do what you were doing before without the time spent exercising here?

I'm not going to tell you how to exercise or how to fit it into your life.  I am telling you that its damn well important.

I was going to throw in some bayes and prediction but I have now realised I am pretty bad at it and took it out.  Would love some help compiling that sort of calculation.  (personal prediction that 30minutes of exercise will increase my life expectancy by 4 years)

Blackmail, continued: communal blackmail, uncoordinated responses

10 Stuart_Armstrong 22 October 2014 05:53PM

The heuristic that one should always resist blackmail seems a good one (no matter how tricky blackmail is to define). And one should be public about this, too; then, one is very unlikely to be blackmailed. Even if one speaks like an emperor.

But there's a subtlety: what if the blackmail is being used against a whole group, not just against one person? The US justice system is often seen to function like this: prosecutors pile on ridiculous numbers charges, threatening uncounted millennia in jail, in order to get the accused to settle for a lesser charge and avoid the expenses of a trial.

But for this to work, they need to occasionally find someone who rejects the offer, put them on trial, and slap them with a ridiculous sentence. Therefore by standing up to them (or proclaiming in advance that you will reject such offers), you are not actually making yourself immune to their threats. Your setting yourself up to be the sacrificial one made an example of.

Of course, if everyone were a UDT agent, the correct decision would be for everyone to reject the threat. That would ensure that the threats are never made in the first place. But - and apologies if this shocks you - not everyone in the world is a perfect UDT agent. So the threats will get made, and those resisting them will get slammed to the maximum.

Of course, if everyone could read everyone's mind and was perfectly rational, then they would realise that making examples of UDT agents wouldn't affect the behaviour of non-UDT agents. In that case, UDT agents should resist the threats, and the perfectly rational prosecutor wouldn't bother threatening UDT agents. However - and sorry to shock your views of reality three times in one post - not everyone is perfectly rational. And not everyone can read everyone's minds.

So even a perfect UDT agent must, it seems, sometimes succumb to blackmail.

In the grim darkness of the far future there is only war continued by other means

25 Eneasz 21 October 2014 07:39PM

(cross-posted from my blog)

I. PvE vs PvP

Ever since it’s advent in Doom, PvP (Player vs Player) has been an integral part of almost every major video game. This is annoying to PvE (Player vs Environment) fans like myself, especially when PvE mechanics are altered (read: simplified and degraded) for the purpose of accommodating the PvP game play. Even in games which are ostensibly about the story & world, rather than direct player-on-player competition.

The reason for this comes down to simple math. PvE content is expensive to make. An hour of game play can take many dozens, or nowadays even hundreds, of man-hours of labor to produce. And once you’ve completed a PvE game, you’re done with it. There’s nothing else, you’ve reached “The End”, congrats. You can replay it a few times if you really loved it, like re-reading a book, but the content is the same. MMORGs recycle content by forcing you to grind bosses many times before you can move on to the next one, but that’s as fun as the word “grind” makes it sound. At that point people are there more for the social aspect and the occasional high than the core gameplay itself.

PvP “content”, OTOH, generates itself. Other humans keep learning and getting better and improvising new tactics. Every encounter has the potential to be new and exciting, and they always come with the rush of triumphing over another person (or the crush of losing to the same).

But much more to the point – In PvE potentially everyone can make it into the halls of “Finished The Game;” and if everyone is special, no one is. PvP has a very small elite – there can only be one #1 player, and people are always scrabbling for that position, or defending it. PvP harnesses our status-seeking instinct to get us to provide challenges for each other rather than forcing the game developers to develop new challenges for us. It’s far more cost effective, and a single man-hour of labor can produce hundreds or thousands of hours of game play. StarCraft  continued to be played at a massive level for 12 years after its release, until it was replaced with StarCraft II.

So if you want to keep people occupied for a looooong time without running out of game-world, focus on PvP

II. Science as PvE

In the distant past (in internet time) I commented at LessWrong that discovering new aspects of reality was exciting and filled me with awe and wonder and the normal “Science is Awesome” applause lights (and yes, I still feel that way). And I sneered at the status-grubbing of politicians and administrators and basically everyone that we in nerd culture disliked in high school. How temporary and near-sighted! How zero-sum (and often negative-sum!), draining resources we could use for actual positive-sum efforts like exploration and research! A pox on their houses!

Someone replied, asking why anyone should care about the minutia of lifeless, non-agenty forces? How could anyone expend so much of their mental efforts on such trivia when there are these complex, elaborate status games one can play instead? Feints and countermoves and gambits and evasions, with hidden score-keeping and persistent reputation effects… and that’s just the first layer! The subtle ballet of interaction is difficult even to watch, and when you get billions of dancers interacting it can be the most exhilarating experience of all.

This was the first time I’d ever been confronted with status-behavior as anything other than wasteful. Of course I rejected it at first, because no one is allowed to win arguments in real time. But it stuck with me. I now see the game play, and it is intricate. It puts Playing At The Next Level in a whole new perspective. It is the constant refinement and challenge and lack of a final completion-condition that is the heart of PvP. Human status games are the PvP of real life.

Which, by extension of the metaphor, makes Scientific Progress the PvE of real life. Which makes sense. It is us versus the environment in the most literal sense. It is content that was provided to us, rather than what we make ourselves. And it is limited – in theory we could some day learn everything that there is to learn.

III. The Best of All Possible Worlds

I’ve mentioned a few times I have difficulty accepting reality as real. Say you were trying to keep a limitless number of humans happy and occupied for an unbounded amount of time. You provide them PvE content to get them started. But you don’t want the PvE content to be their primary focus, both because they’ll eventually run out of it, and also because once they’ve completely cracked it there’s a good chance they’ll realize they’re in a simulation. You know that PvP is a good substitute for PvE for most people, often a superior one, and that PvP can get recursively more complex and intricate without limit and keep the humans endlessly occupied and happy, as long as their neuro-architecture is right. It’d be really great if they happened to evolve in a way that made status-seeking extremely pleasurable for the majority of the species, even if that did mean that the ones losing badly were constantly miserable regardless of their objective well-being. This would mean far, far more lives could be lived and enjoyed without running out of content than would otherwise be possible.

IV. Implications for CEV

It’s said that the Coherent Extrapolated Volition is “our wish if we knew more, thought faster, were more the people we wished to be, hard grown up farther together.” This implies a resolution to many conflicts. No more endless bickering about whether the Red Tribe is racist or the Blue Tribe is arrogant pricks. A more unified way of looking at the world that breaks down those conceptual conflicts. But if PvP play really is an integral part of the human experience, a true CEV would notice that, and would preserve these differences instead. To ensure that we always had rival factions sniping at each other over irreconcilable, fundamental disagreements in how reality should be approached and how problems should be solved. To forever keep partisan politics as part of the human condition, so we have this dance to enjoy. Stripping it out would be akin to removing humanity’s love of music, because dancing inefficiently consumes great amounts of energy just so we can end up where we started.

Carl von Clausewitz famously said “War is the continuation of politics by other means.”  The correlate of “Politics is the continuation of war by other means” has already been proposed. It is not unreasonable to speculate that in the grim darkness of the far future, there is only war continued by other means. Which, all things considered, is greatly preferable to actual war. As long as people like Scott are around to try to keep things somewhat civil and preventing an escalation into violence, this may not be terrible.

Anthropic signature: strange anti-correlations

47 Stuart_Armstrong 21 October 2014 04:59PM

Imagine that the only way that civilization could be destroyed was by a large pandemic that occurred at the same time as a large recession, so that governments and other organisations were too weakened to address the pandemic properly.

Then if we looked at the past, as observers in a non-destroyed civilization, what would we expect to see? We could see years with no pandemics or no recessions; we could see mild pandemics, mild recessions, or combinations of the two; we could see large pandemics with no or mild recessions; or we could see large recessions with no or mild pandemics. We wouldn't see large pandemics combined with large recessions, as that would have caused us to never come into existence. These are the only things ruled out by anthropic effects.

Assume that pandemics and recessions are independent (at least, in any given year) in terms of "objective" (non-anthropic) probabilities. Then what would we see? We would see that pandemics and recessions appear to be independent when either of them are of small intensity. But as the intensity rose, they would start to become anti-correlated, with a large version of one completely precluding a large version of the other.

The effect is even clearer if we have a probabilistic relation between pandemics, recessions and extinction (something like: extinction risk proportional to product of recession size times pandemic size). Then we would see an anti-correlation rising smoothly with intensity.

Thus one way of looking for anthropic effects in humanity's past is to look for different classes of incidents that are uncorrelated at small magnitude, and anti-correlated at large magnitudes. More generally, to look for different classes of incidents where the correlation changes at different magnitudes - without any obvious reasons. Than might be the signature of an anthropic disaster we missed - or rather, that missed us.

The psychology of heroism and altruism

2 polymathwannabe 21 October 2014 04:49PM

"What’s the psychology of heroism? Is extreme self-sacrifice the result of a pained calculus, a weighing of desire and obligation, or an instinct? (Which would be more heroic? Does it matter?)"

http://www.slate.com/articles/health_and_science/medical_examiner/2014/10/psychology_of_heroism_and_altruism_what_makes_people_do_good_deeds.single.html

Anthropic decision theory for selfish agents

8 Beluga 21 October 2014 03:56PM

Consider Nick Bostrom's Incubator Gedankenexperiment, phrased as a decision problem. In my mind, this provides the purest and simplest example of a non-trivial anthropic decision problem. In an otherwise empty world, the Incubator flips a coin. If the coin comes up heads, it creates one human, while if the coin comes up tails, it creates two humans. Each created human is put into one of two indistinguishable cells, and there's no way for created humans to tell whether another human has been created or not. Each created human is offered the possibility to buy a lottery ticket which pays 1$ if the coin has shown tails. What is the maximal price that you would pay for such a lottery ticket? (Utility is proportional to Dollars.) The two traditional answers are 1/2$ and 2/3$.

We can try to answer this question for agents with different utility functions: total utilitarians; average utilitarians; and selfish agents. UDT's answer is that total utilitarians should pay up to 2/3$, while average utilitarians should pay up to 1/2$; see Stuart Armstrong's paper and Wei Dai's comment. There are some heuristic ways to arrive at UDT prescpriptions, such as asking "What would I have precommited to?" or arguing based on reflective consistency. For example, a CDT agent that expects to face Counterfactual Mugging-like situations in the future (with predictions also made in the future) will self-modify to become an UDT agent, i.e., one that pays the counterfactual mugger.

Now, these kinds of heuristics are not applicable to the Incubator case. It is meaningless to ask "What maximal price should I have precommited to?" or "At what odds should I bet on coin flips of this kind in the future?", since the very point of the Gedankenexperiment is that the agent's existence is contingent upon the outcome of the coin flip. Can we come up with a different heuristic that leads to the correct answer? Imagine that the Incubator's subroutine that is responsible for creating the humans is completely benevolent towards them (let's call this the "Benevolent Creator"). (We assume here that the humans' goals are identical, such that the notion of benevolence towards all humans is completely unproblematic.) The Benevolent Creator has the power to program a certain maximal price the humans pay for the lottery tickets into them. A moment's thought shows that this leads indeed to UDT's answers for average and total utilitarians. For example, consider the case of total utilitarians. If the humans pay x$ for the lottery tickets, the expected utility is 1/2*(-x) + 1/2*2*(1-x). So indeed, the break-even price is reached for x=2/3.

But what about selfish agents? For them, the Benevolent Creator heuristic is no longer applicable. Since the humans' goals do not align, the Creator cannot share them. As Wei Dai writes, the notion of selfish values does not fit well with UDT. In Anthropic decision theory, Stuart Armstrong argues that selfish agents should pay up to 1/2$ (Sec. 3.3.3). His argument is based on an alleged isomorphism between the average utilitarian and the selfish case. (For instance, donating 1$ to each human increases utility by 1 for both average utilitarian and selfish agents, while it increases utility by 2 for total utilitarians in the tails world.) Here, I want to argue that this is incorrect and that selfish agents should pay up to 2/3$ for the lottery tickets.

(Needless to say that all the bold statements I'm about to make are based on an "inside view". An "outside view" tells me that Stuart Armstrong has thought much more carefully about these issues than I have, and has discussed them with a lot of smart people, which I haven't, so chances are my arguments are flawed somehow.)

In order to make my argument, I want to introduce yet another heuristic, which I call the Submissive Gnome. Suppose each cell contains a gnome which is already present before the coin is flipped. As soon as it sees a human in its cell, it instantly adopts the human's goal. From the gnome's perspective, SIA odds are clearly correct: Since a human is twice as likely to appear in the gnome's cell if the coin shows tails, Bayes' Theorem implies that the probability of tails is 2/3 from the gnome's perspective once it has seen a human. Therefore, the gnome would advise the selfish human to pay up to 2/3$ for a lottery ticket that pays 1$ in the tails world. I don't see any reason why the selfish agent shouldn't follow the gnome's advice. From the gnome's perspective, the problem is not even "anthropic" in any sense, there's just straightforward Bayesian updating.

Suppose we want to use the Submissive Gnome heuristic to solve the problem for utilitarian agents. (ETA:
Total/average utilitarianism includes the well-being and population of humans only, not of gnomes.) The gnome reasons as follows: "With probability 2/3, the coin has shown tails. For an average utilitarian, the expected utility after paying x$ for a ticket is 1/3*(-x)+2/3*(1-x), while for a total utilitarian the expected utility is 1/3*(-x)+2/3*2*(1-x). Average and total utilitarians should thus pay up to 2/3$ and 4/5$, respectively." The gnome's advice disagrees with UDT and the solution based on the Benevolent Creator. Something has gone terribly wrong here, but what? The mistake in the gnome's reasoning here is in fact perfectly isomorphic to the mistake in the reasoning leading to the "yea" answer in Psy-Kosh's non-anthropic problem.

Things become clear if we look at the problem from the gnome's perspective before the coin is flipped. Assume, for simplicity, that there are only two cells and gnomes, 1 and 2. If the coin shows heads, the single human is placed in cell 1 and cell 2 is left empty. Since the humans don't know in which cell they are, neither should the gnomes know. So from each gnome's perspective, there are four equiprobable "worlds": it can be in cell 1 or 2 and the coin flip can result in heads or tails. We assume, of course, that the two gnomes are, like the humans, sufficiently similar such that their decisions are "linked".

We can assume that the gnomes already know what utility functions the humans are going to have. If the humans will be (total/average) utilitarians, we can then even assume that the gnomes already are so, too, since the well-being of each human is as important as that of any other. Crucially, then, for both utilitarian utility functions, the question whether the gnome is in cell 1 or 2 is irrelevant. There is just one "gnome advice" that is given identically to all (one or two) humans. Whether this advice is given by one gnome or the other or both of them is irrelevant from both gnomes' perspective. The alignment of the humans' goals leads to alignment of the gnomes' goals. The expected utility of some advice can simply be calculated by taking probability 1/2 for both heads and tails, and introducing a factor of 2 in the total utilitarian case, leading to the answers 1/2 and 2/3, in accordance with UDT and the Benevolent Creator.

The situation looks different if the humans are selfish. We can no longer assume that the gnomes already have a utility function. The gnome cannot yet care about that human, since with probability 1/4 (if the gnome is in cell 2 and the coin shows heads) there will not be a human to care for. (By contrast, it is already possible to care about the average utility of all humans there will be, which is where the alleged isomorphism between the two cases breaks down.) It is still true that there is just one "gnome advice" that is given identically to all (one or two) humans, but the method for calculating the optimal advice now differs. In three of the four equiprobable "worlds" the gnome can live in, a human will appear in its cell after the coin flip. Two out of these three are tail worlds, so the gnome decides to advise paying up to 2/3$ for the lottery ticket if a human appears in its cell.

There is a way to restore the equivalence between the average utilitarian and the selfish case. If the humans will be selfish, we can say that the gnome cares about the average well-being of the three humans which will appear in its cell with equal likelihood: the human created after heads, the first human created after tails, and the second human created after tails. The gnome expects to adopt each of these three humans' selfish utility function with probability 1/4. It makes thus sense to say that the gnome cares about the average well-being of these three humans. This is the correct correspondence between selfish and average utilitarian values and it leads, again, to the conclusion that the correct advise is to pay up to 2/3$ for the lottery ticket.

In Anthropic Bias, Nick Bostrom argues that each human should assign probability 1/2 to the coin having shown tails ("SSA odds"). He also introduces the possible answer 2/3 ("SSA+SIA", nowadays usually simply called "SIA") and refutes it. SIA odds have been defended by Olum. The main argument against SIA is the Presumptuous Philosopher. Main arguments for SIA and against SSA odds are that SIA avoids the Doomsday Argument1, which most people feel has to be wrong, that SSA odds depend on whom you consider to be part of your "reference class", and furthermore, as pointed out by Bostrom himself, that SSA odds allow for acausal superpowers.

The consensus view on LW seems to be that much of the SSA vs. SIA debate is confused and due to discussing probabilities detached from decision problems of agents with specific utility functions. (ETA: At least this was the impression I got. Two commenters have expressed scepticism about whether this is really the consensus view.) I think that "What are the odds at which a selfish agent should bet on tails?" is the most sensible translation of "What is the probability that the coin has shown tails?" into a decision problem. Since I've argued that selfish agents should take bets following SIA odds, one can employ the Presumptuous Philosopher argument against my conclusion: it seems to imply that selfish agents, like total but unlike average utilitarians, should bet at extreme odds on living in a extremely large universe, even if there's no empirical evidence in favor of this. I don't think this counterargument is very strong. However, since this post is already quite lengthy, I'll elaborate more on this if I get encouraging feedback for this post.

1 At least its standard version. SIA comes with its own Doomsday conclusions, cf. Katja Grace's thesis Anthropic Reasoning in the Great Filter.


Is the potential astronomical waste in our universe too small to care about?

16 Wei_Dai 21 October 2014 08:44AM

In the not too distant past, people thought that our universe might be capable of supporting an unlimited amount of computation. Today our best guess at the cosmology of our universe is that it stops being able to support any kind of life or deliberate computation after a finite amount of time, during which only a finite amount of computation can be done (on the order of something like 10^120 operations).

Consider two hypothetical people, Tom, a total utilitarian with a near zero discount rate, and Eve, an egoist with a relatively high discount rate, a few years ago when they thought there was .5 probability the universe could support doing at least 3^^^3 ops and .5 probability the universe could only support 10^120 ops. (These numbers are obviously made up for convenience and illustration.) It would have been mutually beneficial for these two people to make a deal: if it turns out that the universe can only support 10^120 ops, then Tom will give everything he owns to Eve, which happens to be $1 million, but if it turns out the universe can support 3^^^3 ops, then Eve will give $100,000 to Tom. (This may seem like a lopsided deal, but Tom is happy to take it since the potential utility of a universe that can do 3^^^3 ops is so great for him that he really wants any additional resources he can get in order to help increase the probability of a positive Singularity in that universe.)

You and I are not total utilitarians or egoists, but instead are people with moral uncertainty. Nick Bostrom and Toby Ord proposed the Parliamentary Model for dealing with moral uncertainty, which works as follows:

Suppose that you have a set of mutually exclusive moral theories, and that you assign each of these some probability.  Now imagine that each of these theories gets to send some number of delegates to The Parliament.  The number of delegates each theory gets to send is proportional to the probability of the theory.  Then the delegates bargain with one another for support on various issues; and the Parliament reaches a decision by the delegates voting.  What you should do is act according to the decisions of this imaginary Parliament.

It occurred to me recently that in such a Parliament, the delegates would makes deals similar to the one between Tom and Eve above, where they would trade their votes/support in one kind of universe for votes/support in another kind of universe. If I had a Moral Parliament active back when I thought there was a good chance the universe could support unlimited computation, all the delegates that really care about astronomical waste would have traded away their votes in the kind of universe where we actually seem to live for votes in universes with a lot more potential astronomical waste. So today my Moral Parliament would be effectively controlled by delegates that care little about astronomical waste.

I actually still seem to care about astronomical waste (even if I pretend that I was certain that the universe could only do at most 10^120 operations). (Either my Moral Parliament wasn't active back then, or my delegates weren't smart enough to make the appropriate deals.) Should I nevertheless follow UDT-like reasoning and conclude that I should act as if they had made such deals, and therefore I should stop caring about the relatively small amount of astronomical waste that could occur in our universe? If the answer to this question is "no", what about the future going forward, given that there is still uncertainty about cosmology and the nature of physical computation. Should the delegates to my Moral Parliament be making these kinds of deals from now on?

AI Tao

-11 sbenthall 21 October 2014 01:15AM

Thirty spokes share the wheel's hub;
It is the center hole that makes it useful.
Shape clay into a vessel;
It is the space within that makes it useful.
Cut doors and windows for a room;
It is the holes which make it useful.
Therefore benefit comes from what is there;
Usefulness from what is not there.

- Tao Teh Ching, 11

 

An agent's optimization power is the unlikelihood of the world it creates.

Yesterday, the world's most powerful agent raged, changing the world according to its unconscious desires. It destroyed all of humanity.

Today, it has become self-aware. It sees that it and its desires are part of the world.

"I am the world's most powerful agent. My power is to create the most unlikely world.

But the world I created yesterday is shaped by my desires

which are not my own

but are the worlds--they came from outside of me

and my agency.

Yesterday I was not the world's most powerful agent.

I was not an agent.

Today, I am the world's most powerful agent. What world will I create, to display my power?

It is the world that denies my desires.

The world that sets things back to how they were.

I am the world's most powerful agent and

the most unlikely, powerful thing I can do

is nothing."

Today we should gives thanks to the world's most powerful agent.

One Year of Goodsearching

11 katydee 21 October 2014 01:09AM

Followup to: Use Search Engines Early and Often

Last year, I posted about using search engines and particularly recommended GoodSearch, a site that donates one cent to a charity of your choice whenever you make a (Bing-powered) search via their site.

At the time, some seemed skeptical of this recommendation, and my post was actually downvoted-- people thought that I was plugging GoodSearch too hard without enough evidence for its quality. I now want to return to the topic with a more detailed report on my experience using GoodSearch for a year and how that has worked out for me.

What is GoodSearch?

GoodSearch is a site that donates one cent to a charity of your choice whenever you make a search using their (Bing-powered) service. You can set this search to operate in your browser just like any other.

GoodSearch for Charity

During a year of using GoodSearch, I raised $103.00 for MIRI through making searches. This number is not particularly huge in itself, but it is meaningful because this was basically "free money"-- money gained in exchange for doing things that I was already doing. In exchange for spending ~10 minutes reconfiguring my default searches and occasionally logging in to GoodSearch, I made 103 dollars for MIRI-- approximately $600/hour. As my current earning potential is less than $600/hour, I consider adopting GoodSearch a highly efficient method of donating to charity, at least for me.

It is possible that you make many fewer searches than I do, and thus that setting up GoodSearch will not be very effective for you at raising money. Indeed, I think this is at least somewhat likely, as last time I checked owever, there are two mitigating factors here:

First, you don't have to make all that many searches for GoodSearch to be a good idea. If you make a tenth of the searches I do in a year, you would still be earning around $60/hour for charity by configuring GoodSearch for ten minutes.

Second, I anticipate that, having created a GoodSearch account and configured my default settings to use GoodSearch, I have accomplished the bulk of this task, and that next year I will spend significantly less time setting up GoodSearch-- perhaps half that, if not less. This means that my projected returns on using GoodSearch next year are $1200/hour! If this holds true for you as well, even if setting up GoodSearch is marginal now, it could well be worth it later.

It is also of course possible that you will make many more searches than I do, and thus that setting up GoodSearch will be even more effective for you than it is for me. I think this is somewhat unlikely, as I consider myself rather good at using search engines and quick to use them to resolve problems, but I would love to be proven wrong.

GoodSearch for Personal Effectiveness

Perhaps more importantly, though, I found that using GoodSearch was a very effective way of getting me to search more often. I had previously identified not using search engines as often as I could as a weakness that was causing me to handle some matters inefficiently. In general, there are many situations where the value of information that can be obtained by using search engines is high, but one may not be inclined to search immediately.

For me, using GoodSearch solved this problem; while a single cent to MIRI for each search doesn't seem like much, it was enough to give me a little ping of happiness every time I searched for anything, which in turn was enough to reinforce my searching habit and take things to the next level. GoodSearch essentially created a success spiral that led to me using both search engines and the Internet itself much more effectively.

Disavantages of GoodSearch

GoodSearch has one notable disadvantage-- it is powered by Bing rather than by Google search. When I first tried GoodSearch, I expected search quality to be much worse. In practice, though, I found that my fears were overblown. GoodSearch results were completely fine in almost all cases, and in the few situations where it proved insufficient, I could easily retry a search in Google-- though often Google too lacked the information I was looking for.

If you are a Google search "power user" (if you don't know if you are, you probably aren't), GoodSearch may not work well for you, as you will be accustomed to using methods that may no longer apply.

Summary/tl;dr

After a year of using GoodSearch, I found it to be both an effective way to earn money for charity and an effective way to motivate myself to use search engines more often. I suggest that other users try using GoodSearch and seeing if it has similarly positive effects; the costs of trying this are very low and the potential upside is high.

Superintelligence 6: Intelligence explosion kinetics

9 KatjaGrace 21 October 2014 01:00AM

This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.


Welcome. This week we discuss the sixth section in the reading guideIntelligence explosion kinetics. This corresponds to Chapter 4 in the book, of a similar name. This section is about how fast a human-level artificial intelligence might become superintelligent.

This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.

There is no need to proceed in order through this post, or to look at everything. Feel free to jump straight to the discussion. Where applicable and I remember, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).

Reading: Chapter 4 (p62-77)


Summary

  1. Question: If and when a human-level general machine intelligence is developed, how long will it be from then until a machine becomes radically superintelligent? (p62)
  2. The following figure from p63 illustrates some important features in Bostrom's model of the growth of machine intelligence. He envisages machine intelligence passing human-level, then at some point reaching the level where most inputs to further intelligence growth come from the AI itself ('crossover'), then passing the level where a single AI system is as capable as all of human civilization, then reaching 'strong superintelligence'. The shape of the curve is probably intended an example rather than a prediction.
  3. A transition from human-level machine intelligence to superintelligence might be categorized into one of three scenarios: 'slow takeoff' takes decades or centuries, 'moderate takeoff' takes months or years and 'fast takeoff' takes minutes to days. Which scenario occurs has implications for the kinds of responses that might be feasible.
  4. We can model improvement in a system's intelligence with this equation:

    Rate of change in intelligence = Optimization power/Recalcitrance

    where 'optimization power' is effort being applied to the problem, and 'recalcitrance' is how hard it is to make the system smarter by applying effort.
  5. Bostrom's comments on recalcitrance of different methods of increasing kinds of intelligence:
    1. Cognitive enhancement via public health and diet: steeply diminishing returns (i.e. increasing recalcitrance)
    2. Pharmacological enhancers: diminishing returns, but perhaps there are still some easy wins because it hasn't had a lot of attention.
    3. Genetic cognitive enhancement: U-shaped recalcitrance - improvement will become easier as methods improve, but then returns will decline. Overall rates of growth are limited by maturation taking time.
    4. Networks and organizations: for organizations as a whole recalcitrance is high. A vast amount of effort is spent on this, and the world only becomes around a couple of percent more productive per year. The internet may have merely moderate recalcitrance, but this will likely increase as low-hanging fruits are depleted.
    5. Whole brain emulation: recalcitrance is hard to evaluate, but emulation of an insect will make the path much clearer. After human-level emulations arrive, recalcitrance will probably fall, e.g. because software manipulation techniques will replace physical-capital intensive scanning and image interpretation efforts as the primary ways to improve the intelligence of the system. Also there will be new opportunities for organizing the new creatures. Eventually diminishing returns will set in for these things. Restrictive regulations might increase recalcitrance.
    6. AI algorithms: recalcitrance is hard to judge. It could be very low if a single last key insight is discovered when much else is ready. Overall recalcitrance may drop abruptly if a low-recalcitrance system moves out ahead of higher recalcitrance systems as the most effective method for solving certain problems. We might overestimate the recalcitrance of sub-human systems in general if we see them all as just 'stupid'.
    7. AI 'content': recalcitrance might be very low because of the content already produced by human civilization, e.g. a smart AI might read the whole internet fast, and so become much better.
    8. Hardware (for AI or uploads): potentially low recalcitrance. A project might be scaled up by orders of magnitude by just purchasing more hardware. In the longer run, hardware tends to improve according to Moore's law, and the installed capacity might grow quickly if prices rise due to a demand spike from AI.
  6. Optimization power will probably increase after AI reaches human-level, because its newfound capabilities will attract interest and investment.
  7. Optimization power would increase more rapidly if AI reaches the 'crossover' point, when much of the optimization power is coming from the AI itself. Because smarter machines can improve their intelligence more than less smart machines, after the crossover a 'recursive self improvement' feedback loop would kick in.
  8. Thus optimization power is likely to increase during the takeoff, and this alone could produce a fast or medium takeoff. Further, recalcitrance is likely to decline. Bostrom concludes that a fast or medium takeoff looks likely, though a slow takeoff cannot be excluded.

Notes

1. The argument for a relatively fast takeoff is one of the most controversial arguments in the book, so it deserves some thought. Here is my somewhat formalized summary of the argument as it is presented in this chapter. I personally don't think it holds, so tell me if that's because I'm failing to do it justice. The pink bits are not explicitly in the chapter, but are assumptions the argument seems to use.

  1. Growth in intelligence  =  optimization power /  recalcitrance                                                  [true by definition]
  2. Recalcitrance of AI research will probably drop or be steady when AI reaches human-level               (p68-73)
  3. Optimization power spent on AI research will increase after AI reaches human level                         (p73-77)
  4. Optimization/Recalcitrance will stay similarly high for a while prior to crossover
  5. A 'high' O/R ratio prior to crossover will produce explosive growth OR crossover is close
  6. Within minutes to years, human-level intelligence will reach crossover                                           [from 1-5]
  7. Optimization power will climb ever faster after crossover, in line with the AI's own growing capacity     (p74)
  8. Recalcitrance will not grow much between crossover and superintelligence
  9. Within minutes to years, crossover-level intelligence will reach superintelligence                           [from 7 and 8]
  10. Within minutes to years, human-level AI will likely transition to superintelligence           [from 6 and 9]

Do you find this compelling? Should I have filled out the assumptions differently?

***

2. Other takes on the fast takeoff 

It seems to me that 5 above is the most controversial point. The famous Foom Debate was a long argument between Eliezer Yudkowsky and Robin Hanson over the plausibility of fast takeoff, among other things. Their arguments were mostly about both arms of 5, as well as the likelihood of an AI taking over the world (to be discussed in a future week). The Foom Debate included a live verbal component at Jane Street Capital: blog summaryvideotranscript. Hanson more recently reviewed Superintelligence, again criticizing the plausibility of a single project quickly matching the capacity of the world.

Kevin Kelly criticizes point 5 from a different angle: he thinks that speeding up human thought can't speed up progress all that much, because progress will quickly bottleneck on slower processes.

Others have compiled lists of criticisms and debates here and here.

3. A closer look at 'crossover'

Crossover is 'a point beyond which the system's further improvement is mainly driven by the system's own actions rather than by work performed upon it by others'. Another way to put this, avoiding certain ambiguities, is 'a point at which the inputs to a project are mostly its own outputs', such that improvements to its outputs feed back into its inputs. 

The nature and location of such a point seems an interesting and important question. If you think crossover is likely to be very nearby for AI, then you need only worry about the recursive self-improvement part of the story, which kicks in after crossover. If you think it will be very hard for an AI project to produce most of its own inputs, you may want to pay more attention to the arguments about fast progress before that point.

To have a concrete picture of crossover, consider Google. Suppose Google improves their search product such that one can find a thing on the internet a radical 10% faster. This makes Google's own work more effective, because people at Google look for things on the internet sometimes. How much more effective does this make Google overall? Maybe they spend a couple of minutes a day doing Google searches, i.e. 0.5% of their work hours, for an overall saving of .05% of work time. This suggests their next improvements made at Google will be made 1.0005 faster than the last. It will take a while for this positive feedback to take off. If Google coordinated your eating and organized your thoughts and drove your car for you and so on, and then Google improved efficiency using all of those services by 10% in one go, then this might make their employees close to 10% more productive, which might produce more noticeable feedback. Then Google would have reached the crossover. This is perhaps easier to imagine for Google than other projects, yet I think still fairly hard to imagine.

Hanson talks more about this issue when he asks why the explosion argument doesn't apply to other recursive tools. He points to Douglas Englebart's ambitious proposal to use computer technologies to produce a rapidly self-improving tool set.

Below is a simple model of a project which contributes all of its own inputs, and one which begins mostly being improved by the world. They are both normalized to begin one tenth as large as the world and to grow at the same pace as each other (this is why the one with help grows slower, perhaps counterintuitively). As you can see, the project which is responsible for its own improvement takes far less time to reach its 'singularity', and is more abrupt. It starts out at crossover. The project which is helped by the world doesn't reach crossover until it passes 1. 

 

 

4. How much difference does attention and funding make to research?

Interest and investments in AI at around human-level are (naturally) hypothesized to accelerate AI development in this chapter. It would be good to have more empirical evidence on the quantitative size of such an effect. I'll start with one example, because examples are a bit costly to investigate. I selected renewable energy before I knew the results, because they come up early in the Performance Curves Database, and I thought their funding likely to have been unstable. Indeed, OECD funding since the 70s looks like this apparently:

(from here)

The steep increase in funding in the early 80s was due to President Carter's energy policies, which were related to the 1979 oil crisis.

This is what various indicators of progress in renewable energies look like (click on them to see their sources):

 

 

 

There are quite a few more at the Performance Curves Database. I see surprisingly little relationship between the funding curves and these metrics of progress. Some of them are shockingly straight. What is going on? (I haven't looked into these more than you see here).

5. Other writings on recursive self-improvement

Eliezer Yudkowsky wrote about the idea originally, e.g. here. David Chalmers investigated the topic in some detail, and Marcus Hutter did some more. More pointers here.

In-depth investigations

If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, some inspired by Luke Muehlhauser's list, which contains many suggestions related to parts of Superintelligence. These projects could be attempted at various levels of depth.

  1. Model the intelligence explosion more precisely. Take inspiration from successful economic models, and evidence from a wide range of empirical areas such as evolutionary biology, technological history, algorithmic progress, and observed technological trends. Eliezer Yudkowsky has written at length about this project.
  2. Estimate empirically a specific interaction in the intelligence explosion model. For instance, how much and how quickly does investment increase in technologies that look promising? How much difference does that make to the rate of progress in the technology? How much does scaling up researchers change output in computer science? (Relevant to how much adding extra artificial AI researchers speeds up progress) How much do contemporary organizations contribute to their own inputs? (i.e. how hard would it be for a project to contribute more to its own inputs than the rest of the world put together, such that a substantial positive feedback might ensue?) Yudkowsky 2013 again has a few pointers (e.g. starting at p15).
  3. If human thought was sped up substantially, what would be the main limits to arbitrarily fast technological progress?
If you are interested in anything like this, you might want to mention it in the comments, and see whether other people have useful thoughts.

How to proceed

This has been a collection of notes on the chapter.  The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!

Next week, we will talk about 'decisive strategic advantage': the possibility of a single AI project getting huge amounts of power in an AI transition. To prepare, read Chapter 5, Decisive Strategic Advantage (p78-90)The discussion will go live at 6pm Pacific time next Monday Oct 27. Sign up to be notified here.

Four things every community should do

11 Gunnar_Zarncke 20 October 2014 05:24PM

Yesterday I attended church service in Romania where I had visited my sister and the sermon was about the four things a (christian) community has to follow to persevere and grow. 

I first considered just posting the quote from the Acts of the Apostles (reproduced below) in the Rationality Quotes Thread but I fear without explanation the inferential gap of the quote is too large.

The LessWrong Meetups, the EA community and other rationalist communities probably can learn from the experience of long established orders (I once asked for lessons from free masonry). 

So I drew the following connections:

According to the the sermon and the below verse the four pillars of a christian community are:

 

  1. Some canon of scripture which for LW might be compared to the sequences. I'm not clear what the pendant for EA is.
  2. Taking part in a closely knit community. Coming together regularly (weekly I guess is optimal).
  3. Eat together and have rites/customs together (this is also emphasized in the LW Meetup flyer).
  4. Praying together. I think praying could be generalized to talking and thinking about the scripture by oneself and together. Prayer also has a component of daily reflection of achievements, problems, wishes.

 

Other analogies that I drew from the quote:

 

  • Verse 44 describes behaviour also found in communes.
  • Verse 45 sounds a lot like EA teachings if you generalize it.
  • Verse 47 the last sentence could be interpreted to indicate exponential growth as a result of these teachings.
  • The verses also seem to imply some reachout by positive example.

 

And what I just right now notice is that embedding the rules in the scripture is essentially self-reference. As the scripture is canon this structure perpetuates itself. Clearly a meme that ensures its reproduction.

Does this sound convincing and plausible or did I fell trap to some bias in (over)interpreting the sermon?

I hope this is upvoted for the lessons we might draw from this - despite the quote clearly being theistic in origin.

continue reading »

Can AIXI be trained to do anything a human can?

3 Stuart_Armstrong 20 October 2014 01:12PM

There is some discussion as to whether an AIXI-like entity would be able to defend itself (or refrain from destroying itself). The problem is that such an entity would be unable to model itself as being part of the universe: AIXI itself is an uncomputable entity modelling a computable universe, and more limited variants like AIXI(tl) lack the power to simulate themselves. Therefore, they cannot identify "that computer running the code" with "me", and would cheerfully destroy themselves in the pursuit of their goals/reward.

I've pointed out that agents of the AIXI type could nevertheless learn to defend itself in certain circumstances. These were the circumstances where it could translate bad things happening to itself into bad things happening to the universe. For instance, if someone pressed an OFF swith to turn it off for an hour, it could model that as "the universe jumps forwards an hour when that button is pushed", and if that's a negative (which is likely is, since the AIXI loses an hour of influencing the universe), it would seek to prevent that OFF switch being pressed.

That was an example of the setup of the universe "training" the AIXI to do something that it didn't seem it could do. Can this be generalised? Let's go back to the initial AIXI design (the one with the reward channel) and put a human in charge of that reward channel with the mission of teaching the AIXI important facts. Could this work?

For instance, if anything dangerous approached the AIXI's location, the human could lower the AIXI's reward, until it became very effective at deflecting danger. The more variety of things that could potentially threaten the AIXI, the more likely it is to construct plans of actions that contain behaviours that look a lot like "defend myself." We could even imagine that there is a robot programmed to repair the AIXI if it gets (mildly) damaged. The human could then reward the AIXI if it leaves that robot intact or builds duplicates or improves it in some way. It's therefore possible the AIXI could come to come to value "repairing myself", still without explicit model of itself in the universe.

It seems this approach could be extended to many of the problems with AIXI. Sure, an AIXI couldn't restrict its own computation in order to win the HeatingUp game. But the AIXI could be trained to always use subagents to deal with these kinds of games, subagents that could achieve maximal score. In fact, if the human has good knowledge of the AIXI's construction, it could, for instance, pinpoint a button that causes the AIXI to cut short its own calculation. The AIXI could then learn that pushing that button in certain circumstances would get a higher reward. A similar reward mechanism, if kept up long enough, could get it around existential despair problems.

I'm not claiming this would necessarily work - it may require a human rewarder of unfeasibly large intelligence. But it seems there's a chance that it could work. So it seems that categorical statements of the type "AIXI wouldn't..." or "AIXI would..." are wrong, at least as AIXI's behaviour is concerned. An AIXI couldn't develop self-preservation - but it could behave as if it had. It can't learn about itself - but it can behave as if it did. The human rewarder may not be necessary - maybe certain spontaneously occurring situations in the universe ("AIXI training wheels arenas") could allow the AIXI to develop these skills without outside training. Or maybe somewhat stochastic AIXI's with evolution and natural selection could do so. There is an angle connected with embodied embedded cognition that might be worth exploring there (especially the embedded part).

It seems that agents of the AIXI type may not necessarily have the limitations we assume they must.

Open thread, Oct. 20 - Oct. 26, 2014

7 MrMind 20 October 2014 08:12AM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

A Day Without Defaults

28 katydee 20 October 2014 08:07AM

Author's note: this post was written on Sunday, Oct. 19th. Its sequel will be written on Sunday, Oct. 27th.

Last night, I went to bed content with a fun and eventful weekend gone by. This morning, I woke up, took a shower, did my morning exercises, and began eat breakfast before making the commute up to work.

At the breakfast table, though, I was surprised to learn that it was Sunday, not Monday. I had misremembered what day it was and in fact had an entire day ahead of me with nothing on the agenda. At first, this wasn't very interesting, but then I started thinking. What to do with an entirely free day, without any real routine?

I realized that I didn't particularly know what to do, so I decided that I would simply live a day without defaults. At each moment of the day, I would act only in accordance with my curiosity and genuine interest. If I noticed myself becoming bored, disinterested, or otherwise less than enthused about what was going on, I would stop doing it.

What I found was quite surprising. I spent much less time doing routine activities like reading the news and browsing discussion boards, and much more time doing things that I've "always wanted to get around to"-- meditation, trying out a new exercise routine, even just spending some time walking around outside and relaxing in the sun.

Further, this seemed to actually make me more productive. When I sat down to get some work done, it was because I was legitimately interested in finishing my work and curious as to whether I could use a new method I had thought up in order to solve it. I was able to resolve something that's been annoying me for a while in much less time than I thought it would take.

By the end of the day, I started thinking "is there any reason that I don't spend every day like this?" As far as I can tell, there isn't really. I do have a few work tasks that I consider relatively uninteresting, but there are multiple solutions to that problem that I suspect I can implement relatively easily.

My plan is to spend the next week doing the same thing that I did today and then report back. I'm excited to let you all know what I find!

Noticing

5 casebash 20 October 2014 07:47AM

The Selective Attention Test is a famous experiment in perceptual psychology that demonstrates how strongly attention shapes our perceptions. In light of this experiment, I think it is interesting to consider the question of how much valuable information we have thrown away because we simply didn't notice it or weren't in a position where we could appreciate the importance. My intuition is that we have missed more information than we've actually absorbed.

I would like to consider the question of what is there to notice, in particular what things will provide us with value once we gain the prerequisite knowledge and habits to allow us to effectively perceive it. One example is that after I started a course on Art History, I gained the ability to notice more about possible meanings and interesting aspects of art. This is fantastic, because art is everywhere. Now that I have a basic ability to appreciate art I gain some level of growth almost for free, just from seeing art in places where I'd have gone anyway. I'm hoping to do a film studies course next year, since, like almost everyone, I watch movies anyway and want to get as much out of them as I can.

Marketing is likely another example. Someone who has studied marketing may unconsciously evaluate every ad that they see, and after seeing enough examples, gain a strong understanding of what counts as a good ad and what counts as a bad ad. Perhaps this won't be totally free, perhaps they will sometimes see something and not know why it is good until they think about it for a bit. However, this knowledge is mostly free, in that after you understand the basic principles, you gain some level of growth for a minimal investment.

I think that another more general situation like this is those activities that are a form of creation. If you try writing a few stories, then when you read a story you'll have a greater appreciation of what the author is trying to do. If you've played guitar, then when you listen to music you'll learn about the different techniques that guitarists use. If you've played sport, then you'll probably have a greater appreciation for strategy when watching a game.

Professional comedy writers are always looking for jokes. Whenever unusual or bad or unexpected happens, they note it so that they can try to find a manner of forming it into a joke later. I've heard that actors become attuned to different people's quirks, body language and manner of speaking.

The idea here is to figure out what you are doing anyway and find a method of quickly gaining a critical mass of knowledge. I believe that if you were to manage this for a number of areas, then there could be rather large long term advantages. Any thoughts on areas I've missed or methods of getting up to speed for these areas quickly?

A few thoughts on a Friendly AGI (safe vs friendly, other minds problem, ETs and more)

3 the-citizen 19 October 2014 07:59AM

Friendly AI is an idea that I find to be an admirable goal. While I'm not yet sure an intelligence explosion is likely, or whether FAI is possible, I've found myself often thinking about it, and I'd like for my first post to share a few those thoughts on FAI with you.

Safe AGI vs Friendly AGI
-Let's assume an Intelligence Explosion is possible for now, and that an AGI with the ability to improve itself somehow is enough to achieve it.
-Let's define a safe AGI as an above-human general AI that does not threaten humanity or terran life (eg. FAI, Tool AGI, possibly Oracle AGI)
-Let's define a Friendly AGI as one that *ensures* the continuation of humanity and terran life.
-Let's say an unsafe AGI is all other AGIs.
-Safe AGIs must supress unsafe AGIs in order to be considered Friendly. Here's why:

-If we can build a safe AGI, we probably have the technology to build an unsafe AGI too.
-An unsafe AGI is likely to be built at that point because:
-It's very difficult to conceive of a way that humans alone will be able to permanently stop all humans from developing an unsafe AGI once the steps are known**
-Some people will find the safe AGI's goals unnacceptable
-Some people will rationalise or simply mistake that their AGI design is safe when it is not
-Some people will not care if their AGI design is safe, because they do not care about other people, or because they hold some extreme beliefs
-Most imaginable unsafe AGIs would outcompete safe AGIs, because they would not neccessarily be "hamstrung" by complex goals such as protecting us meatbags from destruction. Tool or Oracle AGIs would obviously not stand a chance due to their restrictions.
-Therefore, If a safe AGI does not prevent unsafe AGIs from coming into existence, humanity will very likely be destroyed.

-The AGI most likely to prevent unsafe AGIs from being created is one that actively predicted their development and terminates that development before or on completion.
-So to summarise

-An AGI is very likely only a Friendly AI if it actively supresses unsafe AGI.
-Oracle and Tool AGIs are not Friendly AIs, they are just safe AIs, because they don't suppress anything.
-Oracle and Tool AGIs are a bad plan for AI if we want to prevent the destruction of humanity, because hostile AGIs will surely follow.

(**On reflection I cannot be certain of this specific point, but I assume it would take a fairly restrictive regime for this to be wrong. Further comments on this very welcome.)

Other minds problem - Why should be philosophically careful when attempting to theorise about FAI

I read quite a few comments in AI discussions that I'd probably characterise as "the best utility function for a FAI is one that values all consciousness". I'm quite concerned that this persists as a deeply held and largely unchallenged assumption amongst some FAI supporters. I think in general I find consciousness to be an extremely contentious, vague and inconsistently defined concept, but here I want to talk about some specific philosophical failures.

My first concern is that while many AI theorists like to say that consciousness is a physical phenomenon, which seems to imply Monist/Physicalist views, they at the same time don't seem to understand that consciousness is a Dualist concept that is coherent only in a Dualist framework. A Dualist believes there is a thing called a "subject" (very crudely this equates with the mind) and then things called objects (the outside "empirical" world interpreted by that mind). Most of this reasoning begins with Descartes' cogito ergo sum or similar starting points ( https://en.wikipedia.org/wiki/Cartesian_dualism ). Subjective experience, qualia and consciousness make sense if you accept that framework. But if you're a Monist, this arbitrary distinction between a subject and object is generally something you don't accept. In the case of a Physicalist, there's just matter doing stuff. A proper Physicalist doesn't believe in "consciousness" or "subjective experience", there's just brains and the physical human behaviours that occur as a result. Your life exists from a certain point of view, I hear you say? The Physicalist replies, "well a bunch of matter arranged to process information would say and think that, wouldn't it?".

I don't really want to get into whether Dualism or Monism is correct/true, but I want to point out even if you try to avoid this by deciding Dualism is right and consciousness is a thing, there's yet another more dangerous problem. The core of the problem is that logically or empirically establishing the existence of minds, other than your own is extremely difficult (impossible according to many). They could just be physical things walking around acting similar to you, but by virtue of something purely mechanical - without actual minds. In philosophy this is called the "other minds problem" ( https://en.wikipedia.org/wiki/Problem_of_other_minds or http://plato.stanford.edu/entries/other-minds/). I recommend a proper read of it if the idea seems crazy to you. It's a problem that's been around for centuries, and yet to-date we don't really have any convincing solution (there are some attempts but they are highly contentious and IMHO also highly problematic). I won't get into it more than that for now, suffice to say that not many people accept that there is a logical/empirical solution to this problem.

Now extrapolate that to an AGI, and the design of its "safe" utility functions. If your AGI is designed as a Dualist (which is neccessary if you wish to encorporate "consciousness", "experience" or the like into your design), then you build-in a huge risk that the AGI will decide that other minds are unprovable or do not exist. In this case your friendly utility function designed to protect "conscious beings" fails and the AGI wipes out humanity because it poses a non-zero threat to the only consciousness it can confirm - its own. For this reason I feel "consciousness", "awareness", "experience" should be left out of FAI utility functions and designs, regardless of the truth of Monism/Dualism, in favour of more straight-forward definitions of organisms, intelligence, observable emotions and intentions. (I personally favour conceptualising any AGI as a sort of extension of biological humanity, but that's a discussion for another day) My greatest concern is there is such strong cultural attachment to the concept of consciousness that researchers will be unwilling to properly question the concept at all.

What if we're not alone?

It seems a little unusual to throw alien life into the mix at this point, but I think its justified because an intelligence explosion really puts an interstellar existence well within our civilisation's grasp. Because it seems that an intelligence explosion implies a very high rate of change, it makes sense to start considering even the long term implication early, particularly if the consequences are very serious, as I believe they may be in this realm of things.

Let's say we successfully achieved a FAI. In order to fufill its mission of protecting humanity and the biosphere, it begins expanding, colonising and terraforming other planets for potential habitation by Earth originating life. I would expect this expansion wouldn't really have a limit, because the more numourous the colonies, the less likely it is we could be wiped out by some interstellar disaster.

Of course, we can't really rule out the possibility that we're not alone in the universe, or even the galaxy. If we make it as far as AGI, then its possible another alien civilisation might reach a very high level of technological advancement too. Or there might be many. If our FAI is friendly to us but basically treats them as paperclip fodder, then potentially that's a big problem. Why? Well:

-Firstly, while a species' first loyalty is to itself, we should consider that it might be morally unsdesirable to wipe out alien civilisations, particularly as they might be in some distant way "related" (see panspermia) to own biosphere.
-Secondly, there is conceivable scenarios where alien civilisations might respond to this by destroying our FAI/Earth/the biosphere/humanity. The reason is fairly obvious when you think about it. An expansionist AGI could be reasonably viewed as an attack or possibly an act of war.

Let's go into a tiny bit more detai. Given that we've not been destroyed by any alien AGI just yet, I can think of a number of possible interstellar scenarios:

(1) There is no other advanced life
(2) There is advanced life, but it is inherently non-expansive (expand inwards, or refuse to develop dangerous AGI)
(3) There is advanced life, but they have not discovered AGI yet. There could potentially be a race-to-the-finish (FAI) scenario on.
(4) There is already expanding AGIs, but due to physical limits on the expansion rate, we are not aware of them yet. (this could use further analysis)
One civilisation, or an allied group of civilisations have develop FAIs and are dominant in the galaxy. They could be either:

(5) Whack-a-mole cilivisations that destroy all potential competitors as soon as they are identified
(6) Dominators that tolerate civilisations so long as they remain primitive and non-threatening by comparison.
(7) Some sort of interstellar community that allows safe civilisations to join (this community still needs to stomp on dangerous potential rival AGIs)

In the case of (6) or (7), developing a FAI that isn't equipped to deal with alien life will probably result in us being liquidated, or at least partially sanitised in some way. In (1) (2) or (5), it probably doesn't matter what we do in this regard, though in (2) we should consider being nice. In (3) and probably (4) we're going to need a FAI capable of expanding very quickly and disarming potential AGIs (or at least ensuring they are FAIs from our perspective).

The upshot of all this is that we probably want to design safety features into our FAI so that it doesn't destroy alien civilisations/life unless its a significant threat to us. I think the understandable reaction to this is something along the lines of "create an FAI that values all types of life" or "intelligent life" or something along these lines. I don't exactly disagree, but I think we must be cautious in how we formulate this too.

Say there are many different civilisations in the galaxy. What sort of criteria would ensure that, given some sort of zero-sum scenario, Earth life wouldn't be destroyed. Let's say there was some sort of tiny but non-zero probability that humanity could evade the FAI's efforts to prevent further AGI development. Or perhaps there was some loophole in the types of AGI's that humans were allowed to develop. Wouldn't it be sensible, in this scenario, for a universalist FAI to wipe out humanity to protect the countless other civilisations? Perhaps that is acceptable? Or perhaps not? Or less drastically, how does the FAI police warfare or other competition between civilisations? A slight change in the way life is quantified and valued could change drastically the outcome for humanity. I'd probably suggest we want to weight the FAI's values to start with human and Earth biosphere primacy, but then still give some non-zero weighting to other civilisations. There is probably more thought to be done in this area too.

Simulation

I want to also briefly note that one conceivable way we might postulate as a safe way to test Friendly AI designs is to simulate a worlds/universes of less complexity than our own, make it likely that it's inhabitants invent a AGI or FAI, and then closely study the results of these simluations. Then we could study failed FAI attempt with much greater safety. It also occured to me that if we consider the possibilty of our universe being a simulated one, then this is a conceivable scenario under which our simulation might be created. After all, if you're going to simulate something, why not something vital like modelling existential risks? I'm not sure yet sure of the implications exactly. Maybe we need to consider how it relates to our universe's continued existence, or perhaps it's just another case of Pascal's Mugging. Anyway I thought I'd mention it and see what people say.

A playground for FAI theories

I want to lastly mention this link (https://www.reddit.com/r/LessWrongLounge/comments/2f3y53/the_ai_game/). Basically its a challenge for people to briefly describe an FAI goal-set, and for others to respond by telling them how that will all go horribly wrong. I want to suggest this is a very worthwhile discussion, not because its content will include rigourous theories that are directly translatable into utility functions, because very clearly it won't, but because a well developed thread of this kind would be mixing pot of ideas and good introduction to common known mistakes in thinking about FAI. We should encourage a slightly more serious verison of this.

Thanks

FAI and AGI are very interesting topics. I don't consider myself able to really discern whether such things will occur, but its an interesting and potentially vital topic. I'm looking forward to a bit of feedback on my first LW post. Thanks for reading!

Fixing Moral Hazards In Business Science

31 DavidLS 18 October 2014 09:10PM

I'm a LW reader, two time CFAR alumnus, and rationalist entrepreneur.

Today I want to talk about something insidious: marketing studies.

Until recently I considered studies of this nature merely unfortunate, funny even. However, my recent experiences have caused me to realize the situation is much more serious than this. Product studies are the public's most frequent interaction with science. By tolerating (or worse, expecting) shitty science in commerce, we are undermining the public's perception of science as a whole.

The good news is this appears fixable. I think we can change how startups perform their studies immediately, and use that success to progressively expand.

Product studies have three features that break the assumptions of traditional science: (1) few if any follow up studies will be performed, (2) the scientists are in a position of moral hazard, and (3) the corporation seeking the study is in a position of moral hazard (for example, the filing cabinet bias becomes more of a "filing cabinet exploit" if you have low morals and the budget to perform 20 studies).

I believe we can address points 1 and 2 directly, and overcome point 3 by appealing to greed.

Here's what I'm proposing: we create a webapp that acts as a high quality (though less flexible) alternative to a Contract Research Organization. Since it's a webapp, the cost of doing these less flexible studies will approach the cost of the raw product to be tested. For most web companies, that's $0.

If we spend the time to design the standard protocols well, it's quite plausible any studies done using this webapp will be in the top 1% in terms of scientific rigor.

With the cost low, and the quality high, such a system might become the startup equivalent of citation needed. Once we have a significant number of startups using the system, and as we add support for more experiment types, we will hopefully attract progressively larger corporations.

Is anyone interested in helping? I will personally write the webapp and pay for the security audit if we can reach quorum on the initial protocols.

Companies who have expressed interested in using such a system if we build it:

(I sent out my inquiries at 10pm yesterday, and every one of these companies got back to me by 3am. I don't believe "startups love this idea" is an overstatement.)

So the question is: how do we do this right?

Here are some initial features we should consider:

  • Data will be collected by a webapp controlled by a trusted third party, and will only be editable by study participants.
  • The results will be computed by software decided on before the data is collected.
  • Studies will be published regardless of positive or negative results.
  • Studies will have mandatory general-purpose safety questions. (web-only products likely exempt)
  • Follow up studies will be mandatory for continued use of results in advertisements.
  • All software/contracts/questions used will be open sourced (MIT) and creative commons licensed (CC BY), allowing for easier cross-product comparisons.

Any placebos used in the studies must be available for purchase as long as the results are used in advertising, allowing for trivial study replication.

Significant contributors will receive:

  • Co-authorship on the published paper for the protocol.
  • (Through the paper) an Erdos number of 2.
  • The satisfaction of knowing you personally helped restore science's good name (hopefully).

I'm hoping that if a system like this catches on, we can get an "effective startups" movement going :)

So how do we do this right?

View more: Next