Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

New forum for MIRI research: Intelligent Agent Foundations Forum

36 orthonormal 20 March 2015 12:35AM

Today, the Machine Intelligence Research Institute is launching a new forum for research discussion: the Intelligent Agent Foundations Forum! It's already been seeded with a bunch of new work on MIRI topics from the last few months.

We've covered most of the (what, why, how) subjects on the forum's new welcome post and the How to Contribute page, but this post is an easy place to comment if you have further questions (or if, maths forbid, there are technical issues with the forum instead of on it).

But before that, go ahead and check it out!

(Major thanks to Benja Fallenstein, Alice Monday, and Elliott Jin for their work on the forum code, and to all the contributors so far!)

EDIT 3/22: Jessica Taylor, Benja Fallenstein, and I wrote forum digest posts summarizing and linking to recent work (on the IAFF and elsewhere) on reflective oracle machines, on corrigibility, utility indifference, and related control ideas, and on updateless decision theory and the logic of provability, respectively! These are pretty excellent resources for reading up on those topics, in my biased opinion.

Estimating the cost-effectiveness of research

18 owencb 11 December 2014 11:55AM

At a societal level, how much money should we put into medical research, or into fusion research? For individual donors seeking out the best opportunities, how can we compare the expected cost-effectiveness of research projects with more direct interventions?

Over the past few months I've been researching this area for the Global Priorities Project. We've written a variety of articles which focus on different parts of the question. Estimating the cost-effectiveness of research is the central example here, but a lot of the methodology is also applicable to other one-off projects with unknown difficulty (perhaps including political lobbying). I don't think it's all solved, but I do think we've made substantial progress.

I think people here might be interested, so I wanted to share our work. To help you navigate and find the most appropriate pieces, here I collect them, summarise what's contained in each, and explain how they fit together.

  • I gave an overview of my thinking at the Good Done Right conference, held in Oxford in July 2014. The slides and audio of my talk are available; I have developed more sophisticated models for some parts of the area since then.
  • How to treat problems of unknown difficulty introduces the problem: we need to make decisions about when to work more on problems such as research into fusion where we don't know how difficult it will be. It builds some models which allow principled reasoning about how we should act. These models are quite crude but easy to work with: they are intended to lower the bar for Fermi estimates and similar, and provide a starting point for building more sophisticated models.
  • Estimating cost-effectiveness for problems of unknown difficulty picks up from the models in the above post, and asks what they mean for the expected cost-effectiveness of work on the problems. This involves building a model of the counterfactual impact, as solvable research problems are likely to be solved eventually, so the main effect is to move the solution forwards. This post includes several explicit formulae that you can use to produce estimates; it also explains analogies between the explicit model we derive and the qualitative 'three factor' model that GiveWell and 80,000 Hours have used for cause selection.
  • Estimating the cost-effectiveness of research into neglected diseases is an investigation by Max Dalton, which uses the techniques for estimating cost-effectiveness to provide ballpark figures for how valuable we should expect research into vaccines or treatments for neglected diseases to be. The estimates suggest that, if carefully targeted, such research could be more cost-effective than the best direct health interventions currently available for funding.
  • The law of logarithmic returns discusses the question of returns to resources into a field rather than on a single question. With some examples, it suggests that as a first approximation it is often reasonable to assume that diminishing marginal returns take a logarithmic form.
  • Theory behind logarithmic returns explains how some simple generating mechanisms can produce roughly logarithmic returns. This is a complement to the above article: we think having both empirical and theoretical justification for the rule helps us to have higher confidence in it, and to better understand when it's appropriate to generalise to new contexts. In this piece I also highlight areas for further research on the theoretical side, into when the approximation will break down, and what we might want to use instead in these cases.
  • How valuable is medical research? written with Giving What We Can, applies the logarithmic returns model together with counterfactual reasoning to produce an estimate for the cost-effectiveness of medical research as a whole.

I've also made a thread in LessWrong Discussion for people to discuss applications of one of the simpler versions of the cost-effectiveness models, to get Fermi estimates for the value of different areas.

The silos of expertise: beyond heuristics and biases

28 Stuart_Armstrong 26 June 2014 01:13PM

Separate silos of expertise

I've been doing a lot of work on expertise recently, on the issue of measuring it and assessing it. The academic research out there is fascinating, though rather messy. Like many areas in the social sciences, it often suffers from small samples and overgeneralising from narrow examples. More disturbingly, the research projects seems to be grouped into various "silos" that don't communicate much with each other, each silo continuing on their own pet projects.

The main four silos I've identified are:

There may be more silos than this - many people working in expertise studies haven't heard of all of these (for instance, I was ignorant of Cooke's research until it was pointed out to me by someone who hadn't heard of Shanteau or Klein). The division into silos isn't perfect; Shanteau, for instance, has addressed the biases literature at least once (Shanteau, James. "Decision making by experts: The GNAHM effect." Decision Science and Technology. Springer US, 1999. 105-130), Kahneman and Klein have authored a paper together (Kahneman, Daniel, and Gary Klein. "Conditions for intuitive expertise: a failure to disagree." American Psychologist 64.6 (2009): 515). But in general the mutual ignoring (or mutual ignorance) seems pretty strong between the silos.

continue reading »

Lifestyle interventions to increase longevity

121 RomeoStevens 28 February 2014 06:28AM

There is a lot of bad science and controversy in the realm of how to have a healthy lifestyle. Every week we are bombarded with new studies conflicting older studies telling us X is good or Y is bad. Eventually we reach our psychological limit, throw up our hands, and give up. I used to do this a lot. I knew exercise was good, I knew flossing was good, and I wanted to eat better. But I never acted on any of that knowledge. I would feel guilty when I thought about this stuff and go back to what I was doing. Unsurprisingly, this didn't really cause me to make any positive lifestyle changes.

Instead of vaguely guilt-tripping you with potentially unreliable science news, this post aims to provide an overview of lifestyle interventions that have very strong evidence behind them and concrete ways to implement them.

continue reading »

Productivity as a function of ability in theoretical fields

14 Stefan_Schubert 26 January 2014 01:16PM

I argued in this post that the differences in capability between different researchers are vast (Kaj Sotala provided me with some interesting empirical evidence that backs up this claim). Einstein's contributions to physics or John von Neumann's contributions to mathematics (and a number of other disciplines) are arguably at least hundreds of times greater than that of an average physicist or mathematician.

At the same time, Yudkowsky argues that "in the space of brain designs" the difference between the village idiot and Einstein is tiny. Their brains are extremely similar, with the exception of some "minor genetic tweaks". Hence we get the following picture:


The picture I am painting is rather something like this:

It would seem that these pictures are incompatible - something that would be a problem for my picture, since I think that Yudkowsky's picture is right. So how can they both be true? The answer is, obviously, that they are measuring different things. The first is measuring something like difference in brain design that is relevant for intelligence. The second is rather measuring the difference in capability to come up with physical theories that are of use for mankind. Here the village idiot is on par with the chimp and the mouse - all of whom have no such capability whatsover. The average physicist has some such capability, but it's just a fraction of Einstein's.

Why is this? Well it is not because the village idiot has no capability at all to come up with physical theories. In fact, a primitive physical theory that is quite useful is hard-wired into our brains. Rather, the reason is that the village idiot has no capability to come up with a physical theory that is not already well-known.

Problems in theoretical physics and mathematics are typically problems that are so complex that they are hard to solve for some of the world's smartest people. This means that unless you're quite smart, your chances of contributing anything at at all to these disciplines is very slim. But, if you are but a tiny bit smarter than everyone else, you'll be able to spot solutions to problem after problem that others have struggled with - these problems being problems precisely because they were hard to solve for people with a certain level of intelligence. Thus we get something like the following relationship between cognitive ability, in Yudkowsky's sense, and ability to come up with useful physical theories, i.e. productivity - what I'm talking about:



It is for this reason that people like von Neumann and Einstein are so vastly much more productive than the average mathematician/physicist. The difference in intelligence is tiny on Yudkowsky's scale - obviously much smaller than that between Einstein and the village idiot - but this tiny difference allowed von Neumann and Einstein to solve lots of problems that were just too hard for other mathematicians/physicists. (It follows that an artificial intelligence just a tiny bit smarter than Einstein and von Neumann would be as much more productive than them as they are in relation to other mathematician/physicists).

(Obviously other characteristics besides intelligence are very important in these fields - e.g.  work ethic. I put that complication aside here, though.)

The same pattern holds in many other fields - e.g. sports. In a sense, the difference in ability between Rafael Nadal and no 300 on the ATP ranking is very small - e.g. they are hitting the ball roughly as hard, are roughly as good at, say, shooting the ball within half a metre of the base-line when not under pressure, etc - but this small difference in ability makes for a huge difference in productivity (in the sense that lots of people want to watch Nadal - which means that his games generate a lot of utility - but few people want to watch no 300). 

But there are also fields where you have an entirely different pattern. The difference in productivity between the world's best cleaner and the average cleaner is, I'd guess, tiny. Similarly, if Peter is twice as strong as Paul, he will be able to fetch as much as water as needed in half the time Paul needs - neither more, nor less. In other words, the relationship between ability and productivity in these fields is linear:
You get approximately this linear pattern in many physical jobs, but also in some intellectual jobs. Assume, for instance, that there is an intellectual field where the only thing that determines your productivity is your ability to acquire and memorize factual information. Say also that this field is neatly separated into small problems, so that your knowledge of one problem doesn't affect your ability to solve other problems. In this case, a twice as good capacity to acquire and memorize factual information will mean that you'll be able to solve twice as many of these problems - neither more nor less. Now there is obviously no intellectual field where you have exactly this pattern, but there are fields - the more "descriptive", as opposed to theoretical, social sciences come to mind - which at least approach it, and where the differences in productivity hence are much smaller than they are in theoretical physics or mathematics. (Of course, there are other patterns besides these; for instance, in some jobs, what's important is that you meet some minimum level of ability, beyond which more ability translates in very little additional productivity.)

Due to the fact that different academic disciplines have more or less the same pay structure, and are governed by similar rules and social institutions, these large differences between them are, however, seldom noted. This contributes to our inability to see how huge the differences in productivity between different scientists are in some disciplines.

The difference between these two patterns is due to the fact that the first kinds of jobs are more "social" than the latter kinds in a particular way. The usefulness of your work in theoretical physics is dependent on how good others are at theoretical physics in a way the usefulness of your water fetching isn't. Even if you're weak, you'll still contribute something by carrying a small amount of water to put out a fire, but if you're not above a certain level of cognitive ability, your work in theoretical physics will have no value whatsoever.

I suppose that economists must have written on this phenomenon - what I term other-dependent productivity. If so I'd be interested in that and in adopting their terminology.

I think one reason why people have trouble accepting Yudkowsky's picture is that they note how vastly much more productive Einstein was than an average physicist (let alone the village idiot...) and then infer that this difference must be due to a vast difference in intelligence. Hence pointing out that the difference in productivity could be vast even though the difference in intelligence is not, due to the fact that productivity in theoretical physics is strongly other-dependent, should make people more disposed to accept Yudkowsky's picture.

It would be interesting to discuss what the relationship between ability and productivity is in different jobs and intellectual fields. I leave that for later, though. Obviously, the question of how ability is to be defined is relevant here. This question was extensively discussed in the comments to Yudkowsky's post but I have avoided to discuss it for two reasons: firstly, because I think it is possible to get an intuitive grasp of the phenomena I'm discussing without a precise definition of ability, and, secondly, because an extensive discussion of this notion would have made the post far too long and complicated.

Edit: Here is a relevant article I just found on Marginal Revolution on "winner-take-all economies" where "small differences in skills can mean large differences in returns". It also has some useful tips for further reading. 

Singleton: the risks and benefits of one world governments

1 Stuart_Armstrong 05 July 2013 02:05PM

Many thanks to all those whose conversations have contributed to forming these ideas.

Will the singleton save us?

For most of the large existential risks that we deal with here, the situation would be improved with a single world government (a singleton), or at least greater global coordination. The risk of nuclear war would fade, pandemics would be met with a comprehensive global strategy rather than a mess of national priorities. Workable regulations for the technology risks - such as synthetic biology or AI – become at least conceivable. All in all, a great improvement in safety...

...with one important exception. A stable tyrannical one-world government, empowered by future mass surveillance, is itself an existential risk (it might not destroy humanity, but it would “permanently and drastically curtail its potential”). So to decide whether to oppose or advocate for more global coordination, we need to see how likely such a despotic government could be.

This is the kind of research I would love to do if I had the time to develop the relevant domain skills. In the meantime, I’ll just take all my thoughts on the subject and form them into a “proto-research project plan”, in the hopes that someone could make use of them in a real research project. Please contact me if you would want to do research on this, and would fancy a chat.

Defining “acceptable”

Before we can talk about the likelihood of a good outcome, we need to define what a good outcome actually is. For this analysis, I will take the definition that:

  • A singleton regime is acceptable, if it is at least as good as any developed democratic government of today.
continue reading »

Original Research on Less Wrong

21 lukeprog 29 October 2012 10:50PM

Hundreds of Less Wrong posts summarize or repackage work previously published in professional books and journals, but Less Wrong also hosts lots of original research in philosophy, decision theory, mathematical logic, and other fields. This post serves as a curated index of Less Wrong posts containing significant original research.

Obviously, there is much fuzziness about what counts as "significant" or "original." I'll be making lots of subjective judgment calls about which suggestions to add to this post. One clear rule is: I won't be linking anything that merely summarizes previous work (e.g. Stuart's summary of his earlier work on utility indifference).

Update 09/20/2013: Added Notes on logical priors from the MIRI workshop, Cooperating with agents with different ideas of fairness, while resisting exploitation, Do Earths with slower economic growth have a better chance at FAI?

Update 11/03/2013: Added Bayesian probability as an approximate theory of uncertainty?, On the importance of taking limits: Infinite Spheres of Utility, Of all the SIA-doomsdays in the all the worlds...

Update 01/22/2014: Added Change the labels, undo infinitely good, Reduced impact AI: no back channels, International cooperation vs. AI arms race, Naturalistic trust among AIs: The parable of the thesis advisor’s theorem


General philosophy

continue reading »

How about testing our ideas?

31 [deleted] 14 September 2012 10:28AM

Related to:  Science: Do It Yourself, How To Fix Science, Rationality and Science posts from this sequence, Cargo Cult Science, "citizen science"

You think you have a good map, what you really have is a working hypothesis

You did some thought on human rationality, perhaps spurred by intuition or personal experience. Building it up you did your homework and stood on the shoulders of other people's work giving proper weight to expert opinion. You write an article on LessWrong, it gets up voted, debated and perhaps accepted and promoted as part of a "sequence". But now you'd like to do that thing that's been nagging you since the start, you don't want to be one of those insight junkies consuming fun plausible ideas forgetting to ever get around to testing them. Lets see how the predictions made by your model hold up! You dive into the literature in search of experiments that have conveniently already tested your idea.

It is possible there simply isn't any such experimental material or that it is unavailable. Don't get me wrong, if I had to bet on it I would say it is more likely there is at least something similar to what you need than not. I would also bet that some things we wish where done haven't been so far and are unlikely to be for a long time. In the past I've wondered if we can in the future expect CFAR or LessWrong to do experimental work to test many of the hypotheses we've come up with based on fresh but unreliable insight, anecdotal evidence and long fragile chains of reasoning. This will not happen on its own.

With mention of CFAR, the mind jumps to them doing expensive experiments or posing long questionnaires with small samples of students and then publishing papers, like everyone else does. It is the respectable thing to do and it is something that may or may not be worth their effort. It seems doable. The idea of LWers getting into the habit of testing their ideas on human rationality beyond the anecdotal seems utterly impractical. Or is it?

That ordinary people can band together to rapidly produce new knowledge is anything but a trifle

How useful would it be if we had a site visited by thousands or tens of thousands solving forms or participating in experiments submitted by LessWrong posters or CFAR researchers? Something like this site. How useful would it be if we made such a data set publicly available? What if we could in addition to this data mine how people use apps or an online rationality class? At this point you might be asking yourself if building knowledge this way even possible in fields that takes years to study. A fair question, especially for tasks that require technical competence, the answer is yes.

I'm sure many at this point, have started wondering about what kinds of problems biased samples might create for us. It is important to keep in mind what kind of sample of people you get to participate in the experiment or fill out your form, since this influences how confident you are allowed to be about generalizations. Learning things about very specific kinds of people is useful too. Recall this is hardly a unique problem, you can't really get away from it in the social sciences. WEIRD samples aren't weird in academia. And I didn't say the thousands and tens of thousands people would need to come from our own little corner of the internet, indeed they probably couldn't. There are many approaches to getting them and making the sample as good as we can. Sites like yourmorals.org tried a variety of approaches we could learn from them. Even doing something like hiring people from Amazon Mechanical Turk can work out surprisingly well

LessWrong Science: We do what we must because we can

The harder question is if the resulting data would be used at all. As we currently are? I don't think so. There are many publicly available data sets and plenty of opportunities to mine data online, yet we see little if any original analysis based on them here. We either don't have norms encouraging this or we don't have enough people comfortable with statistics doing so. Problems like this aren't immutable. The Neglected Virtue of Scholarship noticeably changed our community in a similarly profound way with positive results. Feeling that more is possible I think it is time for us to move in this direction.

Perhaps just creating a way to get the data will attract the right crowd, the quantified self people are not out of place here. Perhaps LessWrong should become less of a site and more of a blogosphere. I'm not sure how and I think for now the question is a distraction anyway. What clearly can be useful is to create a list of models and ideas we've already assimilated that haven't been really tested or are based on research that still awaits replication. At the very least this will help us be ready to update if relevant future studies show up. But I think that identifying any low hanging fruit and design some experiments or attempts at replication, then going out there and try to perform them can get us so much more. If people have enough pull to get them done inside academia without community help great, if not we should seek alternatives.

Rationality Market Research

59 Raemon 14 July 2011 07:41PM

Several weeks ago, the NYC  Rationality Meetup Group began discussing outreach, both for  rationality in general and the group in particular. A lot of interesting problems were brought up. Should we be targeting the average person, or sticking to the cluster of personality-types that Less Wrong already attracts? How quickly should we introduce people to our community? What are the most effective ways to spread the idea of  rationality, and what are the most effective ways of actually encouraging people to undertake rational actions?

Those are all complex questions with complex answers, which are beyond the scope of this post. I ended up focusing on the question: "Is ' Rationality' the word we want to use when we're pitching ourselves?" I do not think it's worthwhile to try and change the central meme of the Less Wrong community, but it's not obvious that the new, realspace communities forming need to use the same central meme. 

This begat a simpler question: "What does the average person think of when they hear the word ' Rationality?' What positive or negative connotations does it have?" Do they think of straw vulcans and robots? Do they think of effective programmers or businessmen? Armed with this knowledge, we can craft a rationalist pitch that is likely to be effective at the average person, either by challenging their conception of  rationality or by bypassing keywords that might set off memetic immune systems.

continue reading »

The trouble with teamwork

10 Swimmer963 23 March 2011 06:05PM

I've hated group projects since about Grade 3. I grew up assuming that at some point, working in groups would stop being a series of trials and tribulations, and turn into normal, sane people working just like they would normally, except on a shared problem. Either they would change, or I would change, because I am incredibly bad at teamwork, at least the kind of it that gets doled out in the classroom. I don’t have the requisite people skills to lead a group, but I’m too much of a control freak to meekly follow along when the group wants to do a B+ project and I would like an A+. Drama inevitably ensues.

I would like to not have this problem. An inability to work in teams seems like a serious handicap. There are very few jobs that don’t involve teamwork, and my choice of future career, nursing, involves a lot.

My first experiences in the workplace, as a lifeguard, made me feel a little better about this. There was a lot less drama and a lot more just getting the work done. I think it has a lot to do with a) the fact that we’re paid to do a job that’s generally pretty easy, and b) the requirements of that job are pretty simple, if not easy. There is drama, but it rarely involves guard rotations or who’s going to hose the deck, and I can safely ignore it. Rescues do involve teamwork, but it’s a specific sort of teamwork where the roles are all laid out in advance, and that’s what we spent most of our training learning. Example: in a three-guard scenario, the guard to notice an unconscious swimmer in the water becomes guard #1: they make a whistle signal to the others and jump in, while guard #2 calls 911 and guard #3 clears the pool and does crowd control. There isn’t a lot of room for drama, and there isn’t much point because there is one right way to do things, everyone knows the right way to do things, and there isn’t time to fight about it anyway.

I’m hoping that working as a nurse in a hospital will be more this and less like the school-project variety of group work. The roles are defined and laid out; they’re what we’re learning right now in our theory classes. There’s less of a time crunch, but there’s still, usually, an obviously right way to do things. Maybe it gets more complicated when you have to approach a colleague for, say, not following the hand-hygiene rules, or when the rules the hospital management enforces are obviously not the best way to do things, but those are add-ons to the job, not its backbone.

But that’s for bedside nursing. Research is a different matter, and unfortunately, it’s a lot more like school. I’m taking a class about research right now, and something like 30% or 40% of our mark is on a group project. We have to design a study from beginning to end: problem, hypothesis, type of research, research proposal, population and sample, methods of measurement, methods of analysis, etc. My excuse that “I dislike this because it has absolutely no real-world relevance” is downright wrong, because we’re doing exactly what real researchers would do, only with much less resources and time, and I do like research and would like to work in that milieu someday.

Conflict with my group-members usually comes because I’m more invested in the outcome than the others. I have more motivation to spend time on it, and a higher standard for "good enough". Even if I think the assignment is stupid, I want to do it properly, partly for grades and partly because I hate not doing things properly. I don’t want to lead the group, because I know I’m terrible at it, but no one else wants to either because they don’t care either way. I end up feeling like a slave driver who isn’t very good at her job.

This time I had a new sort of problem. A group asked me to join them because they thought I was smart and would be a good worker. They set a personal deadline to have the project finished nearly a month before it was due. They had a group meeting, which I couldn’t go to because I was at work, and assigned sections, and sent out an email with an outline. I skimmed the email and put it aside for later, since it seemed less than urgent to me. ...And all of a sudden, at our next meeting, the project was nearly finished. No one had hounded me; they had just gone ahead and done it. Maybe they had a schema in their heads that hounding the non-productive members of the team would lead to drama, but I was offended, because I felt that in my case it wouldn’t have. I would have overridden my policy of doing my work at the last minute, and just gotten it done. It’s not like I didn’t care about our final grade.  

My pride was hurt (the way my classmate told me was by looking at my computer screen in the library, where I’d started to do the part assigned to me in the outline, and saying “you might as well not save that, I already did it.”) I didn’t feel like fighting about it, so I emailed the prof and asked if I could do the project on my own instead of with a team. She seemed confused that I wanted to do extra work, but assented.

I didn’t want to do extra work. I wanted to avoid the work of team meetings, team discussions, team drama... But that’s not how real-world research works. Refusing to play their game means I lose an opportunity to improve my teamwork skills, and I’m going to need those someday, and not just the skills acquired through lifeguarding. Either I need to turn off my control-freak need to have things my way, or I need to become charismatic and good at leading groups, and to do either of those things, I need a venue to practice.

Does anyone else here have the same problem I do? Has anyone solved it? Does anyone have tips for ways to improve?

Edit: reply to comment by jwendy, concerning my 'other' kind of problem. 

"I probably didn't say enough about it in the article, if you thought it seemed glossed over, but I thought a lot about why this happened at the time, and I was pretty upset (more than I should have been, really, over a school project) and that's why I left the group...because unlike type#2 team members, I actually cared a lotabout making a fair contribution and felt like shit when I hadn't. I never consciously decided to procrastinate, either...I just had a lot of other things on my plate, which is pretty much inevitable during the school year, and all of a sudden, foom!, my part of the project is done because one of the girls was bored on the weekend and had nothing better to do. (Huh? When does this ever happen?)

So I guess I'm like a team #2 member in that I procrastinate when I can get away with it, but like a team#1 member in that I do want to turn in quality work and get an A+. And I want it to my my quality work, not someone else's with my name on it."

I think it was justified to be surprised when the new kind of problem happened to me. If I'm more involved/engaged than all the students I've worked with in the past, that doesn't mean I'm the most engaged, but it does mean I have a schema in my brain for 'no one has their work finished until a week after they say they will'. 

 

View more: Next