Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

[Link] Asgardia - The Space Kingdom

0 morganism 18 November 2017 08:17PM

[Link] Ethical priorities for neurotech and AI - Nature

0 morganism 15 November 2017 09:08AM

[Link] Artificial intelligence and the stability of markets

1 fortyeridania 15 November 2017 02:17AM

[Link] Military AI as a Convergent Goal of Self-Improving AI

0 turchin 13 November 2017 11:25AM

Fables grow around missed natural experiments

1 MaryCh 10 November 2017 09:42PM

So I read Think like a Freak, and then glimpsed through a well-intentioned collection of "Reading Comprehension Tests for Schoolchildren" (in Ukrainian), and I was appalled at how easily the latter book dismissed simple observation of natural experiments that it makes a token effort to describe in favour of drawing the moral.

There was the story of "the Mowgli Children", two girls who were kidnapped and raised by wildlife, then found by someone and taken back to live as humans. (So what if it is hardly true. When I Googled "feral children", other stories were too similar to this one in the ways that matter, including this one.) It says they never learned to talk, didn't live for long after capture (not longer than 12 years, if I recall right), never became truly a part of human society. The moral is that children need interaction with other people to develop normally, "and the tale of Mowgli is just that, a beautiful tale".

Well yes, it kind of seems just like a beautiful tale right from the point when the wolves start talking, I don't know what kind of a kid would miss that before the Reading Comprehension Test but stop believing it afterwards, but anyhow.

What did they die of?

Who answered them when they howled?

Were ever dogs afraid of them?

They did not master human language, but how did they communicate with people? They had to, somehow, or they would not live even that long.

And lastly: how do people weigh the sheer impossibility of two little kids ever surviving against the iron certainty that they would not be able to integrate back into human society - weigh it so lightly? If the reader is expected to take this on faith, how can one be anything but amazed that it is at all possible? When I read about other feral children, somehow being found and taken back never seems to mean good news for them, or for anybody else.

I haven't ever read or heard of "the Mowgli Children" in any other context. Only in this one, about three or four times, and yet it was always presented as an "anecdote of science", although everybody understands it leads nowhere (can't ever lead anywhere because ethics forbids recreating the experiment's conditions) and hardly signifies anything.


What other missed natural experiments do you know of?

Less Wrong Lacks Representatives and Paths Forward

1 curi 08 November 2017 07:00PM

In my understanding, there’s no one who speaks for LW, as its representative, and is *responsible* for addressing questions and criticisms. LW, as a school of thought, has no agents, no representatives – or at least none who are open to discussion.


The people I’ve found interested in discussion on the website and slack have diverse views which disagree with LW on various points. None claim LW is true. They all admit it has some weaknesses, some unanswered criticisms. They have their own personal views which aren’t written down, and which they don’t claim to be correct anyway.


This is problematic. Suppose I wrote some criticisms of the sequences, or some Bayesian book. Who will answer me? Who will fix the mistakes I point out, or canonically address my criticisms with counter-arguments? No one. This makes it hard to learn LW’s ideas in addition to making it hard to improve them.


My school of thought (Fallible Ideas – FI – https://fallibleideas.com) has representatives and claims to be correct as far as is known (like LW, it’s fallibilist, so of course we may discover flaws and improve it in the future). It claims to be the best current knowledge, which is currently non-refuted, and has refutations of its rivals. There are other schools of thought which say the same thing – they actually think they’re right and have people who will address challenges. But LW just has individuals who individually chat about whatever interests them without there being any organized school of thought to engage with. No one is responsible for defining an LW school of thought and dealing with intellectual challenges.


So how is progress to be made? Suppose LW, vaguely defined as it may be, is mistaken on some major points. E.g. Karl Popper refuted induction. How will LW find out about its mistake and change? FI has a forum where its representatives take responsibility for seeing challenges addressed, and have done so continuously for over 20 years (as some representatives stopped being available, others stepped up).


Which challenges are addressed? *All of them*. You can’t just ignore a challenge because it could be correct. If you misjudge something and then ignore it, you will stay wrong. Silence doesn’t facilitate error correction. For information on this methodology, which I call Paths Forward, see: https://curi.us/1898-paths-forward-short-summary BTW if you want to take this challenge seriously, you’ll need to click the link; I don’t repeat all of it. In general, having much knowledge is incompatible with saying all of it (even on one topic) upfront in forum posts without using references.


My criticism of LW as a whole is that it lacks Paths Forward (and lacks some alternative of its own to fulfill the same purpose). In that context, my criticisms regarding specific points don’t really matter (or aren’t yet ready to be discussed) because there’s no mechanism for them to be rationally resolved.


One thing FI has done, which is part of Paths Forward, is it has surveyed and addressed other schools of thought. LW hasn’t done this comparably – LW has no answer to Critical Rationalism (CR). People who chat at LW have individually made some non-canonical arguments on the matter that LW doesn’t take responsibility for (and which often involve conceding LW is wrong on some points). And they have told me that CR has critics – true. But which criticism(s) of CR does LW claim are correct and take responsibility for the correctness of? (Taking responsibility for something involves doing some major rethinking if it’s refuted – addressing criticism of it and fixing your beliefs if you can’t. Which criticisms of CR would LW be shocked to discover are mistaken, and then be eager to reevaluate the whole matter?) There is no answer to this, and there’s no way for it to be answered because LW has no representatives who can speak for it and who are participating in discussion and who consider it their responsibility to see that issues like this are addressed. CR is well known, relevant, and makes some clear LW-contradicting claims like that induction doesn’t work, so if LW had representatives surveying and responding to rival ideas, they would have addressed CR.


BTW I’m not asking for all this stuff to be perfectly organized. I’m just asking for it to exist at all so that progress can be made.


Anecdotally, I’ve found substantial opposition to discussing/considering methodology from LW people so far. I think that’s a mistake because we use methods when discussing or doing other activities. I’ve also found substantial resistance to the use of references (including to my own material) – but why should I rewrite a new version of something that’s already written? Text is text and should be treated the same whether it was written in the past or today, and whether it was written by someone else or by me (either way, I’m taking responsibility. I think that’s something people don’t understand and they’re used to people throwing references around both vaguely and irresponsibly – but they haven’t pointed out any instance where I made that mistake). Ideas should be judged by the idea, not by attributes of the source (reference or non-reference).


The Paths Forward methodology is also what I think individuals should personally do – it works the same for a school of thought or an individual. Figure out what you think is true *and take responsibility for it*. For parts that are already written down, endorse that and take responsibility for it. If you use something to speak for you, then if it’s mistaken *you* are mistaken – you need to treat that the same as your own writing being refuted. For stuff that isn’t written down adequately by anyone (in your opinion), it’s your responsibility to write it (either from scratch or using existing material plus your commentary/improvements). This writing needs to be put in public and exposed to criticism, and the criticism needs to actually get addressed (not silently ignored) so there are good Paths Forward. I hoped to find a person using this method, or interested in it, at LW; so far I haven’t. Nor have I found someone who suggested a superior method (or even *any* alternative method to address the same issues) or pointed out a reason Paths Forward doesn’t work.


Some people I talked with at LW seem to still be developing as intellectuals. For lots of issues, they just haven’t thought about it yet. That’s totally understandable. However I was hoping to find some developed thought which could point out any mistakes in FI or change its mind. I’m seeking primarily peer discussion. (If anyone wants to learn from me, btw, they are welcome to come to my forum. It can also be used to criticize FI. http://fallibleideas.com/discussion-info) Some people also indicated they thought it’d be too much effort to learn about and address rival ideas like CR. But if no one has done that (so there’s no answer to CR they can endorse), then how do they know CR is mistaken? If CR is correct, it’s worth the effort to study! If CR is incorrect, someone better write that down in public (so CR people can learn about their errors and reform; and so perhaps they could improve CR to no longer be mistaken or point out errors in the criticism of CR.)


One of the issues related to this dispute is I believe we can always proceed with non-refuted ideas (there is a long answer for how this works, but I don’t know how to give a short answer that I expect LW people to understand – especially in the context of the currently-unresolved methodology dispute about Paths Forward). In contrast, LW people typically seem to accept mistakes as just something to put up with, rather than something to try to always fix. So I disagree with ignoring some *known* mistakes, whereas LW people seem to take it for granted that they’re mistaken in known ways. Part of the point of Paths Forward is not to be mistaken in known ways.


Paths Forward is a methodology for organizing schools of thought, ideas, discussion, etc, to allow for unbounded error correction (as opposed to typical things people do like putting bounds on discussions, with discussion of the bounds themselves being out of bounds). I believe the lack of Paths Forward at LW is preventing the resolution of other issues like about the correctness of induction, the right approach to AGI, and the solution to the fundamental problem of epistemology (how new knowledge can be created).

[Link] The Little Dragon is Dead

0 SquirrelInHell 06 November 2017 09:24PM

[Link] AGI

0 curi 05 November 2017 08:20PM

[Link] Kialo -- an online discussion platform that attempts to support reasonable debates

2 mirefek 05 November 2017 12:48PM

[Link] Intent of Experimenters; Halting Procedures; Frequentists vs. Bayesians

1 curi 04 November 2017 07:13PM

[Link] Intercellular competition and the inevitability of multicellular aging

1 Gunnar_Zarncke 04 November 2017 12:32PM

Announcing the AI Alignment Prize

6 cousin_it 03 November 2017 03:45PM

Stronger than human artificial intelligence would be dangerous to humanity. It is vital any such intelligence’s goals are aligned with humanity's goals. Maximizing the chance that this happens is a difficult, important and under-studied problem.

To encourage more and better work on this important problem, we (Zvi Mowshowitz and Vladimir Slepnev) are announcing a $5000 prize for publicly posted work advancing understanding of AI alignment, funded by Paul Christiano.

This prize will be awarded based on entries gathered over the next two months. If the prize is successful, we will award further prizes in the future.

This prize is not backed by or affiliated with any organization.


Your entry must be published online for the first time between November 3 and December 31, 2017, and contain novel ideas about AI alignment. Entries have no minimum or maximum size. Important ideas can be short!

Your entry must be written by you, and submitted before 9pm Pacific Time on December 31, 2017. Submit your entries either as URLs in the comments below, or by email to apply@ai-alignment.com. We may provide feedback on early entries to allow improvement.

We will award $5000 to between one and five winners. The first place winner will get at least $2500. The second place winner will get at least $1000. Other winners will get at least $500.

Entries will be judged subjectively. Final judgment will be by Paul Christiano. Prizes will be awarded on or before January 15, 2018.

What kind of work are we looking for?

AI Alignment focuses on ways to ensure that future smarter than human intelligence will have goals aligned with the goals of humanity. Many approaches to AI Alignment deserve attention. This includes technical and philosophical topics, as well as strategic research about related social, economic or political issues. A non-exhaustive list of technical and other topics can be found here.

We are not interested in research dealing with the dangers of existing machine learning systems commonly called AI that do not have smarter than human intelligence. These concerns are also understudied, but are not the subject of this prize except in the context of future smarter than human intelligence. We are also not interested in general AI research. We care about AI Alignment, which may or may not also advance the cause of general AI research.

Problems as dragons and papercuts

1 Elo 03 November 2017 01:41AM

Original post: http://bearlamp.com.au/problems-as-dragons-and-papercuts/

When I started trying to become the kind of person that can give advice, I went looking for dragons.

I figured if I didn't know the answers that meant the answers were hard, they were big monsters with hidden weak spots that you have to find. "Problem solving is hard", I thought.

Problem solving is not something everyone is good at because problems are hard, beasts of a thing.  Right?

For all my searching for problems, I keep coming back to that just not being accurate. Problems are all easy, dumb, simple things. Winning at life is not about taking on the right dragon and finding it's weak spots.

Problem solving is about getting the basics down and dealing with every single, "when I was little I imprinted on not liking chocolate and now I have been an anti-chocolate campaigner for so long for reasons that I have no idea about and now it's time to change that".

It seems like the more I look for dragons and beasts the less I find.  And the more problems seem like paper cuts. But it's paper cuts all the way down.  Paper cuts that caused you to argue with your best friend in sixth grade, paper cuts that caused you to sneak midnight snacks while everyone was not looking, and eat yourself fat and be mad at yourself.  Paper cuts.

I feel like a superhero all dressed up and prepared to fight crime but all the criminals are petty thieves and opportunists that got caught on a bad day. Nothing coordinated, nothing super-villain, and no dragons.

When I was in high school (male with long hair) I used to wear my hair in a pony tail.  For about 4 years.  Every time I would wake up or my hair would dry I would put my hair in a pony tail.  I just did.  That's what I would do.  One day.  One day a girl (who I had not spoken to ever) came up to me and asked me why I did it.  To which I did not have an answer.  From that day forward I realised I was doing a thing I did not need to do.  It's been over 10 years since then and I have that one conversation to thank for changing the way I do that one thing.  I never told her.

That one thing.  That one thing that is irrelevant, and only really meaningful to you because someone said this one thing, this one time. but from the outside it feels like, "so what".  That's what problems are like, and that's what it's like to solve problems.  But.  If you want to be good at solving problems you need to avoid feeling like "so what" and engage the "curiosity", search for the feeling of confusion.  Appeal to the need for understanding.  Get into it.

Meta: this has been an idle musing for weeks now. Actually writing took about an hour.

Cross posted to https://www.lesserwrong.com/posts/MWoxdGwMHBSqNPPKK/problems-as-dragons-and-papercuts

November 2017 Media Thread

1 ArisKatsaris 02 November 2017 12:35AM

This is the monthly thread for posting media of various types that you've found that you enjoy. Post what you're reading, listening to, watching, and your opinion of it. Post recommendations to blogs. Post whatever media you feel like discussing! To see previous recommendations, check out the older threads.


  • Please avoid downvoting recommendations just because you don't personally like the recommended material; remember that liking is a two-place word. If you can point out a specific flaw in a person's recommendation, consider posting a comment to that effect.
  • If you want to post something that (you know) has been recommended before, but have another recommendation to add, please link to the original, so that the reader has both recommendations.
  • Please post only under one of the already created subthreads, and never directly under the parent media thread.
  • Use the "Other Media" thread if you believe the piece of media you want to discuss doesn't fit under any of the established categories.
  • Use the "Meta" thread if you want to discuss about the monthly media thread itself (e.g. to propose adding/removing/splitting/merging subthreads, or to discuss the type of content properly belonging to each subthread) or for any other question or issue you may have about the thread or the rules.

[Link] Simple refutation of the ‘Bayesian’ philosophy of science

1 curi 01 November 2017 06:54AM

[Link] Why Competition in The Politics Industry Is Failing America -pdf

0 morganism 31 October 2017 11:26PM

Questions about AGI's Importance

0 curi 31 October 2017 08:50PM

Why expect AGIs to be better at thinking than human beings? Is there some argument that human thinking problems are primarily due to hardware constraints? Has anyone here put much thought into parenting/educating AGIs?

Cutting edge technology

2 Elo 31 October 2017 06:00AM

Original post: http://bearlamp.com.au/cutting-edge-technology/

When the microscope was invented, in a very short period of time we discovered the cell and the concept of microbiology.  That one invention allowed us to open up entire fields of biology and medicine.  Suddenly we could see the microbes!  We could see the activity that had been going on under our noses for so long.

when we started to improve our ability to refined pure materials we could finally make furnace bricks with specific composition.  Specific compositions could then be used to make bricks that were able to reach higher temperatures without breaking.  Higher temperatures meant better refining of materials.  Better refining meant higher quality bricks, and so on until we now have some very pure technological processes around making materials.  But it's something we didn't have before the prior technology on the skill tree.  

Before we had refrigeration and food packaging, it was difficult to get your fresh food to survive to your home.  Now with production lines it's very simple.  For all his decadence Caesar probably would have had trouble ordering a cheeseburger for $2 and having it ready in under 5 minutes.  We've come a long way since Caesar.  We've built a lot of things that help us stand on the shoulders of those who came before us.

Technology enables further progress.  That seems obvious.  But did that seem obvious before looking down the microscope?  Could we have predicted what bricks we could have made with purely refined materials?  Could Caesar have envisioned every citizen in his kingdom watching TV for relatively little cost to those people?  It would have been hard to forsee these things back then.

With the idea that technology is enabling future growth in mind - I bring the question, "What technology is currently under-utilised?"  Would you be able to spot it when it happens?  Touch screen revolutionised phone technology.  Bitcoin - we are still watching but it's here to stay.  

"What technology is currently under-utilised?"

For example "AI has the power to change everything. (it's almost too big to talk about)".  But that's a big thing.  It's like saying "the internet has the power to change everything" great but could you have predicted google, facebook and uber from a couple of connected computers?  I am hoping for some more specific ideas about which specific technology will change life in what way.

Here are some ideas in ROT13 (chrome addon d3coder):

  • Pbzchgre hfr jvyy punatr jura jr ohvyq gur arkg guvat gb ercynpr "xrlobneqf"
  • Genafcbeg grpuabybtl jvyy punatr vs onggrel be "raretl fgbentr" grpuabybtl vzcebirf.
  • Nhgbzngvba jvyy punatr cebqhpgvba naq qryvirel bs tbbqf naq freivprf. Naq riraghnyyl oevat nobhg cbfg-fpnepvgl rpbabzvpf
  • Vs IE penpxf orggre pbybhe naq fbhaq grpuabybtl (guvax, abg whfg PZLX ohg nyy gur bgure pbybhef abg ba gur YRQ fcrpgehz), jr zvtug whfg frr IE rkcybqr.
  • Znpuvar yrneavat naq fgngvfgvpf unir gur cbjre gb punatr zrqvpvar
  • PEVFCE naq trar rqvgvat jvyy punatr sbbq cebqhpgvba
  • Dhnaghz pbzchgvat jvyy punatr trar rqvgvat ol pnyphyngvat guvatf yvxr cebgrva sbyqvat va fvtavsvpnagyl yrff gvzr.
  • Dhnaghz pbzchgvat (juvyr vg'f fgvyy abg pbafhzre tenqr) jvyy nyfb punatr frphevgl.
  • V jbhyq unir fnvq 3Q cevagvat jbhyq punatr ybpxfzvguvat ohg abj V nz abg fb fher.
    3Q cevagvat unf birenyy qbar n cbbe wbo bs punatvat nalguvat.
  • vs gur pbafgehpgvba vaqhfgel pna nhgbzngr gung jvyy punatr gur jnl jr ohvyq ubhfvat.

As much as these don't all follow the rule of being consumer-grade developments that might revolutionise the world, I'd like to encourage others to aim for consumer viable ideas.  

This matters because this is how you see opportunity.  This is how you find value.  If you can take one thing on my list or your own list and make it happen sooner, you can probably pocket a pretty penny in the process.  So what's on your list?  Do you have two minutes to think about what's coming soon?

Cross posted to lesserwrong: https://www.lesserwrong.com/posts/3GP3j7zgKbnaZCDbp/cutting-edge-technology

Open thread, October 30 - November 5, 2017

0 Elo 30 October 2017 11:37PM
If it's worth saying, but not worth its own post, then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should start on Monday, and end on Sunday.

4. Unflag the two options "Notify me of new top-level comments on this article" and ".

[Link] Reason and Morality: Philosophy Outline with Links for Details

0 curi 30 October 2017 11:33PM

[Link] Should we be spending no less on alternate foods than AI now?

2 denkenberger 30 October 2017 12:13AM

Interactive model knob-turning

3 Gust 28 October 2017 07:42PM

(Please discuss on LessWrong 2.0)

(Cross-posted from my medium channel)

When you are trying to understand something by yourself, a useful skill to check your grasp on the subject is to try out the moving parts of your model and see if you can simulate the resulting changes.

Suppose you want to learn how a rocket works. At the bare minimum, you should be able to calculate the speed of the rocket given the time past launch. But can you tell what happens if Earth gravity was stronger? Weaker? What if the atmosphere had no oxygen? What if we replaced the fuel with Diet Coke and Mentos?

To really understand something, it's not enough to be able to predict the future in a normal, expected, ceteris paribus scenario. You should also be able to predict what happens when several variables are changed is several ways, or, at least, point to which calculations need to be run to arrived at such a prediction.

Douglas Hofstadter and Daniel Dennett call that "turning the knobs". Imagine your model as a box with several knobs, where each knob controls one aspect of the modeled system. You don't have to be able to turn all the possible knobs to all possible values and still get a sensible, testable and correct answer, but the more, the better.

Doug and Dan apply this approach to thought experiments and intuition pumps, as a way to explore possible answers to philosophical questions. In my experience, this skill is also effective when applied to real world problems, notably when trying to understand something that is being explained by someone else.

In this case, you can run this knob-turning check interactively with the other person, which makes it way more powerful. If someone says “X+Y = Z” and “X+W = Z+A”, it’s not enough to mentally turn the knobs and calculate “X+Y+W = Z+A+B”. You should do that, then actually ask the explainer “Hey, let me see if I get what you mean: for example, X+Y+W would be Z+A+B”?

This interactive model knob-turning has been useful to me in many walks of life, but the most common and mundane application is helping out people at work. In that context, I identify six effects which make it helpful:

1) Communication check: maybe you misunderstood and actually X+W = Z-A

This is useful overall, but very important if someone uses metaphor. Some metaphors are clearly vague and people will know that and avoid them in technical explanations. But some metaphors seem really crisp for some people but hazy to others, or worse, very crisp to both people, but with different meanings! So take every metaphor as an invitation to interactive knob-turning.

To focus on communication check, try rephrasing their statements, using different words or, if necessary, very different metaphors. You can also apply a theory in different contexts, to see if the metaphors still apply.

For example, if a person talks about a computer system as if it were a person, I might try to explain the same thing in terms of a group of trained animals, or a board of directors, or dominoes falling.

2) Self-check: correct your own reasoning (maybe you understood the correct premises, but made a logical mistake during knob turning)

This is useful because humans are fallible, and two (competent) heads are less likely to miss a step in the reasoning dance than one.

Also, when someone comes up and asks something, you’ll probably be doing a context-switch, and will be more likely to get confused along the way. The person asking usually has more local context than you in the specific problem they are trying to solve, even if you have more context on the surrounding matters, so they might be able to spot your error more quickly than yourself.

Focus on self-check means double checking any intuitive leaps or tricky reasoning you used. Parts of you model that do not have a clear step-by-step explanation have priority, and should be tested against another brain. Try to phrase the question in a way that makes your intuitive answer look less obvious.

For example: “I’m not sure if this could happen, and it looks like all these messages should arrive in order, but do you know how we can guarantee that?”

3) Other-check: help the other person to correct inferential errors they might have made

The converse of self-checking. Sometimes fresh eyes with some global context can see reasoning errors that are hidden to people who are very focused on a task for too long.

To focus on other-check, ask about conclusions that follow from your model of the situation, but seem unintuitive to you, or required tricky reasoning. It’s possible that your friend also found them unintuitive, and that might have lead them to a jump to the opposite direction.

For example, I could ask: “For this system to work correctly, it seems that the clocks have to be closely synchronized, right? If the clocks are off by much, we could have a difference around midnight.”

Perhaps you successfully understand what was said, and the model you built in your head fits the communicated data. But that doesn’t mean it is the same model that the other person has in mind! In that case, your knob-turning will get you a result that’s inconsistent with what they expect.

4) Alternative hypothesis generation: If they cannot refute your conclusions, you have shown them a possible model they had not yet considered, in which case it will also point in the direction of more research to be made

This is doesn't happen that much when someone is looking for help to something. Usually the context they are trying to explain is the prior existing system which they will build upon, and if they’ve done their homework (i.e. read the docs and/or code) they should have a very good understanding of that already. One exception here is with people who are very new to the job, which are learning while doing.

On the other hand, this is incredibly relevant when someone asks for help debugging. If they can’t find the root cause of a bug, it must be because they are missing something. Either they have derived a mistaken conclusion from the data, or they’ve made an inferential error from those conclusions. The first case is where proposing a new model helps (the second is solved by other-checking).

Maybe they read the logs, saw that a request was sent, and assumed it was received, but perhaps it wasn’t. In that case, you can tell them to check for a log on the receiver system, or the absence of such a log.

To boost this effect, look for data that you strongly expect to exist and confirm your model, where the absence of such data might be caused by relative lack of global context, skill or experience by the other person.

For example: “Ok, so if the database went down, we should’ve seen all requests failing in that time range; but if it was a network instability, we should have random requests failing and others succeeding. Which one was it?”

5) Filling gaps in context: If they show you data that contradicts your model, well, you get more data and improve your understanding

This is very important when you have much less context than the other person. The larger the difference in context, the more likely that there’s some important piece of information that you don’t have, but that they take for granted.

The point here isn’t that there something you don’t know. There are lots and lots of things you don’t know, and neither does your colleague. And if there’s something they know that you don’t, they’ll probably fill you in when asking the question.

The point is that they will tell you something only if they realize you don’t know it yet. But people will expect short inferential distances, underestimate the difference in context, and forget to tell you stuff because it’s just obvious to them that you know.

Focus on filling gaps means you ask about the parts of your model which you are more uncertain about, to find out if they can help you build a clearer image. You can also extrapolate and make a wild guess, which you don’t really expect to be right.

For example: “How does the network works on this datacenter? Do we have a single switch so that, if it fails, all connections go down? Or are those network interfaces all virtualized anyway?”

6) Finding new ideas: If everybody understands one another, and the models are correct, knob-turning will lead to new conclusions (if they hadn’t turned those specific knobs on the problem yet)

This is the whole point of having the conversation, to help someone figure something out they haven’t already. But even if the specific new conclusion you arrive when knob-turning isn’t directly relevant to the current question, it may end up shining light on some part of the other person’s model that they couldn’t see yet.

This effect is general and will happen gradually as both your and the other person's models improve and converge. The goal is to get all obstacles out of the way so you can just move forward and find new ideas and solutions.

The more global context and skill your colleague has, the lower the chance that they missed some crucial piece of data and have a mistaken model (or, if they do, you probably won't be able to figure that out without putting in serious effort). So when talking to more skilled or experienced people, you can focus more in replicating the model from their mind to yours (communication check and self-check).

Conversely, when talking to less skilled people, you should focus more on errors they might have made, or models they might not have considered, or data they may need to collect (other-check and alternative hypothesis generation).

Filling gaps depends more on differences of communication style and local context, so I don't have a person-based heuristic.

NYC Solstice and Megameetup Funding Reminder

1 wearsshoes 27 October 2017 02:14PM

Hey all, we're coming up on the final weekend of our Kickstarter. Details in previous post, and a couple updates here:

  • Megameetup has 16 confirmed attendees. This is shaping up to be a really good chance to form productive conversations and friendships with other rationalists.

  • Solstice is currently at $2,740 (54% funded), with 65% of funding window elapsed. Please contribute - even a little helps.

  • To clarify for people buying multiple tickets, sponsors at $70+ automatically receive two tickets.

  • There will be additional tickets for purchase post-Kickstarter, conditional on meeting our goal, of course.

  • We're offering these incredibly cute stickers above certain backer levels!

Both are only open until Monday, Oct 30th - please give if you can to the Kickstarter, and we're excited to see you at the Megameetup!
Solstice Kickstarter page
Megameetup registration and details



I Want to Review FDT; Are my Criticisms Legitimate?

0 DragonGod 25 October 2017 05:28AM

I'm going to write a review of functional decision theory, I'll use the two papers.
It's going to be around as long as the papers themselves, coupled with school work, I'm not sure when I'll finish writing.
Before I start it, I want to be sure my criticisms are legitimate; is anyone willing to go over my criticisms with me?
My main points of criticism are:
Functional decision theory is actually algorithmic decision theory. It has an algorithmic view of decision theories. It relies on algorithmic equivalence and not functional equivalence.
Quick sort, merge sort, heap sort, insertion sort, selection sort, bubble sort, etc are mutually algorithmically dissimilar, but are all functionally equivalent.
If two decision algorithms are functionally equivalent, but algorithmically dissimilar, you'd want a decision theory that recognises this.
Causal dependence is a subset of algorithmic dependence which is a subset of functional dependence.
So, I specify what an actual functional decision theory would look like.
I then go on to show that even functional dependence is "impoverished".
Imagine a greedy algorithm that gets 95% of problems correct.
Let's call this greedy algorithm f'.
Let's call a correct algorithm f.
f and f' are functionally correlated, but not functionally equivalent.
FDT does not recognise this.
If f is your decision algorithm, and f' is your predictor's decision algorithm, then FDT doesn't recommend one boxing on Newcomb's problem.
EDT can deal with functional correlations.
EDT doesn't distinguish functional correlations from spurious correlations, while FDT doesn't recognise functional correlations.
I use this to specify EFDT (evidential functional decision theory), which considers P(f(π) = f'(π)) instead of P(f = f').
I specify the requirements for a full Implementation of FDT and EFDT.
I'll publish the first draft of the paper here after I'm done.
The paper would be long, because I specify a framework for evaluating decision theories in the paper.
Using this framework I show that EFDT > FDT > ADT > CDT.
I also show that EFDT > EDT.
This framework is basically a hierarchy of decision theories.
A > B means that the set of problems that B correctly decides is a subset of the set of problems that A correctly decides.
The dependence hierarchy is why CDT < ADT < FDT.
EFDT > FDT because EFDT can recognise functional correlations.
EFDT > EDT because EFDT can distinguish functional correlations from spurious correlations.
I plan to write the paper as best as I can, and if I think it's good enough, I'll try submitting it.

Pitting national health care systems against one another

1 michael_b 24 October 2017 09:34PM

I'm about to have a baby.  Any minute now.  Well, my partner is.  I'm just sitting here not growing a baby wondering what to do with myself.

Maybe I can get a jump on our approach to medical care for the new kiddo.

One thing that sticks out at me is that children in the US get a lot of vaccinations.  At my quick count it's something like 37 shots by the time they're 5.

I grew up in the US in the 80s and I don't remember getting nearly this many.  Is my memory faulty?  I'm pretty sure it was more like 12 back in those days.  Is this all really necessary? Nobody likes getting shots, especially not children.  What changed, anyway?

Now, I'm not an expert on immunology or epidemiology so I expect diving into the literature isn't going to be fruitful; I won't be able to ante up decades of education and experience fast enough.  Presumably this is what we pay people at the US CDC and Department of Health for.

But can you *really* trust them?  Aren't all of these vaccinations really convenient for the pharmaceutical industry?  Aren't there seemingly constant allegations/lawsuits about the over-prescription of drug interventions in the US?

The health care systems in major world countries have access to all of the same literature, and they're presumably staffed by educated, expert people too so they should all come to the same conclusions as the US system right?  Not so!

Here's how many shots each nation's health care system recommends by the time children turn 5.

37 US

25 UK

25 Germany

16 Sweden

16 Denmark

The intersection of vaccines being recommended are TDAP, MMR, Polio, HIB and PCB.

In the US we also recommend: Hep A, Hep B, Rotavirus, Meningococcus, Varicella, and yearly flu shots (for babies and children).

Can we explain the variance?  I can think of a few reasons they would vary.

1. Cultural bias.  This can be big.  A psychiatrist in the UK told me that they're not as pharma heavy as, say, psychiatrists in Germany because of a WW2 era bias: lots of the big pharma companies are German.

2. Cultural and environmental differences.  Some diseases are a bigger deal in some countries than others.  Japan (not included above) recommends immunization against diseases (TB, Japanese encephalitis) that none of the systems above are too concerned with.

3. Undue industry influence.  Run-of-the-mill corruption.

4. Quality of health care systems and social safety nets vary.

When it comes to cultural and environment differences I have a hard time imagining that the orthodoxy varies because Hep A is a much bigger deal in the US.  I presume the calculus changes based on your geographic neighbors, but is it a meaningful difference?  Or is it a counterproductive cultural bias?  For example, in the US we may spend more time thinking about diseases people in central America suffer from than the people in Denmark might, but do the neighbors in this case meaningfully translate to a higher disease risk?  Or are we vaccinating against unfounded fears?

Do the other nations vaccinate less than the US because their health care systems are worse?   Annoyingly (if you're an American) all of their health care outcomes rank better.

Is the US health care system more corruptible by industry influence?

Is the story a lot simpler and less sinister?  That the US vaccinates more than the rest of these countries because the balance of the US's health care system (access to treatment, quality of treatment) is worse?  Or is it because having to stay home with a kid that's sick with chicken pox (varicella) is not so big a deal in, say, Denmark, because the social contract is more forgiving of parents who miss work?

Does the poorer quality of health care in the US (going by international rankings) and the lower tolerance for parents missing work combine poorly with the undue influence of industry and therefore lead to more vaccinations?

On the flip side of this argument: so what if we vaccinate kids against more diseases than other countries?  Well, they're not free.  They cost money to administer, and cost tears because kids hate getting shots.  The health risks from vaccines aren't zero, either.  Vaccines have side-effects, and sometimes they're serious.  Those other nations (presumably) ran cost-benefit analyses too and came to different conclusions.  It would be nice if each country showed their work.  

When it comes to needles to stick my new kiddo with, I'm not really being persuaded to do more than the intersection of vaccinations between similar nations.  The fear that a doctor is about to stick my kid with a needle because there was a meeting in a shady room between a pharma rep and a CDC official is pretty powerful.  It doesn't seem like a strictly irrational concern either

[Link] Time to Exit the Sandbox

3 SquirrelInHell 24 October 2017 08:04AM

[Link] Absent Minded Gambler

0 DragonGod 23 October 2017 02:42PM

Introducing Goalclaw, personal goal tracker

1 Nic_Smith 21 October 2017 08:10PM

Quite a while ago, I wrote that there should be more software tools to assist with instrumental rationality. My recent attempt to create such a tool, GOALCLAW, is now available. GOALCLAW is a general goal tracking webapp which currently provides an average of how the tags entered for events day-to-day affect your goals, with plans to make more tag-based metrics and projections available in the near future.

  • GOALCLAW is new:
    • A few editing features are missing and should be added in the next few months
    • The built-in analysis needs to be expanded from averages
    • I'm very interested in feedback on how to make this a more useful goal-tracker
  • The general idea is to make patterns in what's going on around you and what you're doing a bit more obvious, so you can then investigate, verify/experiment, and act to achieve your goals
  • You can download information entered for importing into spreadsheets, stats program, etc.

Halloween costume: Paperclipperer

5 Elo 21 October 2017 06:32AM

Original post: http://bearlamp.com.au/halloween-costume-paperclipperer/

Guidelines for becoming a paperclipperer for halloween.


  • Paperclips (some as a prop, make your life easier by buying some, but show effort by making your own)
  • pliers (extra pairs for extra effect)
  • metal wire (can get colourful for novelty) (Florist wire)
  • crazy hat (for character)
  • Paperclip props.  Think glasses frame, phone case, gloves, cufflinks, shoes, belt, jewellery...
  • if party going - Consider a gift that is suspiciously paperclip like.  example - paperclip coasters, paperclip vase, paperclip party-snack-bowl
  • Epic commitment - make fortune cookies with paperclips in them.  The possibilities are endless.
  • Epic: paperclip tattoo on the heart.  Slightly less epic, draw paperclips on yourself.


While at the party, use the pliers and wire to make paperclips.  When people are not watching, try to attach them to objects around the house (example, on light fittings, on the toilet paper roll, under the soap.  When people are watching you - try to give them to people to wear.  Also wear them on the edges of your clothing.

When people ask about it, offer to teach them to make paperclips.  Exclaim that it's really fun!  Be confused, bewildered or distant when you insist you can't explain why.

Remember that paperclipping is a compulsion and has no reason.  However that it's very important.  "you can stop any time" but after a few minutes you get fidgety and pull out a new pair of pliers and some wire to make some more paperclips.

Try to leave paperclips where they can be found the next day or the next week.  cutlery drawers, in the fridge, on the windowsills.  And generally around the place.  The more home made paperclips the better.

Try to get faster at making paperclips, try to encourage competitions in making paperclips.

Hints for conversation:

  • Are spiral galaxies actually just really big paperclips?
  • Have you heard the good word of our lord and saviour paperclips?
  • Would you like some paperclips in your tea?
  • How many paperclips would you sell your internal organs for?
  • Do you also dream about paperclips (best to have a dream prepared to share)


The better you are at the character, the more likely someone might try to spoil your character by getting in your way, stealing your props, taking your paperclips.  The more you are okay with it, the better.  ideas like, "that's okay, there will be more paperclips".  This is also why you might be good to have a few pairs of pliers and wire.  Also know when to quit the battles and walk away.  This whole thing is about having fun.  Have fun!

Meta: chances are that other people who also read this will not be the paperclipper for halloween.  Which means that you can do it without fear that your friends will copy.  Feel free to share pictures!

Cross posted to lesserwrong: 

NYC Solstice and East Coast Megameetup. Interested in attending? We need your help.

2 wearsshoes 20 October 2017 04:32PM

Hey all, we’re currently raising funds for this year’s NYC Secular Solstice. As in previous years, this will be coinciding with the East Coast Rationalist Megameetup, which will be a mass sleepover and gathering in NYC spanning an entire weekend from December 8th to 10th.

The Solstice itself is on December 9th from 5:00 pm to 8:00 pm, followed by an afterparty. This year’s theme is “Generations” - the passing down of culture and knowledge from teacher to student, from master to apprentice, from parent to child. The stories we tell will investigate the methods by which this knowledge has been preserved, and how we can continue to do so for future generations.

Sounds great. How can I help?

In previous years, Solstice has been mostly underwritten by a few generous individuals; we’re trying to produce a more sustainable base of donations for this year’s event. Right now, our sustainable ticket price is about $30, which we’ve found seems steep to newcomers. Our long-term path to sustainability at a lower price point involves getting more yearly attendance, so we want to continue to provide discounted access for the general public and people with tight finances. So. Our hope is for you to donate this year the amount that you'd be happy to donate each year, to ensure the NYC Solstice continues to thrive.

  • $15 - Newcomer / Affordable option: If you're new, or you're not sure how much Solstice is worth to you, or finances are tight, you're welcome to come with a donation of $15.

  • $35 - Sponsorship option: You attend Solstice, and you contribute a bit towards subsidizing others using the newcomer/affordable option.

  • $25 Volunteering Option - If you're willing to put in roughly 3 hours of work (enough to do a shopping-spree for the afterparty, or show up early to set up, or help run the ticketstand, help clean up, etc)

  • $50 and higher - Higher levels of sponsorship for those who are able.

Donate at https://www.kickstarter.com/projects/1939801081/nyc-secular-solstice-2017-generations

Wait, I’m new to this. What is Secular Solstice?

Secular Solstice is a rationalist tradition, and one of the few public facing rationalist held events. It’s what it says on the tin: a nonreligious winter solstice holiday. We sing, we tell stories about scientific progress and humanist values, we light candles. Usually, we get about 150 people in NYC. For more info, or if you’re curious about how to hold your own, check out www.secularsolstice.com.

I’m interested in snuggling rationalists. What’s this sleepover thing?

Since we’ll have a whole bunch of people from the rationalist community all in town for the same weekend, it’d be awesome if we could spend that weekend hanging out together, learning from each other and doing ingroup things. Because many of us will need a place to stay anyway, we can rent a big house on Airbnb together and use that as the central gathering place, like at Highgarden in 2014. This way we’ll have more flexibility to do things than if we all have to wander around looking for a public space.

Besides Solstice and the afterparty, the big activity will be an unconference on Saturday afternoon. We’ll also have a ritual lab, games, meals together, and whatever other activities you want to run! There'll also be plenty of room for unstructured socializing, of course.

This is all going to cost up to $100 per person for the Airbnb rental, plus $25 per person for food (including at least Saturday lunch and dinner and Sunday breakfast) and other expenses. (The exact Airbnb location hasn’t been determined determined yet, because we don’t know how many participants there’ll be, but $100 per person will be the upper limit on price.)

To gauge interest, registration is open from now until October 30. You’ll be asked to authorize a PayPal payment of $125. It works like Kickstarter; you won’t be charged until October 30, and only if there’s enough interest to move forward. You’ll also only be charged your share of what the rental actually ends up costing, plus the additional $25. For this, you’ll get to sleep in the Airbnb house Friday through Sunday nights (or whatever subset of those you can make it), have three meals with us, and hang out with a bunch of nice/cool/awesome ingroup people throughout the weekend. (Solstice tickets are not part of this deal; those are sold separately through the Solstice Kickstarter.)

If this sounds like a good thing that you want to see happen and be part of, then register before October 30!

Register and/or see further details at www.rationalistmegameetup.com. Taymon Beal is organizing.

Anything else I should know?

If you have other questions, please feel free to post them in the comments or contact me at rachel@rachelshu.com.


Hope to see you in NYC this December!

[Link] Lucid dreaming technique and study

1 morganism 20 October 2017 03:18AM

Recent updates to gwern.net (2016-2017)

7 gwern 20 October 2017 02:11AM

Previously: 2011; 2012-2013; 2013-2014; 2014-2015; 2015-2016

“Every season hath its pleasures; / Spring may boast her flowery prime, / Yet the vineyard’s ruby treasures / Brighten Autumn’s sob’rer time.”

Another year of my completed writings, sorted by topic:

continue reading »

[Link] The NN/tank Story Probably Never Happened

2 gwern 20 October 2017 01:41AM

Just a photo

1 MaryCh 19 October 2017 06:48PM

Would you say the picture below (by A. S. Shevchenko) is almost like an optical illusion?

Have you seen any pictures or sights that fooled your brain for a moment but that you wouldn't call optical illusions, and if yes, what is the salient difference?

Use concrete language to improve your communication in relationships

2 Elo 19 October 2017 03:46AM

She wasn’t respecting me. Or at least, that’s what I was telling myself.

And I was pretty upset. What kind of person was too busy to text back a short reply? I know she’s a friendly person because just a week ago we were talking daily, text, phone, whatever suited us. And now? She didn’t respect me. That’s what I was telling myself. Any person with common decency could see, what she was doing was downright rude! And she was doing it on purpose. Or at least, that’s what I was telling myself.

It was about a half a day of these critical-loop thoughts, when I realised what I was doing. I was telling myself a story. I was building a version of events that grew and morphed beyond the very concrete and specific of what was happening. The trouble with The Map and the Territory, is that “Respect” is in my map of my reality. What it “means” to not reply to my text is in my theory of mind, in my version of events. Not in the territory, not in reality.

I know I could be right about my theory of what’s going on. She could be doing this on purpose, she could be choosing to show that she does not respect me by not replying to my texts, and I often am right about these things. I have been right plenty of times in the past. But that doesn’t make me feel better. Or make it easier to communicate my problem. If she was not showing me respect, sending her an accusation would not help our communication improve.

The concept comes from Non-Violent Communication by Marshall Rosenberg. Better described as Non-Judgemental communication. The challenge I knew I faced was to communicate to her that I was bothered, without an accusation. Without accusing her with my own internal judgement of “she isn’t respecting me”. I knew if I fire off an attack, I will encounter walls of defence. That’s the kind of games we play when we feel attacked by others. We put up walls and fire back.

The first step of NVC is called, “observation”. I call it “concrete experience”. To pass the concrete experience test, the description of what happened needs to be specific enough to be used as instructions by a stranger. For example, there are plenty of ideas someone could have about not showing respect, if my description of the problem is, “she does not respect me”, my grandma might think she started eating before I sat down at the table. If my description is, “In the past 3 days she has not replied to any of my messages”. That’s a very concrete description of what happened. It’s also independent as an observation. It’s not clear that doing this action has caused a problem in my description of what happened. It’s just “what happened”

Notice — I didn’t say, “she never replies to my messages”. This is because “never replies” is not concrete, not specific, and sweepingly untrue. For her to never reply she would have to have my grandma’s texting ability. I definitely can’t expect progress to be made here with a sweeping accusations like “she never replies”.

What I did go with, while not perfect, is a lot better than the firing line of, “you don’t respect me”. Instead it was, “I noticed that you have not messaged me in three days. I am upset because I am telling myself that the only reason you would be doing that is because you don’t respect me, and I know that’s not true. I don’t understand what’s going on with you and I would appreciate an explanation of what’s going on.”.

It’s remarkably hard to be honest and not make an accusation. No sweeping generalisations, no lies or exaggerations, just the concretes of what is going on in my head and the concrete of what happened in the territory. It’s still okay to be telling yourself those accusations, and validate your own feelings that things are not okay — but it’s not okay to lay those accusations on someone else. We all experience telling ourselves what other people are thinking, and the reasons behind their actions, but we can’t ever really know unless we ask. And if we don’t ask, we end up with the same circumstances surrounding the cold-war, each side preparing for war, but a war built on theories in the map, not the experience in the territory.

I’m human too, that’s how I found myself half-a-day of brooding before wondering what I was doing to myself! It’s not easy to apply this method, but it has always been successful at bringing me some of that psychological relief that you need when you are looking to be understood by someone. To get this right think, “How do I describe my concrete observations of what happened?”.

Good Luck!

Cross posted to Medium: https://medium.com/@redeliot/use-concrete-language-to-improve-your-communication-in-relationships-cf1c6459d5d6

Cross posted to www.bearlamp.com.au/use-concrete-language-to-improve-your-communication-in-relationships

Also on lesserwrong: https://www.lesserwrong.com/posts/RovDhfhy5jL6AQ6ve/use-concrete-language-to-improve-your-communication-in

[Link] New program can beat Alpha Go, didn't need input from human games

6 NancyLebovitz 18 October 2017 08:01PM

Adjust for the middleman.

1 MaryCh 18 October 2017 02:40PM

This post is from the point of view of the middleman standing between the grand future he doesn't understand and the general public whose money he's hunting. We have a certain degree of power over what to offer to the customer, and our biases and pet horses are going to contribute a lot to what theoreticians infer about "the actual public"'s tastes. Just how a lot it is, I cannot say, & there's probably tons of literature on this anyway, so take this as a personal anecdote.

Nine months as a teacher of botany (worst gripes here) showed me a glimpse of how teachers/administration view the field they teach. A year in a shop - what managers think of books we sell. The scientific community here in my country grumbles that there's too little non-fiction produced, without actually looking into why it's not being distributed; but really, it's small wonder. Broadest advice - if your sufficiently weird goals depend on the cooperation of a network of people, especially if they are an established profession with which you haven't had a cause to interact closely except as a customer, you might want to ask what they think of your enterprise. Because they aren't going to see it your way. Next thing, is to accept it.

continue reading »

Open thread, October 16 - October 22, 2017

1 root 16 October 2017 06:53PM
If it's worth saying, but not worth its own post, then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should start on Monday, and end on Sunday.

4. Unflag the two options "Notify me of new top-level comments on this article" and ".

Humans can be assigned any values whatsoever...

1 Stuart_Armstrong 13 October 2017 11:32AM

Crossposted at LessWrong 2.0.

Humans have no values... nor do any agent. Unless you make strong assumptions about their rationality. And depending on those assumptions, you get humans to have any values.


An agent with no clear preferences

There are three buttons in this world, B(0), B(1), and X, and one agent H.

B(0) and B(1) can be operated by H, while X can be operated by an outside observer. H will initially press button B(0); if ever X is pressed, the agent will switch to pressing B(1). If X is pressed again, the agent will switch back to pressing B(0), and so on. After a large number of turns N, H will shut off. That's the full algorithm for H.

So the question is, what are the values/preferences/rewards of H? There are three natural reward functions that are plausible:

  • R(0), which is linear in the number of times B(0) is pressed.
  • R(1), which is linear in the number of times B(1) is pressed.
  • R(2) = I(E,X)R(0) + I(O,X)R(1), where I(E,X) is the indicator function for X being pressed an even number of times,I(O,X)=1-I(E,X) being the indicator function for X being pressed an odd number of times.

For R(0), we can interpret H as an R(0) maximising agent which X overrides. For R(1), we can interpret H as an R(1) maximising agent which X releases from constraints. And R(2) is the "H is always fully rational" reward. Semantically, these make sense for the various R(i)'s being a true and natural reward, with X="coercive brain surgery" in the first case, X="release H from annoying social obligations" in the second, and X="switch which of R(0) and R(1) gives you pleasure".

But note that there is no semantic implications here, all that we know is H, with its full algorithm. If we wanted to deduce its true reward for the purpose of something like Inverse Reinforcement Learning (IRL), what would it be?


Modelling human (ir)rationality and reward

Now let's talk about the preferences of an actual human. We all know that humans are not always rational (how exactly we know this is a very interesting question that I will be digging into). But even if humans were fully rational, the fact remains that we are physical, and vulnerable to things like coercive brain surgery (and in practice, to a whole host of other more or less manipulative techniques). So there will be the equivalent of "button X" that overrides human preferences. Thus, "not immortal and unchangeable" is in practice enough for the agent to be considered "not fully rational".

Now assume that we've thoroughly observed a given human h (including their internal brain wiring), so we know the human policy π(h) (which determines their actions in all circumstances). This is, in practice all that we can ever observe - once we know π(h) perfectly, there is nothing more that observing h can teach us (ignore, just for the moment, the question of the internal wiring of h's brain - that might be able to teach us more, but we'll need extra assumptions).

Let R be a possible human reward function, and R the set of such rewards. A human (ir)rationality planning algorithm p (hereafter refereed to as a planner), is a map from R to the space of policies (thus p(R) says how a human with reward R will actually behave - for example, this could be bounded rationality, rationality with biases, or many other options). Say that the pair (p,R) is compatible if p(R)=π(h). Thus a human with planner p and reward R would behave as h does.

What possible compatible pairs are there? Here are some candidates:

  • (p(0), R(0)), where p(0) and R(0) are some "plausible" or "acceptable" planners and reward functions (what this means is a big question).
  • (p(1), R(1)), where p(1) is the "fully rational" planner, and R(1) is a reward that fits to give the required policy.
  • (p(2), R(2)), where R(2)= -R(1), and p(2)= -p(1), where -p(R) is defined as p(-R); here p(2) is the "fully anti-rational" planner.
  • (p(3), R(3)), where p(3) maps all rewards to π(h), and R(3) is trivial and constant.
  • (p(4), R(4)), where p(4)= -p(0) and R(4)= -R(0).


Distinguishing among compatible pairs

How can we distinguish between compatible pairs? At first appearance, we can't. That's because, by their definition of compatible, all pairs produce the correct policy π(h). And once we have π(h), further observations of h tell us nothing.

I initially thought that Kolmogorov or algorithmic complexity might help us here. But in fact:

Theorem: The pairs (p(i), R(i)), i ≥ 1, are either simpler than (p(0), R(0)), or differ in Kolmogorov complexity from it by a constant that is independent of (p(0), R(0)).

Proof: The cases of i=4 and i=2 are easy, as these differ from i=0 and i=1 by two minus signs. Given (p(0), R(0)), a fixed-length algorithm computes π(h). Then a fixed length algorithm defines p(3) (by mapping input to π(h)). Furthermore, given π(h) and any history η, a fixed length algorithm computes the action a(η) the agent will take; then a fixed length algorithm defines R(1)(η,a(η))=1 and R(1)(η,b)=0 for b≠a(η).


So the Kolmogorov complexity can shift between p and R (all in R for i=1,2, all in p for i=3), but it seems that the complexity of the pair doesn't go up during these shifts.

This is puzzling. It seems that, in principle, one cannot assume anything about h's reward at all! R(2)= -R(1), R(4)= -R(0), and p(3) is compatible with any possible reward R. If we give up the assumption of human rationality - which we must - it seems we can't say anything about the human reward function. So it seems IRL must fail.

Yet, in practice, we can and do say a lot about the rationality and reward/desires of various human beings. We talk about ourselves being irrational, as well as others being so. How do we do this? What structure do we need to assume, and is there a way to get AIs to assume the same?

This the question I'll try and partially answer in subsequent posts, using the example of the anchoring bias as a motivating example. The anchoring bias is one of the clearest of all biases; what is it that allows us to say, with such certainty, that it's a bias (or at least a misfiring heuristic) rather than an odd reward function?

Beauty as a signal (map)

4 turchin 12 October 2017 10:02AM

This is my new map, in which female beauty is presented as a signal which moves from woman to man through different mediums and amplifiers. pdf

View more: Next