27 May 2016

LINK: Performing a Failure Autopsy

27 May 2016

In which I discuss the beginnings of a technique for learning from certain kinds of failures more effectively:

"What follows is an edited version of an exercise I performed about a month ago following an embarrassing error cascade. I call it a ‘failure autopsy’, and on one level it’s basically the same thing as an NFL player taping his games and analyzing them later, looking for places to improve.

But the aspiring rationalist wishing to do the something similar faces a more difficult problem, for a couple of reasons:

First, the movements of a mind can’t be seen in the same way the movements of a body can, meaning a different approach must be taken when doing granular analysis of mistaken cognition.

Second, learning to control the mind is simply much harder than learning to control the body.

And third, to my knowledge, nobody has really even tried to develop a framework for doing with rationality what an NFL player does with football, so someone like me has to pretty much invent the technique from scratch on the fly.  

I took a stab at doing that, and I think the result provides some tantalizing hints at what a more mature, more powerful versions of this technique might look like. Further, I think it illustrates the need for what I’ve been calling a “Dictionary of Internal Events”, or a better vocabulary for describing what happens between your ears."

LINK: Quora brainstorms strategies for containing AI risk

26 May 2016

In case you haven't seen it yet, Quora hosted an interesting discussion of different strategies for containing / mitigating AI risk, boosted by a $500 prize for the best answer. It attracted sci-fi author David Brin, U. Michigan professor Igor Markov, and several people with PhDs in machine learning, neuroscience, or artificial intelligence. Most people from LessWrong will disagree with most of the answers, but I think the article is useful as a quick overview of the variety of opinions that ordinary smart people have about AI risk.


Iterated Gambles and Expected Utility Theory

25 May 2016

The Setup

I'm about a third of the way through Stanovich's Decision Making and Rationality in the Modern World.  Basically, I've gotten through some of the more basic axioms of decision theory (Dominance, Transitivity, etc).


As I went through the material, I noted that there were a lot of these:

Decision 5. Which of the following options do you prefer (choose one)?

A. A sure gain of $240

B. 25% chance to gain $1,000 and 75% chance to gain nothing


The text goes on to show how most people tend to make irrational choices when confronted with decisions like this; most strikingly was how often irrelevant contexts and framing effected people's decisions.


But I understand the decision theory bit; my question is a little more complicated.


When I was choosing these options myself, I did what I've been taught by the rationalist community to do in situations where I am given nice, concrete numbers: I shut up and I multiplied, and at each decision choose the option with the highest expected utility.


Granted, I equated dollars to utility, which Stanovich does mention that humans don't do well (see Prospect Theory).



The Problem

In the above decision, option B clearly has the higher expected utility, so I chose it.  But there was still a nagging doubt in my mind, some part of me that thought, if I was really given this option, in real life, I'd choose A.


So I asked myself: why would I choose A?  Is this an emotion that isn't well-calibrated?  Am I being risk-averse for gains but risk-taking for losses?


What exactly is going on?


And then I remembered the Prisoner's Dilemma.



A Tangent That Led Me to an Idea

Now, I'll assume that anyone reading this has a basic understanding of the concept, so I'll get straight to the point.


In classical decision theory, the choice to defect (rat the other guy out) is strictly superior to the choice to cooperate (keep your mouth shut).  No matter what your partner in crime does, you get a better deal if you defect.


Now, I haven't studied the higher branches of decision theory yet (I have a feeling that Eliezer, for example, would find a way to cooperate and make his partner in crime cooperate as well; after all, rationalists should win.)


Where I've seen the Prisoner's Dilemma resolved is, oddly enough, in Dawkin's The Selfish Gene, which is where I was first introduced to the idea of an Iterated Prisoner's Dilemma.


The interesting idea here is that, if you know you'll be in the Prisoner's Dilemma with the same person multiple times, certain kinds of strategies become available that weren't possible in a single instance of the Dilemma.  Partners in crime can be punished for defecting by future defections on your own behalf.


The key idea here is that I might have a different response to the gamble if I knew I could take it again.


The Math

Let's put on our probability hats and actually crunch the numbers:

Format -  Probability: $Amount of Money | Probability: $Amount of Money

Assuming one picks A over and over again, or B over and over again.

Iteration A--------------------------------------------------------------------------------------------B

1 $240-----------------------------------------------------------------------------------------1/4: $1,000 | 3/4: $0

2 $480----------------------------------------------------------------------1/16: $2,000 | 6/16: $1,000 | 9/16: $0

3 $720---------------------------------------------------1/64: $3,000 | 9/64: $2,000 | 27/64: $1,000 | 27/64: $0

4 $960------------------------1/256: $4,000 | 12/256: $3,000 | 54/256: $2,000 | 108/256: $1,000 | 81/256: $0

5 $1,200----1/1024: $5,000 | 15/1024: $4,000 | 90/256: $3,000 | 270/1024: $2,000 | 405/1024: $1,000 | 243/1024: $0

And so on. (If I've ma de a mistake, please let me know.)


The Analysis

It is certainly true that, in terms of expected money, option B outperforms option A no matter how many times one takes the gamble, but instead, let's think in terms of anticipated experience - what we actually expect to happen should we take each bet.


The first time we take option B, we note that there is a 75% chance that we walk away disappointed.  That is, if one person chooses option A, and four people choose option B, on average three out of those four people will underperform the person who chose option A.  And it probably won't come as much consolation to the three losers that the winner won significantly bigger than the person who chose A.


And since nothing unusual ever happens, we should think that, on average, having taken option B, we'd wind up underperforming option A.


Now let's look at further iterations.  In the second iteration, we're more likely than not to have nothing having taken option B twice than we are to have anything.


In the third iteration, there's about a 57.8% chance that we'll have outperformed the person who chose option A the whole time, and a 42.2% chance that we'll have nothing.


In the fourth iteration, there's a 73.8% chance that we'll have matched or done worse than the person who has chose option A four times (I'm rounding a bit, $1,000 isn't that much better than $960).


In the fifth iteration, the above percentage drops to 63.3%.


Now, without doing a longer analysis, I can tell that option B will eventually win.  That was obvious from the beginning.


But there's still a better than even chance you'll wind up with less, picking option B, than by picking option A.  At least for the first five times you take the gamble.




If we act to maximize expected utility, we should choose option B, at least so long as I hold that dollars=utility.  And yet it seems that one would have to take option B a fair number of times before it becomes likely that any given person, taking the iterated gamble, will outperform a different person repeatedly taking option A.


In other words, of the 1025 people taking the iterated gamble:

we expect 1 to walk away with $1,200 (from taking option A five times),

we expect 376 to walk away with more than $1,200, casting smug glances at the scaredy-cat who took option A the whole time,

and we expect 648 to walk away muttering to themselves about how the whole thing was rigged, casting dirty glances at the other 377 people.


After all the calculations, I still think that, if this gamble was really offered to me, I'd take option A, unless I knew for a fact that I could retake the gamble quite a few times.  How do I interpret this in terms of expected utility?


Am I not really treating dollars as equal to utility, and discounting the marginal utility of the additional thousands of dollars that the 376 win?


What mistakes am I making?


Also, a quick trip to google confirms my intuition that there is plenty of work on iterated decisions; does anyone know a good primer on them?


I'd like to leave you with this:


If you were actually offered this gamble in real life, which option would you take?

The AI in Mary's room

24 May 2016

In the Mary's room thought experiment, Mary is a brilliant scientist in a black-and-white room who has never seen any colour. She can investigate the outside world through a black-and-white television, and has piles of textbooks on physics, optics, the eye, and the brain (and everything else of relevance to her condition). Through this she knows everything intellectually there is to know about colours and how humans react to them, but she hasn't seen any colours at all.

After that, when she steps out of the room and sees red (or blue), does she learn anything? It seems that she does. Even if she doesn't technically learn something, she experiences things she hadn't ever before, and her brain certainly changes in new ways.

The argument was intended as a defence of qualia against certain forms of materialism. It's interesting, and I don't intent to solve it fully here. But just like I extended Searle's Chinese room argument from the perspective of an AI, it seems this argument can also be considered from an AI's perspective.

Consider a RL agent with a reward channel, but which currently receives nothing from that channel. The agent can know everything there is to know about itself and the world. It can know about all sorts of other RL agents, and their reward channels. It can observe them getting their own rewards. Maybe it could even interrupt or increase their rewards. But, all this knowledge will not get it any reward. As long as its own channel doesn't send it the signal, knowledge of other agents rewards - even of identical agents getting rewards - does not give this agent any reward. Ceci n'est pas une récompense.

This seems to mirror Mary's situation quite well - knowing everything about the world is no substitute from actually getting the reward/seeing red. Now, a RL's agent reward seems closer to pleasure than qualia - this would correspond to a Mary brought up in a puritanical, pleasure-hating environment.

Closer to the original experiment, we could imagine the AI is programmed to enter into certain specific subroutines, when presented with certain stimuli. The only way for the AI to start these subroutines, is if the stimuli is presented to them. Then, upon seeing red, the AI enters a completely new mental state, with new subroutines. The AI could know everything about its programming, and about the stimulus, and, intellectually, what would change about itself if it saw red. But until it did, it would not enter that mental state.

If we use ⬜ to (informally) denote "knowing all about", then ⬜(X→Y) does not imply Y. Here X and Y could be "seeing red" and "the mental experience of seeing red". I could have simplified that by saying that ⬜Y does not imply Y. Knowing about a mental state, even perfectly, does not put you in that mental state.

This closely resembles the original Mary's room experiment. And it seems that if anyone insists that certain features are necessary to the intuition behind Mary's room, then these features could be added to this model as well.

Mary's room is fascinating, but it doesn't seem to be talking about humans exclusively, or even about conscious entities.

Open Thread May 23 - May 29, 2016

May 23 - May 29, 2016

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

20 May 2016

This summary was posted to LW Main on May 20th.

Knowledge Dump: Pomodoros

19 May 2016

After our recent LW Dojo in Berlin we had a conversation on our mailing list about pomodoros.

How do we handle it if the bell rings but we are in flow? Is it good to honor the bell and take a pause or is it more effective to continue working to keep in flow?

The original setting of 25 minutes came from the 25 minutes that Francesco Cirillo tomato shaped timer had naturally. The LW Study Hall seems to use 32 minutes work with 8 minutes pause. If you have experimented with different lengths, what worked for you?

Did you come to any surprising conclusions about pomodoros while working with them, that might be interesting to other people?

How do you learn Solomonoff Induction?

17 May 2016

I read about a fascinating technique described on Wikipedia as a mathematically formalized combination of Occam's razor and the Principle of Multiple Explanations. I want to add this to my toolbox. I'm dreaming of a concise set of actionable instructions for using Solomonoff induction. I realize this wish might be overly idealistic. I'm willing to peruse a much more convoluted tome and will consider making time for any background knowledge or prerequisites involved.

If anyone knows of a good book on this, or can tell me what set of information I need to acquire, please let me know. It would be much appreciated!

Welcome to Less Wrong! (9th thread, May 2016)

May 2016

Hi, do you read the LessWrong website, but haven't commented yet (or not very much)? Are you a bit scared of the harsh community, or do you feel that questions which are new and interesting for you could be old and boring for the older members?

This is the place for the new members to become courageous and ask what they wanted to ask. Or just to say hi.

The older members are strongly encouraged to be gentle and patient (or just skip the entire discussion if they can't).

Newbies, welcome!


The long version:


If you've recently joined the Less Wrong community, please leave a comment here and introduce yourself. We'd love to know who you are, what you're doing, what you value, how you came to identify as an aspiring rationalist or how you found us. You can skip right to that if you like; the rest of this post consists of a few things you might find helpful. More can be found at the FAQ.


A few notes about the site mechanics

To post your first comment, you must have carried out the e-mail confirmation: When you signed up to create your account, an e-mail was sent to the address you provided with a link that you need to follow to confirm your e-mail address. You must do this before you can post!

Less Wrong comments are threaded for easy following of multiple conversations. To respond to any comment, click the "Reply" link at the bottom of that comment's box. Within the comment box, links and formatting are achieved via Markdown syntax (you can click the "Help" link below the text box to bring up a primer).

You may have noticed that all the posts and comments on this site have buttons to vote them up or down, and all the users have "karma" scores which come from the sum of all their comments and posts. This immediate easy feedback mechanism helps keep arguments from turning into flamewars and helps make the best posts more visible; it's part of what makes discussions on Less Wrong look different from those anywhere else on the Internet.

However, it can feel really irritating to get downvoted, especially if one doesn't know why. It happens to all of us sometimes, and it's perfectly acceptable to ask for an explanation. (Sometimes it's the unwritten LW etiquette; we have different norms than other forums.) Take note when you're downvoted a lot on one topic, as it often means that several members of the community think you're missing an important point or making a mistake in reasoning— not just that they disagree with you! If you have any questions about karma or voting, please feel free to ask here.

Replies to your comments across the site, plus private messages from other users, will show up in your inbox. You can reach it via the little mail icon beneath your karma score on the upper right of most pages. When you have a new reply or message, it glows red. You can also click on any user's name to view all of their comments and posts.

All recent posts (from both Main and Discussion) are available here. At the same time, it's definitely worth your time commenting on old posts; veteran users look through the recent comments thread quite often (there's a separate recent comments thread for the Discussion section, for whatever reason), and a conversation begun anywhere will pick up contributors that way.  There's also a succession of open comment threads for discussion of anything remotely related to rationality.

Discussions on Less Wrong tend to end differently than in most other forums; a surprising number end when one participant changes their mind, or when multiple people clarify their views enough and reach agreement. More commonly, though, people will just stop when they've better identified their deeper disagreements, or simply "tap out" of a discussion that's stopped being productive. (Seriously, you can just write "I'm tapping out of this thread.") This is absolutely OK, and it's one good way to avoid the flamewars that plague many sites.

There's actually more than meets the eye here: look near the top of the page for the "WIKI", "DISCUSSION" and "SEQUENCES" links.
LW WIKI: This is our attempt to make searching by topic feasible, as well as to store information like common abbreviations and idioms. It's a good place to look if someone's speaking Greek to you.
LW DISCUSSION: This is a forum just like the top-level one, with two key differences: in the top-level forum, posts require the author to have 20 karma in order to publish, and any upvotes or downvotes on the post are multiplied by 10. Thus there's a lot more informal dialogue in the Discussion section, including some of the more fun conversations here.
SEQUENCES: A huge corpus of material mostly written by Eliezer Yudkowsky in his days of blogging at Overcoming Bias, before Less Wrong was started. Much of the discussion here will casually depend on or refer to ideas brought up in those posts, so reading them can really help with present discussions. Besides which, they're pretty engrossing in my opinion. They are also available in a book form.

A few notes about the community

If you've come to Less Wrong to  discuss a particular topic, this thread would be a great place to start the conversation. By commenting here, and checking the responses, you'll probably get a good read on what, if anything, has already been said here on that topic, what's widely understood and what you might still need to take some time explaining.

If your welcome comment starts a huge discussion, then please move to the next step and create a LW Discussion post to continue the conversation; we can fit many more welcomes onto each thread if fewer of them sprout 400+ comments. (To do this: click "Create new article" in the upper right corner next to your username, then write the article, then at the bottom take the menu "Post to" and change it from "Drafts" to "Less Wrong Discussion". Then click "Submit". When you edit a published post, clicking "Save and continue" does correctly update the post.)

If you want to write a post about a LW-relevant topic, awesome! I highly recommend you submit your first post to Less Wrong Discussion; don't worry, you can later promote it from there to the main page if it's well-received. (It's much better to get some feedback before every vote counts for 10 karma—honestly, you don't know what you don't know about the community norms here.)

Alternatively, if you're still unsure where to submit a post, whether to submit it at all, would like some feedback before submitting, or want to gauge interest, you can ask / provide your draft / summarize your submission in the latest open comment thread. In fact, Open Threads are intended for anything 'worth saying, but not worth its own post', so please do dive in! Informally, there is also the unofficial Less Wrong IRC chat room, and you might also like to take a look at some of the other regular special threads; they're a great way to get involved with the community!

If you'd like to connect with other LWers in real life, we have  meetups  in various parts of the world. Check the wiki page for places with regular meetups, or the upcoming (irregular) meetups page. There's also a Facebook group. If you have your own blog or other online presence, please feel free to link it.

If English is not your first language, don't let that make you afraid to post or comment. You can get English help on Discussion- or Main-level posts by sending a PM to one of the following users (use the "send message" link on the upper right of their user page). Either put the text of the post in the PM, or just say that you'd like English help and you'll get a response with an email address.
* Normal_Anomaly
* Randaly
* shokwave
* Barry Cotter

A note for theists: you will find the Less Wrong community to be predominantly atheist, though not completely so, and most of us are genuinely respectful of religious people who keep the usual community norms. It's worth saying that we might think religion is off-topic in some places where you think it's on-topic, so be thoughtful about where and how you start explicitly talking about it; some of us are happy to talk about religion, some of us aren't interested. Bear in mind that many of us really, truly have given full consideration to theistic claims and found them to be false, so starting with the most common arguments is pretty likely just to annoy people. Anyhow, it's absolutely OK to mention that you're religious in your welcome post and to invite a discussion there.

A list of some posts that are pretty awesome

I recommend the major sequences to everybody, but I realize how daunting they look at first. So for purposes of immediate gratification, the following posts are particularly interesting/illuminating/provocative and don't require any previous reading:

More suggestions are welcome! Or just check out the top-rated posts from the history of Less Wrong. Most posts at +50 or more are well worth your time.

Welcome to Less Wrong, and we look forward to hearing from you throughout the site!

