The Outside View isn't magic

The planning fallacy is an almost perfect example of the strength of using the outside view. When asked to predict the time taken for a project that they are involved in, people tend to underestimate the time needed (in fact, they tend to predict as if question was how long things would take if everything went perfectly).

Simply telling people about the planning fallacy doesn't seem to make it go away. So the outside view argument is that you need to put your project into the "reference class" of other projects, and expect time overruns as compared to your usual, "inside view" estimates (which focus on the details you know about the project.

So, for the outside view, what is the best way of estimating the time of a project? Well, to find the right reference class for it: the right category of projects to compare it with. You can compare the project with others that have similar features - number of people, budget, objective desired, incentive structure, inside view estimate of time taken etc... - and then derive a time estimate for the project that way.

That's the outside view. But to me, it looks a lot like... induction. In fact, it looks a lot like the elements of a linear (or non-linear) regression. We can put those features (at least the quantifiable ones) into a linear regression with a lot of data about projects, shake it all about, and come up with regression coefficients.

At that point, we are left with a decent project timeline prediction model, and another example of human bias. The fact that humans often perform badly in prediction tasks is not exactly new - see for instance my short review on the academic research on expertise.

So what exactly is the outside view doing in all this?

The role of the outside view: model incomplete and bias human

The main use of the outside view, for humans, seems to be to point out either an incompleteness in the model or a human bias. The planning fallacy has both of these: if you did a linear regression comparing your project with all projects with similar features, you'd notice your inside estimate was more optimistic than the regression - your inside model is incomplete. And if you also compared each person's initial estimate with the ultimate duration of their project, you'd notice a systematically optimistic bias - you'd notice the planning fallacy.

The first type of errors tend to go away with time, if the situation is encountered regularly, as people refine models, add variables, and test them on the data. But the second type remains, as human biases are rarely cleared by mere data.

Reference class tennis

If use of the outside view is disputed, it often develops into a case of reference class tennis - where people with opposing sides insist or deny that a certain example belongs in the reference class (similarly to how, in politics, anything positive is claimed for your side and anything negative assigned to the other side).

But once the phenomena you're addressing has an explanatory model, there are no issues of reference class tennis any more. Consider for instance Goodhart's law: "When a measure becomes a target, it ceases to be a good measure". A law that should be remembered by any minister of education wanting to reward schools according to improvements to their test scores.

This is a typical use of the outside view: if you'd just thought about the system in terms of inside facts - tests are correlated with child performance; schools can improve child performance; we can mandate that test results go up - then you'd have missed several crucial facts.

But notice that nothing mysterious is going on. We understand exactly what's happening here: schools have ways of upping test scores without upping child performance, and so they decided to do that, weakening the correlation between score and performance. Similar things happen in the failures of command economies; but again, once our model is broad enough to encompass enough factors, we get decent explanations, and there's no need for further outside views.

In fact, we know enough that we can show when Goodhart's law fails: when no-one with incentives to game the measure has control of the measure. This is one of the reasons central bank interest rate setting has been so successful. If you order a thousand factories to produce shoes, and reward the managers of each factory for the number of shoes produced, you're heading to disaster. But consider GDP. Say the central bank wants to increase GDP by a certain amount, by fiddling with interest rates. Now, as a shoe factory manager, I might have preferences about the direction of interest rates, and my sales are a contributor to GDP. But they are a tiny contributor. It is not in my interest to manipulate my sales figures, in the vague hope that, aggregated across the economy, this will falsify GDP and change the central bank's policy. The reward is too diluted, and would require coordination with many other agents (and coordination is hard).

Thus if you're engaging in reference class tennis, remember the objective is to find a model with enough variables, and enough data, so that there is no more room for the outside view - a fully understood Goodhart's law rather than just a law.

In the absence of a successful model

Sometimes you can have a strong trend without a compelling model. Take Moore's law, for instance. It is extremely strong, going back decades, and surviving multiple changes in chip technology. But it has no clear cause.

A few explanations have been proposed. Maybe it's a consequence of its own success, of chip companies using it to set their goals. Maybe there's some natural exponential rate of improvement in any low-friction feature of a market economy. Exponential-type growth in the short term is no surprise - that just means growth in proportional to investment - so maybe it was an amalgamation of various short term trends.

Do those explanations sound unlikely? Possibly, but there is a huge trend in computer chips going back decades that needs to be explained. They are unlikely, but they have to be weighed against the unlikeliness of the situation. The most plausible explanation is a combination of the above and maybe some factors we haven't thought of yet.

But here's an explanation that is implausible: little time-travelling angels modify the chips so that they follow Moore's law. It's a silly example, but it shows that not all explanations are created equal, even for phenomena that are not fully understood. In fact there are four broad categories of explanations for putative phenomena that don't have a compelling model:

1. Unlikely but somewhat plausible explanations.
2. We don't have an explanation yet, but we think it's likely that there is an explanation.
3. The phenomenon is a coincidence.
4. Any explanation would go against stuff that we do know, and would be less likely than coincidence.

The explanations I've presented for Moore's law fall into category 1. Even if we hadn't thought of those explanations, Moore's law would fall into category 2, because of the depth of evidence for Moore's law and because a "medium length regular technology trend within a broad but specific category" is something that has is intrinsically likely to have an explanation.

Compare with Kurzweil's "law of time and chaos" (a generalisation of his "law of accelerating returns") and Robin Hanson's model where the development of human brains, hunting, agriculture and the industrial revolution are all points on a trend leading to uploads. I discussed these in a previous post, but I can now better articulate the problem with them.

Firstly, they rely on very few data points (the more recent part of Kurzweil's law, the part about recent technological trends, has a lot of data, but the earlier part does not). This raises the probability that they are a mere coincidence (we should also consider selection bias in choosing the data points, which increases the probability of coincidence). Secondly, we have strong reasons to suspect that there won't be any explanation that ties together things like the early evolution of life on Earth, human brain evolution, the agricultural revolution, the industrial revolution, and future technology development. These phenomena have decent local explanations that we already roughly understand (local in time and space to the phenomena described), and these run counter to any explanation that would tie them together.

Human biases and predictions

There is one area where the outside view can still function for multiple phenomena across different eras: when it comes to pointing out human biases. For example, we know that doctors have been authoritative, educated, informed, and useless for most of human history (or possibly much worse than useless). Hence authoritative, educated, and informed statements or people are not to be considered of any value, unless there is some evidence the statement or person is truth tracking. We now have things like expertise research, some primitive betting markets, and track records to try and estimate their experience; these can provide good "outside views".

And the authors of the models of the previous section have some valid points where bias is concerned. Kurzweil's point that (paraphrasing) "things can happen a lot faster than some people think" is valid: we can compare predictions with outcomes. Robin has similar valid points in defense of the possibility of the em scenario.

The reason these explanations are more likely valid is because they have a very probable underlying model/explanation: humans are biased.

Conclusions

• The outside view is a good reminder for anyone who may be using too narrow a model.
• If the model explains the data well, then there is no need for further outside views.
• If there is a phenomena with data but no convincing model, we need to decide if it's a coincidence or there is an underlying explanation.
• Some phenomena have features that make it likely that there is an explanation, even if we haven't found it yet.
• Some phenomena have features that make it unlikely that there is an explanation, no matter how much we look.
• Outside view arguments that point at human prediction biases, however, can be generally valid, as they only require the explanation that humans are biased in that particular way.

Instrumental Rationality Sequence Finished! (w/ caveats)

Hey everyone,

Back in April, I said I was going to start writing an instrumental rationality sequence.

It's...sort of done.

I ended up collecting the essays into a sort of e-book. It's mainly content that I've put here (Starting Advice, Planning 101, Habits 101, etc.), but there's also quite a bit of new content.

It clocks in at about 150 pages and 30,000 words, about 15,000 of which I wrote after the April announcement post. (Which beats my estimate of 10,000 words before burnout!!!)

However, the editor for LW 1.0 editor isn't making it easy to port the stuff here from my Google Drive.

As LW 2.0 enters actual open beta, I'll repost / edit the essays and host them there.

In the meantime, if you want to read the whole compiled book, the direct Google Doc link is here. That's where the real-time updates will happen, so it's what I'd recommend using to read it for now.

(There's also an online version on my blog if for some reason you want to read it there.)

It's my hope that this sequence becomes a useful reference for newcomers looking to learn more about instrumental rationality, which is more specialized than The Sequences (which really are more for epistemics).

Unfortunately, I didn't manage to write the book/sequence I set out to write. The actual book as it is now is about 10% as good as what I actually wanted. There's stuff I didn't get to write, more nuances I'd have liked to cover, more pictures I wanted to make, etc.

After putting in many hours of research and writing, I think I've learned more about the sort of effort that would need to go into making the actual project I'd outlined at the start.

There'll be a postmortem essay analyzing my expectations vs reality coming soon.

As a result of this project and a few other things, I'm feeling burned out. There probably won't be any major projects from me for a little bit, while I rest up.

What is Rational?

Eliezer defines rationality as such:

Instrumental rationality: systematically achieving your values.

....

Instrumental rationality, on the other hand, is about steering reality— sending the future where you want it to go. It’s the art of choosing actions that lead to outcomes ranked higher in your preferences. I sometimes call this “winning.”

Extrapolating from the above definition, we can conclude that an act is rational, if it causes you to achieve your goals/win. The issue with this definition is that we cannot evaluate the rationality of an act, until after observing the consequences of that action. We cannot determine if an act is rational without first carrying out the act. This is not a very useful definition, as one may want to use the rationality of an act as a guide.

Another definition of rationality is the one used in AI when talking about rational agents:

For each possible percept sequence,  a rational agent should select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and whatever built-in  knowledge the agent has.

A precept sequence is basically the sequence of all perceptions the agent as had from inception to the moment of action. The above definition is useful, but I don't think it is without issue; what is rational for two different agents A and B, with the exact same goals, in the exact same circumstances differs. Suppose A intends to cross a road, and A checks both sides of the road, ensures it's clear and then attempts to cross. However, a meteorite strikes at that exact moment, and A is killed. A is not irrational for attempting to cross the road, giving that t hey did not know of the meteorite (and thus could not have accounted for it). Suppose B has more knowledge than A, and thus knows that there is substantial delay between meteor strikes in the vicinity, and then crosses after A and safely crosses. We cannot reasonably say B is more rational than A.

The above scenario doesn't break our intuitions of what is rational, but what about in other scenarios? What about the gambler who knows not of the gambler's fallacy, and believes that because the die hasn't rolled an odd number for the past n turns, that it would definitely roll odd this time (after all, the probability of not rolling odd $n \text{ times is } 2^{-n}$). Are they then rational for betting the majority of their fund on the die rolling odd? Letting what's rational depend on the knowledge of the agent involved, leads to a very broad (and possibly useless) notion of rationality. It may lead to what I call "folk rationality" (doing what you think would lead to success). Barring a few exceptions (extremes of emotion, compromised mental states, etc), most humans are folk rational. However, this folk rationality isn't what I refer to when I say "rational".

How then do we define what is rational to avoid the two issues I highlighted above?

[Link] Habits 101: Techniques and Research

[Link] Bridging the Intention-Action Gap (aka Akrasia)

[Link] Ignorant, irrelevant, and inscrutable (rationalism critiques)

Lesswrong Sydney Rationality Dojo on zen koans

Short post here.

Lesswrong Sydney runs a rationality dojo once a month.  We usually cover 1-2 topics for an hour or less each.  Our regular attendance is 10-20 people.

This month's topics were:

1. Captain awkward advice
2. Goal factoring (CFAR)
3. Understanding zen koans
I only recorded the section on zen koans.  Feedback welcome.

[Link] Recovering from Failure

3 11 June 2017 04:05AM

Rationality as A Value Decider

1 05 June 2017 03:21AM

A Different Concept of Instrumental Rationality

Eliezer Yudkowsky defines instrumental rationality as “systematically achieving your values” and goes on to say: “Instrumental rationality, on the other hand, is about steering reality—sending the future where you want it to go. It’s the art of choosing actions that lead to outcomes ranked higher in your preferences. I sometimes call this ‘winning.’” [1]

I agree with Yudkowsky’s concept of rationality as a method for systematised winning. It is why I decided to pursue rationality—that I may win. However, I personally disagree with the notion of “systematically achieving your values” simply because I think it is too vague. What are my values? Happiness and personal satisfaction? I find that you can maximise this by joining a religious organisation, in fact I think I was happiest in a time before I discovered the Way. But this isn’t the most relevant, maximising your values isn’t specific enough for my taste, it’s too vague for me.

“Likewise, decision theory defines what action I should take based on my beliefs. For any consistent set of beliefs and preferences I could have about Bob, there is a decision-theoretic answer to how I should then act in order to satisfy my preferences.” [2]

This implies that instrumental rationality is specific; from the above statement, I infer:
“For any decision problem to any rational agent with a specified psyche, there is only one correct choice to make.”

However, if we only seek to systematically achieve our values, I believe that instrumental rationality fails to be specific—it is possible that there’s more than one solution to a problem in which we merely seek to maximise or values. I cherish the specificity of rationality; there is a certain comfort, in knowing that there is a single correct solution to any problem, a right decision to make for any game—one merely need find it. As such, I sought a definition of rationality that I personally agree with; one that satisfies my criteria for specificity; one that satisfies my criteria for winning. The answer I arrived at was: “Rationality is systematically achieving your goals.”

I love the above definition; it is specific—gone is the vagueness and uncertainty of achieving values. It is simple—gone is the worry over whether value X should be an instrumental value or a terminal value. Above all, it is useful—I know whether or not I have achieved my goals, and I can motivate myself more to achieve them. Rather than thinking about vague values I think about my life in terms of goals:
“I have goal X how do I achieve it?”
If necessary, I can specify sub goals and sub goals for those sub goals. I find that thinking about your life in terms of goals to be achieved is a more conducive model for problem solving, a more efficient model—a useful model. I am many things, and above them all I am a utilitist—the worth of any entity is determined by its utility to me. I find the model of rationality as a goal enabler a more useful model.

Goals and values are not always aligned. For example, consider the problem below:

Jane is the captain of a boat full with 100 people. The ship is about to capsize and would, unless ten people are sacrificed. Jane’s goal is to save as many people as possible. Jane’s values hold human lives sacred. Sacrificing ten people has a 100% chance of saving 90 people, while sacrificing no one and going with plan delta has a 10% chance to save the 100, and a 90% chance for everyone to die.

The sanctity of human life is a terminal value for Jane. Jane when seeking to actualise her values, may well choose to go with plan delta, which has a 90% chance to prevent her from achieving her goals.

Values may be misaligned with goals, values may be inhibiting towards achieving our goals. Winning isn’t achieving your values; winning is achieving your goals.

Goals

I feel it is apt to define goals at this juncture, lest the definition be perverted and only goals aligned with values be considered “true/good goals”.

Goals are any objectives a self aware agent consciously assigns itself to accomplish.

There are no true goals, no false goals, no good goals, no bad goals, no worthy goals, no worthless goals; there are just goals.

I do not consider goals something that “exist to affirm/achieve values"—you may assign yourself goals that affirm your values, or goals that run contrary to them—the difference is irrelevant, we work to achieve those goals you have specified.

The Psyche

The Psyche is an objective map that describes a self-aware agent that functions as a decision maker—rational or not. The sum total of an individual’s beliefs—all knowledge is counted as belief—values and goals form their psyche. The psyche is unique to each individual. The psyche is not a subjective evaluation of an individual by themselves, but an objective evaluation of the individual as they would appear to an omniscient observer. An individual’s psyche includes the totality of their map. The psyche is— among other things—a map that describes a map so to speak.

When a decision problem is considered, the optimum solution to such a problem cannot be considered without considering the psyche of that individual. The values that individual holds, the goals they seek to achieve and their mental map of the world.

Eliezer Yudkowsky seems to believe that we have an extremely limited ability to alter our psyche. He posits, that we can’t choose to believe the sky is green at will. I never really bought this, and especially due to personal anecdotal evidence. Yet, I’ll come back to altering beliefs later.

Yudkowsky describes the human psyche as: “a lens that sees its own flaws”. [3] I personally would extend this definition; we are not merely “a lens that sees its own flaws”, we are also “a lens that corrects itself”—the self-aware AI that can alter its own code. The psyche can be altered at will—or so I argue.

I shall start with values. Values are neither permanent nor immutable. I’ve had a slew of values over the years; while Christian, I valued faith, now I adhere to Thomas Huxley’s maxim:

Scepticism is the highest of duties; blind faith the one unpardonable sin.

Another one: prior to my enlightenment I held emotional reasoning in high esteem, and could be persuaded by emotional arguments, after my enlightenment I upheld rational reasoning. Okay, that isn’t entirely true; my answer to the boat problem had always been to sacrifice the ten people, so that doesn’t exactly work, but I was more emotional then, and could be swayed by emotional arguments. Before I discovered the Way earlier this year (when I was fumbling around in the dark searching for rationality) I viewed all emotion as irrational, and my values held logic and reason above all. Back then, I was a true apath, and completely unfeeling. I later read arguments for the utility of emotions, and readjusted my values accordingly. I have readjusted my values several times along the journey of life; just recently, I repressed my values relating to pleasure from feeding—to aid my current routine of intermittent fasting. I similarly repressed my values of sexual arousal/pleasure—I felt it will make me more competent. Values can be altered, and I suspect many of us have done it at least once in our lives—we are the lens that corrects itself.

Getting back to belief (whether you can choose to believe the sky is green at will) I argue that you can, it is just a little more complicated than altering your values. Changing your beliefs—changing your actual anticipation controllers—truly redrawing the map, would require certain alterations to your psyche in order for it to retain a semblance of consistency. In order to be able to believe the sky is green, you would have to:

• Repress your values that make you desire true beliefs.
• Repress your values that make you give priority to empirical evidence.
• Repress your vales that make you sceptical.
• Create (or grow if you already have one) a new value that supports blind faith.

Optional:

• Repress your values that support curiosity.
• Create (or grow if you already have one) a new value that supports ignorance.

By the time, you’ve done the ‘edits’ listed above, you would be able to freely believe that the sky is green, or snow is black, or that the earth rests on the back of a giant turtle, or a teapot floats in the asteroid belt. I’m warning you though, by the time you’ve successfully accomplished the edits above, your psyche would be completely different from now, and you will be—I argue—a different person. If any of you were worried that the happiness of stupidity was forever closed to you, then fear not; it is open to you again—if you truly desire it. Be forewarned; the “you” that would embrace it would be different from the “you” now, and not one I’m sure I’d want to associate with. The psyche is alterable; we are the masters of our own mind—the lens that corrects itself.

I do not posit, that we can alter all of our psyche (I suspect that there are aspects of cognitive machinery that are unalterable; “hardcoded” so to speak. However, my neuroscience is non-existent—as such I shall leave this issue to those more equipped to comment on it.

Values as Tools

In my conception of instrumental rationality, values are no longer put on a pedestal, they are no longer sacred; there are no more terminal values anymore—only instrumental. Values aren’t the masters anymore, they’re slaves—they’re tools.

The notion of values as tools may seem disturbing for some, but I find it to be quite a useful model, and such I shall keep it.

Take the ship problem Jane was presented with above, had Jane deleted her value which held human life as sacred, she would have been able to make the decision with the highest probability of achieving her goals. She could even add a value that suppressed empathy, to assist her in similar situations—though some might feel that is overkill. I once asked a question on a particular subreddit:
“Is altruism rational?”
My reply was a quick and dismissive:
“Rationality doesn’t tell you what values to have, it only tells you how to achieve them.”

What are your goals?

If your goals are to have a net positive effect on the world (do good so to speak) then altruism may be a rational value to have. If your goals are far more selfish, then altruism may only serve as a hindrance.

The utility of “Values as Tools” isn’t just that some values may harm your goals, nay it does much more. The payoff of a decision is determined by two things:

1. How much closer it brings you to the realisation of your goals?
2. How much it aligns with your values?

Choosing values that are doubly correlated with your current goals (you actualise your values when you make goal conducive decisions, and you run opposite to your values when you make goal deleterious decisions) exaggerates the positive payoff of goal conducive decisions, and the negative payoff of goal deleterious decisions. This aggrandising of the payoffs of decisions serves as a strong motivator towards making goal conducive decisions— large rewards, large punishment—a perfect propulsion system so to speak.

The utility of the “Values as Tools” approach is that it serves as a strong motivator towards goal conducive decision making.

Conclusion

It has been brought to my attention that a life such as the one I describe may be “an unsatisfying life” and “a life not worth living”. I may reply that I do not seek to maximise happiness, but that may be dodging the issue; I first conceived rationality as a value decider when thinking about how I would design an AI—it goes without saying that humans are not computers.

References:

[1] Eliezer Yudkowsky, “Rationality: From AI to Zombies”, pg 7, 2015, MIRI, California.
[2] Eliezer Yudkowsky, “Rationality: From AI to Zombies”, pg 203, 2015, MIRI, California.
[3] Eliezer Yudkowsky, “Rationality: From AI to Zombies”, pg 40, 2015, MIRI, California.

Instrumental Rationality Sequence Update (Drive Link to Drafts)

Hey all,

Following my post on my planned Instrumental Rationality sequence, I thought it'd be good to give the LW community an update of where I am.

1) Currently collecting papers on habits. Planning to go through a massive sprint of the papers tomorrow. The papers I'm using are available in the Drive folder linked below.

2) I have a publicly viewable Drive folder here of all relevant articles and drafts and things related to this project, if you're curious to see what I've been writing. Feel free to peek around everywhere, but the most relevant docs are this one which is an outline of where I want to go for the sequence and this one which is the compilation of currently sorta-decent posts in a book-like format (although it's quite short right now at only 16 pages).

Anyway, yep, that's where things are at right now.

14 13 May 2017 08:02PM

2 13 May 2017 01:47AM

There is No Akrasia

17 30 April 2017 03:33PM

I don’t think akrasia exists.

This is a fairly strong claim. I’m also not going to try and argue it.

What I’m really here to argue are the two weaker claims that:

a) Akrasia is often treated as a “thing” by people in the rationality community, and this can lead to problems, even though akrasia a sorta-coherent concept.

b) If we want to move forward and solve the problems that fall under the akrasia-umbrella, it’s better to taboo the term akrasia altogether and instead employ a more reductionist approach that favors specificity

But that’s a lot less catchy, and I think we can 80/20 it with the statement that “akrasia doesn’t exist”, hence the title and the opening sentence.

First off, I do think that akrasia is a term that resonates with a lot of people. When I’ve described this concept to friends (n = 3), they’ve all had varying degrees of reactions along the lines of “Aha! This term perfectly encapsulates something I feel!” On LW, it seems to have garnered acceptance as a concept, evidenced by the posts / wiki on it.

It does seem, then, that this concept of “want-want vs want” or “being unable to do what you ‘want’ to do” seems to point at a phenomenologically real group of things in the world.

However, I think that this is actually bad.

Once people learn the term akrasia and what it represents, they can now pattern-match it to their own associated experiences. I think that, once you’ve reified akrasia, i.e. turned it into a “thing” inside your ontology, problems occur:

First off, treating akrasia as a real thing gives it additional weight and power over you:

Once you start to notice the patterns, it’s harder to see things again as mere apparent chaos. In the case of akrasia, I think this means that people may try less hard because they suddenly realize they’re in the grip of this terrible monster called akrasia.

I think this sort of worldview ends up reinforcing some unhelpful attitudes towards solving the problems akrasia represents. As an example, here are two paraphrased things I’ve overheard about akrasia which I think illustrate this. (Happy to remove these if you would prefer not to be mentioned.)

“Akrasia has mutant healing powers…Thus you can’t fight it, you can only keep switching tactics for a time until they stop working…”

“I have massive akrasia…so if you could just give me some more high-powered tools to defeat it, that’d be great…”

Both of these quotes seem to have taken the akrasia hypothesis a little too far. As I’ll later argue, “akrasia” seems to be dealt with better when you see the problem as a collection of more isolated disparate failures of different parts of your ability to get things done, rather than as an umbrella term.

I think that the current akrasia framing actually makes the problem more intractable.

I see potential failure modes where people come into the community, hear about akrasia (and all the related scary stories of how hard it is to defeat), and end up using it as an excuse (perhaps not an explicit belief, but as an alief) that impacts their ability to do work.

This was certainly the case for me, where improved introspection and metacognition on certain patterns in my mental behaviors actually removed a lot of my willpower which had served me well in the past. I may be getting slightly tangential here, but my point is that giving people models, useful as they might be for things like classification, may not always be net-positive.

Having new things in your ontology can harm you.

So just giving people some of these patterns and saying, “Hey, all these pieces represent a Thing called akrasia that’s hard to defeat,” doesn’t seem like the best idea.

How can we make the akrasia problem more tractable, then?

I claimed earlier that akrasia does seem to be a real thing, as it seems to be relatable to many people. I think this may actually because akrasia maps onto too many things. It’s an umbrella term for lots of different problems in motivation and efficacy that could be quite disparate problems. The typical akrasia framing lumps problems like temporal discounting with motivation problems like internal disagreements or ugh fields, and more.

Those are all very different problems with very different-looking solutions!

In the above quotes about akrasia, I think that they’re an example of having mixed up the class with its members. Instead of treating akrasia as an abstraction that unifies a class of self-imposed problems that share the property of acting as obstacles towards our goals, we treat it as a problem onto itself.

Saying you want to “solve akrasia” makes about as much sense as directly asking for ways to “solve cognitive bias”. Clearly, cognitive biases are merely a class for a wide range of errors our brains make in our thinking. The exercises you’d go through to solve overconfidence look very different than the ones you might use to solve scope neglect, for example.

Under this framing, I think we can be less surprised when there is no direct solution to fighting akrasia—because there isn’t one.

I think the solution here is to be specific about the problem you are currently facing. It’s easy to just say you “have akrasia” and feel the smooth comfort of a catch-all term that doesn’t provide much in the way of insight. It’s another thing to go deep into your ugly problem and actually, honestly say what the problem is.

The important thing here is to identify which subset of the huge akrasia-umbrella your individual problem falls under and try to solve that specific thing instead of throwing generalized “anti-akrasia” weapons at it.

Is your problem one of remembering to do tasks? Then set up a Getting Things Done system.

Is your problem one of hyperbolic discounting, of favoring short-term gains? Then figure out a way to recalibrate the way you weigh outcomes. Maybe look into precommitting to certain courses of action.

Is your problem one of insufficient motivation to pursue things in the first place? Then look into why you care in the first place. If it turns out you really don’t care, then don’t worry about it. Else, find ways to source more motivation.

The basic (and obvious) technique I propose, then, looks like:

1. Identify the akratic thing.

2. Figure out what’s happening when this thing happens. Break it down into moving parts and how you’re reacting to the situation.

3. Think of ways to solve those individual parts.

4. Try solving them. See what happens

5. Iterate

Potential questions to be asking yourself throughout this process:

• What is causing your problem? (EX: Do you have the desire but just aren’t remembering? Are you lacking motivation?)

• How does this akratic problem feel? (EX: What parts of yourself is your current approach doing a good job of satisfying? Which parts are not being satisfied?)

• Is this really a problem? (EX: Do you actually want to do better? How realistic would it be to see the improvements you’re expecting? How much better do you think could be doing?)

Here’s an example of a reductionist approach I did:

“I suffer from akrasia.

More specifically, though, I suffer from a problem where I end up not actually having planned things out in advance. This leads me to do things like browse the internet without having a concrete plan of what I’d like to do next. In some ways, this feels good because I actually like having the novelty of a little unpredictability in life.

However, at the end of the day when I’m looking back at what I’ve done, I have a lot of regret over having not taken key opportunities to actually act on my goals. So it looks like I do care (or meta-care) about the things I do everyday, but, in the moment, it can be hard to remember.”

Now that I’ve far more clearly laid out the problem above, it seems easier to see that the problem I need to deal with is a combination of:

• Reminding myself the stuff I would like to do (maybe via a schedule or to-do list).

• Finding a way to shift my in-the-moment preferences a little more towards the things I’ve laid out (perhaps with a break that allows for some meditation).

I think that once you apply a reductionist viewpoint and specifically say exactly what it is that is causing your problems, the problem is already half-solved. (Having well-specified problems seems to be half the battle.)

Remember, there is no akrasia! There are only problems that have yet to be unpacked and solved!

0 28 April 2017 01:01AM

Introducing the Instrumental Rationality Sequence

29 26 April 2017 09:53PM

What is this project?

I am going to be writing a new sequence of articles on instrumental rationality. The end goal is to have a compiled ebook of all the essays, so the articles themselves are intended to be chapters in the finalized book. There will also be pictures.

I intend for the majority of the articles to be backed by somewhat rigorous research, similar in quality to Planning 101 (with perhaps a few less citations). Broadly speaking, the plan is to introduce a topic, summarize the research on it, give some models and mechanisms, and finish off with some techniques to leverage the models.

The rest of the sequence will be interspersed with general essays on dealing with these concepts, similar to In Defense of the Obvious. Lastly, there will be a few experimental essays on my attempt to synthesize existing models into useful-but-likely-wrong models of my own, like Attractor Theory.

I will likely also recycle / cannibalize some of my older writings for this new project, but I obviously won’t post the repeated material here again as new stuff.

What topics will I cover?

Here is a broad overview of the three main topics I hope to go over:

(Ordering is not set.)

Overconfidence in Planning: I’ll be stealing stuff from Planning 101 and rewrite a bit for clarity, so not much will be changed. I’ll likely add more on the actual models of how overconfidence creeps into our plans.

Motivation: I’ll try to go over procrastination, akrasia, and behavioral economics (hyperbolic discounting, decision instability, precommitment, etc.)

Habituation: This will try to cover what habits are, conditioning, incentives, and ways to take the above areas and habituate them, i.e. actually putting instrumental rationality techniques into practice.

Other areas I may want to cover:

Assorted Object-Level Things: The Boring Advice Repository has a whole bunch of assorted ways to improve life that I think might be useful to reiterate in some fashion.

Aversions and Ugh Fields: I don’t know too much about these things from a domain knowledge perspective, but it’s my impression that being able to debug these sorts of internal sticky situations is a very powerful skill. If I were to write this section, I’d try to focus on Focusing and some assorted S1/S2 communication things. And maybe also epistemics.

Ultimately, the point here isn’t to offer polished rationality techniques people can immediately apply, but rather to give people an overview of the relevant fields with enough techniques that they get the hang of what it means to start making their own rationality.

Why am I doing this?

Niche Role: On LessWrong, there currently doesn’t appear to be a good in-depth series on instrumental rationality. Rationality: From AI to Zombies seems very strong for giving people a worldview that enables things like deeper analysis, but it leans very much into the epistemic side of things.

It’s my opinion that, aside from perhaps Nate Soares’s series on Replacing Guilt (which I would be somewhat hesitant to recommend to everyone), there is no in-depth repository/sequence that ties together these ideas of motivation, planning, procrastination, etc.

Granted, there have been many excellent posts here on several areas, but they've been fairly directed. Luke's stuff on beating procrastination, for example, is fantastic. I'm aiming for a broader overview that hits the current models and research on different things.

I think this means that creating this sequence could add a lot of value, especially to people trying to create their own techniques.

Open-Sourcing Rationality: It’s clear that work is being done on furthering rationality by groups like Leverage and CFAR. However, for various reasons, the work they do is not always available to the public. I’d like to give people who are interested but unable to directly work with these organization something they can use to jump start their own investigations.

I’d like this to become a similar Schelling Point that we could direct people to if they want to get started.

I don’t meant to imply that what I’ll produce is the same caliber, but I do think it makes sense to have some sort of pipeline to get rationalists up to speed with the areas that (in my mind) tie into figuring out instrumental rationality. When I first began looking into this field, there was a lot of information that was scattered in many places.

I’d like to create something cohesive that people can point to when newcomers want to get started with instrumental rationality that similarly gives them a high level overview of the many tools at their disposal.

Revitalizing LessWrong: It’s my impression that independent essays on instrumental rationality have slowed over the years. (But also, as I mentioned above, this doesn’t mean stuff hasn’t happened. CFAR’s been hard at work iterating their own techniques, for example.) As LW 2.0 is being talked about, this seems like an opportune time to provide some new content and help with our reorientation towards LW becoming once again a discussion hub for rationality.

Where does LW fit in?

Crowd-sourcing Content: I fully expect that many other people will have fantastic ideas that they want to contribute. I think that’s a good idea. Given some basic things like formatting / roughly consistent writing style throughout, I think it’d be great if other potential writers see this post as an invitation to start thinking about things they’d like to write / research about instrumental rationality.

Feedback: I’ll be doing all this writing on a public Google Doc with posts that feature chapters once they’re done, so hopefully there’s ample room to improve and take in constructive criticism. Feedback on LW is often high-quality, and I expect that to definitely improve what I will be writing.

Other Help: I probably can’t come through every single research paper out there, so if you see relevant information I didn’t or want to help with the research process, let me know! Likewise, if you think there are other cool ways you can contribute, feel free to either send me a PM or leave a comment below.

Why am I the best person to do this?

I’m probably not the best person to be doing this project, obviously.

But, as a student, I have a lot of time on my hands, and time appears to be a major limiting reactant in this whole process.

Additionally, I’ve been somewhat involved with CFAR, so I have some mental models about their flavor of instrumental rationality; I hope this translates into meaning I'm writing about stuff that isn't just a direct rehash of their workshop content.

Lastly, I’m very excited about this project, so you can expect me to put in about 10,000 words (~40 pages) before I take some minor breaks to reset. My short-term goals (for the next month) will be on note-taking and finding research for habits, specifically, and outlining more of the sequence.

1 19 April 2017 05:13AM

4 31 March 2017 09:45PM

[I recently made a post in the OT about this, but I figured it might be good as a top-level post for add'l attention.]

After writing Planning 101, I realized that there was no automated tool online for Murphyjitsu, the CFAR technique of problem-proofing plans. (I explain Murphyjitsu in more detail about halfway down the Planning 101 post.)

I was also trying to learn some web-dev at the same time, so I decided to code up this little tool, Plan-Bot, that walks you through a series of planning prompts and displays your answers to the questions.

In short, you type in what you want to do, it asks you what the steps are, and when you're done, it asks you to evaluate potential ways things can go wrong.

I set it as my homepage, and I've been getting some use out of it. Hopefully it ends up being helpful for other people as well.

You can try it out here.

And here is it on GitHub.

I'm still trying to learn web-dev, so feel free to give suggestions for improvements, and I'll try to incorporate them.

Deriving techniques on the fly

3 31 March 2017 10:03AM

Original post:  http://bearlamp.com.au/deriving-techniques-on-the-fly/

Last year Lachlan Cannon came back from a CFAR reunion and commented that instead of just having the CFAR skills we need the derivative skills.  The skills that say, "I need a technique for this problem" and let you derive a technique, system, strategy, plan, idea for solving the problem on the spot.

By analogy to an old classic,

Give a man a fish and he will eat for a day.  Teach a man to fish and he never go hungry again.

This concept always felt off to me until I met Anna.  An american who used to live in Alaska where they have enough fish in a river that any time you go fishing you catch a fish, and a big enough one to eat.  In contrast, I had been fishing several times when I was little (in Australia) and never caught things, or only caught fish that were too small to feed one person, let alone many people.

Silly fishing misunderstandings aside I think the old classic speaks to something interesting but misses a point.  to that effect I want to add something.

Teach a man to derive the skill of fishing when he needs it. and he will never stop growing.

We need to go more meta than that?  I am afraid it's turtles all the way down.

Noticing

To help you derive you need to start by noticing when there is a need.  There are two parts to noticing:

1. triggers
2. introspection
3. What next

But before I fail to do it justice, agentyduck has written about this. Art of noticingWhat it's like to notice thingsHow to train noticing.

The Art Of Noticing goes like this:

1. Answer the question, "What's my first possible clue that I'm about to encounter the problem?" If your problem is "I don't respond productively to being confused," then the first sign a crucial moment is coming might be "a fleeting twinge of surprise". Whatever that feels like in real time from the inside of your mind, that's your trigger.

2. Whenever you notice your trigger, make a precise physical gesture. Snap your fingers, tap your foot, touch your pinky finger with your thumb - whatever feels comfortable. Do it every time you notice that fleeting twinge of surprise.

1. I guess. I remember or imagine a few specific instances of encountering weak contrary evidence (such as when I thought my friend wasn't attracted to me, but when I made eye contact with him across the room at a party he smiled widely). On the basis of those simulations, I make a prediction about what it will feel like, in terms of immediate subjective experience, to encounter weak contrary evidence in the future. The prediction is a tentative trigger. For me, this would be "I feel a sort of matching up with one of my beliefs, there's a bit of dissonance, a tiny bit of fear, and maybe a small impulse to direct my attention away from these sensations and away from thoughts about the observation causing all of this".
2. I test my guess. I keep a search going on in the background for anything in the neighborhood of the experience I predicted. Odds are good I'll miss several instances of weak contrary evidence, but as soon as I realize I've encountered one, I go into reflective attention so I'm aware of as many details of my immediate subjective experience as possible. I pay attention to what's going on in my mind right now, and also what's still looping in my very short-term memory of a few moments before I noticed. Then I compare those results to my prediction, noting anything I got wrong, and I feed that information into a new prediction for next time. (I might have gotten something wrong that caused the trigger to go off at the wrong time, which probably means I need to narrow my prediction.) The new prediction is the new trigger.
3. I repeat the test until my trigger seems to be accurate and precise. Now I've got a good trigger to match a good action.

Derivations (as above) are a "what next" action.

My derivations come from asking myself that question or other similar questions, then attempting to answer them:

• What should I do next?
• How do I solve this problem?
• Why don't other people have this problem?
• Can I make this problem go away?
• How do I design a system to make this not matter any more?

(you may notice this is stimulating introspection - this is what it is)

Meta:

The post that led me to post on derivations is this post on How to present a problem hopefully to be published tomorrow.

This post took ~1 hour to write.

Cross posted to lesswrong

Act into Uncertainty

6 24 March 2017 09:28PM

It’s only been recently that I’ve been thinking about epistemics in the context of figuring out my behavior and debiasing. Aside from trying to figure out how I actually behave (as opposed to what I merely profess I believe), I’ve been thinking about how to confront uncertainty—and what it feels like.

For many areas of life, I think we shy away from confronting uncertainty and instead flee into the comforting non-falsifiability of vagueness.

Consider these examples:

1) You want to get things done today. You know that writing things down can help you finish more things. However, it feels aversive to write down what you specifically want to do. So instead, you don’t write things down and instead just keep a hazy notion of “I will do things today”.

2) You try to make a confidence interval for a prediction where money is on the line. You notice yourself feeling uncomfortable, no matter what your bounds are; it feels bad to set down any number at all, which is accompanied by a dread feeling of finality.

3) You’re trying to find solutions to a complex, entangled problem. Coming up with specific solutions feels bad because none of them seem to completely solve the problem. So instead you decide to create a meta-framework that produces solutions, or argue in favor of some abstract process like a “democratized system that focuses on holistic workarounds”.

In each of the above examples, it feels like we move away from making specific claims because that opens us up to specific criticism. But instead of trying to improve the strengths of specific claims, we retreat to fuzzily-defined notions that allow us to incorporate any criticism without having to really update.

I think there’s a sense in which, in some areas of life, we’re embracing shoddy epistemology (e.g. not wanting to validate or falsify our beliefs) because of a fear of wanting to fail / put in the effort to update. I think this failure is what fuels this feeling of aversion.

It seems useful to face this feeling of badness or aversion with the understanding that this is what confronting uncertainty feels like. The best action doesn’t always feel comfortable and easy; it can just as easily feel aversive and final.

Look for situations where you might be flinching away from making specific claims and replacing them with vacuous claims that support all evidence you might see.

If you never put your beliefs to the test with specific claims, then you can never verify them in the real world. And if your beliefs don’t map well onto the real world, they don’t seem very useful to even have in the first place.

5 Project Hufflepuff Suggestions for the Rationality Community

In the spirit of Project Hufflepuff, I’m listing out some ideas for things I would like to see in the rationality community, which seem like perhaps useful things to have. I dunno if all of these are actually good ideas, but it seems better to throw some things out there and iterate.

Ideas:

Idea 1) A more coherent summary of all the different ideas that are happening across all the rationalist blogs. I know LessWrong is trying to become more of a Schelling point, but I think a central forum is still suboptimal for what I want. I’d like something that just takes the best ideas everyone’s been brewing and centralizes them in one place so I can quickly browse them all and dive deep if something looks interesting.

Suggestions:

A) A bi-weekly (or some other period) newsletter where rationalists can summarize their best insights of the past weeks in 100 words or less, with links to their content.

B) An actual section of LessWrong that does the above, so people can comment / respond to the ideas.

Thoughts:

This seems straightforward and doable, conditional on commitment from 5-10 people in the community. If other people are also excited, I’m happy to reach out and get this thing started.

Idea 2) A general tool/app for being able to coordinate. I’d be happy to lend some fraction of my time/effort in order to help solve coordination problems. It’s likely other people feel the same way. I’d like a way to both pledge my commitment and stay updated on things that I might be able to plausibly Cooperate on.

Suggestions:

A) An app that is managed by someone, which sends out broadcasts for action every so often. I’m aware that similar things / platforms already exist, so maybe we could just leverage an existing one for this purpose.

Thoughts:

In abstract, this seems good. Wondering what others think / what sorts of coordination problems this would be good for. The main value here is being confident in *actually* getting coordination from the X people who’ve signed up.

Idea 3) More rationality materials that aren’t blogs. The rationality community seems fairly saturated with blogs. Maybe we could do with more webcomics, videos, or something else?

Suggestions:

A) Brainstorm good content from other mediums, benefits / drawbacks, and see why we might want content in other forms.

B) Convince people who already make such mediums to touch on rationalist ideas, sort of like what SMBC does.

Thoughts:

I’d be willing to start up either a webcomic or a video series, conditional on funding. Anyone interested in sponsoring? Happy to have a discussion below.

EDIT:

Links to things I've done for additional evidence:

Idea 4) More systematic tools to master rationality techniques. To my knowledge, only a small handful of people have really tried to implement Systemization to learning rationality, of whom Malcolm and Brienne are the most visible. I’d like to see some more attempts at Actually Trying to learn techniques.

Suggestions:

A) General meeting place to discuss the learning / practice.

B) Accountability partners + Skype check-ins .

C) List of examples of actually using the techniques + quantified self to get stats.

Thoughts:

I think finding more optimal ways to do this is very important. There is a big step between knowing how techniques work and actually finding ways to do them. I'd be excited to talk more about this Idea.

Idea 5) More online tools that facilitate rationality-things. A lot of rationality techniques seem like they could be operationalized to plausibly provide value.

Suggestions:

A) An online site for Double Cruxing, where people can search for someone to DC with, look at other ongoing DC’s, or propose topics to DC on.

B) Chatbots that integrate things like Murphyjitsu or ask debugging questions.

Thoughts:

I’m working on building a Murphyjitsu chatbot for building up my coding skill. The Double Crux site sounds really cool, and I’d be happy to do some visual mockups if that would help people’s internal picture of how that might work out. I am unsure of my ability to do the actual coding, though.

Conclusion:

Those are the ideas I currently have. Very excited to hear what other people think of them, and how we might be able to get the awesome ones into place. Also, feel free to comment on the FB post, too, if you want to signal boost.

CFAR Workshop Review: February 2017

[A somewhat extensive review of a recent CFAR workshop, with recommendations at the end for those interested in attending one.]

I recently mentored at a CFAR workshop, and this is a review of the actual experience. In broad strokes, this review will cover the physical experience (atmosphere, living, eating, etc.), classes (which ones were good, which ones weren’t), and recommendations (regrets, suggestions, ways to optimize your experience). I’m not officially affiliated with CFAR, and this review represents my own thoughts only.

A little about me: my name is Owen, and I’m here in the Bay Area. This was actually my first real workshop, but I’ve had a fair amount of exposure to CFAR materials from EuroSPARC, private conversations, and LessWrong. So do keep in mind that I’m someone who came into the workshop with a rationalist’s eye.

I’m also happy to answer any questions people might have about the workshop. (Via PM or in the comments below.)

Physical Experience:

Sleeping / Food / Living:

(This section is venue-dependent, so keep that in mind.)

Despite the hefty $3000 plus price tag, the workshop accommodations aren’t exactly plush. You get a bed, and that’s about it. In my workshop, there were always free bathrooms, so that part wasn’t a problem. There was always enough food at meals, and my impression was that dietary restrictions were handled well. For example, one staff member went out and bought someone lunch when one meal didn’t work. Other than that, there’s ample snacks between meals, usually a mix of chips, fruits, and chocolate. Also, hot tea and a surprisingly wide variety of drinks. Atmosphere / Social: (The participants I worked with were perhaps not representative of the general “CFAR participant”, so also take caution here.) People generally seemed excited and engaged. Given that everyone hopefully voluntarily decided to show up, this was perhaps to be expected. Anyway, there’s a really low amount of friction when it comes to joining and exiting conversations. By that, I mean it felt very easy, socially speaking, to just randomly join a conversation. Staff and participants all seemed quite approachable for chatting. I don’t have the actual participant stats, but my impression is that a good amount of people came from quantitative (math/CS) backgrounds, so there were discussions on more technical things, too. It also seemed like a majority of people were familiar with rationality or EA prior to coming to the workshop. There were a few people for whom the material didn’t seem to “resonate” well, but the majority people seemed to be “with the program”. Class Schedule: (The schedule and classes are also in a state of flux, so bear that in mind too.) Classes start at around 9:30 am in the morning and end at about 9:00 pm at night. In between, there are 20 minute breaks between every hour of classes. Lunch is about 90 minutes, while dinner is around 60 minutes. Most of the actual classes were a little under 60 minutes, except for the flash classes, which were only about 20 minutes. Some classes had extended periods for practicing the techniques. You’re put into a group of around 8 people, which switches every day, that you go to classes with. So there’s a few rotating classes that are happening, where you might go to them in a different order. Classes Whose Content I Enjoyed: As I was already familiar with most of the below material, this reflects more a general sense of classes which I think are useful, rather than ones which were taught exceptionally well at the workshop. TAPs: Kaj Sotala already has a great write-up of TAPs here, and I think that they’re a helpful way of building small-scale habits. I also think the “click-whirr” mindset TAPs are built off can be a helpful way to model minds. The most helpful TAP for me is the Quick Focusing TAP I mention about a quarter down the page here. Pair Debugging: Pair Debugging is about having someone else help you work through a problem. I think this is explored to some extent in places like psychiatry (actually, I’m unsure about this) as well as close friendships, but I like how CFAR turned this into a more explicit social norm / general thing to do. When I do this, I often notice a lot of interesting inconsistencies, like when I give someone good-sounding advice—except that I myself don’t follow it. The Strategic Level: The Strategic Level is where you, after having made a mistake, ask yourself, “What sort of general principles would I have had noticed in order to not make a mistake of this class in the future?” This is opposed to merely saying “Well, that mistake was bad” (first level thinking) or “I won’t make that mistake again” (second level thinking). There were also some ideas about how the CFAR techniques can recurse upon themselves in interesting ways, like how you can use Murphyjitsu (middle of the page) on your ability to use Murphyjitsu. This was a flash class, and I would have liked it if we could have spent more time on these ideas. Tutoring Wheel: Less a class and more a pedagogical activity, Tutoring Wheel was where everyone picked a specific rationality class to teach and then rotated, teaching others and being taught. I thought this was a really strong way to help people understand the techniques during the workshop. Focusing / Internal Double Crux / Mundanification: All three of these classes address different things, but in my mind I thought they were similar in the sense of looking into yourself. Focusing is Gendlin’s self-directed therapy technique, where people try to look into themselves to get a “felt shift”. Internal Double Crux is about resolving internal disagreements, often between S1 and S2 (but not necessarily). Mundanification is about facing the truth, even when you flinch from it, via Litany of Gendlin-type things. This general class of techniques that deals with resolving internal feelings of “ugh” I find to be incredibly helpful, and may very well be the highest value thing I got out of the class curriculum. Classes Whose Teaching/Content I Did Not Enjoy: These were classes that I felt were not useful and/or not explained well. This differs from the above, because I let the actual teaching part color my opinions. Taste / Shaping: I thought an earlier iteration of this class was clearer (when it was called Inner Dashboard). Here, I wasn’t exactly sure what the practical purpose of the class was, let alone what the general thing it was pointing at. To the best of my knowledge, Taste is about how we have subtle “yuck” and “yum” senses towards things, and there can be a way to reframe negative affects in a more positive way, like how “difficult” and “challenging” can be two sides of the same coin. Shaping is about…something. I’m really unclear about this one. Pedagogical Content Knowledge (PCK): PCK is, I think, about how the process of teaching a skill differs from the process of learning it. And you need a good understanding of how a beginner is learning something, what that experience feels like, in order to teach it well. I get that part, but this class seemed removed from the other classes, and the activity we did (asking other people how they did math in their head) didn’t seem useful. Flash Class Structure: I didn’t like the 20 minute “flash classes”. I felt like they were too quick to really give people ideas that stuck in their head. In general, I am in support of less classes and extended times to really practice the techniques, and I think having little to no flash classes would be good. Suggestions for Future Classes: This is my personal opinion only. CFAR has iterated their classes over lots of workshops, so it’s safe to assume that they have reasons for choosing what they teach. Nevertheless, I’m going to be bold and suggest some improvements which I think could make things better. Opening Session: CFAR starts off every workshop with a class called Opening Session that tries to get everyone in the right mindset for learning, with a few core principles. Because of limited time, they can’t include everything, but there were a few lessons I thought might have helped as the participants went forward: In Defense of the Obvious: There’s a sense where a lot of what CFAR says might not be revolutionary, but it’s useful. I don’t blame them; much of what they do is draw boundaries around fairly-universal mental notions and draw attention to them. I think they could spend more time highlighting how obvious advice can still be practical. Mental Habits are ProceduralRationality techniques feel like things you know, but it’s really about things you do. Focusing on this distinction could be very useful to make sure people see that actually practicing the skills is very important. Record / Take Notes: I find it really hard to remember concrete takeaways if I don’t write them down. During the workshop, it seemed like maybe only about half of the people were taking notes. In general, I think it’s at least good to remind people to journal their insights at the end of the day, if they’re not taking notes at every class. Turbocharging + Overlearning: Turbocharging is a theory in learning put forth by Valentine Smith which, briefly speaking, says that you get better at what you practice. Similarly, Overlearning is about using a skill excessively over a short period to get it ingrained. It feels like the two skills are based off similar ideas, but their connection to one another wasn’t emphasized. Also, they were several days apart; I think they could be taught closer together. General Increased Cohesion: Similarly, I think that having additional discussion on how these techniques relate to one another be it through concept maps or some theorizing might be good to give people a more unified rationality toolkit. Mental Updates / Concrete Takeaways: This ended up being really long. If you’re interested, see my 5-part series on the topic here. Suggestions / Recommendations: This is a series of things that I would have liked to do (looking back) at the workshop, but that I didn’t manage to do at the time. If you’re considering going, this list may prove useful to you when you go. (You may want to consider bookmarking this.) Write Things Down: Have a good idea? Write it down. Hear something cool? Write it down. Writing things down (or typing, voice recording, etc.) is all really important so you can remember it later! Really, make sure to record your insights! Build Scaffolding: Whenever you have an opportunity to shape your future trajectory, take it. Whether this means sending yourself emails, setting up reminders, or just taking a 30 minute chunk to really practice a certain technique, I think it’s useful to capitalize on the unique workshop environment to, not just learn new things, but also just do things you otherwise probably “wouldn’t have had the time for”. Record Things to Remember Them: Here’s a poster I made that has a bunch of suggestions: Do ALL The Things! Don’t Be Afraid to Ask for Help: Everyone at the workshop, on some level, has self-growth as a goal. As such, it’s a really good idea to ask people for help. If you don’t understand something, feel weird for some reason, or have anything going on, don’t be afraid to use the people around you the fullest (if they’re available, of course). Conclusion: Of course, perhaps the biggest question is “Is the workshop worth the hefty price?” Assuming you’re coming from a tech-based position (apologies to everyone else, I’m just doing a quick ballpark with what seems to be the most common place CFAR participants seem to come from), the average hourly wage is something like$40. At ~\$4,000, the workshop would need to save you about 100 hours to break even.

If you want rigorous quantitative data, you may want to check out CFAR’s own study on their participants. I don’t think I’ve got a good picture of quantifying the sort of personal benefits, myself, so everything below is pretty qualitative.

Things that I do think CFAR provides:

1) A unique training / learning environment for certain types of rationality skills that would probably be hard to learn elsewhere. Several of these techniques, including TAPs, Resolve Cycles, and Focusing have become fairly ingrained in my daily life, and I believe they’ve increased my quality of life.

Learning rationality is the main point of the workshop, so the majority of the value probably comes out of learning these techniques. Also, though, CFAR gives you the space and time to start thinking about a lot of things you might have otherwise put off forever. (Granted, this can be achieved by other means, like just blocking out time every week for review, but I thought this counterfactual benefit was still probably good to mention.)

2) Connections to other like-minded people. As a Schelling point for rationality, you’ll meet people who share similar values / goals as you at a CFAR workshop. If you’re looking to make new friends or meet others, this is another benefit. (Although it does seem costly and inefficient if that’s your main prerogative.)

3) Upgraded mindset: As I wrote about here, I think that learning CFAR-type rationality can really level up the way you look at your brain, which seems to have some potential flow-through effects. The post explains it better, but in short, if you have not-so-good mental models, then CFAR could be a really good choice for boosting how you see how your mind works.

There are probably other things, but those are the main ones. I hope this helps inform your decision. CFAR is currently hosting a major sprint of workshops, so this would be a good time to sign up for one, if you've been considering attending.

Concrete Takeaways Post-CFAR

11 24 February 2017 06:31PM

Concrete Takeaways:

[So I recently volunteered at a CFAR workshop. This is part five of a five-part series on how I changed my mind. It's split into 3 sections: TAPs, Heuristics, and Concepts. They get progressively more abstract. It's also quite long at around 3,000 words, so feel free to just skip around and see what looks interesting.]

(I didn't post Part 3 and Part 4 on LW, as they're more speculative and arguably less interesting, but I've linked to them on my blog if anyone's interested.]

This is a collection of TAPs, heuristics, and concepts that I’ve been thinking about recently. Many of them were inspired by my time at the CFAR workshop, but there’s not really underlying theme behind it all. It’s just a collection of ideas that are either practical or interesting.

TAPs:

TAPs, or Trigger Action Planning, is a CFAR technique that is used to build habits. The basic idea is you pair a strong, concrete sensory “trigger” (e.g. “when I hear my alarm go off”) with a “plan”—the thing you want to do (e.g. “I will put on my running shoes”).

If you’re good at noticing internal states, TAPs can also use your feelings or other internal things as a trigger, but it’s best to try this with something concrete first to get the sense of it.

Some of the more helpful TAPs I’ve recently been thinking about are below:

Ask for Examples TAP:

[Notice you have no mental picture of what the other person is saying. → Ask for examples.]

Examples are good. Examples are god. I really, really like them.

In conversations about abstract topics, it can be easy to understand the meaning of the words that someone said, yet still miss the mental intuition of what they’re pointing at. Asking for an example clarifies what they mean and helps you understand things better.

The trigger for this TAP is noticing that what someone said gave you no mental picture.

I may be extrapolating too far from too little data here, but it seems like people do try to “follow along” with things in their head when listening. And if this mental narrative, simulation, or whatever internal thing you’re doing comes up blank when someone’s speaking, then this may be a sign that what they said was unclear.

Once you notice this, you ask for an example of what gave you no mental picture. Ideally, the other person can then respond with a more concrete statement or clarification.

Quick Focusing TAP:

[Notice you feel aversive towards something → Be curious and try to source the aversion.]

Aversion Factoring, Internal Double Crux, and Focusing are all techniques CFAR teaches to help deal with internal feelings of badness.

While there are definite nuances between all three techniques, I’ve sort of abstracted from the general core of “figuring out why you feel bad” to create an in-the-moment TAP I can use to help debug myself.

The trigger is noticing a mental flinch or an ugh field, where I instinctively shy away from looking too hard.

After I notice the feeling, my first step is to cultivate a sense of curiosity. There’s no sense of needing to solve it; I’m just interested in why I’m feeling this way.

Once I’ve directed my attention to the mental pain, I try to source the discomfort. Using some backtracking and checking multiple threads (e.g. “is it because I feel scared?”) allows me to figure out why. This whole process takes maybe half a minute.

When I’ve figured out the reason why, a sort of shift happens, similar to the felt shift in focusing. In a similar way, I’m trying to “ground” the nebulous, uncertain discomfort, forcing it to take shape.

I’d recommend trying some Focusing before trying this TAP, as it’s basically an expedited version of it, hence the name.

Rule of Reflexivity TAP:

[Notice you’re judging someone → Recall an instance where you did something similar / construct a plausible internal narrative]

[Notice you’re making an excuse → Recall times where others used this excuse and update on how you react in the future.]

This is a TAP that was born out of my observation that our excuses seem way more self-consistent when we’re the ones saying then. (Oh, why hello there, Fundamental Attribution Error!) The point of practicing the Rule of Reflexivity is to build empathy.

The Rule of Reflexivity goes both ways. In the first case, you want to notice if you’re judging someone. This might feel like ascribing a value judgment to something they did, e.g. “This person is stupid and made a bad move.”

The response is to recall times where either you did something similar or (if you think you’re perfect) think of a plausible set of events that might have caused them to act in this way. Remember that most people don’t think they’re acting stupidly; they’re just doing what seems like a good idea from their perspective.

In the second case, you want to notice when you’re trying to justify your own actions. If the excuses you yourself make suspiciously sound like things you’ve heard others say before, then you may want to jump less likely to immediately dismissing them in the future.

Keep Calm TAP:

[Notice you’re starting to get angry → Take a deep breath → Speak softer and slower]

Okay, so this TAP is probably not easy to do because you’re working against a biological response. But I’ve found it useful in several instances where otherwise I would have gotten into a deeper argument.

The trigger, of course, is noticing that you’re angry. For me, this feels like an increased tightness in my chest and a desire to raise my voice. I may feel like a cherished belief of mine is being attacked.

Once I notice these signs, I remember that I have this TAP which is about staying calm. I think something like, “Ah yes, I’m getting angry now. But I previously already made the decision that it’d be a better idea to not yell.”

After that, I take a deep breath, and I try to open up my stance. Then I remember to speak in a slower and quieter tone than previously. I find this TAP especially helpful in arguments—ahem, collaborative searches for the truth—where things get a little too excited on both sides.

Heuristics:

Heuristics are algorithm-like things you can do to help get better results. I think that it’d be possible to turn many of the heuristics below into TAPs, but there’s a sense of deliberately thinking things out that separates these from just the “mindless” actions above.

As more formal procedures, these heuristics do require you to remember to Take Time to do them well. However, I think that the sorts of benefits you get from make it worth the slight investment in time.

Modified Murphyjitsu: The Time Travel Reframe:

(If you haven’t read up on Murphyjitsu yet, it’d probably be good to do that first.)

Murphyjitsu is based off the idea of a premortem, where you imagine that your project failed and you’re looking back. I’ve always found this to be a weird temporal framing, and I realized there’s a potentially easier way to describe things:

Say you’re sitting at your desk, getting ready to write a report on intertemporal travel. You’re confident you can finish before the hour is over. What could go wrong? Closing Facebook, you begin to start typing.

Suddenly, you hear a loud CRACK! A burst of light floods your room as a figure pops into existence, dark and silhouetted by the brightness behind it. The light recedes, and the figure crumples to the ground. Floating in the air is a whirring gizmo, filled with turning gears. Strangely enough, your attention is drawn from the gizmo to the person on the ground:

The figure has a familiar sort of shape. You approach, tentatively, and find the splitting image of yourself! The person stirs and speaks.

“I’m you from one week into the future,” your future self croaks. Your future self tries to tries to get up, but sinks down again.

“Oh,” you say.

“I came from the future to tell you…” your temporal clone says in a scratchy voice.

“To tell me what?” you ask. Already, you can see the whispers of a scenario forming in your head…

Future Your slowly says, “To tell you… that the report on intertemporal travel that you were going to write… won’t go as planned at all. Your best-case estimate failed.”

“Oh no!” you say.

Somehow, though, you aren’t surprised…

At this point, what plausible reasons for your failure come to mind?

I hypothesize that the time-travel reframe I provide here for Murphyjitsu engages similar parts of your brain as a premortem, but is 100% more exciting to use. In all seriousness, I think this is a reframe that is easier to grasp compared to the twisted “imagine you’re in the future looking back into the past, which by the way happens to be you in the present” framing normal Murphyjitsu uses.

The actual (non-dramatized) wording of the heuristic, by the way, is, “Imagine that Future You from one week into the future comes back telling you that the plan you are about to embark on will fail: Why?”

Low on Time? Power On!

Often, when I find myself low on time, I feel less compelled to try. This seems sort of like an instance of failing with abandon, where I think something like, “Oh well, I can’t possibly get anything done in the remaining time between event X and event Y”.

And then I find myself doing quite little as a response.

As a result, I’ve decided to internalize the idea that being low on time doesn’t mean I can’t make meaningful progress on my problems.

This a very Resolve-esque technique. The idea is that even if I have only 5 minutes, that’s enough to get things down. There’s lots of useful things I can pack into small time chunks, like thinking, brainstorming, or doing some Quick Focusing.

I’m hoping to combat the sense of apathy / listlessness that creeps in when time draws to a close.

Supercharge Motivation by Propagating Emotional Bonds:

[Disclaimer: I suspect that this isn’t an optimal motivation strategy, and I’m sure there are people who will object to having bonds based on others rather than themselves. That’s okay. I think this technique is effective, I use it, and I’d like to share it. But if you don’t think it’s right for you, feel free to just move along to the next thing.]

CFAR used to teach a skill called Propagating Urges. It’s now been largely subsumed by Internal Double Crux, but I still find Propagating Urges to be a powerful concept.

In short, Propagating Urges hypothesizes that motivation problems are caused because the implicit parts of ourselves don’t see how the boring things we do (e.g. filing taxes) causally relate to things we care about (e.g. not going to jail). The actual technique involves walking through the causal chain in your mind and some visceral imagery every step of the way to get the implicit part of yourself on board.

I’ve taken the same general principle, but I’ve focused it entirely on the relationships I have with other people. If all the parts of me realize that doing something would greatly hurt those I care about, this becomes a stronger motivation than most external incentives.

For example, I walked through an elaborate internal simulation where I wanted to stop doing a Thing. I imagined someone I cared deeply for finding out about my Thing-habit and being absolutely deeply disappointed. I focused on the sheer emotional weight that such disappointment would cause (facial expressions, what they’d feel inside, the whole deal).

I now have a deep injunction against doing the Thing, and all the parts of me are in agreement because we agree that such a Thing would hurt other people and that’s obviously bad.

The basic steps for Propagating Emotional Bonds looks like:

• Figure out what thing you want to do more of or stop doing.

• Imagine what someone you care about would think or say.

• Really focus on how visceral that feeling would be.

• Rehearse the chain of reasoning (“If I do this, then X will feel bad, and I don’t want X to feel bad, so I won’t do it”) a few times.

Take Time in Social Contexts:

Often, in social situations, when people ask me questions, I feel an underlying pressure to answer quickly. It feels like if I don’t answer in the next ten seconds, something’s wrong with me. (School may have contributed to this). I don’t exactly know why, but it just feels like it’s expected.

I also think that being forced to hurry isn’t good for thinking well. As a result, something helpful I’ve found is when someone asks something like, “Is that all? Anything else?” is to Take Time.

My response is something like, “Okay, wait, let me actually take a few minutes.” At which point, I, uh, actually take a few minutes to think things through. After saying this, it feel like it’s now socially permissible for me to take some time thinking.

This has proven in several contexts where, had I not Taken Time, I would have forgotten to bring up important things or missed key failure-modes.

Ground Mental Notions in Reality not by Platonics:

One of the proposed reasons that people suck at planning is that we don’t actually think about the details behind our plans. We end up thinking about them in vague black-box-style concepts that hide all the scary unknown unknowns. What we’re left with is just the concept of our task, rather than a deep understanding of what our task entails.

In fact, this seems fairly similar to the the “prototype model” that occurs in scope insensitivity.

I find this is especially problematic for tasks which look nothing like their concepts. For example, my mental representation of “doing math” conjures images of great mathematicians, intricate connections, and fantastic concepts like uncountable sets.

Of course, actually doing math looks more like writing stuff on paper, slogging through textbooks, and banging your head on the table.

My brain doesn’t differentiate well between doing a task and the affect associated with the task. Thus I think it can be useful to try and notice when our brains our doing this sort of black-boxing and instead “unpack” the concepts.

This means getting better correspondences between our mental conceptions of tasks and the tasks themselves, so that we can hopefully actually choose better.

3 Conversation Tips:

I often forget what it means to be having a good conversation with someone. I think I miss opportunities to learn from others when talking with them. This is my handy 3-step list of Conversation Tips to get more value out of conversations:

1) "Steal their Magic": Figure out what other people are really good at, and then get inspired by their awesomeness and think of ways you can become more like that. Learn from what other people are doing well.

2) "Find the LCD"/"Intellectually Escalate": Figure out where your intelligence matches theirs, and learn something new. Focus on Actually Trying to bridge those inferential distances. In conversations, this means focusing on the limits of either what you know or what the other person knows.

3) "Convince or Be Convinced”: (This is a John Salvatier idea, and it also follows from the above.) Focus on maximizing your persuasive ability to convince them of something. Or be convinced of something. Either way, focus on updating beliefs, be it your own or the other party’s.

Be The Noodly Appendages of the Superintelligence You Wish To See in the World:

CFAR co-founder Anna Salamon has this awesome reframe similar to IAT which asks, “Say a superintelligence exists and is trying to take over the world. However, you are its only agent. What do you do?”

I’ll admit I haven’t used this one, but it’s super cool and not something I’d thought of, so I’m including it here.

Concepts:

Concepts are just things in the world I’ve identified and drawn some boundaries around. They are farthest from the pipeline that goes from ideas to TAPs, as concepts are just ideas. Still, I do think these concepts “bottom out” at some point into practicality, and I think playing around with them could yield interesting results.

Paperspace =/= Mindspace:

I tend to write things down because I want to remember them. Recently, though I’ve noticed that rather act as an extension of my brain, I seem to treat things I write down as no longer in my own head. As in, if I write something down, it’s not necessarily easier for me to recall it later.

It’s as if by “offloading” the thoughts onto paper, I’ve cleared them out of my brain. This seems suboptimal, because a big reason I write things down is to cement them more deeply within my head.

I can still access the thoughts if I’m asking myself questions like, “What did I write down yesterday?” but only if I’m specifically sorting for things I write down.

The point is, I want stuff I write down on paper to be, not where I store things, but merely a sign of what’s stored inside my brain.

Outreach: Focus on Your Target’s Target:

One interesting idea I got from the CFAR workshop was that of thinking about yourself as a radioactive vampire. Um, I mean, thinking about yourself as a memetic vector for rationality (the vampire thing was an actual metaphor they used, though).

The interesting thing they mentioned was to think, not about who you’re directly influencing, but who your targets themselves influence.

This means that not only do you have to care about the fidelity of your transmission, but you need to think of ways to ensure that your target also does a passable job of passing it on to their friends.

I’ve always thought about outreach / memetics in terms of the people I directly influence, so looking at two degrees of separation is a pretty cool thing I hadn’t thought about in the past.

I guess that if I took this advice to heart, I’d probably have to change the way that I explain things. For example, I might want to try giving more salient examples that can be easily passed on or focusing on getting the intuitions behind the ideas across.

Build in Blank Time:

Professor Barbara Oakley distinguishes between focused and diffused modes of thinking. Her claim is that time spent in a thoughtless activity allows your brain to continue working on problems without conscious input. This is the basis of diffuse mode.

In my experience, I’ve found that I get interesting ideas or remember important ideas when I’m doing laundry or something else similarly mindless.

I’ve found this to be helpful enough that I’m considering building in “Blank Time” in my schedules.

My intuitions here are something like, “My brain is a thought-generator, and it’s particularly active if I can pay attention to it. But I need to be doing something that doesn’t require much of my executive function to even pay attention to my brain. So maybe having more Blank Time would be good if I want to get more ideas.”

There’s also the additional point that meta-level thinking can’t be done if you’re always in the moment, stuck in a task. This means that, cool ideas aside, if I just want to reorient or survey my current state, Blank Time can be helpful.

The 99/1 Rule: Few of Your Thoughts are Insights:

The 99/1 Rule says that the vast majority of your thoughts every day are pretty boring and that only about one percent of them are insightful.

This was generally true for my life…and then I went to the CFAR workshop and this rule sort of stopped being appropriate. (Other exceptions to this rule were EuroSPARC [now ESPR] and EAG)

Note:

I bulldozed through a bunch of ideas here, some of which could have probably garnered a longer post. I’ll probably explore some of these ideas later on, but if you want to talk more about any one of them, feel free to leave a comment / PM me.

Levers, Emotions, and Lazy Evaluators:

5 20 February 2017 11:00PM

Levers, Emotions, and Lazy Evaluators: Post-CFAR 2

[This is a trio of topics following from the first post that all use the idea of ontologies in the mental sense as a bouncing off point. I examine why naming concepts can be helpful, listening to your emotions, and humans as lazy evaluators. I think this post may also be of interest to people here. Posts 3 and 4 are less so, so I'll probably skip those, unless someone expresses interest. Lastly, the below expressed views are my own and don’t reflect CFAR’s in any way.]

Levers:

When I was at the CFAR workshop, someone mentioned that something like 90% of the curriculum was just making up fancy new names for things they already sort of did. This got some laughs, but I think it’s worth exploring why even just naming things can be powerful.

Our minds do lots of things; they carry many thoughts, and we can recall many memories. Some of these phenomena may be more helpful for our goals, and we may want to name them.

When we name a phenomenon, like focusing, we’re essentially drawing a boundary around the thing, highlighting attention on it. We’ve made it conceptually discrete. This transformation, in turn, allows us to more concretely identify which things among the sea of our mental activity correspond to Focusing.

Focusing can then become a concept that floats in our understanding of things our minds can do. We’ve taken a mental action and packaged it into a “thing”. This can be especially helpful if we’ve identified a phenomena that consists of several steps which usually aren’t found together.

By drawing certain patterns around a thing with a name, we can hopefully help others recognize them and perhaps do the same for other mental motions, which seems to be one more way that we find new rationality techniques.

This then means that we’ve created a new action that is explicitly available to our ontology. This notion of “actions I can take” is what I think forms the idea of levers in our mind. When CFAR teaches a rationality technique, the technique itself seems to be pointing at a sequence of things that happen in our brain. Last post, I mentioned that I think CFAR techniques upgrade people’s mindsets by changing their sense of what is possible.

I think that levers are a core part of this because they give us the feeling of, “Oh wow! That thing I sometimes do has a name! Now I can refer to it and think about it in a much nicer way. I can call it ‘focusing’, rather than ‘that thing I sometimes do when I try to figure out why I’m feeling sad that involves looking into myself’.”

For example, once you understand that a large part of habituation is simply "if-then" loops (ala TAPs, aka Trigger Action Plans), you’ve now not only understood what it means to learn something as a habit, but you’ve internalized the very concept of habituation itself. You’ve gone one meta-level up, and you can now reason about this abstract mental process in a far more explicit way.

Names haves power in the same way that abstraction barriers have power in a programming language—they change how you think about the phenomena itself, and this in turn can affect your behavior.

Emotions:

CFAR teaches a class called “Understanding Shoulds”, which is about seeing your “shoulds”, the parts of yourself that feel like obligations, as data about things you might care about. This is a little different from Nate Soares’s Replacing Guilt series, which tries to move past guilt-based motivation.

In further conversations with staff, I’ve seen the even deeper view that all emotions should be considered information.

The basic premise seems to be based off the understanding that different parts of us may need different things to function. Our conscious understanding of our own needs may sometimes be limited. Thus, our implicit emotions (and other S1 processes) can serve as a way to inform ourselves about what we’re missing.

In this way, all emotions seem channels where information can be passed on from implicit parts of you to the forefront of “meta-you”. This idea of “emotions as a data trove” is yet another ontology that produces different rationality techniques, as it’s operating on, once again, a mental model that is built out of a different type of abstraction.

Many of the skills based on this ontology focus on communication between different pieces of the self.

I’m very sympathetic to this viewpoint, as it form the basis of the Internal Double Crux (IDC) technique, one of my favorite CFAR skills. In short, IDC assumes that akrasia-esque problems are caused by a disagreement between different parts of you, some of which might be in the implicit parts of your brain.

By “disagreement”, I mean that some part of you endorses an action for some well-meaning reasons, but some other part of you is against the action and also has justifications. To resolve the problem, IDC has us “dialogue” between the conflicting parts of ourselves, treating both sides as valid. If done right, without “rigging” the dialogue to bias one side, IDC can be a powerful way to source internal motivation for our tasks.

While I do seem to do some communication between my emotions, I haven’t fully integrated them as internal advisors in the IFS sense. I’m not ready to adopt a worldview that might potentially hand over executive control to all the parts of me. Meta-me still deems some of my implicit desires as “foolish”, like the part of me that craves video games, for example. In order to avoid slippery slopes, I have a blanket precommitment on certain things in life.

For the meantime, I’m fine sticking with these precommitments. The modern world is filled with superstimuli, from milkshakes to insight porn (and the normal kind) to mobile games, that can hijack our well-meaning reward systems.

Lastly, I believe that without certain mental prerequisites, some ontologies can be actively harmful. Nate’s Resolving Guilt series can leave people without additional motivation for their actions; guilt can be a useful motivator. Similarly, Nihilism is another example of an ontology that can be crippling unless paired with ideas like humanism.

Lazy Evaluators:

In In Defense of the Obvious, I gave a practical argument as to why obvious advice was very good. I brought this point up up several times during the workshop, and people seemed to like the point.

While that essay focused on listening to obvious advice, there appears to be a similar thing where merely asking someone, “Did you do all the obvious things?” will often uncover helpful solutions they have yet to do.

My current hypothesis for this (apart from “humans are programs that wrote themselves on computers made of meat”, which is a great workshop quote) is that people tend to be lazy evaluators. In programming, lazy evaluation is a way of solving for the value of expressions at the last minute, not until the answers are absolutely needed.

It seems like something similar happens in people’s heads, where we simply don’t ask ourselves questions like “What are multiple ways I could accomplish this?” or “Do actually I want to do this thing?” until we need to…Except that most of the time, we never need to—Life putters on, whether or not we’re winning at it.

I think this is part of what makes “pair debugging”, a CFAR activity where a group of people try to help one person with their “bugs”, effective. When we have someone else taking an outside view asking us these questions, it may even be the first time we see these questions ourselves.

Therefore, it looks like a helpful skill is to constantly ask ourselves questions and cultivate a sense of curiosity about how things are. Anna Salamon refers to this skill of “boggling”. I think boggling can help with both counteracting lazy evaluation and actually doing obvious actions.

Looking at why obvious advice is obvious, like “What the heck does ‘obvious’ even mean?” can help break the immediate dismissive veneer our brain puts on obvious information.

EX: “If I want to learn more about coding, it probably makes sense to ask some coder friends what good resources are.”

“Nah, that’s so obvious; I should instead just stick to this abstruse book that basically no one’s heard of—wait, I just rejected something that felt obvious.”

“Huh…I wonder why that thought felt obvious…what does it even mean for something to be dubbed ‘obvious’?”

“Well…obvious thoughts seem to have a generally ‘self-evident’ tag on them. If they aren’t outright tautological or circularly defined, then there’s a sense where the obvious things seems to be the shortest paths to the goal. Like, I could fold my clothes or I could build a Rube Goldberg machine to fold my clothes. But the first option seems so much more ‘obvious’…”

“Aside from that, there also seems to be a sense where if I search my brain for ‘obvious’ things, I’m using a ‘faster’ mode of thinking (ala System 1). Also, aside from favoring simpler solutions, also seems to be influenced by social norms (what do people ‘typically’ do). And my ‘obvious action generator’ seems to also be built off my understanding of the world, like, I’m thinking about things in terms of causal chains that actually exist in the world. As in, when I’m thinking about ‘obvious’ ways to get a job, for instance, I’m thinking about actions I could take in the real world that might plausibly actually get me there…”

“Whoa…that means that obvious advice is so much more than some sort of self-evident tag. There’s a huge amount of information that’s being compressed when I look at it from the surface…’Obvious’ really means something like ‘that which my brain quickly dismisses because it is simple, complies with social norms, and/or runs off my internal model of how the universe works.”

The goal is to reduce the sort of “acclimation” that happens with obvious advice by peering deeper into it. Ideally, if you’re boggling at your own actions, you can force yourself to evaluate earlier. Otherwise, it can hopefully at least make obvious advice more appealing.

I’ll end with a quote of mine from the workshop:

“You still yet fail to grasp the weight of the Obvious.”

Planning 101: Debiasing and Research

Planning 101: Debiasing and Research

Murphyjitsu:

Conclusion:

80,000 Hours: EA and Highly Political Causes

this post is now crossposted to the EA forum

80,000 hours is a well known Effective Altruism organisation which does "in-depth research alongside academics at Oxford into how graduates can make the biggest difference possible with their careers".

They recently posted a guide to donating which aims, in their words, to (my emphasis)

use evidence and careful reasoning to work out how to best promote the wellbeing of all. To find the highest-impact charities this giving season ... We ... summed up the main recommendations by area below

Looking below, we find a section on the problem area of criminal justice (US-focused). An area where the aim is outlined as follows: (quoting from the Open Philanthropy "problem area" page)

investing in criminal justice policy and practice reforms to substantially reduce incarceration while maintaining public safety.

Reducing incarceration whilst maintaining public safety seems like a reasonable EA cause, if we interpret "pubic safety" in a broad sense - that is, keep fewer people in prison whilst still getting almost all of the benefits of incarceration such as deterrent effects, prevention of crime, etc.

So what are the recommended charities? (my emphasis below)

"The Alliance for Safety and Justice is a US organization that aims to reduce incarceration and racial disparities in incarceration in states across the country, and replace mass incarceration with new safety priorities that prioritize prevention and protect low-income communities of color."

They promote an article on their site called "black wounds matter", as well as how you can "Apply for VOCA Funding: A Toolkit for Organizations Working With Crime Survivors in Communities of Color and Other Underserved Communities"

2. Cosecha - (note that their url is www.lahuelga.com, which means "the strike" in Spanish) (my emphasis below)

"Cosecha is a group organizing undocumented immigrants in 50-60 cities around the country. Its goal is to build mass popular support for undocumented immigrants, in resistance to incarceration/detention, deportation, denigration of rights, and discrimination. The group has become especially active since the Presidential election, given the immediate threat of mass incarceration and deportation of millions of people."

Cosecha have a footprint in the news, for example this article:

They have the ultimate goal of launching massive civil resistance and non-cooperation to show this country it depends on us ...  if they wage a general strike of five to eight million workers for seven days, we think the economy of this country would not be able to sustain itself

The article quotes Carlos Saavedra, who is directly mentioned by Open Philanthropy's Chloe Cockburn:

Carlos Saavedra, who leads Cosecha, stands out as an organizer who is devoted to testing and improving his methods, ... Cosecha can do a lot of good to prevent mass deportations and incarceration, I think his work is a good fit for likely readers of this post."

They mention other charities elsewhere on their site and in their writeup on the subject, such as the conservative Center for Criminal Justice Reform, but Cosecha and the Alliance for Safety and Justice are the ones that were chosen as "highest impact" and featured in the guide to donating

Sometimes one has to be blunt: 80,000 hours is promoting the financial support of some extremely hot-button political causes, which may not be a good idea. Traditionalists/conservatives and those who are uninitiated to Social Justice ideology might look at The Alliance for Safety and Justice and Cosecha and label them as them racists and criminals, and thereby be turned off by Effective Altruism, or even by the rationality movement as a whole.

There are standard arguments, for example this by Robin Hanson from 10 years ago about why it is not smart or "effective" to get into these political tugs-of-war if one wants to make a genuine difference in the world.

One could also argue that the 80,000 hours' charities go beyond the usual folly of political tugs-of-war. In addition to supporting extremely political causes, 80,000 hours could be accused of being somewhat intellectually dishonest about what goal they are trying to further actually is.

Consider The Alliance for Safety and Justice. 80,000 Hours state that the goal of their work in the criminal justice problem area is to "substantially reduce incarceration while maintaining public safety". This is an abstract goal that has very broad appeal and one that I am sure almost everyone agrees to. But then their more concrete policy in this area is to fund a charity that wants to "reduce racial disparities in incarceration" and "protect low-income communities of color". The latter is significantly different to the former - it isn't even close to being the same thing - and the difference is highly political. One could object that reducing racial disparities in incarceration is merely a means to the end of substantially reducing incarceration while maintaining public safety, since many people in prison in the US are "of color". However this line of argument is a very politicized one and it might be wrong, or at least I don't see strong support for it. "Selectively release people of color and make society safer - endorsed by effective altruists!" struggles against known facts about redictivism rates across races, as well as an objection about the implicit conflation of equality of outcome and equality of opportunity. (and I do not want this to be interpreted as a claim of moral superiority of one race over others - merely a necessary exercise in coming to terms with facts and debunking implicit assumptions). Males are incarcerated much more than women, so what about reducing gender disparities in incarceration, whilst also maintaining public safety? Again, this is all highly political, laden with politicized implicit assumptions and language.

Cosecha is worse! They are actively planning potentially illegal activities like helping illegal immigrants evade the law (though IANAL), as well as activities which potentially harm the majority of US citizens such as a seven day nationwide strike whose intent is to damage the economy. Their URL is "The Strike" in Spanish.

Again, the abstract goal is extremely attractive to almost anyone, but the concrete implementation is highly divisive. If some conservative altruist signed up to financially or morally support the abstract goal of "substantially reducing incarceration while maintaining public safety" and EA organisations that are pursuing that goal without reading the details, and then at a later point they saw the details of Cosecha and The Alliance for Safety and Justice, they would rightly feel cheated. And to the objection that conservative altruists should read the description rather than just the heading - what are we doing writing headings so misleading that you'd feel cheated if you relied on them as summaries of the activity they are mean to summarize?

One possibility would be for 80,000 hours to be much more upfront about what they are trying to achieve here - maybe they like left-wing social justice causes, and want to help like-minded people donate money to such causes and help the particular groups who are favored in those circles. There's almost a nod and a wink to this when Chloe Cockburn says (my paraphrase of Saavedra, and emphasis, below)

I think his [A man who wants to lead a general strike of five to eight million workers for seven days so that the economy of the USA would not be able to sustain itself, in order to help illegal immigrants] work is a good fit for likely readers of this post

Alternatively, they could try to reinvigorate the idea that their "criminal justice" problem area is politically neutral and beneficial to everyone; the Open Philanthropy issue writeup talks about "conservative interest in what has traditionally been a solely liberal cause" after all. I would advise considering dropping The Alliance for Safety and Justice and Cosecha if they intend to do this. There may not be politically neutral charities in this area, or there may not be enough high quality conservative charities to present a politically balanced set of recommendations. Setting up a growing donor advised fund or a prize for nonpartisan progress that genuinely intends to benefit everyone including conservatives, people opposed to illegal immigration and people who are not "of color" might be an option to consider.

We could examine 80,000 hours' choice to back these organisations from a more overall-utilitarian/overall-effectiveness point of view, rather than limiting the analysis to the specific problem area. These two charities don't pass the smell test for altruistic consequentialism, pulling sideways on ropes, finding hidden levers that others are ignoring, etc. Is the best thing you can do with your smart EA money helping a charity that wants to get stuck into the culture war about which skin color is most over-represented in prisons? What about a second charity that wants to help people illegally immigrate at a time when immigration is the most divisive political topic in the western world?

Furthermore, Cosecha's plans for a nationwide strike and potential civil disobedience/showdown with Trump & co could push an already volatile situation in the US into something extremely ugly. The vast majority of people in the world (present and future) are not the specific group that Cosecha aims to help, but the set of people who could be harmed by the uglier versions of a violent and calamitous showdown in the US is basically the whole world. That means that even if P(Cosecha persuades Trump to do a U-turn on illegals) is 10 or 100 times greater than P(Cosecha precipitates a violent crisis in the USA), they may still be net-negative from an expected utility point of view. EA doesn't usually fund causes whose outcome distribution is heavily left-skewed so this argument is a bit unusual to have to make, but there it is.

Not only is Cosecha a cause that is (a) mind-killing and culture war-ish (b) very tangentially related to the actual problem area it is advertised under by 80,000 hours, but it might also (c) be an anti-charity that produces net disutility (in expectation) in the form of a higher probability a US civil war with money that you donate to it.

Back on the topic of criminal justice and incarceration: opposition to reform often comes from conservative voters and politicians, so it might seem unlikely to a careful thinker that extra money on the left-wing side is going to be highly effective. Some intellectual judo is required; make conservatives think that it was their idea all along. So promoting the Center for Criminal Justice Reform sounds like the kind of smart, against-the-grain idea that might be highly effective! Well done, Open Philanthropy! Also in favor of this org: they don't copiously mention which races or person-categories they think are most important in their articles about criminal justice reform, the only culture war item I could find on them is the world "conservative" (and given the intellectual judo argument above, this counts as a plus), and they're not planning a national strike or other action with a heavy tail risk. But that's the one that didn't make the cut for the 80,000 hours guide to donating!

The fact that they let Cosecha (and to a lesser extent The Alliance for Safety and Justice) through reduces my confidence in 80,000 hours and the EA movement as a whole. Who thought it would be a good idea to get EA into the culture war with these causes, and also thought that they were plausibly among the most effective things you can do with money? Are they taking effectiveness seriously? What does the political diversity of meetings at 80,000 hours look like? Were there no conservative altruists present in discussions surrounding The Alliance for Safety and Justice and Cosecha, and the promotion of them as "beneficial for everyone" and "effective"?

Before we finish, I want to emphasize that this post is not intended to start an object-level discussion about which race, gender, political movement or sexual orientation is cooler, and I would encourage moderators to temp-ban people who try to have that kind of argument in the comments of this post.

I also want to emphasize that criticism of professional altruists is a necessary evil; in an ideal world the only thing I would ever want to say to people who dedicate their lives to helping others (Chloe Cockburn in particular, since I mentioned her name above)  is "thank you, you're amazing". Other than that, comments and criticism are welcome, especially anything pointing out any inaccuracies or misunderstandings in this post. Comments from anyone involved in 80,000 hours or Open Philanthropy are welcome.

First impressions...

7 24 January 2017 03:14PM

... of LW: a while ago, a former boss and friend of mine said that rationality is irrational because you never have sufficient computational power to evaluate everything rationally. I thought he was missing the point - but after two posts on LW, I am inclined to agree with him.

It's kind of funny - every post gets broken down into its tiniest constituents, and these get overanalysed and then people go on tangents only marginally relevant to the intent of the original article.

This would be fine if the original questions of the post were answered; but when I asked for metrics to evaluate a presidency, few people actually provided any - most started debating the validity of metrics, and one subthread went off to discuss the appropriateness of the term "gender equality".

I am new here, and I don't want to be overly critical of a culture I do not yet understand. But I just want to point out - rationality is a great tool to solve problems; if it becomes overly abstract, it kind of misses its point I think.

Instrumental Rationality: Overriding Defaults

2 20 January 2017 05:14AM

[I'd previously posted this essay as a link. From now on, I'll be cross-posting blog posts here instead of linking them, to keep the discussions LW central. This is the first in an in-progress of sequence of articles that'll focus on identifying instrumental rationality techniques and cataloging my attempt to integrate them into my life with examples and insight from habit research.]

[Epistemic Status: Pretty sure. The stuff on habits being situation-response links seems fairly robust. I'll be writing something later with the actual research. I'm basically just retooling existing theory into an optimizational framework for improving life.]

I’m interested how rationality can help us make better decisions.

Many of these decisions seem to involve split-second choices where it’s hard to sit down and search a handbook for the relevant bits of information—you want to quickly react in the correct way, else the moment passes and you’ve lost. On a very general level, it seems to be about reacting in the right way once the situation provides a cue.

Consider these situation-reaction pairs:

• ·       You are having an argument with someone. As you begin to notice the signs of yourself getting heated, you remember to calm down and talk civilly. Maybe also some deep breaths.
• ·       You are giving yourself a deadline or making a schedule for a task, and you write down the time you expect to finish. Quickly, though, you remember to actually check if it took you that long last time, and you adjust accordingly.
• ·       You feel yourself slipping towards doing something some part of you doesn’t want to do. Say you are reneging on a previous commitment. As you give in to temptation, you remember to pause and really let the two sides of yourself communicate.
• ·       You think about doing something, but you feel aversive / flinch-y to it. As you shy away from the mental pain, rather than just quickly thinking about something else, you also feel curious as to why you feel that way. You query your brain and try to pick apart the “ugh” feeling,

Two things seem key to the above scenarios:

One, each situation above involves taking an action that is different from our keyed-in defaults.

Two, the situation-reaction pair paradigm is pretty much CFAR’s Trigger Action Plan (TAP) model, paired with a multi-step plan.

Also, knowing about biases isn’t enough to make good decisions. Even memorizing a mantra like “Notice signs of aversion and query them!” probably isn’t going to be clear enough to be translated into something actionable. It sounds nice enough on the conceptual level, but when, in the moment, you remember such a mantra, you still need to figure out how to “notice signs of aversion and query them”.

What we want is a series of explicit steps that turn the abstract mantra into small, actionable steps. Then, we want to quickly deploy the steps at the first sign of the situation we’re looking out for, like a new cached response.

This looks like a problem that a combination of focused habit-building and a breakdown of the 5-second level can help solve.

In short, the goal looks to be to combine triggers with clear algorithms to quickly optimize in the moment. Reference class information from habit studies can also help give good estimates on how long the whole process will take to internalize (on average 66 days, according to Lally et al)

But these Trigger Action Plan-type plans don’t seem to directly cover the willpower related problems with akrasia.

Sure, TAPs can help alert you to the presence of an internal problem, like in the above example where you notice aversion. And the actual internal conversation can probably be operationalized to some extent, like how CFAR has described the process of Double Crux.

But most of the Overriding Default Habit actions seem to be ones I’d be happy to do anytime—I just need a reminder—whereas akrasia-related problems are centrally related to me trying to debug my motivational system. For that reason, I think it helps to separate the two. Also, it makes the outside-seeming TAP algorithms complementary, rather than at odds, with the inside-seeming internal debugging techniques.

Loosely speaking, then, I think it still makes quite a bit of sense to divide the things rationality helps with into two categories:

• Overriding Default Habits:

These are the situation-reaction pairs I’ve covered above. Here, you’re substituting a modified action instead of your “default action”. But the cue serves as mainly a reminder/trigger. It’s less about diagnosing internal disagreement.

• Akrasia / Willpower Problems:

Here we’re talking about problems that might require you to precommit (although precommitment might not be all you need to do), perhaps because of decision instability. The “action-intention gap” caused by akrasia, where you (sort of) want to something but you don’t want to also goes in here.

Still, it’s easy to point to lots of other things that fall in the bounds of rationality that my approach doesn’t cover: epistemology, meta-levels, VNM rationality, and many other concepts are conspicuously absent. Part of this is because I’ve been focusing on instrumental rationality, while a lot of those ideas are more in the epistemic camp.

Ideas like meta-levels do seem to have some place in informing other ideas and skills. Even as declarative knowledge, they do chain together in a way that results in useful real world heuristics.  Meta-levels, for example, can help you keep track of the ultimate direction in a conversation. Then, it can help you table conversations that don’t seem immediately useful/relevant and not get sucked into the object-level discussion.

At some point, useful information about how the world works should actually help you make better decisions in the real world. For an especially pragmatic approach, it may be useful to ask yourself, each time you learn something new, “What do I see myself doing as a result of learning this information?”

There’s definitely more to mine from the related fields of learning theory, habits, and debiasing, but I think I’ll have more than enough skills to practice if I just focus on the immediately practical ones.

16 12 January 2017 09:26PM

1 12 January 2017 07:26PM

0 11 January 2017 07:07PM

9 08 January 2017 10:36AM

Why you should be very careful about trying to openly seek truth in any political discussion

1. Rationality considered harmful for Scott Aaronson in the great gender debate

In 2015, complexity theorist and rationalist Scott Aaronson was foolhardy enough to step into the Gender Politics war on his blog with a comment stating that extreme feminism that he bought into made him hate himself and try to seek ways to chemically castrate himself. The feminist blogoshere got hold of this and crucified him for it, and he has written a few followup blog posts about it. Recently I saw this comment by him on his blog:

As the comment 171 affair blew up last year, one of my female colleagues in quantum computing remarked to me that the real issue had nothing to do with gender politics; it was really just about the commitment to truth regardless of the social costs—a quality that many of the people attacking me (who were overwhelmingly from outside the hard sciences) had perhaps never encountered before in their lives. That remark cheered me more than anything else at the time

2. Rationality considered harmful for Sam Harris in the islamophobia war

I recently heard a very angry, exasperated 2 hour podcast by the new atheist and political commentator Sam Harris about how badly he has been straw-manned, misrepresented and trash talked by his intellectual rivals (who he collectively refers to as the "regressive left"). Sam Harris likes to tackle hard questions such as when torture is justified, which religions are more or less harmful than others, defence of freedom of speech, etc. Several times, Harris goes to the meta-level and sees clearly what is happening:

Rather than a searching and beautiful exercise in human reason to have conversations on these topics [ethics of torture, military intervention, Islam, etc], people are making it just politically so toxic, reputationally so toxic to even raise these issues that smart people, smarter than me, are smart enough not to go near these topics

Everyone on the left at the moment seems to be a mind reader.. no matter how much you try to take their foot out of your mouth, the mere effort itself is going to be counted against you - you're someone who's in denial, or you don't even understand how racist you are, etc

3. Rationality considered harmful when talking to your left-wing friends about genetic modification

In the SlateStarCodex comments I posted complaining that many left-wing people were responding very personally (and negatively) to my political views.

One long term friend openly and pointedly asked whether we should still be friends over the subject of eugenics and genetic engineering, for example altering the human germ-line via genetic engineering to permanently cure a genetic disease. This friend responded to a rational argument about why some modifications of the human germ line may in fact be a good thing by saying that "(s)he was beginning to wonder whether we should still be friends".

A large comment thread ensued, but the best comment I got was this one:

One of the useful things I have found when confused by something my brain does is to ask what it is *for*. For example: I get angry, the anger is counterproductive, but recognizing that doesn’t make it go away. What is anger *for*? Maybe it is to cause me to plausibly signal violence by making my body ready for violence or some such.

Similarly, when I ask myself what moral/political discourse among friends is *for* I get back something like “signal what sort of ally you would be/broadcast what sort of people you want to ally with.” This makes disagreements more sensible. They are trying to signal things about distribution of resources, I am trying to signal things about truth value, others are trying to signal things about what the tribe should hold sacred etc. Feeling strong emotions is just a way of signaling strong precommitments to these positions (i.e. I will follow the morality I am signaling now because I will be wracked by guilt if I do not. I am a reliable/predictable ally.) They aren’t mad at your positions. They are mad that you are signaling that you would defect when push came to shove about things they think are important.

Let me repeat that last one: moral/political discourse among friends is for “signalling what sort of ally you would be/broadcast what sort of people you want to ally with”. Moral/political discourse probably activates specially evolved brainware in human beings; that brainware has a purpose and it isn't truthseeking. Politics is not about policy

4. Takeaways

This post is already getting too long so I deleted the section on lessons to be learned, but if there is interest I'll do a followup. Let me know what you think in the comments!

[Link] Applied Rationality Exercises

5 06 January 2017 06:50AM

[I first posted this as a link to my blog post, but I'm reposting as a focused article here that trims some fat of the original post, which was less accessible]

I think a lot about heuristics and biases, and I admit that many of my ideas on rationality and debiasing get lost in the sea of my own thoughts.  They’re accessible, if I’m specifically thinking about rationality-esque things, but often invisible otherwise.

That seems highly sub-optimal, considering that the whole point of having usable mental models isn’t to write fancy posts about them, but to, you know, actually use them.

To that end, I’ve been thinking about finding some sort of systematic way to integrate all of these ideas into my actual life.

(If you’re curious, here’s the actual picture of what my internal “concept-verse” (w/ associated LW and CFAR memes) looks like)

Open Image In New Tab for all the details

So I have all of these ideas, all of which look really great on paper and in thought experiments.  Some of them even have some sort of experimental backing.  Given this, how do I put them together into a kind of coherent notion?

Equivalently, what does it look like if I successfully implement these mental models?  What sorts of changes might I expect to see?  Then, knowing the end product, what kind of process can get me there?

One way of looking it would to say that if I implemented techniques well, then I’d be better able to tackle my goals and get things done.  Maybe my productivity would go up.  That sort of makes sense.  But this tells us nothing about how I’d actually be going about, using such skills.

We want to know how to implement these skills and then actually utilize them.

Yudkowsky gives a highly useful abstraction when he talks about the five-second level.  He gives some great tips on breaking down mental techniques into their component mental motions.  It’s a step-by-step approach that really goes into the details of what it feels like to undergo one of the LessWrong epistemological techniques.  We’d like our mental techniques to be actual heuristics that we can use in the moment, so having an in-depth breakdown makes sense.

Here’s my attempt at a 5-second-level breakdown for Going Meta, or "popping" out of one's head to stay mindful of the moment:

1. Notice the feeling that you are being mentally “dragged” towards continuing an action.
1. (It can feel like an urge, or your mind automatically making a plan to do something.  Notice your brain simulating you taking an action without much conscious input.)
2. Remember that you have a 5-second-level series of steps to do something about it.
3. Feel aversive towards continuing the loop.  Mentally shudder at the part of you that tries to continue.
4. Close your eyes.  Take in a breath.
5. Think about what 1-second action you could take to instantly cut off the stimulus from whatever loop you’re stuck in. (EX: Turning off the display, closing the window, moving to somewhere else).
6. Tense your muscles and clench, actually doing said action.
7. Run a search through your head, looking for an action labeled “productive”.  Try to remember things you’ve told yourself you “should probably do” lately.
1. (If you can’t find anything, pattern-match to find something that seems “productive-ish”.)
8. Take note of what time it is.  Write it down.
9. Do the new thing.  Finish.
10. Note the end time.  Calculate how long you did work.

Next, the other part is actually accessing the heuristic in the situations where you want it.  We want it to be habitual.

After doing some quick searches on the existing research on habits, it appears that many of the links go to Charles Duhigg, author of The Power of Habit, or B J Fogg of Tiny Habits. Both models focus on two things: Identifying the Thing you want to do.  Then setting triggers so you actually do It.  (There’s some similarity to CFAR’s Trigger Action Plans.)

B J’s approach focuses on scaffolding new habits into existing routines, like brushing your teeth, which are already automatic.  Duhigg appears to be focused more on reinforcement and rewards, with several nods to Skinner.  CFAR views actions as self-reinforcing, so the reward isn’t even necessary— they see repetition as building automation.

Overlearning the material also seems to be useful in some contexts, for skills like acquiring procedural knowledge.  And mental notions do seem to be more like procedural knowledge.

For these mental skills specifically, we’d want them to go off, time irrespective, so anchoring it to an existing routine might not be best.  Having it as a response to an internal state (EX: “When I notice myself being ‘dragged’ into a spiral, or automatically making plans to do a thing”) may be more useful.

(Follow-up post forthcoming on concretely trying to apply habit research to implementing heuristics.)

[Link] Rationality 101 (An Intro Post to the Rationalist-Sphere for Friends: Kahneman, LW, etc.)

[Link] Ozy's Thoughts on CFAR's Mission Statement

[Link] Take the Rationality Test to determine your rational thinking style

Measuring the Sanity Waterline

4 06 December 2016 08:38PM

I've always appreciated the motto, "Raising the sanity waterline." Intentionally raising the ambient level of rationality in our civilization strikes me as a very inspiring and important goal.

It occurred to me some time ago that the "sanity waterline" could be more than just a metaphor, that it could be quantified. What gets measured gets managed. If we have metrics to aim at, we can talk concretely about strategies to effectively promulgate rationality by improving those metrics. A "rationality intervention" that effectively improves a targeted metric can be said to be effective.

It is relatively easy to concoct or discover second-order metrics. You would expect a variety of metrics to respond to the state of ambient sanity. For example, I would expect that, all things being equal, preventable deaths should decrease when overall sanity increases, because a sane society acts to effectively prevent the kinds of things that lead to preventable deaths. But of course other factors may also cause these contingent measures to fluctuate whichever way, so it's important to remember that these are only indirect measures of sanity.

The UN collects a lot of different types of data. Perusing their database, it becomes obvious that there are a lot of things that are probably worth caring about but which have only a very indirect relationship with what we could call "sanity". For example, one imagines that GDP would increase under conditions of high sanity, but that'd be a pretty noisy measure.

Take five minutes to think about how one might measure global sanity, and maybe brainstorm some potential metrics. Part of the prompt, of course, is to consider what we could mean by "sanity" in the first place.

~~~ THINK ABOUT THE PROBLEM FOR FIVE MINUTES ~~~

This is my first pass at brainstorming metrics which may more-or-less directly indicate the level of civilizational sanity:

• (+) Literacy rate
• (+) Enrollment rates in primary/secondary/tertiary education
• (-) Deaths due to preventable disease
• (-) QALYs lost due to preventable causes
• (+) Median level of awareness about world events
• (-) Religiosity rate
• (-) Fundamentalist religiosity rate
• (-) Per-capita spent on medical treatments that have not been proven to work
• (-) Per-capita spent on medical treatments that have been proven not to work
• (-) Adolescent fertility rate
• (+) Human development index

It's potentially more productive (and probably more practically difficult) to talk concretely about how best to improve one or two of these metrics via specific rationality interventions, than it is to talk about popularizing abstract rationality concepts.

Sidebar: The CFAR approach may yield something like "trickle down rationality", where the top 0.0000001% of rational people are selected and taught to be even more rational, and maybe eventually good thinking habits will infect everybody in the world from the top down. But I wouldn't bet on that being the most efficient path to raising the global sanity waterline.

As to the question of the meaning of "sanity", it seems to me that this indicates a certain basic package of rationality.

In Eliezer's original post on the topic, he seems to suggest a platform that boils down to a comprehensive embrace of probability-based reasoning and reductionism, with enough caveats and asterisks applied to that summary that you might as well go back and read his original post to get his full point. The idea was that with a high enough sanity waterline, obvious irrationalities like religion would eventually "go underwater" and cease to be viable. I see no problem with any of the "curricula" Eliezer lists in his post.

It has become popular within the rationalsphere to push back against reductionism, positivism, Bayesianism, etc. While such critiques of "extreme rationality" have an important place in the discourse, I think for the sake of this discussion, we should remember that the median human being really would benefit from more rationality in their thinking, and that human societies would benefit from having more rational citizens. Maybe we can all agree on that, even if we continue to disagree on, e.g., the finer points of positivism.

"Sanity" shouldn't require dogmatic adherence to a particular description of rationality, but it must include at least a basic inoculation of rationality to be worthy of the name. The type of sanity that I would advocate for promoting is this more "basic" kind, where religion ends up underwater, but people are still socially allowed to be contrarian in certain regards. After all, a sane society is aware of the power of conformity, and should actively promote some level of contrarianism within its population to promote a diversity of ideas and therefor avoid letting itself become stuck on local maxima.

Epistemic Effort

Epistemic Effort

Epistemic Effort: Thought seriously for 5 minutes about it. Thought a bit about how to test it empirically. Spelled out my model a little bit. I'm >80% confident this is worth trying and seeing what happens. Spent 45 min writing post.

I've been pleased to see "Epistemic Status" hit a critical mass of adoption - I think it's a good habit for us to have. In addition to letting you know how seriously to take an individual post, it sends a signal about what sort of discussion you want to have, and helps remind other people to think about their own thinking.

I have a suggestion for an evolution of it - "Epistemic Effort" instead of status. Instead of "how confident you are", it's more of a measure of "what steps did you actually take to make sure this was accurate?" with some examples including:

• Thought about it musingly
• Made a 5 minute timer and thought seriously about possible flaws or refinements
• Had a conversation with other people you epistemically respect and who helped refine it
• Thought about how to do an empirical test
• Thought about how to build a model that would let you make predictions about the thing
• Did some kind of empirical test
• Did a review of relevant literature
• Ran an Randomized Control Trial
[Edit: the intention with these examples is for it to start with things that are fairly easy to do to get people in the habit of thinking about how to think better, but to have it quickly escalate to "empirical tests, hard to fake evidence and exposure to falsifiability"]

A few reasons I think this (most of these reasons are "things that seem likely to me" but which I haven't made any formal effort to test - they come from some background in game design and reading some books on habit formation, most of which weren't very well cited)
• People are more likely to put effort into being rational if there's a relatively straightforward, understandable path to do so
• People are more likely to put effort into being rational if they see other people doing it
• People are more likely to put effort into being rational if they are rewarded (socially or otherwise) for doing so.
• It's not obvious that people will get _especially_ socially rewarded for doing something like "Epistemic Effort" (or "Epistemic Status") but there are mild social rewards just for doing something you see other people doing, and a mild personal reward simply for doing something you believe to be virtuous (I wanted to say "dopamine" reward but then realized I honestly don't know if that's the mechanism, but "small internal brain happy feeling")
• Less Wrong etc is a more valuable project if more people involved are putting more effort into thinking and communicating "rationally" (i.e. making an effort to make sure their beliefs align with the truth, and making sure to communicate so other people's beliefs align with the truth)
• People range in their ability / time to put a lot of epistemic effort into things, but if there are easily achievable, well established "low end" efforts that are easy to remember and do, this reduces the barrier for newcomers to start building good habits. Having a nice range of recommended actions can provide a pseudo-gamified structure where there's always another slightly harder step you available to you.
• In the process of writing this very post, I actually went from planning a quick, 2 paragraph post to the current version, when I realized I should really eat my own dogfood and make a minimal effort to increase my epistemic effort here. I didn't have that much time so I did a couple simpler techniques. But even that I think provided a lot of value.
Results of thinking about it for 5 minutes.

• It occurred to me that explicitly demonstrating the results of putting epistemic effort into something might be motivational both for me and for anyone else thinking about doing this, hence this entire section. (This is sort of stream of conscious-y because I didn't want to force myself to do so much that I ended up going 'ugh I don't have time for this right now I'll do it later.')
• One failure mode is that people end up putting minimal, token effort into things (i.e. randomly tried something on a couple doubleblinded people and call it a Randomized Control Trial).
• Another is that people might end up defaulting to whatever the "common" sample efforts are, instead of thinking more creatively about how to refine their ideas. I think the benefit of providing a clear path to people who weren't thinking about this at all outweights people who might end up being less agenty about their epistemology, but it seems like something to be aware of.
• I don't think it's worth the effort to run a "serious" empirical test of this, but I do think it'd be worth the effort, if a number of people started doing this on their posts, to run a followup informal survey asking "did you do this? Did it work out for you? Do you have feedback."
• A neat nice-to-have, if people actually started adopting this and it proved useful, might be for it to automatically appear at the top of new posts, along with a link to a wiki entry that explained what the deal was.

Next actions, if you found this post persuasive:

Next time you're writing any kind of post intended to communicate an idea (whether on Less Wrong, Tumblr or Facebook), try adding "Epistemic Effort: " to the beginning of it. If it was intended to be a quick, lightweight post, just write it in its quick, lightweight form.

After the quick, lightweight post is complete, think about whether it'd be worth doing something as simple as "set a 5 minute timer and think about how to refine/refute the idea". If not, just write "thought about it musingly" after Epistemic Status. If so, start thinking about it more seriously and see where it leads.

While thinking about it for 5 minutes, some questions worth asking yourself:
• If this were wrong, how would I know?
• What actually led me to believe this was a good idea? Can I spell that out? In how much detail?
• Where might I check to see if this idea has already been tried/discussed?
• What pieces of the idea might you peel away or refine to make the idea stronger? Are there individual premises you might be wrong about? Do they invalidate the idea? Does removing them lead to a different idea?

View more: Next