## The fairness of the Sleeping Beauty

0 07 July 2015 08:25AM

This post will attempt a (yet another) analysis of the problem of the Sleeping Beauty, in terms of Jaynes' framework "probability as extended logic" (aka objective Bayesianism).

TL,DR: The problem of the sleeping beauty reduces to interpreting the sentence “a fair coin is tossed”: it can mean either that no results of the toss is favourite, or that the coin toss is not influenced by anthropic information, but not both at the same time. Fairness is a property in the mind of the observer that must be further clarified: the two meanings cannot be confused.

What I hope to show is that the two standard solutions, 1/3 and 1/2 (the 'thirder' and the 'halfer' solutions), are both consistent and correct, and the confusion lies only in the incorrect specification of the sentence "a fair coin is tossed".

The setup is given both in the Lesswrong's wiki and in Wikipedia, so I will not repeat it here.

I'm going to symbolize the events in the following way:

- It's Monday = Mon
- It's Tuesday = Tue
- The coin landed head = H
- The coin landed tail = T
- statement "A and B" = A & B
- statement "not A" = ~A

1)    H = ~T (the coin can land only on head or tail)

2)    Mon = ~Tue (if it's Tuesday, it cannot be Monday, and viceversa)

And of probability:

3)    P(Mon|H) = 1 (upon learning that the coin landed head, the sleeping beauty knows that it’s Monday)

4)    P(T|Tue) = 1 (upon learning that it’s Tuesday, the sleeping beauty knows that the coin landed tail)

Using the indifference principle, we can also derive another equation.

Let's say that the Sleeping Beauty is awaken and told that the coin landed tail, but nothing else. Since she has no information useful to distinguish between Monday and Tuesday, she should assign both events equal probability. That is:

5)    P(Mon|T) = P(Tue|T)

Which gives

6)    P(Mon & T) = P(Mon|T)P(T) = P(Tue|T)P(T) = P(Tue & T)

It's here that the analysis between "thirder" and "halfer" starts to diverge.

The wikipedia article says "Guided by the objective chance of heads landing being equal to the chance of tails landing, it should therefore hold that". We know however that there's no such thing as 'the objective chance'.

Thus, "a fair coin will be tossed", in this context, will mean different things for different people.

The thirders interpret the sentence to mean that beauty learns no new facts about the coin upon learning that it is Monday.

They thus make the assumption:

(TA) P(T|Mon) = P(H|Mon)

So:

7)    P(Mon & H) = P(H|Mon)P(Mon) = P(T|Mon)P(Mon) = P(Mon & T)

From 6) and 7) we have:

8)    P(Mon & H) = P(Mon & T) = P(Tue & T)

And since those events are a partition of unity, P(Mon & H) = 1/3.

And indeed from 8) and 3):

9)    1/3 =  P(Mon & H) = P(Mon|H)P(H) = P(H)

So that, under TA, P(H) = 1/3 and P(T) = 2/3.

Notice that also, since if it’s Monday the coin landed either on head or tail, P(H|Mon) = 1/2.

The thirder analysis of the Sleeping Beauty problem is thus one in which "a fair coin is tossed" means "Sleeping Beauty receives no information about the coin from anthropic information".

There is however another way to interpret the sentence, that is the halfer analysis:

(HA) P(T) = P(H)

Here, a fair coin is tossed means simply that we assign no preference to either side of the coin.

Obviously from 1:

10)  P(T) + P(H) = 1

So that, from 10) and HA)

11) P(H) = 1/2, P(T) = 1/2

But let’s not stop here, let’s calculate P(H|Mon).

First of all, from 3) and 11)

12) P(H & Mon) = P(H|Mon)P(Mon) = P(Mon|H)P(H) = 1/2

From 5) and 11) also

13) P(Mon & T) = 1/4

But from 12) and 13) we get

14) P(Mon) = P(Mon & T) + P(Mon & H) = 1/2 + 1/4 = 3/4

So that, from 12) and 14)

15) P(H|Mon) = P(H & Mon) / P(Mon) = 1/2 / 3/4 = 2/3

We have seen that either P(H) = 1/2 and P(H|Mon) = 2/3, or P(H) = 2/3 and P(H|Mon) = 1/2.

Nick Bostrom is correct in saying that self-locating information changes the probability distribution, but this is true in both interpretations.

The problem of the sleeping beauty reduces to interpreting the sentence “a fair coin is tossed”: it can mean either that no results of the toss is favourite, or that the coin toss is not influenced by anthropic information, that is, you can attribute the fairness of the coin to prior or posterior distribution.

Either P(H)=P(T) or P(H|Mon)=P(T|Mon), but both at the same time is not possible.

If probability were a physical property of the coin, then so would be its fairness. But since the causal interactions of the coin possess both kind of indifference (balance and independency from the future), that would make the two probability equivalent.

That such is not the case just means that fairness is a property in the mind of the observer that must be further clarified, since the two meanings cannot be confused.

## Presidents, asteroids, natural categories, and reduced impact

0 06 July 2015 05:44PM

A putative new idea for AI control; index here.

This post attempts to use the ideas developed about natural categories in order to get high impact from reduced impact AIs.

## Extending niceness/reduced impact

I recently presented the problem of extending AI "niceness" given some fact X, to niceness given ¬X, choosing X to be something pretty significant but not overwhelmingly so - the death of a president. By assumption we had a successfully programmed niceness, but no good definition (this was meant to be "reduced impact" in a slight disguise).

This problem turned out to be much harder than expected. It seems that the only way to do so is to require the AI to define values dependent on a set of various (boolean) random variables Zj that did not include X/¬X. Then as long as the the random variables represented natural categories, given X, the niceness should extend.

What did we mean by natural categories? Informally, it means that X should not appear in the definitions of these random variables. For instance, nuclear war is a natural category; "nuclear war XOR X" is not. Actually defining this was quite subtle; diverting through the grue and bleen problem, it seems that we had to define how we update X and the Zj given the evidence we expected to find. This was put in equation as picking Zj's that minimize

• Variance{log[ P(X∧Z|E)*P(¬X∧¬Z|E) / P(X∧¬Z|E)*P(¬X∧Z|E) ]}

where E is the random variable denoting the evidence we expected to find. Note that if we interchange X and ¬X, the ratio inverts, the log changes sign - but this makes no difference to the variance. So we can equally well talk about extending niceness given X to ¬X, or niceness given ¬X to X.

## Perfect and imperfect extensions

The above definition would work for an "perfectly nice AI". That could be an AI that would be nice, given any combination of estimates of X and Zj. In practice, because we can't consider every edge case, we would only have an "expectedly nice AI". That means that the AI can fail to be nice in certain unusual and unlikely edge cases, in certain strange set of values of Zj that almost never come up...

...or at least, that almost never come up, given X. Since the "expected niceness" was calibrated given X, the such an expectedly nice AI may fail to be nice if ¬X results in a substantial change in the probability of the Zj (see the second failure mode in this post; some of the Zj may be so tightly coupled to the value of X that an expected niceness AI considers them fixed, and this results in problems if ¬X happens and their values change).

One way of fixing this is to require that the "swing" of the Zj be small upon changing X to ¬X or vice versa. Something like, for all values of {aj}, the the ratio P({Zj=aj} | X) / P({Zj=aj} | ¬X) is contained between 100 and 1/100. This means that a reasonably good "expected niceness" calibrated on the Zj will transfer from X to ¬X (though the error may grow). This approach has some other advantages, as we'll see in the next section.

Of course, problems arise if the Zj are defined by smashing together events relatively independent of X with some that are very strongly dependent on X, into the same variable. Thus we'll further require that Zj cannot be decomposed into natural subvariables whose values swing more under changes from X to ¬X.

## High impact from reduced impact

With this, we are finally in a position to define a reduced impact AI design that can have a large impact. Imagine that an asteroid is coming to hit the Earth. Two AIs are charged with targeting a laser at it. The laser isn't perfectly effective, and the asteroid isn't perfectly certain to hit the Earth anyway: let's say aiming the laser correctly will reduce the odds of impact by a factor of 10 or so (well within the probability swing restrictions of the previous section).

In the original formulation of this, the AIs were tasked with outputting the x and y coordinates to aim the laser. Here, instead, they'll be tasked with constructing the motivation of a robotic subagent that will move around for a while, see things and interact, before physically aiming the laser in the right direction. This task is harder.

Then the top level AI charged with aiming the y coordinates is given the following task:

• It is motivated to cause the the correct y-coordinate to be outputted.
• If the x-coordinate AI is not turned on (call this event ¬X), it is motivated to have reduced impact. This motivation is sufficiently strong that it will not want to have the correct y-coordinate outputted.
• It must phrase the robotic subagent's utility function in terms of the Zj that obey all the properties above.
• The evidence E considered for the equation governing the naturalness of Zj is considered (for example) to be a random sampling of space at the time both AIs are turned on. Most importantly, this is defined to be a theoretical process that the AI cannot affect physically.

So what will the AI do? It will produce a robot that will aim to aim the the y-coordinate of the laser correctly, given ¬X, and not expand dangerously. However, we've setup things so that X is virtually certain, so the robot will see another, x-aiming robot, wander around. The chances of this happening, given ¬X, are minuscule, so the swing of the probability from ¬X to X, is huge. So, by the restrictions we've put on the Zj, the robot will not be motivated to change its behaviour just because its sees the other robot.

And if the laser is fired, the impact on humanity will, conversely, not be that large, so well within the parameters of the Zj. Thus the "general reduced impact" portion of the robot's motivation will be preserved.

And, of course, the x-robot will have the mirror image motivations, so they will both aim correctly, it seems. This should evade the problem described here.

## Failure modes

There are two ways the approach could fail: one bad way, and two graceful ways. The bad way is if the "no decomposition" requirements (one of them is presented above, the other is in this post) fail. In this case, the niceness or reduced impact are not guaranteed to be extended to ¬X, and the AI's behaviour can become undefined. Note that this is is not bad optimisation pressure, just an absence of optimisation pressure - but that could still be potentially dangerous.

The first graceful failure happens if the AI is not able to create a reduced impact sub-agent given the restrictions on the Zj above. In that case, the AI does nothing. The second graceful failure happens if the AI evades our attempts to increase its impact, given ¬X. In that case, it simply becomes a reduced impact AI that does little. Not ideal, but not deadly.

Overall status: I'm not sure the idea is sound, at least not yet. Critiques welcome.

## Grue, Bleen, and natural categories

0 06 July 2015 01:47PM

A putative new idea for AI control; index here.

In a previous post, I looked at unnatural concepts such as grue (green if X was true, blue if it was false) and bleen. This was to enable one to construct the natural categories that extend AI behaviour, something that seemed surprisingly difficult to do.

The basic idea discussed in the grue post was that the naturalness of grue and bleen seemed dependent on features of our universe - mostly, that it was easy to tell whether an object was "currently green" without knowing what time it was, but we could not know whether the object was "currently grue" without knowing the time.

So the naturalness of the category depended on the type of evidence we expected to find. Furthermore, it seemed easier to discuss whether a category is natural "given X", rather than whether that category is natural in general. However, we know the relevant X in the AI problems considered so far, so this is not a problem.

## Natural category, probability flows

Fix a boolean random variable X, and assume we want to check whether the boolean random variable Z is a natural category, given X.

If Z is natural (for instance, it could be the colour of an object, while X might be the brightness), then we expect to uncover two types of evidence:

• those that change our estimate of X; this causes probability to "flow" as follows (or in the opposite directions):

• ...and those that change our estimate of Z:

Or we might discover something that changes our estimates of X and Z simultaneously. If the probability flows to X and and Z in the same proportions, we might get:

What is an example of an unnatural category? Well, if Z is some sort of grue/bleen-like object given X, then we can have Z = X XOR Z', for Z' actually a natural category. This sets up the following probability flows, which we would not want to see:

More generally, Z might be constructed so that X∧Z, X∧¬Z, ¬X∧Z and ¬X∧¬Z are completely distinct categories; in that case, there are more forbidden probability flows:

and

In fact, there are only really three "linearly independent" probability flows, as we shall see.

## Less pictures, more math

Let's represent the four possible state of affairs by four weights (not probabilities):

Since everything is easier when it's linear, let's set w11 = log(P(X∧Z)) and similarly for the other weights (we neglect cases where some events have zero probability). Weights are correspond to the same probabilities iff you get from one set to another by multiplying by a strictly positive number. For logarithms, this corresponds to adding the same constant to all the log-weights. So we can normalise our log-weights (select a single set of representative log-weights for each possible probability sets) by choosing the w such that

w11 + w12 + w21 + w22 = 0.

Thus the probability "flows" correspond to adding together two such normalised 2x2 matrices, one for the prior and one for the update. Composing two flows means adding two change matrices to the prior.

Four variables, one constraint: the set of possible log-weights is three dimensional. We know we have two allowable probability flows, given naturalness: those caused by changes to P(X), independent of P(Z), and vice versa. Thus we are looking for a single extra constraint to keep Z natural given X.

A little thought reveals that we want to keep constant the quantity:

w11 + w22 - w12 - w21.

This preserves all the allowed probability flows and rules out all the forbidden ones. Translating this back to a the general case, let "e" be the evidence we find. Then if Z is a natural category given X and the evidence e, the following quantity is the same for all possible values of e:

log[P(X∧Z|e)*P(¬X∧¬Z|e) / P(X∧¬Z|e)*P(¬X∧Z|e)].

If E is a random variable representing the possible values of e, this means that we want

log[P(X∧Z|E)*P(¬X∧¬Z|E) / P(X∧¬Z|E)*P(¬X∧Z|E)]

to be constant, or, equivalently, seeing the posterior probabilities as random variables dependent on E:

• Variance{log[ P(X∧Z|E)*P(¬X∧¬Z|E) / P(X∧¬Z|E)*P(¬X∧Z|E) ]} = 0.

Call that variance the XE-naturalness measure. If it is zero, then Z defines a XE-natural category. Note that this does not imply that Z and X are independent, or independent conditional on E. Just that they are, in some sense, "equally (in)dependent whatever E is".

## Almost natural category

The advantage of that last formulation becomes visible when we consider that the evidence which we uncover is not, in the real world, going to perfectly mark Z as natural, given X. To return to the grue example, though most evidence we uncover about an object is going to be the colour or the time rather than some weird combination, there is going to be somebidy who will right things like "either the object is green, and the sun has not yet set in the west; or instead perchance, those two statements are both alike in falsity". Upon reading that evidence, if we believe it in the slightest, the variance can no longer be zero.

Thus we cannot expect that the above XE-naturalness be perfectly zero, but we can demand that it be low. How low? There seems no principled way of deciding this, but we can make one attempt: that we cannot lower it be decomposing Z.

What do we mean by that? Well, assume that Z is a natural category, given X and the expected evidence, but Z' is not. Then we can define a new category boolean Y to be Z with high probability, and Z' otherwise. This will still have low XE-naturalness measure (as Z does) but is obviously not ideal.

Reversing this idea, we say Z defines a "XE-almost natural category" if there is no "more XE-natural" category that extends X∧Z (and the other for conjunctions). Technically, if

X∧Z = X∧Y,

Then Y must have equal or greater XE-naturalness measure to Z. And similarly for X∧¬Z, ¬X∧Z, and ¬X∧¬Z.

Note: I am somewhat unsure about this last definition; the concept I want to capture is clear (Z is not the combination of more XE-natural subvariables), but I'm not certain the definition does it.

## Beyond boolean

What if Z takes n values, rather than being a boolean? This can be treated simply.

If we set the wjk to be log-weights as before, there are 2n free variables. The normalisation constraint is that they all sum to a constant. The "permissible" probability flows are given by flows from X to ¬X (adding a constant to the first column, subtracting it from the second) and pure changes in Z (adding constants to various rows, summing to 0). There are 1+ (n-1) linearly independent ways of doing this.

Therefore we are looking for 2n-1 -(1+(n-1))=n-1 independent constraints to forbid non-natural updating of X and Z. One basis set for these constraints could be to keep constant the values of

wj1 + w(j+1)2 - wj2 - w(j+1)1,

where j ranges between 1 and n-1.

This translates to variance constraints of the type:

• Variance{log[ P(X∧{Z=j}|E)*P(¬X∧{Z=j+1}|E) / P(X∧{Z=j+1}|E)*P(¬X∧{Z=j}|E) ]} = 0.

But those are n different possible variances. What is the best global measure of XE-naturalness? It seems it could simply be

• Maxjk Variance{log[ P(X∧{Z=j}|E)*P(¬X∧{Z=k}|E) / P(X∧{Z=k}|E)*P(¬X∧{Z=j}|E) ]} = 0.

If this quantity is zero, it naturally sends all variances to zero, and, when not zero, is a good candidate for the degree of XE-naturalness of Z.

The extension to the case where X takes multiple values is straightforward:

• Maxjklm Variance{log[ P({X=l}∧{Z=j}|E)*P({X=m}∧{Z=k}|E) / P({X=l}∧{Z=k}|E)*P({X=m}∧{Z=j}|E) ]} = 0.

Note: if ever we need to compare the XE-naturalness of random variables taking different numbers of values, it may become necessary to divide these quantities by the number of variables involved, or maybe substitute a more complicated expression that contains all the different possible variances, rather than simply the maximum.

## And in practice?

In the next post, I'll look at using this in practice for an AI, to evade presidential deaths and deflect asteroids.

## Zooming your mind in and out

6 06 July 2015 12:30PM

I recently noticed I had two mental processes opposing one another in an interesting way.

The first mental process was instilled by reading Daniel Kahneman on the focusing illusion and Paul Graham on procrastination.  This process encourages me to "zoom out" when engaging in low-value activities so I can see they don't deliver much value in the grand scheme of things.

The second mental process was instilled by reading about the importance of just trying things.  (These articles could be seen as steelmanning Mark Friedenbach's recent Less Wrong critique.)  This mental process encourages me to "zoom in" and get my hands dirty through experimentation.

Both these processes seem useful.  Instead of spending long stretches of time in either the "zoomed in" or "zoomed out" state, I think I'd do better flip-flopping between them.  For example, if I'm wandering down internet rabbit holes, I'm spending too much time zoomed in.  Asking "why" repeatedly could help me realize I'm doing something low value.  If I'm daydreaming or planning lots with little doing, I'm spending too much time zoomed out.  Asking "how" repeatedly could help me identify a first step.

This fits in with construal level theory, aka "near/far theory" as discussed by Robin Hanson.  (I recommend the reviews Hanson links to; they gave me a different view of the concept than his standard presentation.)  To be more effective, maybe one should increase cross communication between the "near" and "far" modes, so the parts work together harmoniously instead of being at odds.

If Hanson's view is right, maybe the reason people become uncomfortable when they realize they are procrastinating (or not Just Trying It) is that this maps to getting caught red-handed in an act of hypocrisy in the ancestral environment.  You're pursuing near interests (watching Youtube videos) instead of working towards far ideals (doing your homework)?  For shame!

(Possible cure: Tell yourself that there's nothing to be ashamed of if you get stuck zoomed in; it happens to everyone.  Just zoom out.)

Part of me is reluctant to make this post, because I just had this idea and it feels like I should test it out more before writing about it.  So here are my excuses:

1. If I wait until I develop expertise in everything, it may be too late to pass it on.

2. In order to see if this idea is useful, I'll need to pay attention to it.  And writing about it publicly is a good way to help myself pay attention to it, since it will become part of my identity and I'll be interested to see how people respond.

There might be activities people already do on a regular basis that consist of repeated zooming in and out.  If so, engaging in them could be a good way to build this mental muscle.  Can anyone think of something like this?

## Green Emeralds, Grue Diamonds

7 06 July 2015 11:27AM

A putative new idea for AI control; index here.

When posing his "New Riddle of Induction", Goodman introduced the concepts of "grue" and "bleen" to show some of the problems with the conventional understanding of induction.

I've somewhat modified those concepts. Let T be a set of intervals in time, and we'll use the boolean X to designate the fact that the current time t belongs to T (with ¬X equivalent to t∉T). We'll define an object to be:

• Grue if it is green given X (ie whenever t∈T), and blue given ¬X (ie whenever t∈T).
• Bleen if it is blue given X, and green given ¬X.

At this point, people are tempted to point out the ridiculousness of the concepts, dismissing them because of their strange disjunctive definitions. However, this doesn't really solve the problem; if we take grue and bleen as fundamental concepts, then we have the disjunctively defined green and blue; an object is:

• Green if it is grue given X, and bleen given ¬X.
• Blue if it is bleen given X, and grue given ¬X.

Still, the categories green and blue are clearly more fundamental than grue and bleen. There must be something we can whack them with to get this - maybe Kolmogorov complexity or stuff like that? Sure someone on Earth could make a grue or bleen object (a screen with a timer, maybe?), but it would be completely artificial. Note that though grue and bleen are unnatural, "currently grue" (colour=green XOR ¬X) or "currently bleen" (colour=blue XOR ¬X) make perfect sense (though they require knowing X, an important point for later on).

But before that... are we so sure the grue and bleen categories are unnatural? Relative to what?

## Welcome to Chiron Beta Prime

Chiron Beta Prime, apart from having its own issues with low-intelligence AIs, is noted for having many suns: one large sun that glows mainly in the blue spectrum, and multiple smaller ones glowing mainly in the green spectrum. They all emit in the totality of the spectrum, but they are stronger in those colours.

Because of the way the orbits are locked to each other, the green suns are always visible from everywhere. The blue sun rises and sets on a regular schedule; define T to be time when the blue sun is risen (so X="Blue sun visible, some green suns visible" and ¬X="Blue sun not visible, some green suns visible").

Now "green" is a well defined concept in this world. Emeralds are green; they glow green under the green suns, and do the same when the blue sun is risen. "Blue" is also a well-defined concept. Sapphires are blue. They glow blue under the blue sun and continue to do so (albeit less intensely) when it is set.

But "grue" is also a well defined concept. Diamonds are grue. They glow green when the green suns are the only ones visible, but glow blue under the glare of the blue sun.

Green, blue, and grue (which we would insist on calling green, blue and white) are thus well understood and fundamental concepts, that people of this world use regularly to compactly convey useful information to each other. They match up easily to fundamental properties of the objects in question (eg frequency of light reflected).

Bleen, on the other hand - don't be ridiculous. Sure, someone on Chiron Beta Prime could make a bleen object (a screen with a timer, maybe?), but it would be completely artificial.

In contrast, the inhabitants of Pholus Delta Secundus, who have a major green sun and many minor blue suns (coincidentally with exactly the same orbital cycles), feel that green, blue and bleen are the natural categories...

## Natural relative to the (current) universe

We've shown that some categories that we see as disjunctive or artificial can seem perfectly natural and fundamental to beings in different circumstances. Here's another example:

A philosopher proposes, as thought experiment, to define a certain concept for every object. It's the weighted sum of the inverse of the height of an object (from the centre of the Earth), and its speed (squared, because why not?), and its temperature (but only on an "absolute" scale), and some complicated thing involving its composition and shape, and another term involving its composition only. And maybe we can add another piece for its total mass.

And then that philosopher proposes, to great derision, that this whole messy sum be given a single name, "Energy", and that we start talking about it as if it was a single thing. Faced with such an artificially bizarre definition, sensible people who want to use induction properly have no choice... but to embrace energy as one of the fundamental useful facts of the universe.

What these example show is that green, blue, grue, bleen, and energy are not natural or non-natural categories in some abstract sense, but relative to the universe we inhabit. For instance, if we had some strange energy' which used the inverse of the height cubed, then we'd have a useless category - unless we lived in five spacial dimensions.

## You're grue, what time is it?

So how can we say that green and blue are natural categories in our universe, while grue and bleen are not? A very valid explanation seems to be the dependence on X - on the time of day. In our earth, we can tell whether objects are green or blue without knowing anything about the time. Certainly we can get combined information about an object's colour and the time of day (for instance by looking at emeralds out in the open). But we also expect to get information about the colour (by looking at an object in a lit basement) and the time (by looking at a clock). And we expect these pieces of information to be independent of each other.

In contrast, we never expect to get information about an object being currently grue or currently bleen without knowing the time (or the colour, for that matter). And information about the time can completely change our assessment as to whether an object is grue versus bleen. It would be a very contrived set of circumstances where we would be able to assert "I'm pretty sure that object is currently grue, but I have no idea about its colour or about the current time".

Again, this is a feature of our world and the evidence we see in it, not some fundamental feature of the categories of grue and bleen. We just don't generally seen green objects change into blue objects, nor do we typically learn about disjunctive statements of the type "colour=green XOR time=night" without learning about the colour and the time separately.

What about the grue objects on Chiron Beta Prime? There, people do see objects change colour regularly, and, upon investigation, they can detect whether an object is grue without knowing either the time or the apparent colour of the object. For instance, they know that diamond is grue, so they can detect some grue objects by a simple hardness test.

But what's happening is that the Chiron Beta Primers have correctly identified a fundamental category - the one we call white, or, more technically "prone to reflect light both in the blue and green parts of the spectrum" - that has different features on their planet than on ours. From the macroscopic perspective, it's as if we and they live in a different universe, hence grue means something to them and not to us. But the same laws of physics underlie both our worlds, so fundamentally the concepts converge - our white, their grue, mean the same things at the microscopic level.

## Definitions open to manipulation

In the next post, I'll look at whether we can formalise "expect independent information about colour and time", and "we don't expect change to the time information to change our colour assessment."

But be warned. The naturalness of these categories is dependent on facts about the universe, and these facts could be changed. A demented human (or a powerful AI) could go through the universe, hiding everything in boxes, smashing clocks, and putting "current bleen detectors" all other the place, so that it suddenly becomes very easy to know statements like "colour=blue XOR time=night", but very hard to know about colour (or time) independently from this. So it would be easy to say "this object is currently bleen", but hard to say "this object is blue". Thus the "natural" categories may be natural now, but this could well change, so we must have care when using these definitions to program an AI.

## A question and a tail

2 06 July 2015 09:34AM

This is a rambling post, and I will appreciate your criticism to help dry it or delete it altogether.

It seems that however little a question I research by reviewing [botanical] literature, there is always a much more complex, and rather difficult to rigorously put, question that I have to ask for the first one to be meaningful. The second answer (or tier of answers) doesn't add much to the information I will build upon, but it might - just might! - add uncertainty to the result or allow predictions in advance. How do we use it in advance? We don't apply formal reasoning, usually, and yet somehow we use it!

1.

Consider: a certain invasive plant has a host of adaptations beneficial to its success. (They probably wouldn't be sufficient if there were some actual effort to manage manmade ecosystems, but duh.) A trait many IP share is the ability to increase their ploidy - from 2 to 3, 4, 6, 8 or even 10 sets of homologous chromosomes, etc. (Polyploidization sometimes happens even in single cells in somatic (= non-reproductive) tissues, so it's really a heavily used shortcut.)

Now, suppose I want to see how a different specific property of the species behaves abroad. I will have to check the ploidy level, of course! Quick, what does the literature say, how many chromosomes can it have?

...but wait. Make no mistake, I do have to count them; but what if there is a continent-wide study showing that it generally has 4n in Eastern Europe?.. That would allow me to at least expect 4n, or whatever amount they found, and see if there is any research specifically dealing with this situation within its native range.

...but wait. Of course, those findings will be useful in discussion if I find 4n, but if I don't, they will be just a point in the overall space of possibilities. Still relevant, but not worth putting much explanatory weight on.

Something in my brain evaluated the usefulness of a piece of data other people have found, which I myself have yet to look up, of whose exact composition I have no idea - perhaps there are simply no other reports! - and placed it in context of what I really expect to do.

2.

Okay, if I can think so about other people's writings without even reading them, then maybe I can compile a dummy set of data I expect right now and compare them to those I will find in the literature. And later, to actual data. Here's a simplified problem that doesn't approach labwork on any scale (I don't want to add too many qualifiers).

Let us 'measure' 8 parameters, and check if there have been studies that have found correlations between at least some of them (and maybe with some other ones), and then try to see if our expectations based on knowledge of study area and casual surveys fit our expectations based on published research in any specific way. We are not ready to put forth any causal structure - no real data yet - though we strongly suspect (80%) that all the parameters are in some way linked to each other.

The following table is rough and repetitive, but I think useful as an illustration of how things brew in [my own] a not-much-clever student's head. The numbers are 'dimensionless', distributions are normal, total number of studies measuring each parameter is 7 or less, and all correlations are no less that 0.8.

 Parameter Total range Our expected data ±SE Reported data range* Our imaginary correlations Reported correlations A 1-12 8±1 4-10 A&F, A&H A&D, A&F, A doesn't correlate with anything if nothing else correlates with anything B 1-5 2±1 1-4 B&C, B&E, B&G, B&H B doesn't correlate with E if F&H C 1-100 35±20 80±7 (only one other study) C&B, C&F, C&H Unknown D 1-28 6±2 2-18 D&F D&G (and then E&F) E 1-500 200±46 150-480 E&B, E&G E&F if D&G F 1-50 47±8 8-45 F&A, F&C, F&D, F&H if A&H F&A, F&H (and then B doesn't correlate with E) G 1-25 18±2 11-20 G&B, G&E G&D (and then E&F) H 1-40 23±10 1-40 H&A, H&B, H&C, H&F (and then H&A) H&F (and then B doesn't correlate with E)

*as in, 'for this species, out of 1-12 that are altogerther possible, only 4-10 have been so far observed. It might mean that 4-10 is the actual range, but the prior for that is about 60% due to difference in methodologies used by various researchers and to the fact that only a part of the species's habitats have been studied' etc.

Now I understand that this is hardly the most profitable presentation method and statistics has advanced much since Pearson and eveything. It is just that I find it difficult to compare graphs with diagrams with clouds along axes as they are published in different papers. I only want to guesstimate if my data fit a pattern, to discuss them qualitatively. To stratify the parameters in such a way that I will place explanative weight on some of them, and report the others to give a full picture. I have to do this explicitly, because I know I am doing this implicitly – it's a feeling I get, of brain working and deciding and not showing me what it has.

I cannot speak about A, only that maybe A, H and F do have something in common – perhaps I haven't measured it. B looks rather suspicious; I will need to reread that other report. C is intriguing, but ultimately belongs to the 'lower value stratum', and maybe those correlations I found are spurious; if only there was a way to reduce the variability... but it won't be cost-efficient. E, F, D and G also might be worth discussing together. F by itself doesn't seem very meaningful, unless there is a causal connection to the others; too bad one can imagine many plausible explanations for that. I will probably start discussion with H, since it probably has been studied for other plants and at least something has already been proposed.

Now when I have my own data I will see where they deviate from my expectations, and that will be some knowledge I can put into words, and I will hopefully start calibrating myself on these matters. And on matters of Discussion structuring:)

## Open Thread, Jul. 6 - Jul. 12, 2015

4 06 July 2015 07:31AM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

## There is no such thing as strength: a parody

14 05 July 2015 11:44PM

The concept of strength is ubiquitous in our culture. It is commonplace to hear one person described as "stronger" or "weaker" than another. And yet the notion of strength is a a pernicious myth which reinforces many our social ills and should be abandoned wholesale.

1. Just what is strength, exactly? Few of the people who use the word can provide an exact definition.

On first try, many people would say that  strength is the ability to lift heavy objects. But this completely ignores the strength necessary to push or pull on objects; to run long distances without exhausting oneself; to throw objects with great speed; to balance oneself on a tightrope, and so forth.

When this is pointed out, people often try to incorporate all of these aspects into the definition of strength, with a result that is long, unwieldy, ad-hoc, and still missing some acts commonly considered to be manifestations of strength.

Attempts to solve the problem by referring to the supposed cause of strength -- for example, by saying that strength is just a measure of  muscle mass -- do not help. A person with a large amount of muscle mass may be quite weak on any of the conventional measures of strength if, for example, they cannot lift objects due to injuries or illness.

2. The concept of strength has an ugly history. Indeed, strength is implicated in both sexism and racism. Women have long been held to be the "weaker sex," consequently needing protection from the "stronger" males, resulting in centuries of structural oppression. Myths about racialist differences in strength have informed pernicious stereotypes and buttressed inequality.

3. There is no consistent way of grouping people into strong and weak. Indeed, what are we to make of the fact that some people are good at running but bad at lifting and vice versa?

One might think that we can talk about different strengths - the strength in one's arms and one's legs for example. But what, then, should we make of the person who is good at arm-wrestling but poor at lifting? Arms can move in many ways; what will we make of someone who can move arms one way with great force, but not another? It is not hard to see that potential concepts such as "arm strength" or "leg strength" are problematic as well.

4. When people are grouped into strong and weak according to any number of criteria, the amount of variation within each group is far larger than the amount of variation between groups.

5. Strength is a social construct. Thus no one is inherently weak or strong. Scientifically, anthropologically, we are only human

6. Scientists are rapidly starting to understand the illusory nature of strength, and one needs only to glance at any of the popular scientific periodicals to encounter refutations of this notion.

In on experiment, respondents from two different cultures were asked to lift a heavy object as much as they could. In one of the cultures, the respondents lifted the object higher. Furthermore, the manner in which the respondents attempted to lift the object depended on the culture. This shows that tests of strength cannot be considered culture-free and that there may be no such thing as a universal test of strength

7. Indeed, to even ask "what is strength?" is to assume that there is a quality, or essence, of humans with essential, immutable qualities. Asking the question begins the process of reifying strength... (see page 22 here).

---------------------------------------

For a serious statement of what the point of this was supposed to be, see this comment

## A Year of Spaced Repetition Software in the Classroom

73 04 July 2015 10:30PM

Last year, I asked LW for some advice about spaced repetition software (SRS) that might be useful to me as a high school teacher. With said advice came a request to write a follow-up after I had accumulated some experience using SRS in the classroom. This is my report.

Please note that this was not a scientific experiment to determine whether SRS "works." Prior studies are already pretty convincing on this point and I couldn't think of a practical way to run a control group or "blind" myself. What follows is more of an informal debriefing for how I used SRS during the 2014-15 school year, my insights for others who might want to try it, and how the experience is changing how I teach.

## Summary

SRS can raise student achievement even with students who won't use the software on their own, and even with frequent disruptions to the study schedule. Gains are most apparent with the already high-performing students, but are also meaningful for the lowest students. Deliberate efforts are needed to get student buy-in, and getting the most out of SRS may require changes in course design.

### The software

After looking into various programs, including the game-like Memrise, and even writing my own simple SRS, I ultimately went with Anki for its multi-platform availability, cloud sync, and ease-of-use. I also wanted a program that could act as an impromptu catch-all bin for the 2,000+ cards I would be producing on the fly throughout the year. (Memrise, in contrast, really needs clearly defined units packaged in advance).

### The students

I teach 9th and 10th grade English at an above-average suburban American public high school in a below-average state. Mine are the lower "required level" students at a school with high enrollment in honors and Advanced Placement classes. Generally speaking, this means my students are mostly not self-motivated, are only very weakly motivated by grades, and will not do anything school-related outside of class no matter how much it would be in their interest to do so. There are, of course, plenty of exceptions, and my students span an extremely wide range of ability and apathy levels.

### The procedure

First, what I did not do. I did not make Anki decks, assign them to my students to study independently, and then quiz them on the content. With honors classes I taught in previous years I think that might have worked, but I know my current students too well. Only about 10% of them would have done it, and the rest would have blamed me for their failing grades—with some justification, in my opinion.

Instead, we did Anki together, as a class, nearly every day.

As initial setup, I created a separate Anki profile for each class period. With a third-party add-on for Anki called Zoom, I enlarged the display font sizes to be clearly legible on the interactive whiteboard at the front of my room.

Nightly, I wrote up cards to reinforce new material and integrated them into the deck in time for the next day's classes. This averaged about 7 new cards per lesson period.These cards came in many varieties, but the three main types were:

1. concepts and terms, often with reversed companion cards, sometimes supplemented with "what is this an example of" scenario cards.
2. vocabulary, 3 cards per word: word/def, reverse, and fill-in-the-blank example sentence
3. grammar, usually in the form of "What change(s), if any, does this sentence need?" Alternative cards had different permutations of the sentence.

Weekly, I updated the deck to the cloud for self-motivated students wishing to study on their own.

Daily, I led each class in an Anki review of new and due cards for an average of 8 minutes per study day, usually as our first activity, at a rate of about 3.5 cards per minute. As each card appeared on the interactive whiteboard, I would read it out loud while students willing to share the answer raised their hands. Depending on the card, I might offer additional time to think before calling on someone to answer. Depending on their answer, and my impressions of the class as a whole, I might elaborate or offer some reminders, mnemonics, etc. I would then quickly poll the class on how they felt about the card by having them show a color by way of a small piece of card-stock divided into green, red, yellow, and white quadrants. Based on my own judgment (informed only partly by the poll), I would choose and press a response button in Anki, determining when we should see that card again.

[Data shown is from one of my five classes. We didn't start using Anki until a couple weeks into the school year.]

### Opportunity costs

8 minutes is a significant portion of a 55 minute class period, especially for a teacher like me who fills every one of those minutes. Something had to give. For me, I entirely cut some varieties of written vocab reinforcement, and reduced the time we spent playing the team-based vocab/term review game I wrote for our interactive whiteboards some years ago. To a lesser extent, I also cut back on some oral reading comprehension spot-checks that accompany my whole-class reading sessions. On balance, I think Anki was a much better way to spend the time, but it's complicated. Keep reading.

### Whole-class SRS not ideal

Every student is different, and would get the most out of having a personal Anki profile determine when they should see each card. Also, most individuals could study many more cards per minute on their own than we averaged doing it together. (To be fair, a small handful of my students did use the software independently, judging from Ankiweb download stats)

Before we started using SRS I tried to sell my students on it with a heartfelt, over-prepared 20 minute presentation on how it works and the superpowers to be gained from it. It might have been a waste of time. It might have changed someone's life. Hard to say.

As for the daily class review, I induced engagement partly through participation points that were part of the final semester grade, and which students knew I tracked closely. Raising a hand could earn a kind of bonus currency, but was never required—unlike looking up front and showing colors during polls, which I insisted on. When I thought students were just reflexively holding up the same color and zoning out, I would sometimes spot check them on the last card we did and penalize them if warranted.

But because I know my students are not strongly motivated by grades, I think the most important influence was my attitude. I made it a point to really turn up the charm during review and play the part of the engaging game show host. Positive feedback. Coaxing out the lurkers. Keeping that energy up. Being ready to kill and joke about bad cards. Reminding classes how awesome they did on tests and assignments because they knew their Anki stuff.

(This is a good time to point out that the average review time per class period stabilized at about 8 minutes because I tried to end reviews before student engagement tapered off too much, which typically started happening at around the 6-7 minute mark. Occasional short end-of-class reviews mostly account for the difference.)

I also got my students more on the Anki bandwagon by showing them how this was directly linked reduced note-taking requirements. If I could trust that they would remember something through Anki alone, why waste time waiting for them to write it down? They were unlikely to study from those notes anyway. And if they aren't looking down at their paper, they'll be paying more attention to me. I better come up with more cool things to tell them!

### Making memories

Everything I had read about spaced repetition suggested it was a great reinforcement tool but not a good way to introduce new material. With that in mind, I tried hard to find or create memorable images, examples, mnemonics, and anecdotes that my Anki cards could become hooks for, and to get those cards into circulation as soon as possible. I even gave this method a mantra: "vivid memory, card ready".

When a student during review raised their hand, gave me a pained look, and said, "like that time when...." or "I can see that picture of..." as they struggled to remember, I knew I had done well. (And I would always wait a moment, because they would usually get it.)

### Baby cards need immediate love

Unfortunately, if the card wasn't introduced quickly enough—within a day or two of the lesson—the entire memory often vanished and had to be recreated, killing the momentum of our review. This happened far too often—not because I didn't write the card soon enough (I stayed really on top of that), but because it didn't always come up for study soon enough. There were a few reasons for this:

1. We often had too many due cards to get through in one session, and by default Anki puts new cards behind due ones.
2. By default, Anki only introduces 20 new cards in one session (I soon uncapped this).
3. Some cards were in categories that I gave lower priority to.

Two obvious cures for this problem:

1. Make fewer cards. (I did get more selective as the year went on.)
2. Have all cards prepped ahead of time and introduce new ones at the end of the class period they go with. (For practical reasons, not the least of which was the fact that I didn't always know what cards I was making until after the lesson, I did not do this. I might able to next year.)

### Days off suck

SRS is meant to be used every day. When you take weekends off, you get a backlog of due cards. Not only do my students take every weekend and major holiday off (slackers), they have a few 1-2 week vacations built into the calendar. Coming back from a week's vacation means a 9-day backlog (due to the weekends bookending it). There's no good workaround for students that won't study on their own. The best I could do was run longer or multiple Anki sessions on return days to try catch up with the backlog. It wasn't enough. The "caught up" condition was not normal for most classes at most points during the year, but rather something to aspire to and occasionally applaud ourselves for reaching. Some cards spent weeks or months on the bottom of the stack. Memories died. Baby cards emerged stillborn. Learning was lost.

Needless to say, the last weeks of the school year also had a certain silliness to them. When the class will never see the card again, it doesn't matter whether I push the button that says 11 days or the one that says 8 months. (So I reduced polling and accelerated our cards/minute rate.)

Never before SRS did I fully appreciate the loss of learning that must happen every summer break.

### Triage

I kept each course's master deck divided into a few large subdecks. This was initially for organizational reasons, but I eventually started using it as a prioritizing tool. This happened after a curse-worthy discovery: if you tell Anki to review a deck made from subdecks, due cards from subdecks higher up in the stack are shown before cards from decks listed below, no matter how overdue they might be. From that point, on days when we were backlogged (most days) I would specifically review the concept/terminology subdeck for the current semester before any other subdecks, as these were my highest priority.

On a couple of occasions, I also used Anki's study deck tools to create temporary decks of especially high-priority cards.

### Seizing those moments

Veteran teachers start acquiring a sense of when it might be a good time to go off book and teach something that isn't in the unit, and maybe not even in the curriculum. Maybe it's teaching exactly the right word to describe a vivid situation you're reading about, or maybe it's advice on what to do in a certain type of emergency that nearly happened. As the year progressed, I found myself humoring my instincts more often because of a new confidence that I can turn an impressionable moment into a strong memory and lock it down with a new Anki card. I don't even care if it will ever be on a test. This insight has me questioning a great deal of what I thought knew about organizing a curriculum. And I like it.

### A lifeline for low performers

An accidental discovery came from having written some cards that were, it was immediately obvious to me, much too easy. I was embarrassed to even be reading them out loud. Then I saw which hands were coming up.

In any class you'll get some small number of extremely low performers who never seem to be doing anything that we're doing, and, when confronted, deny that they have any ability whatsoever. Some of the hands I was seeing were attached to these students. And you better believe I called on them.

It turns out that easy cards are really important because they can give wins to students who desperately need them. Knowing a 6th grade level card in a 10th grade class is no great achievement, of course, but the action takes what had been negative morale and nudges it upward. And it can trend. I can build on it. A few of these students started making Anki the thing they did in class, even if they ignored everything else. I can confidently name one student I'm sure passed my class only because of Anki. Don't get me wrong—he just barely passed. Most cards remained over his head. Anki was no miracle cure here, but it gave him and I something to work with that we didn't have when he failed my class the year before.

### A springboard for high achievers

It's not even fair. The lowest students got something important out of Anki, but the highest achievers drank it up and used it for rocket fuel. When people ask who's widening the achievement gap, I guess I get to raise my hand now.

I refuse to feel bad for this. Smart kids are badly underserved in American public schools thanks to policies that encourage staff to focus on that slice of students near (but not at) the bottom—the ones who might just barely be able to pass the state test, given enough attention.

Where my bright students might have been used to high Bs and low As on tests, they were now breaking my scales. You could see it in the multiple choice, but it was most obvious in their writing: they were skillfully working in terminology at an unprecedented rate, and making way more attempts to use new vocabulary—attempts that were, for the most part, successful.

Given the seemingly objective nature of Anki it might seem counterintuitive that the benefits would be more obvious in writing than in multiple choice, but it actually makes sense when I consider that even without SRS these students probably would have known the terms and the vocab well enough to get multiple choice questions right, but might have lacked the confidence to use them on their own initiative. Anki gave them that extra confidence.

### A wash for the apathetic middle?

I'm confident that about a third of my students got very little out of our Anki review. They were either really good at faking involvement while they zoned out, or didn't even try to pretend and just took the hit to their participation grade day after day, no matter what I did or who I contacted.

These weren't even necessarily failing students—just the apathetic middle that's smart enough to remember some fraction of what they hear and regurgitate some fraction of that at the appropriate times. Review of any kind holds no interest for them. It's a rerun. They don't really know the material, but they tell themselves that they do, and they don't care if they're wrong.

On the one hand, these students are no worse off with Anki than they would have been with with the activities it replaced, and nobody cries when average kids get average grades. On the other hand, I'm not ok with this... but so far I don't like any of my ideas for what to do about it.

### Putting up numbers: a case study

For unplanned reasons, I taught a unit at the start of a quarter that I didn't formally test them on until the end of said quarter. Historically, this would have been a disaster. In this case, it worked out well. For five weeks, Anki was the only ongoing exposure they were getting to that unit, but it proved to be enough. Because I had given the same test as a pre-test early in the unit, I have some numbers to back it up. The test was all multiple choice, with two sections: the first was on general terminology and concepts related to the unit. The second was a much harder reading comprehension section.

As expected, scores did not go up much on the reading comprehension section. Overall reading levels are very difficult to boost in the short term and I would not expect any one unit or quarter to make a significant difference. The average score there rose by 4 percentage points, from 48 to 52%.

Scores in the terminology and concept section were more encouraging. For material we had not covered until after the pre-test, the average score rose by 22 percentage points, from 53 to 75%. No surprise there either, though; it's hard to say how much credit we should give to SRS for that.

But there were also a number of questions about material we had already covered before the pretest. Being the earliest material, I might have expected some degradation in performance on the second test. Instead, the already strong average score in that section rose by an additional 3 percentage points, from 82 to 85%. (These numbers are less reliable because of the smaller number of questions, but they tell me Anki at least "locked in" the older knowledge, and may have strengthened it.)

Some other time, I might try reserving a section of content that I teach before the pre-test but don't make any Anki cards for. This would give me a way to compare Anki to an alternative review exercise.

### What about formal standardized tests?

I don't know yet. The scores aren't back. I'll probably be shown some "value added" analysis numbers at some point that tell me whether my students beat expectations, but I don't know how much that will tell me. My students were consistently beating expectations before Anki, and the state gave an entirely different test this year because of legislative changes. I'll go back and revise this paragraph if I learn anything useful.

### Those discussions...

If I'm trying to acquire a new skill, one of the first things I try to do is listen to skilled practitioners of that skill talk about it to each other. What are the terms-of-art? How do they use them? What does this tell me about how they see their craft? Their shorthand is a treasure trove of crystallized concepts; once I can use it the same way they do, I find I'm working at a level of abstraction much closer to theirs.

Similarly, I was hoping Anki could help make my students more fluent in the subject-specific lexicon that helps you score well in analytical essays. After introducing a new term and making the Anki card for it, I made extra efforts to use it conversationally. I used to shy away from that because so many students would have forgotten it immediately and tuned me out for not making any sense. Not this year. Once we'd seen the card, I used the term freely, with only the occasional reminder of what it meant. I started using multiple terms in the same sentence. I started talking about writing and analysis the way my fellow experts do, and so invited them into that world.

Even though I was already seeing written evidence that some of my high performers had assimilated the lexicon, the high quality discussions of these same students caught me off guard. You see, I usually dread whole-class discussions with non-honors classes because good comments are so rare that I end up dejectedly spouting all the insights I had hoped they could find. But by the end of the year, my students had stepped up.

I think what happened here was, as with the writing, as much a boost in confidence as a boost in fluency. Whatever it was, they got into some good discussions where they used the terminology and built on it to say smarter stuff.

Don't get me wrong. Most of my students never got to that point. But on average even small groups without smart kids had a noticeably higher level of discourse than I am used to hearing when I break up the class for smaller discussions.

### Limitations

SRS is inherently weak when it comes to the abstract and complex. No card I've devised enables a student to develop a distinctive authorial voice, or write essay openings that reveal just enough to make the reader curious. Yes, you can make cards about strategies for this sort of thing, but these were consistently my worst cards—the overly difficult "leeches" that I eventually suspended from my decks.

A less obvious limitation of SRS is that students with a very strong grasp of a concept often fail to apply that knowledge in more authentic situations. For instance, they may know perfectly well the difference between "there", "their", and "they're", but never pause to think carefully about whether they're using the right one in a sentence. I am very open to suggestions about how I might train my students' autonomous "System 1" brains to have "interrupts" for that sort of thing... or even just a reflex to go back and check after finishing a draft.

### Moving forward

I absolutely intend to continue using SRS in the classroom. Here's what I intend to do differently this coming school year:

• Reduce the number of cards by about 20%, to maybe 850-950 for the year in a given course, mostly by reducing the number of variations on some overexposed concepts.
• Be more willing to add extra Anki study sessions to stay better caught-up with the deck, even if this means my lesson content doesn't line up with class periods as neatly.
• Be more willing to press the red button on cards we need to re-learn. I think I was too hesitant here because we were rarely caught up as it was.
• Rework underperforming cards to be simpler and more fun.
• Use more simple cloze deletion cards. I only had a few of these, but they worked better than I expected for structured idea sets like, "characteristics of a tragic hero".
• Take a less linear and more opportunistic approach to introducing terms and concepts.
• Allow for more impromptu discussions where we bring up older concepts in relevant situations and build on them.
• Shape more of my lessons around the "vivid memory, card ready" philosophy.
• Continue to reduce needless student note-taking.
• Keep a close eye on 10th grade students who had me for 9th grade last year. I wonder how much they retained over the summer, and I can't wait to see what a second year of SRS will do for them.

## The Pre-Historical Fallacy

12 03 July 2015 08:21PM

One fallacy that I see frequently in works of popular science -- and also here on LessWrong -- is the belief that we have strong evidence of the way things were in pre-history, particularly when one is giving evidence that we can explain various aspects of our culture, psychology, or personal experience because we evolved in a certain way. Moreover, it is held implicit that because we have this 'strong evidence', it must be relevant to the topic at hand. While it is true that the environment did effect our evolution and thus the way we are today, evolution and anthropology of pre-historic societies is emphasized to a much greater extent than rational thought would indicate is appropriate.

As a matter of course, you should remember these points whenever you hear a claim about prehistory:

• Most of what we know (or guess) is based on less data than you would expect, and the publish or perish mentality is alive and well in the field of anthropology.
• Most of the information is limited and technical, which means that anyone writing for a popular audience will have strong motivation to generalize and simplify.
• It has been found time and time again that for any statement that we can make about human culture and behavior that there is (or was) a society somewhere that will serve as a counterexample.
• Very rarely do anthropologists or members of related fields have finely tuned critical thinking skills or a strong background on the philosophy of science, and are highly motivated to come up with interpretations of results that match their previous theories and expectations.

Results that you should have reasonable levels of confidence in should be framed in generalities, not absolutes. E.g., "The great majority of human cultures that we have observed have distinct and strong religious traditions", and not "humans evolved to have religion". It may be true that we have areas in our brain that evolved not only 'consistent with holding religion', but actually evolved 'specifically for the purpose of experiencing religion'... but it would be very hard to prove this second statement, and anyone who makes it should be highly suspect.

Perhaps more importantly, these statements are almost always a red herring. It may make you feel better that humans evolved to be violent, to fit in with the tribe, to eat meat, to be spiritual, to die at the age of thirty.... But rarely do we see these claims in a context where the stated purpose is to make you feel better. Instead they are couched in language indicating that they are making a normative statement -- that this is the way things in some way should be. (This is specifically the argumentum ad antiquitatem or appeal to tradition, and should not be confused with the historical fallacy, but it is certainly a fallacy).

It is fine to identify, for example, that your fear of flying has a evolutionary basis. However, it is foolish to therefore refuse to fly because it is unnatural, or to undertake gene therapy to correct the fear. Whether or not the explanation is valid, it is not meaningful.

Obviously, this doesn't mean that we shouldn't study evolution or the effects evolution has on behavior. However, any time you hear someone refer to this information in order to support any argument outside the fields of biology or anthropology, you should look carefully at why they are taking the time to distract you from the practical implications of the matter under discussion.

## Weekly LW Meetups

3 03 July 2015 06:05PM

This summary was posted to LW Main on June 26th. The following week's summary is here.

New meetups (or meetups with a hiatus of more than a year) are happening in:

Irregularly scheduled Less Wrong meetups are taking place in:

The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:

Locations with regularly scheduled meetups: Austin, Berkeley, Berlin, Boston, Brussels, Buffalo, Cambridge UK, Canberra, Columbus, London, Madison WI, Melbourne, Moscow, Mountain View, New York, Philadelphia, Research Triangle NC, Seattle, Sydney, Tel Aviv, Toronto, Vienna, Washington DC, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers.

## A Federal Judge on Biases in the Criminal Justice System.

20 03 July 2015 03:17AM

A well-known American federal appellate judge, Alex Kozinski, has written a commentary on systemic biases and institutional myths in the criminal justice system.

The basic thrust of his criticism will be familiar to readers of the sequences and rationalists generally. Lots about cognitive biases (but some specific criticisms of fingerprints and DNA evidence as well). Still, it's interesting that a prominent federal judge -- the youngest when appointed, and later chief of the Ninth Circuit -- would treat some sacred cows of the judiciary so ruthlessly.

This is specifically a criticism of U.S. criminal justice, but, ceteris paribus, much of it applies not only to other areas of U.S. law, but to legal practices throughout the world as well.

## MIRI needs an Office Manager (aka Force Multiplier)

16 03 July 2015 01:10AM

(Cross-posted from MIRI's blog.)

MIRI's looking for a full-time office manager to support our growing team. It’s a big job that requires organization, initiative, technical chops, and superlative communication skills. You’ll develop, improve, and manage the processes and systems that make us a super-effective organization. You’ll obsess over our processes (faster! easier!) and our systems (simplify! simplify!). Essentially, it’s your job to ensure that everyone at MIRI, including you, is able to focus on their work and Get Sh*t Done.

That’s a super-brief intro to what you’ll be working on. But first, you need to know if you’ll even like working here.

We’re a research nonprofit working on the critically important problem of superintelligence alignment: how to bring smarter-than-human artificial intelligence into alignment with human values.1 Superintelligence alignment is a burgeoning field, and arguably the most important and under-funded research problem in the world. Experts largely agree that AI is likely to exceed human levels of capability on most cognitive tasks in this century—but it’s not clear when, and we aren’t doing a very good job of preparing for the possibility. Given how disruptive smarter-than-human AI would be, we need to start thinking now about AI’s global impact. Over the past year, a number of leaders in science and industry have voiced their support for prioritizing this endeavor:

People are starting to discuss these issues in a more serious way, and MIRI is well-positioned to be a thought leader in this important space. As interest in AI safety grows, we’re growing too—we’ve gone from a single full-time researcher in 2013 to what will likely be a half-dozen research fellows by the end of 2015, and intend to continue growing in 2016.

All of which is to say: we really need an office manager who will support our efforts to hack away at the problem of superintelligence alignment!

If our overall mission seems important to you, and you love running well-oiled machines, you’ll probably fit right in. If that’s the case, we can’t wait to hear from you.

## What it’s like to work at MIRI

We try really hard to make working at MIRI an amazing experience. We have a team full of truly exceptional people—the kind you’ll be excited to work with. Here’s how we operate:

### Flexible Hours

We do not have strict office hours. Simply ensure you’re here enough to be available to the team when needed, and to fulfill all of your duties and responsibilities.

### Modern Work Spaces

Many of us have adjustable standing desks with multiple large external monitors. We consider workspace ergonomics important, and try to rig up work stations to be as comfortable as possible.

### Living in the Bay Area

We’re located in downtown Berkeley, California. Berkeley’s monthly average temperature ranges from 60°F in the winter to 75°F in the summer. From our office you’re:

• A 10-second walk to the roof of our building, from which you can view the Berkeley Hills, the Golden Gate Bridge, and San Francisco.
• A 30-second walk to the BART (Bay Area Rapid Transit), which can get you around the Bay Area.
• A 3-minute walk to UC Berkeley Campus.
• A 5-minute walk to dozens of restaurants (including ones in Berkeley’s well-known Gourmet Ghetto).
• A 30-minute BART ride to downtown San Francisco.
• A 30-minute drive to the beautiful west coast.
• A 3-hour drive to Yosemite National Park.

### Vacation Policy

Our vacation policy is that we don’t have a vacation policy. That is, take the vacations you need to be a happy, healthy, productive human. There are checks in place to ensure this policy isn’t abused, but we haven’t actually run into any problems since initiating the policy.

We consider our work important, and we care about whether it gets done well, not about how many total hours you log each week. We’d much rather you take a day off than extend work tasks just to fill that extra day.

### Regular Team Dinners and Hangouts

We get the whole team together every few months, order a bunch of food, and have a great time.

### Top-Notch Benefits

We provide top-notch health and dental benefits. We care about our team’s health, and we want you to be able to get health care with as little effort and annoyance as possible.

### Agile Methodologies

Our ops team follows standard Agile best practices, meeting regularly to plan, as a team, the tasks and priorities over the coming weeks. If the thought of being part of an effective, well-functioning operation gets you really excited, that’s a promising sign!

### Other Tidbits

• Moving to the Bay Area? We’ll cover up to \$3,500 in travel expenses.
• Use public transit to get to work? You get a transit pass with a large monthly allowance.
• All the snacks and drinks you could want at the office.
• You’ll get a smartphone and full plan.
• This is a salaried position. (That is, your job is not to sit at a desk for 40 hours a week. Your job is to get your important work done, even if this occasionally means working on a weekend or after hours.)

It can also be surprisingly motivating to realize that your day job is helping people explore the frontiers of human understanding, mitigate global catastrophic risk, etc., etc. At MIRI, we try to tackle the very largest problems facing humanity, and that can be a pretty satisfying feeling.

## What an office manager does and why it matters

Our ops team and researchers (and collection of remote contractors) are swamped making progress on the huge task we’ve taken on as an organization.

That’s where you come in. An office manager is the oil that keeps the engine running. They’re indispensable. Office managers are force multipliers: a good one doesn’t merely improve their own effectiveness—they make the entire organization better.

We need you to build, oversee, and improve all the “behind-the-scenes” things that ensure MIRI runs smoothly and effortlessly. You will devote your full attention to looking at the big picture and the small details and making sense of it all. You’ll turn all of that into actionable information and tools that make the whole team better. That’s the job.

Sometimes this looks like researching and testing out new and exciting services. Other times this looks like stocking the fridge with drinks, sorting through piles of mail, lugging bags of groceries, or spending time on the phone on hold with our internet provider. But don’t think that the more tedious tasks are low-value. If the hard tasks don’t get done, none of MIRI’s work is possible. Moreover, you’re actively encouraged to find creative ways to make the boring stuff more efficient—making an awesome spreadsheet, writing a script, training a contractor to take on the task—so that you can spend more time on what you find most exciting.

We’re small, but we’re growing, and this is an opportunity for you to grow too. There’s room for advancement at MIRI (if that interests you), based on your interests and performance.

You’ll have a wide variety of responsibilities, including, but not necessarily limited to, the following:

• Orienting and training new staff.
• Onboarding and offboarding staff and contractors.
• Managing employee benefits and services, like transit passes and health care.
• Payroll management; handling staff questions.
• Championing our internal policies and procedures wiki—keeping everything up to date, keeping everything accessible, and keeping staff aware of relevant information.
• Managing various services and accounts (ex. internet, phone, insurance).
• Championing our work space, with the goal of making the MIRI office a fantastic place to work.
• Running onsite logistics for introductory workshops.
• Processing all incoming mail packages.
• Researching and implementing better systems and procedures.

Your “value-add” is by taking responsibility for making all of these things happen. Having a competent individual in charge of this diverse set of tasks at MIRI is extremely valuable!

### A Day in the Life

A typical day in the life of MIRI’s office manager may look something like this:

• Come in.
• Process email inbox.
• Process any incoming mail, scanning/shredding/dealing-with as needed.
• Stock the fridge, review any low-stocked items, and place an order online for whatever’s missing.
• Onboard a new contractor.
• Spend some time thinking of a faster/easier way to onboard contractors. Implement any hacks you come up with.
• Notice that you’ve spent a few hours per week the last few weeks doing xyz. Spend some time figuring out whether you can eliminate the task completely, automate it in some way, outsource it to a service, or otherwise simplify the process.
• Review the latest post drafts on the wiki. Polish drafts as needed and move them to the appropriate location.
• Process email.
• Go home.

### You’re the one we’re looking for if:

• You are authorized to work in the US. (Prospects for obtaining an employment-based visa for this type of position are slim; sorry!)
• You can solve problems for yourself in new domains; you find that you don’t generally need to be told what to do.
• You love organizing information. (There’s a lot of it, and it needs to be super-accessible.)
• Your life is organized and structured.
• You enjoy trying things you haven’t done before. (How else will you learn which things work?)
• You’re way more excited at the thought of being the jack-of-all-trades than at the thought of being the specialist.
• You are good with people—good at talking about things that are going great, as well as things that aren’t.
• People thank you when you deliver difficult news. You’re that good.
• You can notice all the subtle and wondrous ways processes can be automated, simplified, streamlined… while still keeping the fridge stocked in the meantime.
• You know your way around a computer really well.
• Really, really well.
• You enjoy eliminating unnecessary work, automating automatable work, outsourcing outsourcable work, and executing on everything else.
• You want to do what it takes to help all other MIRI employees focus on their jobs.
• You’re the sort of person who sees the world, organizations, and teams as systems that can be observed, understood, and optimized.
• You think Sam is the real hero in Lord of the Rings.
• You have the strong ability to take real responsibility for an issue or task, and ensure it gets done. (This doesn’t mean it has to get done by you; but it has to get done somehow.)
• You celebrate excellence and relentlessly pursue improvement.

### Bonus Points:

• Your technical chops are really strong. (Dabbled in scripting? HTML/CSS? Automator?)
• Involvement in the Effective Altruism space.
• Involvement in the broader AI-risk space.
• Previous experience as an office manager.

### Experience & Education Requirements

• Let us know about anything that’s evidence that you’ll fit the bill.

## How to Apply

by July 31, 2015!

P.S. Share the love! If you know someone who might be a perfect fit, we’d really appreciate it if you pass this along!

1. More details on our About page.

## A Roadmap: How to Survive the End of the Universe

5 02 July 2015 11:01AM

In a sense, this plan needs to be perceived with irony because it is almost irrelevant: we have very small chances of surviving even next 1000 years and if we do, we have a lot of things to do before it becomes reality. And even afterwards, our successors will have completely different plans.

There is one important exception: there are suggestions that collider experiments may lead to a vacuum phase transition, which begins at one point and spreads across the visible universe. Then we can destroy ourselves and our universe in this century, but it would happen so quickly that we will not have time to notice it. (The term "universe" hereafter refers to the observable universe that is the three-dimensional world around us, resulting from the Big Bang.)

We can also solve this problem in next century if we create superintelligence.

The purpose of this plan is to show that actual immortality is possible: that we have an opportunity to live not just billions and trillions of years, but an unlimited duration. My hope is that the plan will encourage us to invest more in life extension and prevention of global catastrophic risks. Our life could be eternal and thus have meaning forever.

Anyway, the end of the observable universe is not an absolute end: it's just one more problem on which the future human race will be able to work. And even at the negligible level of knowledge about the universe that we have today, we are still able to offer more than 50 ideas on how to prevent its end.

In fact, to assemble and come up with these 50 ideas I spent about 200 working hours, and if I had spent more time on it, I'm sure I would have found many new ideas.  In the distant future we can find more ideas; choose the best of them; prove them, and prepare for their implementation.

First of all, we need to understand exactly what kind end to the universe we should expect in the natural course of things. There are many hypotheses on this subject, which can be divided into two large groups:

1. The universe is expected to have a relatively quick and abrupt end, known as the Big Crunch or Big Rip (accelerating expansion of the universe causes it to break apart), or the decay of the false vacuum. Vacuum decay can occur at any time; a Big Rip could happen in about 10-30 billion years, and the Big Crunch has hundreds of billions of years timescale.

2. Another scenario assumes an infinitely long existence of an empty, flat and cold universe which would experience so called "heat death" that is gradual halting of all processes and then disappearance of all matter.

The choice between these scenarios depends on the geometry of the universe, which is determined by the equations of general relativity and, – above all – the behavior of the almost unknown parameter: dark energy.

The recent discovery of dark energy has made Big Rip the most likely scenario, but it is clear that the picture of the end of the universe will change several times.

You can find more at: http://en.wikipedia.org/wiki/Ultimate_fate_of_the_universe

There are five general approaches to solve the end of the universe problem, each of them includes many subtypes shown in the map:

1.     Surf the Wave: Utilize the nature of the process which is ending the universe. (The most known of these type of solutions is Omega Point by Tippler, where the universe's energy collapse is used to make infinite calculations.)

2.     Go to parallel world

3.     Prevent the end of the universe

4.     Survive the end of the universe

5.     Dissolving the problem

Some of the ideas are on the level of the wildest possible speculations and I hope you will enjoy them.

The new feature of this map is that in many cases mentioned, ideas are linked to corresponding wiki pages in the pdf.

11 02 July 2015 01:55AM

This is part of a semi-monthly reading group on Eliezer Yudkowsky's ebook, Rationality: From AI to Zombies. For more information about the group, see the announcement post.

Welcome to the Rationality reading group. This week we discuss Part D: Mysterious Answers (pp. 117-191)This post summarizes each article of the sequence, linking to the original LessWrong post where available.

### C. Noticing Confusion

30. Fake Explanations - People think that fake explanations use words like "magic," while real explanations use scientific words like "heat conduction." But being a real explanation isn't a matter of literary genre. Scientific-sounding words aren't enough. Real explanations constrain anticipation. Ideally, you could explain only the observations that actually happened. Fake explanations could just as well "explain" the opposite of what you observed.

In schools, "education" often consists of having students memorize answers to specific questions (i.e., the "teacher's password"), rather than learning a predictive model that says what is and isn't likely to happen. Thus, students incorrectly learn to guess at passwords in the face of strange observations rather than admit their confusion. Don't do that: any explanation you give should have a predictive model behind it. If your explanation lacks such a model, start from a recognition of your own confusion and surprise at seeing the result.

You don't understand the phrase "because of evolution" unless it constrains your anticipations. Otherwise, you are using it as attire to identify yourself with the "scientific" tribe. Similarly, it isn't scientific to reject strongly superhuman AI only because it sounds like science fiction. A scientific rejection would require a theoretical model that bounds possible intelligences. If your proud beliefs don't constrain anticipation, they are probably just passwords or attire.

33. Fake Causality - It is very easy for a human being to think that a theory predicts a phenomenon, when in fact is was fitted to a phenomenon. Properly designed reasoning systems (GAIs) would be able to avoid this mistake with our knowledge of probability theory, but humans have to write down a prediction in advance in order to ensure that our reasoning about causality is correct.

There are certain words and phrases that act as "stopsigns" to thinking. They aren't actually explanations, or help to resolve the actual issue at hand, but they act as a marker saying "don't ask any questions."

The theory of vitalism was developed before the idea of biochemistry. It stated that the mysterious properties of living matter, compared to nonliving matter, was due to an "elan vital". This explanation acts as a curiosity-stopper, and leaves the phenomenon just as mysterious and inexplicable as it was before the answer was given. It feels like an explanation, though it fails to constrain anticipation.

The theory of "emergence" has become very popular, but is just a mysterious answer to a mysterious question. After learning that a property is emergent, you aren't able to make any new predictions.

The concept of complexity isn't meaningless, but too often people assume that adding complexity to a system they don't understand will improve it. If you don't know how to solve a problem, adding complexity won't help; better to say "I have no idea" than to say "complexity" and think you've reached an answer.

Positive bias is the tendency to look for evidence that confirms a hypothesis, rather than disconfirming evidence.

Facing a random scenario, the correct solution is really not to behave randomly. Faced with an irrational universe, throwing away your rationality won't help.

Traditional rationality (without Bayes' Theorem) allows you to formulate hypotheses without a reason to prefer them to the status quo, as long as they are falsifiable. Even following all the rules of traditional rationality, you can waste a lot of time. It takes a lot of rationality to avoid making mistakes; a moderate level of rationality will just lead you to make new and different mistakes.

There are no inherently mysterious phenomena, but every phenomenon seems mysterious, right up until the moment that science explains it. It seems to us now that biology, chemistry, and astronomy are naturally the realm of science, but if we had lived through their discoveries, and watched them reduced from mysterious to mundane, we would be more reluctant to believe the next phenomenon is inherently mysterious.

It's easy not to take the lessons of history seriously; our brains aren't well-equipped to translate dry facts into experiences. But imagine living through the whole of human history - imagine watching mysteries be explained, watching civilizations rise and fall, being surprised over and over again - and you'll be less shocked by the strangeness of the next era.

When you encounter something you don't understand, you have three options: to seek an explanation, knowing that that explanation will itself require an explanation; to avoid thinking about the mystery at all; or to embrace the mysteriousness of the world and worship your confusion.

Although science does have explanations for phenomena, it is not enough to simply say that "Science!" is responsible for how something works -- nor is it enough to appeal to something more specific like "electricity" or "conduction". Yet for many people, simply noting that "Science has an answer" is enough to make them no longer curious about how it works. In that respect, "Science" is no different from more blatant curiosity-stoppers like "God did it!" But you shouldn't let your interest die simply because someone else knows the answer (which is a rather strange heuristic anyway): You should only be satisfied with a predictive model, and how a given phenomenon fits into that model.

Any time you believe you've learned something, you should ask yourself, "Could I re-generate this knowledge if it were somehow deleted from my mind, and how would I do so?" If the supposed knowledge is just empty buzzwords, you will recognize that you can't, and therefore that you haven't learned anything. But if it's an actual model of reality, this method will reinforce how the knowledge is entangled with the rest of the world, enabling you to apply it to other domains, and know when you need to update those beliefs. It will have become "truly part of you", growing and changing with the rest of your knowledge.

Interlude: The Simple Truth

This has been a collection of notes on the assigned sequence for this week. The most important part of the reading group though is discussion, which is in the comments section. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!

The next reading will cover Part E: Overly Convenient Excuses (pp. 211-252). The discussion will go live on Wednesday, 15 July 2015 at or around 6 p.m. PDT, right here on the discussion forum of LessWrong.

5 01 July 2015 09:15PM

This is the monthly thread for posting media of various types that you've found that you enjoy. Post what you're reading, listening to, watching, and your opinion of it. Post recommendations to blogs. Post whatever media you feel like discussing! To see previous recommendations, check out the older threads.

Rules:

• Please avoid downvoting recommendations just because you don't personally like the recommended material; remember that liking is a two-place word. If you can point out a specific flaw in a person's recommendation, consider posting a comment to that effect.
• If you want to post something that (you know) has been recommended before, but have another recommendation to add, please link to the original, so that the reader has both recommendations.
• Use the "Other Media" thread if you believe the piece of media you want to discuss doesn't fit under any of the established categories.
• Use the "Meta" thread if you want to discuss about the monthly media thread itself (e.g. to propose adding/removing/splitting/merging subthreads, or to discuss the type of content properly belonging to each subthread) or for any other question or issue you may have about the thread or the rules.

## Analogical Reasoning and Creativity

20 01 July 2015 08:38PM

This article explores analogism and creativity, starting with a detailed investigation into IQ-test style analogy problems and how both the brain and some new artificial neural networks solve them.  Next we analyze concept map formation in the cortex and the role of the hippocampal complex in establishing novel semantic connections: the neural basis of creative insights.  From there we move into learning strategies, and finally conclude with speculations on how a grounded understanding of analogical creative reasoning could be applied towards advancing the art of rationality.

1. Introduction
2. Under the Hood
3. Conceptual Abstractions and Cortical Maps
4. The Hippocampal Association Engine
5. Cultivate memetic heterogeneity and heterozygosity
6. Construct and maintain clean conceptual taxonomies
7. Conclusion

#### Introduction

The computer is like a bicycle for the mind.

-- Steve Jobs

The kingdom of heaven is like a mustard seed, the smallest of all seeds, but when it falls on prepared soil, it produces a large plant and becomes a shelter for the birds of the sky.

-- Jesus

Sigmoidal neural networks are like multi-layered logistic regression.

-- various

The threat of superintelligence is like a tribe of sparrows who find a large egg to hatch and raise.  It grows up into a great owl which devours them all.

-- Nick Bostrom (see this video)

Analogical reasoning is one of the key foundational mechanisms underlying human intelligence, and perhaps a key missing ingredient in machine intelligence.  For some - such as Douglas Hofstadter - analogy is the essence of cognition itself.[1]

Steve Job's bicycle analogy is clever because it encapsulates the whole cybernetic idea of computers as extensions of the nervous system into a single memorable sentence using everyday terms.

A large chunk of Jesus's known sayings are parables about the 'Kingdom of Heaven': a complex enigmatic concept that he explains indirectly through various analogies, of which the mustard seed is perhaps the most memorable.  It conveys the notions of exponential/sigmoidal growth of ideas and social movements (see also the Parable of the Leaven), while also hinting at greater future purpose.

In a number of fields, including the technical, analogical reasoning is key to creativity: most new insights come from establishing mappings between or with concepts from other fields or domains, or from generalizing existing insights/concepts (which is closely related).  These abilities all depend on deep, wide, and well organized internal conceptual maps.

In a previous post, I presented a high level working hypothesis of the brain as a biological implementation of a universal learning machine, using various familiar computational concepts as analogies to explain brain subsystems.  In my last post, I used the conceptions of unfriendly superintelligence and value alignment as analogies for market mechanism design and the healthcare problem (and vice versa).

A clever analogy is like a sophisticated conceptual compressor that helps maximize knowledge transmission.  Coming up with good novel analogies is hard because it requires compressing a complex large body of knowledge into a succinct message that heavily exploits the recipient's existing knowledgebase.  Due to the deep connections between compression, inference, intelligence and creativity, a deeper investigation of analogical reasoning is useful from a variety of angles.

It is the hard task of coming up with novel analogical connections that can lead to creative insights, but to understand that process we should start first with the mechanics of recognition.

#### Under the Hood

You can think of the development of IQ tests as a search for simple tests which have high predictive power for g-factor in humans, while being relatively insensitive to specific domain knowledge.  That search process resulted in a number of problem categories, many of which are based on verbal and mathematical analogies.

The image to the right is an example of a simple geometric analogy problem.  As an experiment, start a timer before having a go at it.  For bonus points, attempt to introspect on your mental algorithm.

Solving this problem requires first reducing the images to simpler compact abstract representations.  The first rows of images then become something like sentences describing relations or constraints (Z is to ? as A is to B and C is to D).  The solution to the query sentence can then be found by finding the image which best satisfies the likely analogous relations.

Imagine watching a human subject (such as your previous self) solve this problem while hooked up to a future high resolution brain imaging device.  Viewed in slow motion, you would see the subject move their eyes from location to location through a series of saccades, while various vectors or mental variable maps flowed through their brain modules.  Each fixation lasts about 300ms[2], which gives enough time for one complete feedforward pass through the dorsal vision stream and perhaps one backwards sweep.

The output of the dorsal stream in inferior temporal cortex (TE on the bottom) results in abstract encodings which end up in working memory buffers in prefrontal cortex.  From there some sort of learned 'mental program' implements the actual analogy evaluations, probably involving several more steps in PFC, cingulate cortex, and various other cortical modules (coordinated by the Basal Ganglia and PFC). Meanwhile the eye frontal fields and various related modules are computing the next saccade decision every 300ms or so.

If we assume that visual parsing requires one fixation on each object and 50ms saccades, this suggests that solving this problem would take a typical brain a minimum of about 4 seconds (and much longer on average).  The minimum estimate assumes - probably unrealistically - that the subject can perform the analogy checks or mental rotations near instantly without any backtracking to help prime working memory.  Of course faster times are also theoretically possible - but not dramatically faster.

These types of visual analogy problems test a wide set of cognitive operations, which by itself can explain much of the correlation with IQ or g-factor: speed and efficiency of neural processing, working memory, module communication, etc.

However once we lay all of that aside, there remains a core dependency on the ability for conceptual abstraction.  The mapping between these simple visual images and their compact internal encodings is ambiguous, as is the predictive relationship.  Solving these problems requires the ability to find efficient and useful abstractions - a general pattern recognition ability which we can relate to efficient encoding, representation learning, and nonlinear dimension reduction: the very essence of learning in both man and machine[3].

The machine learning perspective can help make these connections more concrete when we look into state of the art programs for IQ tests in general and analogy problems in particular.  Many of the specific problem subtypes used in IQ tests can be solved by relatively simple programs.  In 2003, Sange and Dowe created a simple Perl program (less than 1000 lines of code) that can solve several specific subtypes of common IQ problems[4] - but not analogies.  It scored an IQ of a little over 100, simply by excelling in a few categories and making random guesses for the remaining harder problem types.  Thus its score is highly dependent on the test's particular mix of subproblems, but that is also true for humans to some extent.

The IQ test sub-problems that remain hard for computers are those that require pattern recognition combined with analogical reasoning and or inductive inference.  Precise mathematical inductive inference is easier for machines, whereas humans excel at natural reasoning - inference problems involving huge numbers of variables that can only be solved by scalable approximations.

For natural language tasks, neural networks have recently been used to learn vector embeddings which map words or sentences to abstract conceptual spaces encoded as vectors (typically of dimensionality 100 to 1000).  Combining word vector embeddings with some new techniques for handling multiple word senses, Wang and Gao et al just recently trained a system that can solve typical verbal reasoning problems from IQ tests (or the GRE) at upper human level - including verbal analogies[5].

The word vector embedding is learned as a component of an ANN trained via backprop on a large corpus of text data - Wikipedia.  This particular model is rather complex: it combines a multi-sense word embedding, a local sliding window prediction objective, task-specific geometric objectives, and relational regularization constraints.  Unlike the recent crop of general linguistic modeling RNNs, this particular system doesn't model full sentence structure or longer term dependencies - as those aren't necessary for answering these specific questions.  Surprisingly all it takes to solve the verbal analogy problems typical of IQ/SAT/GRE style tests are very simple geometric operations in the word vector space - once the appropriate embedding is learned.

As a trivial example: "Uncle is to Aunt as King is to ?" literally reduces to:

Uncle + X = Aunt, King + X = ?, and thus X = Aunt-Uncle, and:

? = King + (Aunt-Uncle).

The (Aunt-Uncle) expression encapsulates the concept of 'femaleness', which can be combined with any male version of a word to get the female version.  This is perhaps the simplest example, but more complex transformations build on this same principle.  The embedded concept space allows for easy mixing and transforms of memetic sub-features to get new concepts.

#### Conceptual Abstractions and Cortical Maps

The success of these simplistic geometric transforms operating on word vector embeddings should not come as a huge surprise to one familiar with the structure of the brain.  The brain is extraordinarily slow, so it must learn to solve complex problems via extremely simple and short mental programs operating on huge wide vectors.  Humans (and now convolutional neural networks) can perform complex visual recognition tasks in just 10-15 individual computational steps (150 ms), or 'cortical clock cycles'.  The entire program that you used to solve the earlier visual analogy problem probably took on the order of a few thousand cycles (assuming it took you a few dozen seconds).  Einstein solved general relativity in - very roughly - around 10 billion low level cortical cycles.

The core principle behind word vector embeddings, convolutional neural networks, and the cortex itself is the same: learning to represent the statistical structure of the world by an efficient low complexity linear algebra program (consisting of local matrix vector products and per-element non-linearities).  The local wiring structure within each cortical module is equivalent to a matrix with sparse local connectivity, optimized heavily for wiring and computation such that semantically related concepts cluster close together.

(Concept mapping the cortex, from this research page)

The image above is from the paper "A Continous Semantic Space Describes the Representation of Thousands of Object and Action Categories across the Human Brain" by Huth et al.[5] They used fMRI to record activity across the cortex while subjects watched annotated video clips, and then used that data to find out roughly what types of concepts each voxel of cortex responds to.  It correctly identifies the FFA region as specializing in people-face things and the PPA as specializing in man-made objects and buildings.  A limitation of the above image visualizations is that they don't show response variance or breadth, so the voxel colors are especially misleading for lower level cortical regions that represent generic local features (such as gabor edges in V1).

The power of analogical reasoning depends entirely on the formation of efficient conceptual maps that carve reality at the joints.  The visual pathway learns a conceptual hierarchy that builds up objects from their parts: a series of hierarchical has-a relationships encoded in the connections between V1, V2, V4 and so on.  Meanwhile the semantic clustering within individual cortical maps allows for fast computations of is-a relationships through simple local pooling filters.

An individual person can be encoded as a specific active subnetwork in the face region, and simple pooling over a local cluster of neurons across the face region can then compute the presence of a face in general.  Smaller local pooling filters with more specific shapes can then compute the presence of a female or male face, and so on - all starting from the full specific feature encoding.

The pooling filter concept has been extensively studied in the lower levels of the visual system, where 'complex' cells higher up in V1 pool over 'simple' cell features: abstracting away gabor edges at specific positions to get edges OR'd over a range of positions (CNNs use this same technique to gain invariance to small local translations).

This key semantic organization principle is used throughout the cortex: is-a relations and more general abstractions/invariances are computed through fast local intramodule connections that exploit the physical semantic clustering on the cortical surface, and more complex has-a relations and arbitrary transforms (ex: mapping between an eye centered coordinate basis and a body centered coordinate basis) are computed through intermodule connections (which also exploit physical clustering).

#### The Hippocampal Association Engine

The Hippocampus is a tubular seahorse shaped module located in the center of the brain, to the exterior side of the central structures (basal ganglia, thalamus).  It is the brain's associative database and search engine responsible for storing, retrieving, and consolidating patterns and declarative memories (those which we are consciously aware of and can verbally declare) over long time scales beyond the reach of short term memory in the cortex itself.

A human (or animal) unfortunate enough to suffer complete loss of hippocampal functionality basically loses the ability to form and consolidate new long term episodic and semantic memories.  They also lose more recent memories that have not yet been consolidated down the cortical hierarchy.  In rats and humans, problems in the hippocampal complex can also lead to spatial navigation impairments (forgetting current location or recent path), as the HC is used to compute and retrieve spatial map information associated with current sensory impressions (a specific instance of the HC's more general function).

In terms of module connectivity, the hippocampal complex sits on top of the cortical sensory hierarchy.  It receives inputs from a number of cortical modules, largely in the nearby associative cortex, which collectively provide a summary of the recent sensory stream and overall brain state.  The HC then has several sub circuits which further compress the mental summary into something like a compact key which is then sent into a hetero-auto-associative memory circuit to find suitable matches.

If a good match is found, it can then cause retrieval: reactivation of the cortical subnetworks that originally formed the memory.  As the hippocampus can't know for sure which memories will be useful in the future, it tends to store everything with emphasis on the recent, perhaps as a sort of slow exponentially fading stream.  Each memory retrieval involves a new decoding and encoding to drive learning in the cortex through distillation/consolidation/retraining (this also helps prevent ontological crisis).  The amygdala is a little cap on the edge of the hippocampus which connects to the various emotion subsystems and helps estimate the importance of current memories for prioritization in the HC.

A very strong retrieval of an episodic memory causes the inner experience of reliving the past (or imagining the future), but more typical weaker retrievals (those which load information into the cortex without overriding much of the existing context) are a crucial component in general higher cognition.

In short the computation that the HC performs is that of dynamic association between the current mental pattern/state loaded into short term memory across the cortex and some previous mental pattern/state.  This is the very essence of creative insight.

Associative recall can be viewed as a type of pattern recognition with the attendant familiar tradeoffs between precision/recall or sensitivity/specificity.  At the extreme of low recall high precision the network is very conservative and risk averse: it only returns high confidence associations, maximizing precision at the expense of recall (few associations found, many potentially useful matches are lost).  At the other extreme is the over-confident crazy network which maximizes recall at the expense of precision (many associations are made, most of which are poor).  This can also be viewed in terms of the exploitation vs exploration tradeoff.

This general analogy or framework - although oversimplified - also provides a useful perspective for understanding both schizotypy and hallucinogenic drugs.  There is a large body of accumulated evidence in the form of use cases or trip reports, with a general consensus that hallucinogens can provide occasional flashes of creative insight at the expense of pushing one farther towards madness.

From a skeptical stance, using hallucinogenic drugs in an attempt to improve the mind is like doing surgery with butter-knives.  Nonetheless, careful exploration of the sanity border can help one understand more on how the mind works from the inside.

Cannabis in particular is believed - by many of its users - to enhance creativity via occasional flashes of insight.  Most of its main mental effects: time dilation, random associations, memory impairment, spatial navigation impairment, etc appear to involve the hippocampus.  We could explain much of this as a general shift in the precision/recall tradeoff to make the hippocampus less selective.  Mainly that makes the HC just work less effectively, but it also can occasionally lead to atypical creative insights, and appears to elevate some related low level measures such as schizotypy and divergent thinking[7].  The tradeoff is one must be willing to first sift through a pile of low value random associations.

#### Cultivate memetic heterogeneity and heterozygosity

Fluid intelligence is obviously important, but in many endeavors net creativity is even more important.

Of all the components underlying creativity, improving the efficiency of learning, the quality of knowledge learned, and the organizational efficiency of one's internal cortical maps are probably the most profitable dimensions of improvement: the low hanging fruits.

Our learning process is largely automatic and subconscious : we do not need to teach children how to perceive the world.  But this just means it takes some extra work to analyze the underlying machinery and understand how to best utilize it.

Over long time scales humanity has learned a great deal on how to improve on natural innate learning: education is more or less learning-engineering.  The first obvious lesson from education is the need for curriculum: acquiring concepts in stages of escalating complexity and order-dependency (which of course is already now increasingly a thing in machine learning).

In most competitive creative domains, formal education can only train you up to the starting gate.  This of course is to be expected, for the creation of novel and useful ideas requires uncommon insights.

Memetic evolution is similar to genetic evolution in that novelty comes more from recombination than mutation.  We can draw some additional practical lessons from this analogy: cultivate memetic heterogeneity and heterozygosity.

The first part - cultivate memetic heterogeneity - should be straightforward, but it is worth examining some examples.  If you possess only the same baseline memetic population as your peers, then the chances of your mind evolving truly novel creative combinations are substantially diminished.  You have no edge - your insights are likely to be common.

To illustrate this point, let us consider a few examples:

Geoffrey Hinton is one of the most successful researchers in machine learning - which itself is a diverse field.  He first formally studied psychology, and then artificial intelligence.  His various 200 research publications integrate ideas from statistics, neuroscience and physics.  His work on boltzmann machines and variants in particular imports concepts from statistical physics whole cloth.

Before founding DeepMind (now one of the premier DL research groups in the world), Demis Hassabis studied the brain and hippocampus in particular at the Gatsby Computational Neuroscience Unit, and before that he worked for years in the video game industry after studying computer science.

Before the Annus Mirabilis, Einstein worked at the patent office for four years, during which time he was exposed to a large variety of ideas relating to the transmission of electric signals and electrical-mechanical synchronization of time, core concepts which show up in his later thought experiments.[8]

Creative people also tend to have a diverse social circle of creative friends to share and exchange ideas across fields.

#### Genetic heterozygosity is the quality of having two different alleles at a gene locus; summed over the organism this leads to a different but related concept of diversity.

Within developing fields of knowledge we often find key questions or subdomains for which there are multiple competing hypotheses or approaches.  Good old fashioned AI vs Connectionism, Ray tracing vs Rasterization, and so on.

In these scenarios, it is almost always better to understand both viewpoints or knowledge clusters - at least to some degree.  Each cluster is likely to have some unique ideas which are useful for understanding the greater truth or at the very least for later recombination.

This then is memetic heterozygosity.  It invokes the Jain version of the blind men and the elephant.

#### Construct and maintain clean conceptual taxonomies

Formal education has developed various methods and rituals which have been found to be effective through a long process of experimentation.  Some of these techniques are still quite useful for autodidacts.

When one sets out to learn, it is best to start with a clear goal.  The goal of high school is just to provide a generalist background.  In college one then chooses a major suitable for a particular goal cluster: do you want to become a computer programmer? a physicist? a biologist? etc.  A significant amount of work then goes into structuring a learning curriculum most suitable for these goal types.

Once out of the educational system we all end up creating our own curriculums, whether intentionally or not.  It can be helpful to think strategically as if planning a curriculum to suit one's longer term goals.

For example, about four years ago I decided to learn how the brain works and how AGI could be built in particular.  When starting on this journey, I had a background mainly in computer graphics, simulation, and game related programming.  I decided to focus about equally on mainstream AI, machine learning, computational neuroscience, and the AGI literature.  I quickly discovered that my statistics background was a little weak, so I had to shore that up.  Doing it all over again I may have started with a statistics book.  Instead I started with AI: a modern approach (of course I mostly learn from the online research literature).

Learning works best when it is applied.  Education exploits this principle and it is just as important for autodidactic learning.  The best way to learn many math or programming concepts is learning by doing, where you create reasonable subtasks or subgoals for yourself along the way.

For general knowledge, application can take the form of writing about what you have learned.  Academics are doing this all the time as they write papers and textbooks, but the same idea applies outside of academia.

In particular a good exercise is to imagine that you need to communicate all that you have learned about the domain.  Imagine that you are writing a textbook or survey paper for example, and then you need to compress all that knowledge into a summary chapter or paper, and then all of that again down into an abstract.  Then actually do write up a summary - at least in the form of a blog post (even if you don't show it to anybody).

The same ideas apply on some level to giving oral presentations or just discussing what you have learned informally - all of which are also features of the academic learning environment.

Early on, your first attempts to distill what you have learned into written form will be ... poor.  But doing this process forces you to attempt to compress what you have learned, and thus it helps encourage the formation of well structured concept maps in the cortex.

A well structured conceptual map can be thought of as a memetic taxonomy.  The point of a taxonomy is to organize all the invariances and 'is-a' relationships between objects so that higher level inferences and transformations can generalize well across categories.

Explicitly asking questions which probe the conceptual taxonomy can help force said structure to take form.  For example in computer science/programming the question: "what is the greater generalization of this algorithm?" is a powerful tool.

In some domains, it may even be possible to semi-automate or at least guide the creative process using a structured method.

For example consider sci-fi/fantasy genre novels.  Many of the great works have a general analogical structure based on real history ported over into a more exotic setting.  The foundation series uses the model of the fall of the roman empire.  Dune is like Lawrence of Arabia in space.  Stranger in a Strange Land is like the Mormon version of Jesus the space alien, but from Mars instead of Kolob.  A Song of Fire and Ice is partly a fantasy port of the war of the roses.  And so on.

One could probably find some new ideas for novels just by creating and exploring a sufficiently large table of historical events and figures and comparing it to a map of the currently colonized space of ideas.  Obviously having an idea for a novel is just the tiniest tip of the iceberg in the process, but a semi-formal method is interesting nonetheless for brainstorming and applies across domains (others have proposed similar techniques for generating startup ideas, for example).

#### Conclusion

We are born equipped with sophisticated learning machinery and yet lack innate knowledge on how to use it effectively - for this too we must learn.

The greatest constraint on creative ability is the quality of conceptual maps in the cortex.  Understanding how these maps form doesn't automagically increase creativity, but it does help ground our intuitions and knowledge about learning, and could pave the way for future improved techniques.

In the meantime: cultivate memetic heterogeneity and heterozygosity, create a learning strategy, develop and test your conceptual taxonomy, continuously compress what you learn by writing and summarizing, and find ways to apply what you learn as you go.

## Stupid Questions July 2015

4 01 July 2015 07:13PM

This thread is for asking any questions that might seem obvious, tangential, silly or what-have-you. Don't be shy, everyone has holes in their knowledge, though the fewer and the smaller we can make them, the better.

Please be respectful of other people's admitting ignorance and don't mock them for it, as they're doing a noble thing.

## [link] FLI's recommended project grants for AI safety research announced

17 01 July 2015 03:27PM

http://futureoflife.org/misc/2015awardees

You may recognize several familiar names there, such as Paul Christiano, Benja Fallenstein, Katja Grace, Nick Bostrom, Anna Salamon, Jacob Steinhardt, Stuart Russell... and me. (the \$20,000 for my project was the smallest grant that they gave out, but hey, I'm definitely not complaining. ^^)

3 01 July 2015 12:45PM

Has anyone tried advertising existential risk?

Bostroms "End of Humanity" talk for instance.

It costs about 0.2 \$ per view for a video ad on YouTube, so if 0.2% of viewers give an average of 100 \$ it would break even. Hopefully people would give more than that.

You can target ads to groups likely to give much by the way, like the highly educated.

I posted this suggestion in the open thread as well, before I had the karma to make a thread. That okay?

## Rational Discussion of Controversial Topics

8 01 July 2015 11:15AM

Two months ago we began testing an experimental website for Rational Discussion of Politics.  Our main goal was to create a platform that would allow high quality discussion of controversial topics without resorting to any forms of censorship. The website is now ready and new members are welcome to join the discussions.

Many thanks to all the LessWrong members who have been taking part in this project.

P.S.  A note to new users.  A key feature of the new website is the automated recommendation system which evaluates all comments and articles based on their potential interest for each user. The recommendation system has passed the initial calibration, but its ongoing performance is sensitive to the number of user ratings per comment/article. So rating posts that you read is highly encouraged.

## Solving sleep: just a toe-dipping

33 30 June 2015 07:38PM
[For the past few months I’ve been undertaking a mostly independent study of sleep, and looking to build a coherent model of what sleep does and find ways to optimize it. I’d like to write a series of posts outlining my findings and hypotheses. I’m not sure if this is the best venue for such a project, and I’d like to gauge community interest. This first post is a brief overview of one important aspect of sleep, with a few related points of recommendation, to provide some background knowledge.]

In the quest to become more effective and productive, sleep is an enormously important process to optimize. Most of us spend (or at least think we should spend) 7.5 to 8.5 hours in bed every night, a third of a 24 hour day. Not sleeping well and not sleeping sufficiently have known and large drawbacks, including decreased attention, greater irritability, depressed immune function, and generally weakened cognitive ability. If you’re looking for more time, either for subjective life-extension, or so that you can get more done in a day, taking steps to sleep most efficiently, so as to not spend more than the required amount of time in bed and to get the full benefit of the rest, is of high value.

Understanding the inner mechanisms of this process, can let us work around them. Sleep, baffling as it is (and it is extremely baffling), is not a black box. Knowing how it works, you can organize your behavior to accommodate the world as it is, just as taking advantage of the principles of aerodynamics, thrust, and lift, enables one to build an airplane.

The most important thing to know about sleep and wakefulness is that it is the result of a dual process: how alert a person feels is determined by two different and opposite functions. The first is termed  the homeostatic sleep drive (also, homeostatic drive, sleep load, sleep pressure, and process S), which is determined solely by how long it has been since an individual has last slept fully. The longer he/she’s been awake, the greater his/her sleep drive. It is the brain's biological need to sleep. Just as sufficient need for calories produces hunger, sufficient sleep-drive produces sleepiness. Sleeping decreases sleep drive, and sleep drive drops faster (when sleeping) then it rises (when awake).

Neuroscience is complicated, but it seems the chemical correlate of sleep drive is the build-up of adenosine in the basal forebrain and this is used as the brain’s internal measure of how badly one needs sleep.1 (Caffeine makes us feel alert by competing with adenosine for bonding sites and thereby inhibiting reuptake.)

This is only half the story, however. Adenosine levels are much higher (and sleep drive correspondingly lower) in the evening, when one has been awake for a while, than in the middle of the night, when one has just slept for several hours. If sleepiness were only determined by sleep drive, you would have a much more fragmented sleep: sleeping several times during the day, and waking up several times during the night. Instead, humans typically stay awake through the day, and sleep through the whole night. This is due to the second influence on wakefulness: the circadian alerting signal.

For most of human history, there was little that could be done at night. Darkness made it much more difficult to hunt or gather than it was during day. Given that the brain requires some fraction of the nychthemeron (meaning a 24-hour period) asleep, it is evolutionarily preferable to concentrate that fraction of of the nychthemeron in the nighttime, freeing the day to do other things. For this reason, there is also a cyclical component to one’s alertness: independent of how long it has been since an individual has slept, there will be times in the nychthemeron when he/she will feel more or less tired.

Roughly, the circadian alerting signal (also known as process C) counters the sleep-drive, so that as sleep drive builds up during the day, alertness stays constant, and as sleep drive increases over the course of the night, the individual will stay asleep.

The alerting signal is synchronized to circadian rhythms, which are in turn attuned to light exposure. The circadian clock is set so that the alerting signal begins to increase again (after a night of sleep) at the time when the optic nerve is first exposed to light in the morning (or rather, when the the optic nerve has habitually been first exposed to light, since it takes up to a week to reset circadian rhythms), and increases with the sleep drive until about 14 hours later (from the point that the alerting signal started rising).

This is why if you pull an “all-nighter” you might find it difficult to fall asleep during the following day, even if you feel exhausted. Your sleep drive is high, but the alerting signal is triggering wakefulness, which makes it hard to fall asleep.

For unknown reasons, there is a dip in the circadian alerting about 8 hours after the beginning of the cycle. This is why people sometimes experience that “2:30 feeling.” This is also the time at which biphasic cultures typically have an afternoon siesta. This is useful to know, because this is the best time to take a nap if you want to make up sleep missed the night before.

The neurochemistry of the circadian alerting signal is more complex than that of the sleep drive, but one of the key chemicals of process C is melatonin, which is secreted by the pineal gland about 12 hours after the start of the circadian cycle (two hours before habitual bedtime). It is mildly sleep-inducing.

This is why taking melatonin tablets before is recommended by gwern and others. I second this recommendation. Though not FDA-approved, there seem to be little in the way of negative side effects and they make it much easier to fall asleep.

The natural release of melatonin is inhibited by light, and in particular blue light (which is why it is beneficial applications to red-shift the light of their computer screens, like flux or reds.shift, or wear red-tinted goggles, before bed). By limiting light exposure in the late evening you allow natural melatonin secretion, which both stimulates sleep and prevents the circadian clock from shifting (which would make it even more difficult to fall asleep the following night). Recent studies have shown bright screens ant night do demonstrably disrupt sleep.2

## Effective Altruism vs Missionaries? Advice Requested from a Newly-Built Crowdfunding Platform.

3 30 June 2015 05:39PM

Hi, I'm developing a next-generation crowdfunding platform for non-profit fundraising. From what we have seen, it is aeffective tool, more about it below. I'm working with two other cofounders, both of whom are evangelical Christians. We get along well in general, but that I strongly believe in effective altruism and they do not.

We will launch a second pilot fundraising campaign in 2-3 weeks. My co-founders have arranged for us fund raise for is a "church planting" missionary organization. This is so opposed my belief in effective altruism I feel uncomfortable using our effective tool to funnel donors' dollars in THIS of all directions. This is not the reason I got involved in this project.

My argument with them is that we should charge more to ineffective nonprofits such as colleges, religious, or political organizations, and use that extra to subsidize the campaign and money-processing costs of the effective non-profits. I think this is logically consistent with earning to give. But I am being outvoted two-to-one by people who believe saving lives and saving souls are nearly equally important.

So I have two requests:

1. If anyone has advise on how to navigate this (including any especially well written articles that would appeal to evangelical Christians, or experience negotiating with start-up cofounders).

2. If anyone has personal connections with effective or effective-ish non-profits, I would much prefer to fundraise for them than my co-founder's church connections. Caveat: the org must have US non-profit legal status.

About the platform: the gist our concept is that we're using a lot of psychology and biases and altruism research to nudge more people towards giving and also nudge them towards a sustained involvement with the nonprofit in question. We're using some of the tricks that made the ice bucket challenge so successful (but with added accountability to ensure that visible involvement actually leads to monetary donations). Users can pledge money contingent on their friend's involvement, which motivates people in the same way that matching donations motivate people. Giving is very visible, and people are more likely to give if they see friends giving. Friends are making the request for funding, which creates a sense of personal connection. Each person's mini-campaign has an involvement goal and a time limit (3 friends in 3 days) to create a sense of urgency. The money your friends donate visibly increases your impact so it also feel like getting money from nothing - a \$20 pledge can become hundreds of dollars. We nudge people towards automated smaller monthly reoccurring gifts. We try to minimize the number of barriers to making a donation (less steps, fewer fields).

## Selecting vs. grooming

5 30 June 2015 10:48AM

Content warning: meta-political, with hopefully low mind-killer factor.

Epistemic status: proposal for brain-storming.

- Representative democracies select political leaders. Monarchies and aristocracies groom political leaders for the job from childhood. (Also, to a certain extent they breed them for the job.)

- Capitalistic competition selects economic elites. Heritable landowning aristocracies groom economic elites from childhood. (Again, they also breed them.)

- A capitalist employer selects an accountant from a pool of 100 applicants. A feudal lord would groom a serf boy who has a knack for horses into the job of the adult stable man.

It seems a lot like selecting is better than grooming. After it is the modern way and hardly anyone would argue capitalism doesn't have a higher economic output than feudalism and so on.

But... since it was such a hugely important difference through history, perhaps, it was one of the things that really defined the modern world because it determines the whole social structure of societies past and present, that I think it should deserve some investigation. There may be something more interesting lurking here than just saying selection/testing won over grooming, period.

1) Can aspects of grooming as opposed to selecting/testing be steelmanned, are there corner cases when it could be better?

2) A pre-modern, medievalish society that nevertheless used a lot of selection/testing was China - I am thinking about the famous mandarin exams. Does this seem to have had any positive effect on China compared to other similar societies? I.e. is this even like that it is a big factor in the general outcomes of 2015 West vs. 1515 West? Comparing old China with similar medievalish but not selectionist (but inheritance based) societies would be useful for isolating this factor, right?

3) Why exactly does selecting and testing work better than grooming (and breeding) ?

4) Is it possible it works better because people do the breeding (intelligent people tend to marry intelligent people etc.) and grooming (a child of doctors will have an entirely different upbringing than a child of manual laborers) on their own, thus the social system does not have to do it, it is enough / better for the social system to do the selection, to do the testing of the success of the at-home grooming?

5) Any other interesting insight or reference?

Note: this is NOT about meritocracy vs. aristocracy. It is about two different kinds of meritocracy - where you either select, test people for merit (through market competition or elections) but you don't care much how to _build_ people who  will have merit vs. an aristocratic meritocracy where you largely focus on breeding and grooming people into the kinds who will have merit, and don't focus on selecting and testing so much.

Note 2: is this even possible this is a false dichotomy? One could argue that Western society is chock full of features for breeding and grooming people, there are dating sites for specific groups of people, there are tons of helping resources parents can draw on, kids spend 15-20 years at school and so on, so the breeding and grooming is done all right, I am just being misled here by mere names. Such as the name democracy: it is a selection process, but who wins depends on breeding and grooming. Such as market competition: those best bred and groomed have the highest chance. Is it simply so that selection is more noticable than grooming, it gets more limelight, but we actually do both? If yes, why does selection get more limelight than grooming? Why do we talk about elections more than about how to groom a child into being a politician, or why do we talk about market competition more than how to groom a child into the entrepreneur who aces competition? If modern society uses both, why is selection in the public spotlight while grooming just being something happening at home and school and not so noticeable? (To be fair, on LW, we talk more about how to test hypotheses than how to formulate them. Is this potentially related? People are just more interested in testing than building, be that hypotheses or people?)

## Top 9+1 myths about AI risk

37 29 June 2015 08:41PM

Following some somewhat misleading articles quoting me, I thought Id present the top 10 myths about the AI risk thesis:

1. That we’re certain AI will doom us. Certainly not. It’s very hard to be certain of anything involving a technology that doesn’t exist; we’re just claiming that the probability of AI going bad isn’t low enough that we can ignore it.
2. That humanity will survive, because we’ve always survived before. Many groups of humans haven’t survived contact with more powerful intelligent agents. In the past, those agents were other humans; but they need not be. The universe does not owe us a destiny. In the future, something will survive; it need not be us.
3. That uncertainty means that you’re safe. If you’re claiming that AI is impossible, or that it will take countless decades, or that it’ll be safe... you’re not being uncertain, you’re being extremely specific about the future. “No AI risk” is certain; “Possible AI risk” is where we stand.
4. That Terminator robots will be involved. Please? The threat from AI comes from its potential intelligence, not from its ability to clank around slowly with an Austrian accent.
5. That we’re assuming the AI is too dumb to know what we’re asking it. No. A powerful AI will know what we meant to program it to do. But why should it care? And if we could figure out how to program “care about what we meant to ask”, well, then we’d have safe AI.
6. That there’s one simple trick that can solve the whole problem. Many people have proposed that one trick. Some of them could even help (see Holden’s tool AI idea). None of them reduce the risk enough to relax – and many of the tricks contradict each other (you can’t design an AI that’s both a tool and socialising with humans!).
7. That we want to stop AI research. We don’t. Current AI research is very far from the risky areas and abilities. And it’s risk aware AI researchers that are most likely to figure out how to make safe AI.
8. That AIs will be more intelligent than us, hence more moral. It’s pretty clear than in humans, high intelligence is no guarantee of morality. Are you really willing to bet the whole future of humanity on the idea that AIs might be different? That in the billions of possible minds out there, there is none that is both dangerous and very intelligent?
9. That science fiction or spiritual ideas are useful ways of understanding AI risk. Science fiction and spirituality are full of human concepts, created by humans, for humans, to communicate human ideas. They need not apply to AI at all, as these could be minds far removed from human concepts, possibly without a body, possibly with no emotions or consciousness, possibly with many new emotions and a different type of consciousness, etc... Anthropomorphising the AIs could lead us completely astray.
10. That AIs have to be evil to be dangerous. The majority of the risk comes from indifferent or partially nice AIs. Those that have sone goal to follow, with humanity and its desires just getting in the way – using resources, trying to oppose it, or just not being perfectly efficient for its goal.

## Parenting Technique: Increase Your Child’s Working Memory

13 29 June 2015 07:51PM

I continually train my ten-year-old son’s working memory, and urge parents of other young children to do likewise.  While I have succeeded in at least temporarily improving his working memory, I accept that this change might not be permanent and could end a few months after he stops training.  But I also believe that while his working memory is boosted so too is his learning capacity.

I have a horrible working memory that greatly hindered my academic achievement.  I was so bad at spelling that they stopped counting it against me in school.  In technical classes I had trouble remembering what variables stood for.  My son, in contrast, has a fantastic memory.  He twice won his school’s spelling bee, and just recently I wrote twenty symbols (letters, numbers, and shapes) in rows of five.  After a few minutes he memorized the symbols and then (without looking) repeated them forward, backwards, forwards, and then by columns.

My son and I have been learning different programming languages through Codecademy.  While I struggle to remember the required syntax of different languages, he quickly gets this and can focus on higher level understanding.  When we do math learning together his strong working memory also lets him concentrate on higher order issues then remembering the details of the problem and the relevant formulas.

You can easily train a child’s working memory.  It requires just a few minutes of time a day, can be very low tech or done on a computer, can be optimized for your child to get him in flow, and easily lends itself to a reward system.  Here is some of the training we have done:

• I write down a sequence and have him repeat it.
• I say a sequence and have him repeat it.
• He repeats the sequence backwards.
• He repeats the sequence with slight changes such as adding one to each number and “subtracting” one from each letter.
• He repeats while doing some task like touching his head every time he says an even number and touching his knee every time he says an odd one.
• Before repeating a memorized sequence he must play repeat after me where I say a random string.
• I draw a picture and have him redraw it.
• He plays N-back games.
• He does mental math requiring keeping track of numbers (i.e. 42 times 37).
• I assign numerical values to letters and ask him math operation questions (i.e. A*B+C).

The key is to keep changing how you train your kid so you have more hope of improving general working memory rather than the very specific task you are doing.  So, for example, if you say a sequence and have your kid repeat it back to you, vary the speed at which you talk on different days and don’t just use one class of symbols in your exercises.

## Two-boxing, smoking and chewing gum in Medical Newcomb problems

13 29 June 2015 10:35AM

I am currently learning about the basics of decision theory, most of which is common knowledge on LW. I have a question, related to why EDT is said not to work.

Consider the following Newcomblike problem: A study shows that most people who two-box in Newcomblike problems as the following have a certain gene (and one-boxers don't have the gene). Now, Omega could put you into something like Newcomb's original problem, but instead of having run a simulation of you, Omega has only looked at your DNA: If you don't have the "two-boxing gene", Omega puts \$1M into box B, otherwise box B is empty. And there is \$1K in box A, as usual. Would you one-box (take only box B) or two-box (take box A and B)? Here's a causal diagram for the problem:

Since Omega does not do much other than translating your genes into money under a box, it does not seem to hurt to leave it out:

I presume that most LWers would one-box. (And as I understand it, not only CDT but also TDT would two-box, am I wrong?)

Now, how does this problem differ from the smoking lesion or Yudkowsky's (2010, p.67) chewing gum problem? Chewing Gum (or smoking) seems to be like taking box A to get at least/additional \$1K, the two-boxing gene is like the CGTA gene, the illness itself (the abscess or lung cancer) is like not having \$1M in box B. Here's another causal diagram, this time for the chewing gum problem:

As far as I can tell, the difference between the two problems is some additional, unstated intuition in the classic medical Newcomb problems. Maybe, the additional assumption is that the actual evidence lies in the "tickle", or that knowing and thinking about the study results causes some complications. In EDT terms: The intuition is that neither smoking nor chewing gum gives the agent additional information.

## Open Thread, Jun. 29 - Jul. 5, 2015

5 29 June 2015 12:14AM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

## Is this evidence for the Simulation hypothesis?

1 28 June 2015 11:45PM

I haven't come across this particular argument before, so I hope I'm not just rehashing a well-known problem.

"The universe displays some very strong signs that it is a simulation.

As has been mentioned in some other answers, one way to efficiently achieve a high fidelity simulation is to design it in such a way that you only need to compute as much detail as is needed. If someone takes a cursory glance at something you should only compute its rough details and only when someone looks at it closely, with a microscope say, do you need to fill in the details.

This puts a big constraint on the kind of physics you can have in a simulation. You need this property: suppose some physical system starts in state x. The system evolves over time to a new state y which is now observed to accuracy ε. As the simulation only needs to display the system to accuracy ε the implementor doesn't want to have to compute x to arbitrary precision. They'd like only have to compute x to some limited degree of accuracy. In other words, demanding y to some limited degree of accuracy should only require computing x to a limited degree of accuracy.

Let's spell this out. Write y as a function of x, y = f(x). We want that for all ε there is a δ such that for all x-δ<y<x+δ, |f(y)-f(x)|<ε. This is just a restatement in mathematical notation of what I said in English. But do you recognise it?

It's the standard textbook definition of a Continuous function. We humans invented the notion of continuity because it was an ubiquitous property of functions in the physical world. But it's precisely the property you need to implement a simulation with demand-driven level of detail. All of our fundamental physics is based on equations that evolve continuously over time and so are optimised for demand-driven implementation.

One way of looking at this is that if y=f(x), then if you want to compute n digits of y you only need a finite number of digits of x. This has another amazing advantage: if you only ever display things to a given accuracy you only ever need to compute your real numbers to a finite accuracy. Nature could have chosen to use any number of arbitrarily complicated functions on the reals. But in fact we only find functions with the special property that they need only be computed to finite precision. This is precisely what a smart programmer would have implemented.

(This also helps motivate the use of real numbers. The basic operations on real numbers such as addition and multiplication are continuous and require only finite precision in their arguments to compute their values to finite precision. So real numbers give a really neat way to allow inhabitants to find ever more detail within a simulation without putting an undue burden on its implementation.)

But you can do one step further. As Gregory Benford says in Timescape: "nature seemed to like equations stated in covariant differential forms". Our fundamental physical quantities aren't just continuous, they're differentiable. Differentiability means that if y=f(x) then once you zoom in closely enough, y depends linearly on x. This means that one more digit of y requires precisely one more digit of x. In other words our hypothetical programmer has arranged things so that after some initial finite length segment they can know in advance exactly how much data they are going to need.

After all that, I don't see how we can know we're not in a simulation. Nature seems cleverly designed to make a demand-driven simulation of it as efficient as possible."

## Goal setting journal (GSJ) - 28/06/15 -> 05/07/15

4 28 June 2015 06:24AM

Inspired by the group rationality diary and open thread, this is the inaugural weekly goal setting journal (GSJ) thread.

If you have goals worth setting that are not worth their own post (even in Discussion), then it goes here.

Notes for future GSJ posters:

2. Check if there is an active GSJ thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. GSJ Threads should be posted in Discussion, and not Main.

4. GSJ Threads should run for no longer than 1 week, but you may set goals, subgoals and tasks for as distant into the future as you please.

5. No one is in charge of posting these threads. If it's time for a new thread, and you want a new thread, just create it.

## Praising the Constitution

-6 27 June 2015 04:55PM

I am sure the majority of the discussion surrounding the Unites States recent Supreme Court ruling will be on the topic of same-sex marriage and marriage equality. And while there is a lot of good discussion to be had, I thought I would take the opportunity to bring up another topic that seems often to be glossed over, but is yet very important to the discussion. That is the idea in the USA of praising the United States Constitution and holding it to an often unquestioning level of devotion.

Before I really get going I would like to take a quick moment to say I do support the US Constitution and think it is important to have a very strong document that provides rights for the people and guidelines for government. The entire structure of the government is defined by the Constitution, and some form of constitution or charter is necessary for the establishment of any type of governing body. Also, in the arguments I use as examples I am not in any way saying which side I am on. I am simply using them as examples, and no attempt should be made to infer my political stances from how I treat the arguments themselves.

But now the other way. I often hear in political discussions people, particularly Libertarians, trying to tie their position back to being based on the Constitution. The buck stops there. The Constitution says it, therefore it must be right. End of discussion. To me this often sounds eerily similar to arguing the semantics of a religious text to support your position.

A great example is in the debate over gun control laws. Without espousing one side or the other, I can fairly safely and definitively say the US Constitution does support citizens' rights to own guns. For many a Libertarian, the discussion ends there. This is not something only Libertarians are guilty of. The other side of the debate often resorts to arguing context and semantics in an attempt to make the Constitution support their side. This clearly is just a case of people trying to win the argument rather than discuss and discover the best solution.

Similarly in the topic of marriage equality, a lot of the discussion has been focused on whether or not the US supreme court ruling was, in fact, constitutional. Extending that further, the topic goes on to "does the Constitution give the federal government the right to demand that the fifty states all allow same-sex marriage?" To me, this is not the true question that needs answering. Or at least, the answer to that question does not determine a certain action or inaction on the part of the federal government. (E.g., if it was decided that it was unconstitutional, that STILL DOESN'T NECESSARILY mean that the federal government shouldn't do it. I know, shocking.)

The Constitution was written by a bunch of men over two hundred years ago. Fallible, albeit brilliant, men. It isn't perfect. (It's damn good, else the country wouldn't have survived this long.) But it is still just a heuristic for finding the best course of action in what resembles a reasonable amount of time (insert your favorite 'inefficiency of bureaucracy' joke here). But heuristics can be wrong. So perhaps we should more often consider the question of whether or not what the Constitution says is actually the right thing. Certainly, departures from the heuristic of the Constitution should be taken with extreme caution and consideration. But we cannot discard the idea and simply argue based on the Constitution.

At the heart of the marriage equality and the supreme court ruling debate are the ideas of freedom, equality, and states' rights. All three of those are heuristics I use that usually point to what I think are best. I usually support states' rights, and consider departure from that as negative expected utility. However, there are many times when that consideration is completely blown away by other considerations.

The best example I can think of off the top of my head is slavery. Before the Emancipation Proclamation some states ruled slavery illegal, some legal. The question that tore our nation apart was whether or not the federal government had the right to impose abolition of slavery on all the states. I usually side with states' rights. But slavery is such an abominable practice that in that case I would have considered the constitutional rights of the federal government a non-issue when weighed against the continuation of slavery in the US for a single more day. If the Constitution had specifically supported the legality of slavery, then that would have shown it was time to burn it and try again.

Any federal proclamation infringes on states' rights, something I usually side with. And as more and more states were legalizing same-sex marriage it seemed that the states were deciding by themselves to promote marriage equality. The supreme court decision certainly speeds things up, but is it worth the infringement of state rights? To me that is the important question. Not whether or not it is Constitutional, but whether or not it is right. I am not answering that question here, just attempting to point out that the discussion of constitutionality may be the wrong question. And certainly an argument could be made for why states' rights should not be used as a heuristic at all.

## 4 days left in Giving What We Can's 2015 fundraiser - £34k to go

5 27 June 2015 02:16AM

We at Giving What We Can have been running a fundraiser to raise £150,000 by the end of June, so that we can make our budget through the end of 2015. We are really keen to keep the team focussed on their job of growing the movement behind effective giving, and ensure they aren't distracted worrying about fundraising and paying the bills.

With 4 days to go, we are now short just £34,000!

We also still have £6,000 worth of matching funds available for those who haven't given more than £1,000 to GWWC before and donate £1,000-£5,000 before next Tuesday! (For those who are asking, 2 of the matchers I think wouldn't have given otherwise and 2 I would guess would have.)

If you've been one of those holding out to see if we would easily reach the goal, now's the time to pitch in to ensure Giving What We Can can continue to achieve its vision of making effective giving the societal default and move millions more to GiveWell-recommended and other high impact organisations.

So please give now or email me for our bank details: robert [dot] wiblin [at] centreforeffectivealtruism [dot] org.

If you want to learn more, please see this more complete explanation for why we might be the highest impact place you can donate. This fundraiser has also been discussed on LessWrong before, as well as the Effective Altruist forum.

Thanks so much!

## New LW Meetups: Maine, San Antonio

2 26 June 2015 02:59PM

This summary was posted to LW Main on June 19th. The following week's summary is here.

New meetups (or meetups with a hiatus of more than a year) are happening in:

Irregularly scheduled Less Wrong meetups are taking place in:

The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:

Locations with regularly scheduled meetups: Austin, Berkeley, Berlin, Boston, Brussels, Buffalo, Cambridge UK, Canberra, Columbus, London, Madison WI, Melbourne, Moscow, Mountain View, New York, Philadelphia, Research Triangle NC, Seattle, Sydney, Tel Aviv, Toronto, Vienna, Washington DC, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers.

## [link] Essay on AI Safety

11 26 June 2015 07:42AM

I recently wrote an essay about AI risk, targeted at other academics:

Long-Term and Short-Term Challenges to Ensuring the Safety of AI Systems

I think it might be interesting to some of you, so I am sharing it here. I would appreciate any feedback any of you have, especially from others who do AI / machine learning research.

## Can You Give Support or Feedback for My Program to Alleviate Poverty?

9 25 June 2015 11:18PM

Hi LessWrong,

Two years ago, when I travelled to Belize, I came up with an idea for a self-sufficient scalable program to address poverty. I saw how many people in Belize were unemployed or getting paid very low wages, but I also saw how skilled they were, a result of English being the national language and a mandatory education system. Many Belizeans have a secondary/high school education in Belize, and the vast majority have at least a primary school education and can speak English. I thought to myself, "it's too bad I can't teleport Belizeans to the United States, because in the U.S., they would automatically be able to earn many times more the minimum wage in Belize with their existing skills."

But I knew there was a way to do it: "virtual teleportation." My solution involves using computer and internet access in conjunction with training and support to connect the poor with high paying international work opportunities. My tests of virtual employment using Upwork and Amazon Mechanical Turk show that it is possible to earn at least twice the minimum wage in Belize, around \$3 an hour, working with flexible hours. This solution is scalable because there is a consistent international demand for very low wage work (relatively speaking) from competent English speakers, and in other countries around the world like South Africa, many people matching that description can be found and lifted out of poverty. The solution could become self-sufficient because running a virtual employment enterprise or taking a cut of the earnings of members using virtual employment services (as bad as that sounds) can generate enough income to pay for the relatively low costs of monthly internet and the one-time costs of technology upgrades.

If you have any feedback, comments, suggestions, I would love to hear about it in the comments section. Feedback on my fundraising campaign at igg.me/at/bvep is also greatly appreciated.

If you are thinking about supporting the idea, my team and I need your help to make this possible. It may be difficult for us to reach our goal, but every contribution greatly increases the chances our fundraiser and our program will be successful, especially in the early stages. All donations are tax-deductible, and if you’d like, you can also opt-in for perks like flash drives and t-shirts. It only takes a moment to make a great difference: igg.me/at/bvep.

## GiveWell event for SF Bay Area EAs

3 25 June 2015 08:27PM

Passing this announcement along from GiveWell:

GiveWell is holding an event at our offices in San Francisco for Bay Area residents who are interested in Effective Altruism. The evening will be similar to the research events we hold periodically for GiveWell donors: it will include presentations and discussion about GiveWell’s top charity work and the Open Philanthropy Project, as well as a light dinner and time for mingling. We’re tentatively planning to hold the event in the evening of Tuesday July 7th or Wednesday July 8th.

We hope to be able to accommodate everyone who is interested, but may have to limit places depending on demand. If you would be interested in attending, please fill out this form.
We hope to see you there!

15 25 June 2015 12:06PM

Summary: Utilitarianism is often ill-defined by supporters and critics alike, preference utilitarianism even more so. I briefly examine some of the axes of utilitarianism common to all popular forms, then look at some axes unique but essential to preference utilitarianism, which seem to have received little to no discussion – at least not this side of a paywall. This way I hope to clarify future discussions between hedonistic and preference utilitarians and perhaps to clarify things for their critics too, though I’m aiming the discussion primarily at utilitarians and utilitarian-sympathisers.

http://valence-utilitarianism.com/?p=8

I like this essay particularly for the way it breaks down different forms of utilitarianism to various axes, which have rarely been discussed on LW much.

For utilitarianism in general:

Many of these axes are well discussed, pertinent to almost any form of utilitarianism, and at least reasonably well understood, and I don’t propose to discuss them here beyond highlighting their salience. These include but probably aren’t restricted to the following:

• What is utility? (for the sake of easy reference, I’ll give each axis a simple title – for this, the utility axis); eg happiness, fulfilled preferences, beauty, information(PDF)
• How drastically are we trying to adjust it?, aka what if any is the criterion for ‘right’ness? (sufficiency axis); eg satisficing, maximising[2], scalar
• How do we balance tradeoffs between positive and negative utility? (weighting axis); eg, negative, negative-leaning, positive (as in fully discounting negative utility – I don’t think anyone actually holds this), ‘middling’ ie ‘normal’ (often called positive, but it would benefit from a distinct adjective)
• What’s our primary mentality toward it? (mentality axis); eg act, rule, two-level, global
• How do we deal with changing populations? (population axis); eg average, total
• To what extent do we discount future utility? (discounting axis); eg zero discount, >0 discount
• How do we pinpoint the net zero utility point? (balancing axis); eg Tännsjö’s test, experience tradeoffs
• What is a utilon? (utilon axis) [3] – I don’t know of any examples of serious discussion on this (other than generic dismissals of the question), but it’s ultimately a question utilitarians will need to answer if they wish to formalise their system.

For preference utilitarianism in particular:

Here then, are the six most salient dependent axes of preference utilitarianism, ie those that describe what could count as utility for PUs. I’ll refer to the poles on each axis as (axis)0 and (axis)1, where any intermediate view will be (axis)X. We can then formally refer to subtypes, and also exclude them, eg ~(F0)R1PU, or ~(F0 v R1)PU etc, or represent a range, eg C0..XPU.

How do we process misinformed preferences? (information axis F)

(F0 no adjustment / F1 adjust to what it would have been had the person been fully informed / FX somewhere in between)

How do we process irrational preferences? (rationality axis R)

(R0 no adjustment / R1 adjust to what it would have been had the person been fully rational / RX somewhere in between)

How do we process malformed preferences? (malformation axes M)

(M0 Ignore them / MF1 adjust to fully informed / MFR1 adjust to fully informed and rational (shorthand for MF1R1) / MFxRx adjust to somewhere in between)

How long is a preference relevant? (duration axis D)

(D0 During its expression only / DF1 During and future / DPF1 During, future and past (shorthand for  DP1F1) / DPxFx Somewhere in between)

What constitutes a preference? (constitution axis C)

(C0 Phenomenal experience only / C1 Behaviour only / CX A combination of the two)

What resolves a preference? (resolution axis S)

(S0 Phenomenal experience only / S1 External circumstances only / SX A combination of the two)

What distinguishes these categorisations is that each category, as far as I can perceive, has no analogous axis within hedonistic utilitarianism. In other words to a hedonistic utilitarian, such axes would either be meaningless, or have only one logical answer. But any well-defined and consistent form of preference utilitarianism must sit at some point on every one of these axes.

See the article for more detailed discussion about each of the axes of preference utilitarianism, and more.

## Cryonics: peace of mind vs. immortality

3 24 June 2015 07:10AM

I wrote a blog post arguing that people sign up for cryo more for peace of mind than for immortality. This suggests that cryo organizations should market towards the former desire than the latter (you can think of it as marketing to near mode rather than far mode, in Hansonian terms).

Perhaps we've been selling cryonics wrong. I'm signed up and feel like the reason I should have for signing up is that cryonics buys me a small, but non-zero chance at living forever. However, for years this should didn't actually result in me signing up. Recently, though, after being made aware of this dissonance between my words and actions, I finally signed up. I'm now very glad that I did. But it's not because I now have a shot at everlasting life.

For those signed up already, does peace-of-mind resonate as a benefit of your membership?

If you are not a cryonics member, what would make you decide that it is a good idea?

## ​My recent thoughts on consciousness

-1 24 June 2015 12:37AM

I have lately come to seriously consider the view that the everyday notion of consciousness doesn’t refer to anything that exists out there in the world but is rather a confused (but useful) projection made by purely physical minds onto their depiction of themselves in the world. The main influences on my thinking are Dan Dennett, (I assume most of you are familiar with him)  and to a lesser extent Yudkowsky (1) and Tomasik (2). To use Dennett’s line of thought: we say that honey is sweet, that metal is solid or that a falling tree makes a sound, but the character tag of sweetness and sounds is not in the world but in the brains internal model of it. Sweetness in not an inherent property of the glucose molecule, instead, we are wired by evolution to perceive it as sweet to reward us for calorie intake in our ancestral environment, and there is neither any need for non-physical sweetness-juice in the brain – no, it's coded (3). We can talk about sweetness and sound as if being out there in the world but in reality it is a useful fiction of sorts that we are "projecting" out into the world. The default model of our surroundings and ourselves we use in our daily lives (the manifest image, or ’umwelt’) is puzzling to reconcile with the scientific perspective of gluons and quarks. We can use this insight to look critically on how we perceive a very familiar part of the world: ourselves. It might be that we are projecting useful fictions onto our model of ourselves as well. Our normal perception of consciousness is perhaps like the sweetness of honey, something we think exist in the world, when it is in fact a judgement about the world made (unconsciously) by the mind.

What we are pointing at with the judgement “I am conscious” is perhaps the competence that we have to access states about the world, form expectations about those states and judge their value to us, coded in by evolution. That is, under this view, equivalent with saying that suger is made of glucose molecules, not sweetness-magic. In everyday language we can talk about suger as sweet and consciousness as “something-to-be-like-ness“ or “having qualia”, which is useful and probably necessary for us to function, but that is a somewhat misleading projection made by our ​​world-accessing and assessing consciousness that really exists in the world. That notion of consciousness is not subject to the Hard Problem, it may not be an easy problem to figure out how consciousness works, but it does not appear impossible to explain it scientifically as pure matter like anything else in the natural world, at least in theory. I’m pretty confident that we will solve consciousness, if we by consciousness mean the competence of a biological system to access states about the world, make judgements and form expectations. That is however not what most people mean when they say consciousness. Just like ”real” magic refers to the magic that isn’t real and the magic that is real, that can be performed in the world, is not “real magic”, “real” consciousness turns out to be a useful, but misleading assessment (4). We should perhaps keep the word consciousness but adjust what we mean when we use it, for diplomacy.

Another difficulty I confront is why e.g. colors and sounds looks and sounds the way they do or why they have any quality at all, under this explanation. Where do they come from if they’re only labels my brain uses to distinguish inputs from the senses? Where does the yellowness of yellow come? Maybe it’s not a sensible question, but only the murmuring of a confused primate. Then again, where does anything come from? If we can learn to shut up our bafflement about consciousness and sensibly reduce it down to physics – fair enough, but where does physics come from? That mystery remains, and that will possibly always be out of reach, at least probably before advanced superintelligent philosophers. For now, understanding how a physical computational system represents the world, creates judgements and expectations from perception presents enough of a challenge. It seems to be a good starting point to explore anyway (7).

I did not really put forth any particularly new ideas here, this is just some of my thoughts and repetitions of what I have read and heard others say, so I'm not sure if this post adds any value. My hope is that someone will at least find some of my references useful, and that it can provide a starting point for discussion. Take into account that this is my first post here, I am very grateful to receive input and criticism! :-)

1. Check out Eliezer's hilarious tear down of philosophical zombies if you haven't already
2. http://reducing-suffering.org/hard-problem-consciousness/
3. [Video] TED talk by Dan Dennett http://www.ted.com/talks/dan_dennett_cute_sexy_sweet_funny
4. http://ase.tufts.edu/cogstud/dennett/papers/explainingmagic.pdf
5. Reading “The Moral Landscape” by Sam Harris increased my confidence in moral realism. Whether moral realism is true of false can obviously have implications for approaches to the value learning problem in AI alignment, and for the factual accuracy of the orthogonality thesis
6. http://www.lehigh.edu/~mhb0/Dennett-WhereAmI.pdf
7. For anyone interested in getting a grasp of this scientific challenge I strongly recommend the book “A User’s Guide to Thought and Meaning” by Ray Jackendoff.

## Is Greed Stupid?

-7 23 June 2015 08:38PM

I just finished reading a fantastic Wait But Why post: How Tesla Will Change The World. One of the things that was noted is that the people in the Auto and Oil industries are trying to delay the introduction of Electric Vehicles (EVs) so they could make more money.

The post also explains how important it is that we become less reliant on oil.

1. Because we're going to run out relatively soon.
2. Because it's causing global warming.
So, from the perspective of these moneybag guys, here is how I see the cost-benefit of delaying the introduction of EVs:
• Make some more money, which gives them and their families a marginally more comfortable life.
• Not get a sense of purpose out of your career.
• Probably feel some sort of guilt about what you do.
• Avoid the short-term discomfort of changing jobs/careers.
This probably makes my opinions pretty clear:
• Because of diminishing marginal utility, I doubt that the extra money is making them much happier. I'm sure they're pretty well off to begin with. It could be the case that they're so used to their lifestyle that they really do need the extra money to be happy, but I doubt it.
• Autonomy, mastery and purpose are three of the most important things to get out of your career. There seems to be a huge opportunity cost to not working somewhere that provides you with a sense of purpose.
• To continue that thought, I'm sure they feel some sort of guilt for what they're doing. Or maybe not. But if they are, that seems like a relatively large cost.
• I understand that there's probably a decent amount of social pressure on them to conform. I'm sure that they surround themselves with people who are pro-oil and anti-electric. I'm sure that their companies put pressure on them to perform. I'm sure that they have families and all of that and starting something new might be difficult. But these don't seem to be large enough costs to make their choices worthwhile. A big reason why I get this impression is because they are so short term.
I've been talking specifically about those in the auto and oil industries, but the same logic seems to apply to other greedy people (ex. in finance). I get the impression that greed is stupid. That it doesn't make you happy, and that it isn't instrumentally rational. But I'd like to get the opinions of others.

View more: Next