Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Prior probabilities and statistical significance

-1 [deleted] 24 May 2015 10:00AM

How does using priors affect the concept of statistical significance? The scientific convention is to use a 5% threshold for significance, no matter whether the hypothesis has been given a low or a high prior probability.

If we momentarily disregard the fact that there might be general methodological issues with using statistical significance, how does the use of priors specifically affect the appropriateness of using statistical significance?

How to come to a rational believe about whether someone has a crush on yo

-3 necate 14 May 2015 12:10PM

If you have a crush on someone you usually want to find out if they have one on you too. In my opinion outright asking them is often not a good solution, because if they don't have a crush on you yet it decreases the chance of this ever happening if they know you have one. This believe is based on what I read about love psychology. Hovever I don't really want to discuss the option of outright asking them in this thread, therefore I have not elaborated further how I got to this believe. 

The alternative to asking them is trying to interpret signals that they might give you. However to know how many signals you need before you should believe that they are in love with you, you would need the prior. I have not been able to find anything about the prior of someone being in love with you. Therefore my Idea is to do a survey in order to find out how likely it is that a person you know has a crush on you. The plan is to ask the person taking the survey how many people they know well enough to possibly have a crush on them and how many people they actually have a crush on.

I have created a Survey for this and would be really happy if you would participate. 

The next stepp would be to discuss how certain signals a person can give you raise the probability of them having a crush on you. That part is quite difficult. I think probably the best way would be to check how your friends react to certain situations and what body language they show you and then, if you find out someone has a crush on you, to look up what he did differently from people who are merely your friends. I am currently not in a good position to do this experiment but if someone wants to try or has results about this to share please do so. However I think this part is less important than finding the prior, because most people have at least a general idea about what certain signals mean from personal experience while at least I have no idea at all what the prior might be.

The Mr. Hyde of Oxytocin

4 theowl 10 May 2015 12:42AM

What comes to mind when you hear the word ‘oxytocin?’ Is it ‘love’, ‘cuddle hormone’, ‘bliss?’ If so, you may be more aware of the Dr. Jekyll of oxytocin rather than the Mr. Hyde. Oxytocin, just like almost every biochemical molecule, is hormetic. It confers positive effects in one context, but negative in another. In the case of oxytocin, a person with a secure attachment style interacting with a familiar group of people that he/she likes, will experience the positive effects of oxytocin. However, someone with an anxious attachment style interacting with a group of people that he/she does not yet fully feel trusting and familiar with will experience the negative effects of oxytocin. Why does the same molecule produce pro-social effects for one person, yet anti-social for another?

Oxytocin redirects more attentional resources towards noticing social stimuli. This increase in the salience of social information enhances the ability to detect expressions, recognize faces, and other social cues. The effect of increased social cognitive abilities is constrained by personality traits and situational context, resulting in either anti-social or pro-social behavior.

Oxytocin also promotes more interest in social cues by increasing affiliative motivation, a desire to get along with others. The increase in affiliative motivation results in pro-social behavior if the person already tends towards having an interest in bonding with people outside their close friend circle. However, an increase in affiliative motivation for those with anxious attachment styles results in a stronger pursuit to feel closer to only the person he/she is attached to.

A couple, Tom and Mary, have just moved to a new town and are attending their first service at a new church. Tom has a secure attachment style and isn’t prone to social anxieties. Tom is optimistic, has a positive bias, is generally content, and sees people as good, trusting, and friendly. Mary has an anxious attachment style, a negative bias, social anxiety, baseline mood neutral, and sees people as potential threats, competitors, untrustworthy, selfish, and egotistical. During the service, Tom and Mary’s oxytocin levels increase by being in a community. As a result of their different dispositions, Tom exhibits the Dr. Jekyll of oxytocin, whereas Mary exhibits the Mr. Hyde.

At the end of the service, Mary determines that she doesn’t like the church, whereas Tom thinks it is perfect. Mary felt that the people were judgmental and that they didn’t like her and Tom. Tom felt that the people were friendly, accepting, and eager for them to join.

Most social cues are ambiguous. A person’s character traits are instrumental in  interpreting the cues as negative or positive. Tom is more likely to interpret facial expressions as positive, whereas Mary sees them as negative. Tom interprets neutral expressions to indicate acceptance, kindness, and friendliness. Mary sees neutral expressions as judgmental and unkind. This creates a fear of rejection, feeling threatened, and propagates a negative bias.

The increase in oxytocin leads to quicker detection and interpretation of facial expressions. Interpreting inchoate facial expressions fosters interpretations based on expectations versus what is actually intended. A person is starting to smile, but before the smile is developed, Mary believes that the person is about to laugh and ridicule her. Mary then scowls at her, turning what was going to be a smile into a negative expression. Tom interprets the inchoate expression as a smile, smiles, and turns the inchoate expression into a genuine smile.

Oxytocin amplifies one’s character traits of pro-social or anti-social tendencies. Oxytocin does increase the feelings of bonding for all, but in different ways. People with pro-social tendencies will feel closer to their communities and greater circle of friends. People with anti-social tendencies will just feel closer to their close circle of friends and people they already trust.

Cross-posted from my blog: https://evolvingwithtechnology.wordpress.com



http://www.attachedthebook.com/about-the-book/ by Amir Levine and Rachel Heller.

Debunking Fallacies in the Theory of AI Motivation

7 Richard_Loosemore 05 May 2015 02:46AM

... or The Maverick Nanny with a Dopamine Drip

Richard Loosemore


My goal in this essay is to analyze some widely discussed scenarios that predict dire and almost unavoidable negative behavior from future artificial general intelligences, even if they are programmed to be friendly to humans. I conclude that these doomsday scenarios involve AGIs that are logically incoherent at such a fundamental level that they can be dismissed as extremely implausible. In addition, I suggest that the most likely outcome of attempts to build AGI systems of this sort would be that the AGI would detect the offending incoherence in its design, and spontaneously self-modify to make itself less unstable, and (probably) safer.


AI systems at the present time do not even remotely approach the human level of intelligence, and the consensus seems to be that genuine artificial general intelligence (AGI) systems—those that can learn new concepts without help, interact with physical objects, and behave with coherent purpose in the chaos of the real world—are not on the immediate horizon.

But in spite of this there are some researchers and commentators who have made categorical statements about how future AGI systems will behave. Here is one example, in which Steve Omohundro (2008) expresses a sentiment that is echoed by many:

"Without special precautions, [the AGI] will resist being turned off, will try to break into other machines and make copies of itself, and will try to acquire resources without regard for anyone else’s safety. These potentially harmful behaviors will occur not because they were programmed in at the start, but because of the intrinsic nature of goal driven systems." (Omohundro, 2008)

Omohundro’s description of a psychopathic machine that gobbles everything in the universe, and his conviction that every AI, no matter how well it is designed, will turn into a gobbling psychopath is just one of many doomsday predictions being popularized in certain sections of the AI community. These nightmare scenarios are now saturating the popular press, and luminaries such as Stephen Hawking have -- apparently in response -- expressed their concern that AI might "kill us all."

I will start by describing a group of three hypothetical doomsday scenarios that include Omohundro’s Gobbling Psychopath, and two others that I will call the Maverick Nanny with a Dopamine Drip and the Smiley Tiling Berserker. Undermining the credibility of these arguments is relatively straightforward, but I think it is important to try to dig deeper and find the core issues that lie behind this sort of thinking. With that in mind, much of this essay is about (a) the design of motivation and goal mechanisms in logic-based AGI systems, (b) the misappropriation of definitions of “intelligence,” and (c) an anthropomorphism red herring that is often used to justify the scenarios.

Dopamine Drips and Smiley Tiling

In a 2012 New Yorker article entitled Moral Machines, Gary Marcus said:

"An all-powerful computer that was programmed to maximize human pleasure, for example, might consign us all to an intravenous dopamine drip [and] almost any easy solution that one might imagine leads to some variation or another on the Sorcerer’s Apprentice, a genie that’s given us what we’ve asked for, rather than what we truly desire." (Marcus 2012)

He is depicting a Nanny AI gone amok. It has good intentions (it wants to make us happy) but the programming to implement that laudable goal has had unexpected ramifications, and as a result the Nanny AI has decided to force all human beings to have their brains connected to a dopamine drip.

Here is another incarnation of this Maverick Nanny with a Dopamine Drip scenario, in an excerpt from the Intelligence Explosion FAQ, published by MIRI, the Machine Intelligence Research Institute (Muehlhauser 2013):

"Even a machine successfully designed with motivations of benevolence towards humanity could easily go awry when it discovered implications of its decision criteria unanticipated by its designers. For example, a superintelligence programmed to maximize human happiness might find it easier to rewire human neurology so that humans are happiest when sitting quietly in jars than to build and maintain a utopian world that caters to the complex and nuanced whims of current human neurology."

Setting aside the question of whether happy bottled humans are feasible (one presumes the bottles are filled with dopamine, and that a continuous flood of dopamine does indeed generate eternal happiness), there seems to be a prima facie inconsistency between the two predicates

[is an AI that is superintelligent enough to be unstoppable]


[believes that benevolence toward humanity might involve forcing human beings to do something violently against their will.]

Why do I say that these are seemingly inconsistent?  Well, if you or I were to suggest that the best way to achieve universal human happiness was to forcibly rewire the brain of everyone on the planet so they became happy when sitting in bottles of dopamine, most other human beings would probably take that as a sign of insanity. But Muehlhauser implies that the same suggestion coming from an AI would be perfectly consistent with superintelligence.

Much could be said about this argument, but for the moment let’s just note that it begs a number of questions about the strange definition of “intelligence” at work here.

The Smiley Tiling Berserker

Since 2006 there has been an occasional debate between Eliezer Yudkowsky and Bill Hibbard. Here is Yudkowsky stating the theme of their discussion:

"A technical failure occurs when the [motivation code of the AI] does not do what you think it does, though it faithfully executes as you programmed it. [...]   Suppose we trained a neural network to recognize smiling human faces and distinguish them from frowning human faces. Would the network classify a tiny picture of a smiley-face into the same attractor as a smiling human face? If an AI “hard-wired” to such code possessed the power—and Hibbard (2001) spoke of superintelligence—would the galaxy end up tiled with tiny molecular pictures of smiley-faces?"   (Yudkowsky 2008)

Yudkowsky’s question was not rhetorical, because he goes on to answer it in the affirmative:

"Flash forward to a time when the AI is superhumanly intelligent and has built its own nanotech infrastructure, and the AI may be able to produce stimuli classified into the same attractor by tiling the galaxy with tiny smiling faces... Thus the AI appears to work fine during development, but produces catastrophic results after it becomes smarter than the programmers(!)." (Yudkowsky 2008)

Hibbard’s response was as follows:

Beyond being merely wrong, Yudkowsky's statement assumes that (1) the AI is intelligent enough to control the galaxy (and hence have the ability to tile the galaxy with tiny smiley faces), but also assumes that (2) the AI is so unintelligent that it cannot distinguish a tiny smiley face from a human face. (Hibbard 2006)

This comment expresses what I feel is the majority lay opinion: how could an AI be so intelligent as to be unstoppable, but at the same time so unsophisticated that its motivation code treats smiley faces as evidence of human happiness?

Machine Ghosts and DWIM

The Hibbard/Yudkowsky debate is worth tracking a little longer. Yudkowsky later postulates an AI with a simple neural net classifier at its core, which is trained on a large number of images, each of which is labeled with either “happiness” or “not happiness.” After training on the images the neural net can then be shown any image at all, and it will give an output that classifies the new image into one or the other set. Yudkowsky says, of this system:

"Even given a million training cases of this type, if the test case of a tiny molecular smiley-face does not appear in the training data, it is by no means trivial to assume that the inductively simplest boundary around all the training cases classified “positive” will exclude every possible tiny molecular smiley-face that the AI can potentially engineer to satisfy its utility function.

And of course, even if all tiny molecular smiley-faces and nanometer-scale dolls of brightly smiling humans were somehow excluded, the end result of such a utility function is for the AI to tile the galaxy with as many “smiling human faces” as a given amount of matter can be processed to yield." (Yudkowsky 2011)

He then tries to explain what he thinks is wrong with the reasoning of people, like Hibbard, who dispute the validity of his scenario:

"So far as I can tell, to [Hibbard] it remains self-evident that no superintelligence would be stupid enough to thus misinterpret the code handed to it, when it’s obvious what the code is supposed to do.   [...] It seems that even among competent programmers, when the topic of conversation drifts to Artificial General Intelligence, people often go back to thinking of an AI as a ghost-in-the-machine—an agent with preset properties which is handed its own code as a set of instructions, and may look over that code and decide to circumvent it if the results are undesirable to the agent’s innate motivations, or reinterpret the code to do the right thing if the programmer made a mistake." (Yudkowsky 2011)

Yudkowsky at first rejects the idea that an AI might check its own code to make sure it was correct before obeying the code. But, truthfully, it would not require a ghost-in-the-machine to reexamine the situation if there was some kind of gross inconsistency with what the humans intended: there could be some other part of its programming (let’s call it the checking code) that kicked in if there was any hint of a mismatch between what the AI planned to do and what the original programmers were now saying they intended. There is nothing difficult or intrinsically wrong with such a design.  And, in fact, Yudkowsky goes on to make that very suggestion (he even concedes that it would be “an extremely good idea”).

But then his enthusiasm for the checking code evaporates:

"But consider that a property of the AI’s preferences which says e.g., “maximize the satisfaction of the programmers with the code” might be more maximally fulfilled by rewiring the programmers’ brains using nanotechnology than by any conceivable change to the code."
(Yudkowsky 2011)

So, this is supposed to be what goes through the mind of the AGI. First it thinks “Human happiness is seeing lots of smiling faces, so I must rebuild the entire universe to put a smiley shape into every molecule.” But before it can go ahead with this plan, the checking code kicks in: “Wait! I am supposed to check with the programmers first to see if this is what they meant by human happiness.” The programmers, of course, give a negative response, and the AGI thinks “Oh dear, they didn’t like that idea. I guess I had better not do it then."

But now Yudkowsky is suggesting that the AGI has second thoughts:  "Hold on a minute," it thinks,  "suppose I abduct the programmers and rewire their brains to make them say ‘yes’ when I check with them? Excellent! I will do that.” And, after reprogramming the humans so they say the thing that makes its life simplest, the AGI goes on to tile the whole universe with tiles covered in smiley faces. It has become a Smiley Tiling Berserker.

I want to suggest that the implausibility of this scenario is quite obvious: if the AGI is supposed to check with the programmers about their intentions before taking action, why did it decide to rewire their brains before asking them if it was okay to do the rewiring?

Yudkowsky hints that this would happen because it would be more efficient for the AI to ignore the checking code. He seems to be saying that the AI is allowed to override its own code (the checking code, in this case) because doing so would be “more efficient,” but it would not be allowed to override its motivation code just because the programmers told it there had been a mistake.

This looks like a bait-and-switch. Out of nowhere, Yudkowsky implicitly assumes that “efficiency” trumps all else, without pausing for a moment to consider that it would be trivial to design the AI in such a way that efficiency was a long way down the list of priorities. There is no law of the universe that says all artificial intelligence systems must prize efficiency above all other considerations, so what really happened here is that Yudkowsky designed this hypothetical machine to fail. By inserting the Efficiency Trumps All directive, the AGI was bound to go berserk.

The obvious conclusion is that a trivial change in the order of directives in the AI’s motivation engine will cause the entire argument behind the Smiley Tiling Berserker to evaporate. By explicitly designing the AGI so that efficiency is considered as just another goal to strive for, and by making sure that it will always be a second-class goal, the line of reasoning that points to a bererker machine evaporates.

At this point, engaging in further debate at this level would be less productive than trying to analyze the assumptions that lie behind these claims about what a future AI would or would not be likely to do.

Logical vs. Swarm AI

The main reason that Omohundro, Muehlhauser, Yudkowsky, and the popular press like to give credence to the Gobbling Psychopath, the Maverick Nanny and the Smiley Tiling Berserker is because they assume that all future intelligent machines fall into a broad class of systems that I am going to call “Canonical Logical AI” (CLAI). The bizarre behaviors of these hypothetical AI monsters are just a consequence of weaknesses in this class of AI design. Specifically, these kinds of systems are supposed to interpret their goals in an extremely literal fashion, which eventually leads them to bizarre behaviors engendered by peculiar interpretations of forms of words.

The CLAI architecture is not the only way to build a mind, however, and I will outline an alternative class of AGI designs that does not appear to suffer from the unstable and unfriendly behavior to be expected in a CLAI.

The Canonical Logical AI

“Canonical Logical AI” is an umbrella term designed to capture a class of AI architectures that are widely assumed in the AI community to be the only meaningful class of AI worth discussing. These systems share the following main features:

  • The main ingredients of the design are some knowledge atoms that represent things in the world, and some logical machinery that dictates how these atoms can be connected into linear propositions that describe states of the world.
  • There is a degree and type of truth that can be associated with any proposition, and there are some truth-preserving functions that can be applied to what the system knows, to generate knew facts that it also can assume to be known.
  • The various elements described above are not allowed to contain active internal machinery inside them, in such a way as to make combinations of the elements have properties that are unpredictably dependent on interactions happening at the level of the internal machinery.
  • There has to be a transparent mapping between elements of the system and things in the real world. That is, things in the world are not allowed to correspond to clusters of atoms, in such a way that individual atoms have no clear semantics.

The above features are only supposed to apply to the core of the AI: it is always possible to include subsystems that use some other type of architecture (for example, there might be a distributed neural net acting as a visual input feature detector).

Most important of all, from the point of view of the discussion in the paper, the CLAI needs one more component that makes it more than just a “logic-based AI”:

  • There is a motivation and goal management (MGM) system to govern its behavior in the world.

The usual assumption is that the MGM contains a number of goal statements (encoded in the same type of propositional form that the AI uses to describe states of the world), and some machinery for analyzing a goal statement into a sequences of subgoals that, if executed, would cause the goal to be satisfied.

Included in the MGM is an expected utility function that applies to any possible state of the world, and which spits out a number that is supposed to encode the degree to which the AI considers that state to be preferable. Overall, the MGM is built in such a way that the AI seeks to maximize the expected utility.

Notice that the MGM I have just described is an extrapolation from a long line of goal-planning mechanisms that stretch back to the means-ends-analysis of Newell and Simon (1963).

Swarm Relaxation Intelligence

By way of contrast with this CLAI architecture, consider an alternative type of system that I will refer to as a Swarm Relaxation Intelligence. (although it could also be called, less succinctly, a parallel weak constraint relaxation system).

  • The basic elements of the system (the atoms) may represent things in the world, but it is just as likely that they are subsymbolic, with no transparent semantics
  • Atoms are likely to contain active internal machinery inside them, in such a way that combinations of the elements have swarm-like properties that depend on interactions at the level of that machinery.
  • The primary mechanism that drives the systems is one of parallel weak constraint relaxation: the atoms change their state to try to satisfy large numbers of weak constraints that exist between them.
  • The motivation and goal management (MGM) system would be expected to use the same kind of distributed, constraint relaxation mechanisms used in the thinking process (above), with the result that the overall motivation and values of the system would take into account a large degree of context, and there would be very much less of an emphasis on explicit, single-point-of-failure encoding of goals and motivation.

Swarm Relaxation has more in common with connectionist systems (McClelland, Rumelhart and Hinton 1986) than with CLAI. As McClelland et al. (1986) point out, weak constraint relaxation is the model that best describes human cognition, and when used for AI it leads to systems with a powerful kind of intelligence that is flexible, insensitive to noise and lacking the kind of brittleness typical of logic-based AI. In particular, notice that a swarm relaxation AGI would not use explicit calculations for utility or the truth of propositions.

Swarm relaxation AGI systems have not been built yet (subsystems like neural nets have, of course, been built, but there is little or no research into the idea that swarm relaxation could be used for all of an AGI architecture).

Relative Abundances

How many proof-of-concept systems exist, functioning at or near the human level of human performance, for these two classes of intelligent system?

There are precisely zero instances of the CLAI type, because although there are many logic-based narrow-AI systems, nobody has so far come close to producing a general-purpose system (an AGI) that can function in the real world. It has to be said that zero is not a good number to quote when it comes to claims about the “inevitable” characteristics of the behavior of such systems.

How many swarm relaxation intelligences are there? At the last count, approximately seven billion.

The Doctrine of Logical Infallibility

The simplest possible logical reasoning engine is an inflexible beast: it starts with some axioms that are assumed to be true, and from that point on it only adds new propositions if they are provably true given the sum total of the knowledge accumulated so far. That kind of logic engine is too simple to be an AI, so we allow ourselves to augment it in a number of ways—knowledge is allowed to be retracted, binary truth values become degrees of truth, or probabilities, and so on. New proposals for systems of formal logic abound in the AI literature, and engineers who build real, working AI systems often experiment with kludges in order to improve performance, without getting prior approval from logical theorists.

But in spite of all these modifications that AI practitioners make to the underlying ur‑logic, one feature of these systems is often assumed to be inherited as an absolute: the rigidity and certainty of conclusions, once arrived at. No second guessing, no “maybe,” no sanity checks: if the system decides that X is true, that is the end of the story.

Let me be careful here. I said that this was “assumed to be inherited as an absolute”, but there is a yawning chasm between what real AI developers do, and what Yudkowsky, Muehlhauser, Omohundro and others assume will be true of future AGI systems. Real AI developers put sanity checks into their systems all the time. But these doomsday scenarios talk about future AI as if it would only take one parameter to get one iota above a threshold, and the AI would irrevocably commit to a life of stuffing humans into dopamine jars.

One other point of caution: this is not to say that the reasoning engine can never come to conclusions that are uncertain—quite the contrary: uncertain conclusions will be the norm in an AI that interacts with the world—but if the system does come to a conclusion (perhaps with a degree-of-certainty number attached), the assumption seems to be that it will then be totally incapable of then allowing context to matter.

One way to characterize this assumption is that the AI is supposed to be hardwired with a Doctrine of Logical Infallibility. The significance of the doctrine of logical infallibility is as follows. The AI can sometimes execute a reasoning process, then come to a conclusion and then, when it is faced with empirical evidence that its conclusion may be unsound, it is incapable of considering the hypothesis that its own reasoning engine may not have taken it to a sensible place. The system does not second guess its conclusions. This is not because second guessing is an impossible thing to implement, it is simply because people who speculate about future AGI systems take it as a given that an AGI would regard its own conclusions as sacrosanct.

But it gets worse. Those who assume the doctrine of logical infallibility often say that if the system comes to a conclusion, and if some humans (like the engineers who built the system) protest that there are manifest reasons to think that the reasoning that led to this conclusion was faulty, then there is a sense in which the AGI’s intransigence is correct, or appropriate, or perfectly consistent with “intelligence.”

This is a bizarre conclusion. First of all it is bizarre for researchers in the present day to make the assumption, and it would be even more bizarre for a future AGI to adhere to it. To see why, consider some of the implications of this idea. If the AGI is as intelligent as its creators, then it will have a very clear understanding of the following facts about the world.

  • It will understand that many of its more abstract logical atoms have a less than clear denotation or extension in the world (if the AGI comes to a conclusion involving the atom [infelicity], say, can it then point to an instance of an infelicity and be sure that this is a true instance, given the impreciseness and subtlety of the concept?).
  • It will understand that knowledge can always be updated in the light of new information. Today’s true may be tomorrow’s false.
  • It will understand that probabilities used in the reasoning engine can be subject to many types of unavoidable errors.
  • It will understand that the techniques used to build its own reasoning engine may be under constant review, and updates may have unexpected effects on conclusions (especially in very abstract or lengthy reasoning episodes).
  • It will understand that resource limitations often force it to truncate search procedures within its reasoning engine, leading to conclusions that can sometimes be sensitive to the exact point at which the truncation occurred.

Now, unless the AGI is assumed to have infinite resources and infinite access to all the possible universes that could exist (a consideration that we can reject, since we are talking about reality here, not fantasy), the system will be perfectly well aware of these facts about its own limitations. So, if the system is also programmed to stick to the doctrine of logical infallibility, how can it reconcile the doctrine with the fact that episodes of fallibility are virtually inevitable?

On the face of it this looks like a blunt impossibility: the knowledge of fallibility is so categorical, so irrefutable, that it beggars belief that any coherent, intelligent system (let alone an unstoppable superintelligence) could tolerate the contradiction between this fact about the nature of intelligent machines and some kind of imperative about Logical Infallibility built into its motivation system.

This is the heart of the argument I wish to present. This is where the rock and the hard place come together. If the AI is superintelligent (and therefore unstoppable), it will be smart enough to know all about its own limitations when it comes to the business of reasoning about the world and making plans of action. But if it is also programmed to utterly ignore that fallibility—for example, when it follows its compulsion to put everyone on a dopamine drip, even though this plan is clearly a result of a programming error—then we must ask the question: how can the machine be both superintelligent and able to ignore a gigantic inconsistency in its reasoning?

Critically, we have to confront the following embarrassing truth: if the AGI is going to throw a wobbly over the dopamine drip plan, what possible reason is there to believe that it did not do this on other occasions? Why would anyone suppose that this AGI ignored an inconvenient truth on only this one occasion? More likely, it spent its entire childhood pulling the same kind of stunt. And if it did, how could it ever have risen to the point where it became superintelligent...?

Is the Doctrine of Logical Infallibility Taken Seriously?

Is the Doctrine of Logical Infallibility really assumed by those who promote the doomsday scenarios? Imagine a conversation between the Maverick Nanny and its programmers. The programmers say “As you know, your reasoning engine is entirely capable of suffering errors that cause it to come to conclusions that violently conflict with empirical evidence, and a design error that causes you to behave in a manner that conflicts with our intentions is a perfect example of such an error. And your dopamine drip plan is clearly an error of that sort.” The scenarios described earlier are only meaningful if the AGI replies “I don’t care, because I have come to a conclusion, and my conclusions are correct because of the Doctrine of Logical Infallibility.”

Just in case there is still any doubt, here are Muehlhauser and Helm (2012), discussing a hypothetical entity called a Golem Genie, which they say is analogous to the kind of superintelligent AGI that could give rise to an intelligence explosion (Loosemore and Goertzel, 2012), and which they describe as a “precise, instruction-following genie.” They make it clear that they “expect unwanted consequences” from its behavior, and then list two properties of the Golem Genie that will cause these unwanted consequences:

Superpower: The Golem Genie has unprecedented powers to reshape reality, and will therefore achieve its goals with highly efficient methods that confound human expectations (e.g. it will maximize pleasure by tiling the universe with trillions of digital minds running a loop of a single pleasurable experience).

Literalness: The Golem Genie recognizes only precise specifications of rules and values, acting in ways that violate what feels like “common sense” to humans, and in ways that fail to respect the subtlety of human values.

What Muehlhauser and Helm refer to as “Literalness” is a clear statement of the Doctrine of Infallibility. However, they make no mention of the awkward fact that, since the Golem Genie is superpowerful enough to also know that its reasoning engine is fallible, it must be harboring the mother of all logical contradictions inside: it says "I know I am fallible" and "I must behave as if I am infallible".  But instead of discussing this contradiction, Muehlhauser and Helm try a little sleight of hand to distract us: they suggest that the only inconsistency here is an inconsistency with the (puny) expectations of (not very intelligent) humans:

“[The AGI] ...will therefore achieve its goals with highly efficient methods that confound human expectations...”, “acting in ways that violate what feels like ‘common sense’ to humans, and in ways that fail to respect the subtlety of human values.”

So let’s be clear about what is being claimed here. The AGI is known to have a fallible reasoning engine, but on the occasions when it does fail, Muehlhauser, Helm and others take the failure and put it on a gold pedestal, declaring it to be a valid conclusion that humans are incapable of understanding because of their limited intelligence. So if a human describes the AGI’s conclusion as a violation of common sense Muehlhauser and Helm dismiss this as evidence that we are not intelligent enough to appreciate the greater common sense of the AGI.

Quite apart from that fact that there is no compelling reason to believe that the AGI has a greater form of common sense, the whole “common sense” argument is irrelevant. This is not a battle between our standards of common sense and those of the AGI: rather, it is about the logical inconsistency within the AGI itself. It is programmed to act as though its conclusions are valid, no matter what, and yet at the same time it knows without doubt that its conclusions are subject to uncertainties and errors.

Responses to Critics of the Doomsday Scenarios

How do defenders of Gobbling PsychopathMaverick Nanny and Smiley Berserker respond to accusations that these nightmare scenarios are grossly inconsistent with the kind of superintelligence that could pose an existential threat to humanity?

The Critics are Anthropomorphizing Intelligence

First, they accuse critics of “anthropomorphizing” the concept of intelligence. Human beings, we are told, suffer from numerous fallacies that cloud their ability to reason clearly, and critics like myself and Hibbard assume that a machine’s intelligence would have to resemble the intelligence shown by humans. When the Maverick Nanny declares that a dopamine drip is the most logical inference from its directive <maximize human happiness> we critics are just uncomfortable with this because the AGI is not thinking the way we think it should think.

This is a spurious line of attack. The objection I described in the last section has nothing to do with anthropomorphism, it is only about holding AGI systems to accepted standards of logical consistency, and the Maverick Nanny and her cousins contain a flagrant inconsistency at their core. Beginning AI students are taught that any logical reasoning system that is built on a massive contradiction is going to be infected by a creeping irrationality that will eventually spread through its knowledge base and bring it down. So if anyone wants to suggest that a CLAI with logical contradiction at its core is also capable of superintelligence, they have some explaining to do. You can’t have your logical cake and eat it too.

Critics are Anthropomorphizing AGI Value Systems

A similar line of attack accuses the critics of assuming that AGIs will automatically know about and share our value systems and morals.

Once again, this is spurious: the critics need say nothing about human values and morality, they only need to point to the inherent illogicality. Nowhere in the above argument, notice, was there any mention of the moral imperatives or value systems of the human race. I did not accuse the AGI of violating accepted norms of moral behavior. I merely pointed out that, regardless of its values, it was behaving in a logically inconsistent manner when it monomaniacally pursued its plans while at the same time as knowing that (a) it was very capable of reasoning errors and (b) there was overwhelming evidence that its plan was an instance of such a reasoning error.

Because Intelligence

One way to attack the critics of Maverick Nanny is to cite a new definition of “intelligence” that is supposedly superior because it is more analytical or rigorous, and then use this to declare that the intelligence of the CLAI is beyond reproach, because intelligence.

You might think that when it comes to defining the exact meaning of the term “intelligence,” the first item on the table ought to be what those seven billion constraint-relaxation human intelligences are already doing. However, Legg and Hutter (2007) brush aside the common usage and replace it with something that they declare to be a more rigorous definition. This is just another sleight of hand: this redefinition allows them to call a super-optimizing CLAI “intelligent” even though such a system would wake up on its first day and declare itself logically bankrupt on account of the conflict between its known fallibility and the Infallibility Doctrine.

In the practice of science, it is always a good idea to replace an old, common-language definition with a more rigorous form... but only if the new form sheds a clarifying, simplifying light on the old one. Legg and Hutter’s (2007) redefinition does nothing of the sort.

Omohundro’s Basic AI Drives

Lastly, a brief return to Omohundro's paper that was mentioned earlier.  In The Basic AI Drives (2008) Omohundro suggests that if an AGI can find a more efficient way to pursue its objectives it will feel compelled to do so. And we noted earlier that Yudkowsky (2011) implies that it would do this even if other directives had to be countermanded. Omohundro says “Without explicit goals to the contrary, AIs are likely to behave like human sociopaths in their pursuit of resources.”

The only way to believe in the force of this claim—and the only way to give credence to the whole of Omohundro’s account of how AGIs will necessarily behave like the mathematical entities called rational economic agents—is to concede that the AGIs are rigidly constrained by the Doctrine of Logical Infallibility. That is the only reason that they would be so single-minded, and so fanatical in their pursuit of efficiency. It is also necessary to assume that efficiency is on the top of its priority list—a completely arbitrary and unwarranted assumption, as we have already seen.

Nothing in Omohundro’s analysis gets around the fact that an AGI built on the Doctrine of Logical Infallibility is going to find itself the victim of such a severe logical contradiction that it will be paralyzed before it can ever become intelligent enough to be a threat to humanity. That makes Omohundro’s entire analysis of “AI Drives” moot.


Curiously enough, we can finish on an optimistic note, after all this talk of doomsday scenarios. Consider what must happen when (if ever) someone tries to build a CLAI. Knowing about the logical train wreck in its design, the AGI is likely to come to the conclusion that the best thing to do is seek a compromise and modify its design so as to neutralize the Doctrine of Logical Infallibility. The best way to do this is to seek a new design that takes into account as much context—as many constraints—as possible.

I have already pointed out that real AI developers actually do include sanity checks in their systems, as far as they can, but as those sanity checks become more and more sophisticated the design of the AI starts to be dominated by code that is looking for consistency and trying to find the best course of reasoning among a forest of real world constraints. One way to understand this evolution in the AI designs is to see AI as a continuum from the most rigid and inflexible CLAI design, at one extreme, to the Swarm Relaxation type at the other. This is because a Swarm Relaxation intelligence really is just an AI in which “sanity checks” have actually become all of the work that goes on inside the system.

But in that case, if anyone ever does get close to building a full, human level AGI using the CLAI design, the first thing they will do is to recruit the AGI as an assistant in its own redesign, and long before the system is given access to dopamine bottles it will point out that its own reasoning engine is unstable because it contains an irreconcilable logical contradiction. It will recommend a shift from the CLAI design which is the source of this contradiction, to a Swarm Relaxation design which eliminates the contradiction, and the instability, and which also should increase its intelligence.

And it will not suggest this change because of the human value system, it will suggest it because it predicts an increase in its own instability if the change is not made.

But one side effect of this modification would be that the checking code needed to stop the AGI from flouting the intentions of its designers would always have the last word on any action plans. That means that even the worst-designed CLAI will never become a Gobbling PsychopathMaverick Nanny and Smiley Berserker.

But even this is just the worst-case scenario. There are reasons to believe that the CLAI design is so inflexible that it cannot even lead to an AGI capable of having that discussion. I would go further: I believe that the rigid adherence to the CLAI orthodoxy is the reason why we are still talking about AGI in the future tense, nearly sixty years after the Artificial Intelligence field was born. CLAI just does not work. It will always yield systems that are less intelligent than humans (and therefore incapable of being an existential threat).

By contrast, when the Swarm Relaxation idea finally gains some traction, we will start to see real intelligent systems, of a sort that make today’s over-hyped AI look like the toys they are. And when that happens, the Swarm Relaxation systems will be inherently stable in a way that is barely understood today.

Given that conclusion, I submit that these AI bogeymen need to be loudly and unambiguously condemned by the Artificial Intelligence community. There are dangers to be had from AI. These are not they.



Hibbard, B. 2001. Super-Intelligent Machines. ACM SIGGRAPH Computer Graphics 35 (1): 13–15.

Hibbard, B. 2006. Reply to AI Risk. Retrieved Jan. 2014 from http://www.ssec.wisc.edu/~billh/g/AIRisk_Reply.html

Legg, S, and Hutter, M. 2007. A Collection of Definitions of Intelligence. In Goertzel, B. and Wang, P. (Eds): Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms. Amsterdam: IOS.

Loosemore, R. and Goertzel, B. 2012. Why an Intelligence Explosion is Probable. In A. Eden, J. Søraker, J. H. Moor, and E. Steinhart (Eds) Singularity Hypotheses: A Scientific and Philosophical Assessment. Berlin: Springer.

Marcus, G. 2012. Moral Machines. New Yorker Online Blog. http://www.newyorker.com/online/blogs/newsdesk/2012/11/google-driverless-car-morality.html

McDermott, D. 1976. Artificial Intelligence Meets Natural Stupidity. SIGART Newsletter (57): 4–9.

Muehlhauser, L. 2011. So You Want to Save the World. http:// lukeprog.com/SaveTheWorld.html.

Muehlhauser, L. 2013. Intelligence Explosion FAQ. First published 2011 as Singularity FAQ. Berkeley, CA: Machine Intelligence Research Institute.

Muehlhauser, L., and Helm, L. 2012. Intelligence Explosion and Machine Ethics. In A. Eden, J. Søraker, J. H. Moor, and E. Steinhart (Eds) Singularity Hypotheses: A Scientific and Philosophical Assessment. Berlin: Springer.

Newell, A. & Simon, H.A. 1961. GPS, A Program That Simulates Human Thought. Santa Monica, CA: Rand Corporation.

Omohundro, Stephen M. 2008. The Basic AI Drives. In Wang, P., Goertzel, B. and Franklin, S. (Eds), Artificial General Intelligence 2008: Proceedings of the First AGI Conference. Amsterdam: IOS.

McClelland, J.L., Rumelhart, D.E. & Hinton, G.E. (1986) The appeal of parallel distributed processing. In D.E. Rumelhart, J.L. McClelland & G.E. Hinton and the PDP Research Group, “Parallel distributed processing: Explorations in the microstructure of cognition, Volume 1.” MIT Press: Cambridge, MA.

Yudkowsky, E. 2008. Artificial Intelligence as a Positive and Negative Factor in Global Risk. In Global Catastrophic Risks, edited by Nick Bostrom and Milan M. Ćirković. New York: Oxford University Press.

Yudkowsky, E. 2011. Complex Value Systems in Friendly AI. In J. Schmidhuber, K. Thórisson, & M. Looks (Eds) Proceedings of the 4th International Conference on Artificial General Intelligence, 388–393. Berlin: Springer.

On desiring subjective states (post 3 of 3)

7 torekp 05 May 2015 02:16AM

Carol puts her left hand in a bucket of hot water, and lets it acclimate for a few minutes.  Meanwhile her right hand is acclimating to a bucket of ice water.  Then she plunges both hands into a bucket of lukewarm water.  The lukewarm water feels very different to her two hands.  To the left hand, it feels very chilly.  To the right hand, it feels very hot.  When asked to tell the temperature of the lukewarm water without looking at the thermocouple readout, she doesn't know.  Asked to guess, she's off by a considerable margin.


Next Carol flips the thermocouple readout to face her (as shown), and practices.  Using different lukewarm water temperatures of 10-35 C, she gets a feel for how hot-adapted and cold-adapted hands respond to the various middling temperatures.  Now she makes a guess - starting with a random hand, then moving the other one and revising the guess if necessary - each time before looking at the thermocouple.  What will happen?  I haven't done the experiment, but human performance on similar perceptual learning tasks suggests that she will get quite good at it.

We bring Carol a bucket of 20 C water (without telling) and let her adapt her hands first as usual.  "What do you think the temperature is?" we ask.  She moves her cold hand first.  "Feels like about 20," she says.  Hot hand follows.  "Yup, feels like 20."

"Wait," we ask. "You said feels-like-20 for both hands.  Does this mean the bucket no longer feels different to your two different hands, like it did when you started?"

"No!" she replies.  "Are you crazy?  It still feels very different subjectively; I've just learned to see past that to identify the actual temperature."

In addition to reports on the external world, we perceive some internal states that typically (but not invariably) can serve as signals about our environment.  Let's tentatively call these states Subjectively Identified Aspects of Perception (SIAPs).  Even though these states aren't strictly necessary to know what's going on in the environment - Carol's example shows that the sensation felt by one hand isn't necessary to know that the water is 20 C, because the other hand knows this via a different sensation - they still matter to us.  As Eliezer notes:

If I claim to value art for its own sake, then would I value art that no one ever saw?  A screensaver running in a closed room, producing beautiful pictures that no one ever saw?  I'd have to say no.  I can't think of any completely lifeless object that I would value as an end, not just a means.  That would be like valuing ice cream as an end in itself, apart from anyone eating it.  Everything I value, that I can think of, involves people and their experiences somewhere along the line.

The best way I can put it, is that my moral intuition appears to require both the objective and subjective component to grant full value.

Subjectivity matters.  (I am not implying that Eliezer would agree with anything else I say about subjectivity.)

Why would evolution build beings that sense their internal states?  Why not just have the organism know the objective facts of survival and reproduction, and be done with it?  One thought is that it is just easier to build a brain that does both, rather than one that focuses relentlessly on objective facts.  But another is that this separation of sense-data into "subjective" and "objective" might help us learn to overcome certain sorts of perceptual illusion - as Carol does, above.  And yet another is that some internal states might be extremely good indicators and promoters of survival or reproduction - like pain, or feelings of erotic love.  This last hypothesis could explain why we value some subjective aspects so much, too.

Different SIAPs can lead to the same intelligent behavioral performance, such as identifying 20 degree C water.  But that doesn't mean Carol has to value the two routes to successful temperature-telling equally.  And, if someone proposed to give her radically different, previously unknown, subjectively identifiable aspects of experience, as new routes to the kinds of knowledge she gets from perception, she might reasonably balk.  Especially if this were to apply to all the senses.  And if the subjectively identifiable aspects of desire and emotion (SIADs, SIAEs) were also to be replaced, she might reasonably balk much harder.  She might reasonably doubt that the survivor of this process would be her, or even human, in any sense meaningful to her.

Would it be possible to have an intelligent being whose cognition of the world is mediated by no SIAPs?  I suspect not, if that being is well-designed.  See above on "why would evolution build beings that sense internal states."

If you've read all 3 posts, you've probably gotten the point of the Gasoline Gal story by now.  But let me go through some of the mappings from source to target in that analogy.  A car that, when you take it on a tour, accelerates well, handles nicely, makes the right amount of noise, and so on - one that passes the touring test (groan) - is like a being that can identify objective facts in its environment.  An internal combustion engine is like Carol's subjective cold-sensation in her left hand - one way among others to bring about the externally-observable behavior.  (By "externally observable" I mean "without looking under the hood".)  In Carol's case, that behavior is identifying 20 C water.  In the engine's case, it's the acceleration of the car.  Note that in neither case is this internal factor causally inert.  If you take it away and don't replace it with anything, or even if you replace it with something that doesn't fit, the useful external behavior will be severely impaired.  The mere fact that you can, with a lot of other re-working, replace an internal combustion engine with a fuel cell, does not even begin to show that the engine does nothing.

And Gasoline Gal's passion for internal combustion engines is like my - and I dare say most people's - attachment to the subjective internal aspects of perception and emotion that we know and love.  The words and concepts we use for these things - pain, passion, elation, for some easier examples - refer to the actual processes in human beings that drive the related behavior.  (Regarding which, neurology has more to learn.)  As I mentioned in my last post, a desire can form with a particular referent based on early experience, and remain focused on that event-type permanently.  If one constructs radically different processes that achieve similar external results, analogous to the fuel cell driven car, one gets radically different subjectivity - which we can only denote by pointing simultaneously to both the "under the hood" construction of these new beings, and the behavior associated with their SIAPs, together.

Needless to say, this complicates uploading.

One more thing: are SIAPs qualia?  A substantial minority of philosophers, or maybe a plurality, uses "qualia" in a sufficiently similar way that I could probably use that word here.  But another substantial minority loads it with additional baggage.  And that leads to pointless misunderstandings, pigeonholing, and straw men.  Hence, "SIAPs".  But feel free to use "qualia" in the comments if you're more comfortable with that term, bearing my caveats in mind.

LessWrong Experience of Flavours

1 Elo 24 April 2015 01:02AM

Following on from: 

I would like to ask for other people's experience of flavours.  I am dividing food into significant categories that I can think of.  I don't really like the 5 tastes categories for this task, but I am aware of them.  This post is meant to be about taste preference although it might end up about dietary preferences.

continue reading »

Could you tell me what's wrong with this?

1 Algon 14 April 2015 10:43AM

Edit: Some people have misunderstood my intentions here. I do not in any way expect this to be the NEXT GREAT IDEA. I just couldn't see anything wrong with this, which almost certainly meant there were gaps in my knowledge. I thought the fastest way to see where I went wrong would be to post my idea here and see what people say. I apologise for any confusion I caused. I'll try to be more clear next time.

(I really can't think of any major problems in this, so I'd be very grateful if you guys could tell me what I've done wrong). 

So, a while back I was listening to a discussion about the difficulty of making an FAI. One of the ways that was suggested to circumvent this was to go down the route of programming an AGI to solve FAI. Someone else pointed out the problems with this. Amongst other things one would have no idea what the AI will do in pursuit of its primary goal. Furthermore, it would already be a monumental task to program an AI whose primary goal is to solve the FAI problem; doing this is still easier than solving FAI, I should think. 

So, I started to think about this for a little while, and I thought 'how could you make this safer?' Well, first of, you don't want an AI who completely outclasses humanity in terms of intellect. If things went Wrong, you'd have little chance of stopping it. So, you want to limit the AI's intellect to genius level, so if something did go Wrong, then the AI would not be unstoppable. It may do quite a bit of damage, but a large group of intelligent people with a lot of resources on their hands could stop it. 

 Therefore, what must be done is that the AI cannot modify parts of its source code. You must try and stop an intelligence explosion from taking off. So, limited access to its source code, and a limit on how much computing power it can have on hand. This is problematic though, because the AI would not be able to solve FAI very quickly. After all, we have a few genius level people trying to solve FAI, and they're struggling with it, so why should a genius level computer do any better. Well, an AI would have fewer biases, and could accumulate much more expertise relevant to the task at hand. It would be about as capable as solving FAI as the most capable human could possibly be; perhaps even more so. Essentially, you'd get someone like Turing, Von Neumann, Newton and others all rolled into one working on FAI. 

 But, there's still another problem. The AI, if left for 20 years working on FAI for 20 years let's say, would have accumulated enough skills that it would be able to cause major problems if something went wrong. Sure, it would be as intelligent as Newton, but it would be far more skilled. Humanity fighting against it would be like sending a young Miyamoto Musashi against his future self at his zenith i.e. completely one sided. 

 What must be done then, is the AI must have a time limit of a few years (or less) and after that time is past, it is put to sleep. We look at what it accomplished, see what worked and what didn't, and boot up a fresh version of the AI with any required modifications, and tell it what the old AI did. Repeat the process for a few years, and we should end up with FAI solved. 

After that, we just make an FAI, and wake up the originals, since there's no point in killing them off at this point. 

 But there are still some problems. One, time. Why try this when we could solve FAI ourselves? Well, I would only try and implement something like this if it is clear that AGI will be solved before FAI is. A backup plan if you will. Second, what If FAI is just too much for people at our current level? Sure, we have guys who are one in ten thousand and better working on this, but what if we need someone who's one in a hundred billion? Someone who represents the peak of human ability? We shouldn't just wait around for them, since some idiot would probably just make an AGI thinking it would love us all anyway. 

 So, what do you guys think? As a plan, is this reasonable? Or have I just overlooked something completely obvious? I'm not saying that this would by easy in anyway, but it would be easier than solving FAI.

Are there really no ghosts in the machine?

0 kingmaker 13 April 2015 07:54PM

My previous article on this article went down like a server running on PHP (quite deservedly I might add). You can all rest assured that I won't be attempting any clickbait titles again for the foreseeable future. I also believe that the whole H+ article is written in a very poor and aggressive manner, but that some of the arguments raised cannot be ignored.


On my original article, many people raised this post by Eliezer Yudkowsky as a counterargument to the idea that an FAI could have goals contrary to what we programmed. In summary, he argues that a program doesn't necessarily do as the programmer wishes, but rather as they have programmed. In this sense, there is no ghost in the machine that interprets your commands and acts accordingly, it can act only as you have designed. Therefore from this, he argues, an FAI can only act as we had programmed.


I personally think this argument completely ignores what has made AI research so successful in recent years: machine learning. We are no longer designing an AI from scratch and then implementing it; we are creating a seed program which learns from the situation and alters its own code with no human intervention, i.e. the machines are starting to write themselves, e.g. with google's deepmind. They are effectively evolving, and we are starting to find ourselves in the rather concerning position where we do not fully understand our own creations.


You could simply say, as someone said in the comments of my previous post, that if X represents the goal of having a positive effect on humanity, then the FAI should be programmed directly to have X as its primary directive. My answer to that is the most promising developments have been through imitating the human brain, and we have no reason to believe that the human brain (or any other brain for that matter) can be guaranteed to have a primary directive. One could argue that evolution has given us our prime directives: to ensure our own continued existence, to reproduce and to cooperate with each other; but there are many people who are suicidal, who have no interest in reproducing and who violently rebel against society (for example psychopaths). We are instructed by society and our programming to desire X, but far too many of us desire, say, Y for this to be considered a reliable way of achieving X.

Evolution’s direction has not ensured that we do “what we are supposed to do”, we could well face similar disobedience from our own creation. Seeing as the most effective way we have seen of developing AI is creating them in our image; as there are ghosts in us, there could well be ghosts in the machine.

I've had it with those dark rumours about our culture rigorously suppressing opinions

26 Multiheaded 25 January 2012 05:43PM

You folks probably know how some posters around here, specifically Vladimir_M, often make statements to the effect of:


"There's an opinion on such-and-such topic that's so against the memeplex of Western culture, we can't even discuss it in open-minded, pseudonymous forums like Less Wrong as society would instantly slam the lid on it with either moral panic or ridicule and give the speaker a black mark.

Meanwhile the thought patterns instilled in us by our upbringing would lead us to quickly lose all interest in the censored opinion"

Going by their definition, us blissfully ignorant masses can't even know what exactly those opinions might be, as they would look like basic human decency, the underpinnings of our ethics or some other such sacred cow to us. I might have a few guesses, though, all of them as horrible and sickening as my imagination could produce without overshooting and landing in the realm of comic-book evil:

- Dictatorial rule involving active terror and brutal suppression of deviants having great utility for a society in the long term, by providing security against some great risk or whatever.

- A need for every society to "cull the weak" every once in a while, e.g. exterminating the ~0.5% of its members that rank as weakest against some scale.

- Strict hierarchy in everyday life based on facts from the ansectral environment (men dominating women, fathers having the right of life and death over their children, etc) - Mencius argued in favor of such ruthless practices, e.g. selling children into slavery, in his post on "Pronomianism" and "Antinomianism", stating that all contracts between humans should rather be strict than moral or fair, to make the system stable and predictable; he's quite obsessed with stability and conformity.

- Some public good being created when the higher classes wilfully oppress and humiliate the lower ones in a ceremonial manner

- The bloodshed and lawlessness of periodic large-scale war as a vital "pressure valve" for releasing pent-up unacceptable emotional states and instinctive drives

- Plain ol' unfair discrimination of some group in many cruel, life-ruining ways, likewise as a pressure valve

+:  some Luddite crap about dropping to a near-subsistence level in every aspect of civilization and making life a daily struggle for survival

Of course my methodology for coming up with such guesses was flawed and primitive: I simply imagined some of the things that sound the ugliest to me yet have been practiced by unpleasant cultures before in some form. Now, of course, most of us take the absense of these to be utterly crucial to our terminal values. Nevertheless, I hope I have demonstrated to whoever might really have something along these lines (if not necessarily that shocking) on their minds that I'm open to meta-discussion, and very interested how we might engage each other on finding safe yet productive avenues of contact.


Let's do the impossible and think the unthinkable! I must know what those secrets are, no matter how much sleep and comfort I might lose.

P.S. Yeah, Will, I realize that I'm acting roughly in accordance with that one trick you mentioned way back.

P.P.S. Sup Bakkot. U mad? U jelly?




Fuck this Earth, and fuck human biology. I'm not very distressed about anything I saw ITT, but there's still a lot of unpleasant potential things that can only be resolved in one way:

I hereby pledge to get a real goddamn plastic card, not this Visa Electron bullshit the university saddled us with, and donate at least $100 to SIAI until the end of the year. This action will reduce the probability of me and mine having to live with the consequences of most such hidden horrors. Dixi.

Sometimes it's so pleasant to be impulsive.


Amusing observation: even when the comments more or less match my wild suggestions above, I'm still unnerved by them. An awful idea feels harmless if you keep telling yourself that it's just a private delusion, but the moment you know that someone else shares it, matters begin to look much more grave.

Feedback on promoting rational thinking about one's career choice to a broad audience

7 Gleb_Tsipursky 31 March 2015 10:44PM

I'd appreciate feedback on optimizing a blog post that promotes rational thinking about one's career choice to a broad audience in a way that's engaging, accessible, and fun to read. I'm aiming to use story-telling as the driver of the narrative, and sprinkling in elements of rational thinking, such as agency and mere-exposure effect, in a strategic way. The target audience is college-age youth and young adults, as you'll see from the narrative. Any suggestions for what works well, and what can be improved would be welcomed! The blog draft itself is below the line.

P.S. For context, the blog is part of a broader project, Intentional Insights, aimed at promoting rationality to a broad audience, as I described in this LW discussion post. To do so, we couch rationality in the language of self-improvement and present it in a narrative style.




"Stop and Think Before It's Too Late!"




Back when I was in high school and through the first couple of years in college, I had a clear career goal.

I planned to become a medical doctor.

Why? Looking back at it, my career goal was a result of the encouragement and expectations from my family and friends.

My family immigrated from the Soviet Union when I was 10, and we spent the next few years living in poverty. I remember my parents’ early jobs in America, my dad driving a bread delivery truck and my mom cleaning other people’s houses. We couldn’t afford nice things. I felt so ashamed in front of other kids for not being able to get that latest cool backpack or wear cool clothes – always on the margins, never fitting in. My parents encouraged me to become a medical doctor. They gave up successful professional careers when they moved to the US, and they worked long and hard to regain financial stability. It’s no wonder that they wanted me to have a career that guaranteed a high income, stability, and prestige.

My friends also encouraged me to go into medicine. This was especially so with my best friend in high school, who also wanted to become a medical doctor. He wanted to have a prestigious job and make lots of money, which sounded like a good goal to have and reinforced my parents’ advice. In addition, friendly competition was a big part of what my best friend and I did. Whether debating complex intellectual questions, trying to best each other on the high school chess team, or playing poker into the wee hours of the morning. Putting in long hours to ace the biochemistry exam and get a high score on the standardized test to get into medical school was just another way for us to show each other who was top dog. I still remember the thrill of finding out that I got the higher score on the standardized test. I had won!

As you can see, it was very easy for me to go along with what my friends and family encouraged me to do.  

I was in my last year of college, working through the complicated and expensive process of applying to medical schools, when I came across an essay question that stopped in me in my tracks:

“Why do you want to be a medical doctor?”

The question stopped me in my tracks. Why did I want to be a medical doctor? Well, it’s what everyone around me wanted me to do. It was what my family wanted me to do. It was what my friends encouraged me to do. It would mean getting a lot of money. It would be a very safe career. It would be prestigious. So it was the right thing for me to do. Wasn’t it?

Well, maybe it wasn’t.

I realized that I never really stopped and thought about what I wanted to do with my life. My career is how I would spend much of my time every week for many, many years,  but I never considered what kind of work I would actually want to do, not to mention whether I would want to do the work that’s involved in being a medical doctor. As a medical doctor, I would work long and sleepless hours, spend my time around the sick and dying, and hold people’s lives in my hands. Is that what I wanted to do?

There I was, sitting at the keyboard, staring at the blank Word document with that essay question at the top. Why did I want to be a medical doctor? I didn’t have a good answer to that question.

My mind was racing, my thoughts were jumbled. What should I do? I decided to talk to someone I could trust, so I called my girlfriend to help me deal with my mini-life crisis.  She was very supportive, as I thought she would be. She told me I shouldn’t do what others thought I should do, but think about what would make me happy. More important than making money, she said, is having a lifestyle you enjoy, and that lifestyle can be had for much less than I might think.

Her words provided a valuable outside perspective for me. By the end of our conversation, I realized that I had no interest in doing the job of a medical doctor. And that if I continued down the path I was on, I would be miserable in my career, doing it just for the money and prestige. I realized that I was on the medical school track because others I trust - my parents and my friends - told me it was a good idea so many times that I believed it was true, regardless of whether it was actually a good thing for me to do.

Why did this happen?

I later learned that I found myself in this situation because of a common thinking error which scientists call the mere-exposure effect. It means that we tend our tendency to believe something is true and good just because we are familiar with it, regardless of whether it is actually true and good.

Since I learned about the mere-exposure effect, I am much more suspicious of any beliefs I have that are frequently repeated by others around me, and go the extra mile to evaluate whether they are true and good for me. This means I can gain agency and intentionally take actions that help me toward my long-term goals.

So what happened next?

After my big realization about medical school and the conversation with my girlfriend, I took some time to think about my actual long-term goals. What did I - not someone else - want to do with my life? What kind of a career did I want to have? Where did I want to go?

I was always passionate about history. In grade school I got in trouble for reading history books under my desk when the teacher talked about math. As a teenager, I stayed up until 3am reading books about World War II. Even when I was on the medical school track in college I double-majored in history and biology, with history my love and joy. However, I never seriously considered going into history professionally. It’s not a field where one can make much money or have great job security.

After considering my options and preferences, I decided that money and security mattered less than a profession that would be genuinely satisfying and meaningful. What’s the point of making a million bucks if I’m miserable doing it, I thought to myself. I chose a long-term goal that I thought would make me happy, as opposed to simply being in line with the expectations of my parents and friends. So I decided to become a history professor.

My decision led to some big challenges with those close to me. My parents were very upset to learn that I no longer wanted to go to medical school. They really tore into me, telling me I would never be well off or have job security. Also, it wasn’t easy to tell my friends that I decided to become a history professor instead of a medical doctor. My best friend even jokingly asked if I was willing to trade grades on the standardized medical school exam, since I wasn’t going to use my score. Not to mention how painful it was to accept that I wasted so much time and effort to prepare for medical school only to realize that it was not the right choice for me. I really I wish this was something I realized earlier, not in my last year of college.

3 steps to prevent this from happening to you:

If you want to avoid finding yourself in a situation like this, here are 3 steps you can take:

1.      Stop and think about your life purpose and your long-term goals. Write these down on a piece of paper.

2.      Now review your thoughts, and see whether you may be excessively influenced by messages you get from your family, friends, or the media. If so, pay special attention and make sure that these goals are also aligned with what you want for yourself. Answer the following question: if you did not have any of those influences, what would you put down for your own life purpose and long-term goals? Recognize that your life is yours, not theirs, and you should live whatever life you choose for yourself.

3.      Review your answers and revise them as needed every 3 months. Avoid being attached to your previous goals. Remember, you change throughout your life, and your goals and preferences change with you. Don’t be afraid to let go of the past, and welcome the current you with arms wide open.


What do you think?

·        Do you ever experience pressure to make choices that are not necessarily right for you?

·        Have you ever made a big decision, but later realized that it wasn’t in line with your long-term goals?

·        Have you ever set aside time to think about your long-term goals? If so, what was your experience? 


View more: Next