Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Introducing Goalclaw, personal goal tracker

1 Nic_Smith 21 October 2017 08:10PM

Quite a while ago, I wrote that there should be more software tools to assist with instrumental rationality. My recent attempt to create such a tool, GOALCLAW, is now available. GOALCLAW is a general goal tracking webapp which currently provides an average of how the tags entered for events day-to-day affect your goals, with plans to make more tag-based metrics and projections available in the near future.

  • GOALCLAW is new:
    • A few editing features are missing and should be added in the next few months
    • The built-in analysis needs to be expanded from averages
    • I'm very interested in feedback on how to make this a more useful goal-tracker
  • The general idea is to make patterns in what's going on around you and what you're doing a bit more obvious, so you can then investigate, verify/experiment, and act to achieve your goals
  • You can download information entered for importing into spreadsheets, stats program, etc.

Rationality as A Value Decider

1 DragonGod 05 June 2017 03:21AM

Rationality As a Value Decider

A Different Concept of Instrumental Rationality

Eliezer Yudkowsky defines instrumental rationality as “systematically achieving your values” and goes on to say: “Instrumental rationality, on the other hand, is about steering reality—sending the future where you want it to go. It’s the art of choosing actions that lead to outcomes ranked higher in your preferences. I sometimes call this ‘winning.’” [1]
I agree with Yudkowsky’s concept of rationality as a method for systematised winning. It is why I decided to pursue rationality—that I may win. However, I personally disagree with the notion of “systematically achieving your values” simply because I think it is too vague. What are my values? Happiness and personal satisfaction? I find that you can maximise this by joining a religious organisation, in fact I think I was happiest in a time before I discovered the Way. But this isn’t the most relevant, maximising your values isn’t specific enough for my taste, it’s too vague for me.
“Likewise, decision theory defines what action I should take based on my beliefs. For any consistent set of beliefs and preferences I could have about Bob, there is a decision-theoretic answer to how I should then act in order to satisfy my preferences.” [2]
This implies that instrumental rationality is specific; from the above statement, I infer:
“For any decision problem to any rational agent with a specified psyche, there is only one correct choice to make.”
However, if we only seek to systematically achieve our values, I believe that instrumental rationality fails to be specific—it is possible that there’s more than one solution to a problem in which we merely seek to maximise or values. I cherish the specificity of rationality; there is a certain comfort, in knowing that there is a single correct solution to any problem, a right decision to make for any game—one merely need find it. As such, I sought a definition of rationality that I personally agree with; one that satisfies my criteria for specificity; one that satisfies my criteria for winning. The answer I arrived at was: “Rationality is systematically achieving your goals.”
I love the above definition; it is specific—gone is the vagueness and uncertainty of achieving values. It is simple—gone is the worry over whether value X should be an instrumental value or a terminal value. Above all, it is useful—I know whether or not I have achieved my goals, and I can motivate myself more to achieve them. Rather than thinking about vague values I think about my life in terms of goals:
“I have goal X how do I achieve it?”
If necessary, I can specify sub goals and sub goals for those sub goals. I find that thinking about your life in terms of goals to be achieved is a more conducive model for problem solving, a more efficient model—a useful model. I am many things, and above them all I am a utilitist—the worth of any entity is determined by its utility to me. I find the model of rationality as a goal enabler a more useful model.
Goals and values are not always aligned. For example, consider the problem below:

Jane is the captain of a boat full with 100 people. The ship is about to capsize and would, unless ten people are sacrificed. Jane’s goal is to save as many people as possible. Jane’s values hold human lives sacred. Sacrificing ten people has a 100% chance of saving 90 people, while sacrificing no one and going with plan delta has a 10% chance to save the 100, and a 90% chance for everyone to die.


The sanctity of human life is a terminal value for Jane. Jane when seeking to actualise her values, may well choose to go with plan delta, which has a 90% chance to prevent her from achieving her goals.
Values may be misaligned with goals, values may be inhibiting towards achieving our goals. Winning isn’t achieving your values; winning is achieving your goals.


I feel it is apt to define goals at this juncture, lest the definition be perverted and only goals aligned with values be considered “true/good goals”.
Goals are any objectives a self aware agent consciously assigns itself to accomplish.
There are no true goals, no false goals, no good goals, no bad goals, no worthy goals, no worthless goals; there are just goals.
I do not consider goals something that “exist to affirm/achieve values"—you may assign yourself goals that affirm your values, or goals that run contrary to them—the difference is irrelevant, we work to achieve those goals you have specified.

The Psyche

The Psyche is an objective map that describes a self-aware agent that functions as a decision maker—rational or not. The sum total of an individual’s beliefs—all knowledge is counted as belief—values and goals form their psyche. The psyche is unique to each individual. The psyche is not a subjective evaluation of an individual by themselves, but an objective evaluation of the individual as they would appear to an omniscient observer. An individual’s psyche includes the totality of their map. The psyche is— among other things—a map that describes a map so to speak.
When a decision problem is considered, the optimum solution to such a problem cannot be considered without considering the psyche of that individual. The values that individual holds, the goals they seek to achieve and their mental map of the world.
Eliezer Yudkowsky seems to believe that we have an extremely limited ability to alter our psyche. He posits, that we can’t choose to believe the sky is green at will. I never really bought this, and especially due to personal anecdotal evidence. Yet, I’ll come back to altering beliefs later.
Yudkowsky describes the human psyche as: “a lens that sees its own flaws”. [3] I personally would extend this definition; we are not merely “a lens that sees its own flaws”, we are also “a lens that corrects itself”—the self-aware AI that can alter its own code. The psyche can be altered at will—or so I argue.
I shall start with values. Values are neither permanent nor immutable. I’ve had a slew of values over the years; while Christian, I valued faith, now I adhere to Thomas Huxley’s maxim:

Scepticism is the highest of duties; blind faith the one unpardonable sin.


Another one: prior to my enlightenment I held emotional reasoning in high esteem, and could be persuaded by emotional arguments, after my enlightenment I upheld rational reasoning. Okay, that isn’t entirely true; my answer to the boat problem had always been to sacrifice the ten people, so that doesn’t exactly work, but I was more emotional then, and could be swayed by emotional arguments. Before I discovered the Way earlier this year (when I was fumbling around in the dark searching for rationality) I viewed all emotion as irrational, and my values held logic and reason above all. Back then, I was a true apath, and completely unfeeling. I later read arguments for the utility of emotions, and readjusted my values accordingly. I have readjusted my values several times along the journey of life; just recently, I repressed my values relating to pleasure from feeding—to aid my current routine of intermittent fasting. I similarly repressed my values of sexual arousal/pleasure—I felt it will make me more competent. Values can be altered, and I suspect many of us have done it at least once in our lives—we are the lens that corrects itself.

Getting back to belief (whether you can choose to believe the sky is green at will) I argue that you can, it is just a little more complicated than altering your values. Changing your beliefs—changing your actual anticipation controllers—truly redrawing the map, would require certain alterations to your psyche in order for it to retain a semblance of consistency. In order to be able to believe the sky is green, you would have to:

  • Repress your values that make you desire true beliefs.
  • Repress your values that make you give priority to empirical evidence.
  • Repress your vales that make you sceptical.  
  • Create (or grow if you already have one) a new value that supports blind faith.


  • Repress your values that support curiosity. 
  • Create (or grow if you already have one) a new value that supports ignorance.

By the time, you’ve done the ‘edits’ listed above, you would be able to freely believe that the sky is green, or snow is black, or that the earth rests on the back of a giant turtle, or a teapot floats in the asteroid belt. I’m warning you though, by the time you’ve successfully accomplished the edits above, your psyche would be completely different from now, and you will be—I argue—a different person. If any of you were worried that the happiness of stupidity was forever closed to you, then fear not; it is open to you again—if you truly desire it. Be forewarned; the “you” that would embrace it would be different from the “you” now, and not one I’m sure I’d want to associate with. The psyche is alterable; we are the masters of our own mind—the lens that corrects itself.
I do not posit, that we can alter all of our psyche (I suspect that there are aspects of cognitive machinery that are unalterable; “hardcoded” so to speak. However, my neuroscience is non-existent—as such I shall leave this issue to those more equipped to comment on it.

Values as Tools

In my conception of instrumental rationality, values are no longer put on a pedestal, they are no longer sacred; there are no more terminal values anymore—only instrumental. Values aren’t the masters anymore, they’re slaves—they’re tools.
The notion of values as tools may seem disturbing for some, but I find it to be quite a useful model, and such I shall keep it.
Take the ship problem Jane was presented with above, had Jane deleted her value which held human life as sacred, she would have been able to make the decision with the highest probability of achieving her goals. She could even add a value that suppressed empathy, to assist her in similar situations—though some might feel that is overkill. I once asked a question on a particular subreddit:
“Is altruism rational?”
My reply was a quick and dismissive:
“Rationality doesn’t tell you what values to have, it only tells you how to achieve them.”
The answer was the standard textbook reply that anyone who had read the sequences or RAZ (Rationality: From AI to Zombies) would produce; I had read neither at the time. Nonetheless, I was reading HPMOR (Harry Potter and the Methods of Rationality), and that did sound like something Harry would say. After downloading my own copy of RAZ, I found that the answer was indeed correct—as long as I accepted Yudkowsky’s conception of instrumental rationality. Now that I reject it, and consider rationality as a tool to enable goals, I have a more apt response:

What are your goals?


If your goals are to have a net positive effect on the world (do good so to speak) then altruism may be a rational value to have. If your goals are far more selfish, then altruism may only serve as a hindrance.

The utility of “Values as Tools” isn’t just that some values may harm your goals, nay it does much more. The payoff of a decision is determined by two things:

  1. How much closer it brings you to the realisation of your goals? 
  2. How much it aligns with your values?

Choosing values that are doubly correlated with your current goals (you actualise your values when you make goal conducive decisions, and you run opposite to your values when you make goal deleterious decisions) exaggerates the positive payoff of goal conducive decisions, and the negative payoff of goal deleterious decisions. This aggrandising of the payoffs of decisions serves as a strong motivator towards making goal conducive decisions— large rewards, large punishment—a perfect propulsion system so to speak.

The utility of the “Values as Tools” approach is that it serves as a strong motivator towards goal conducive decision making.


It has been brought to my attention that a life such as the one I describe may be “an unsatisfying life” and “a life not worth living”. I may reply that I do not seek to maximise happiness, but that may be dodging the issue; I first conceived rationality as a value decider when thinking about how I would design an AI—it goes without saying that humans are not computers.

I offer a suggestion: order your current values in a scale of preference. Note the value (or set thereof) utmost in your scale of preference. The value that if you had to achieve only one value, that you would choose. Pick a goal that is aligned with that value (or set thereof). That goal shall be called your “prime goal”.The moment you pick your prime goal, you fix it. From now on, you no longer change your goals to align with your values. You change your values to align with your goals. Your aim in life is to achieve your prime goal, and you pick values and subgoals that would help you achieve your prime goal.


[1] Eliezer Yudkowsky, “Rationality: From AI to Zombies”, pg 7, 2015, MIRI, California.
[2] Eliezer Yudkowsky, “Rationality: From AI to Zombies”, pg 203, 2015, MIRI, California.
[3] Eliezer Yudkowsky, “Rationality: From AI to Zombies”, pg 40, 2015, MIRI, California.

The time you have

5 Elo 05 January 2017 02:13AM

Original post: http://bearlamp.com.au/the-time-you-have/

Part 1: Exploration-Exploitation

Part 2: Bargaining Trade-offs to your brain.

Part 2a: Empirical time management

Part 3: The time that you have

There is a process called The Immunity to Change by Robert Kegan.  The process is designed to deal with personal problems that are stubborn.  The first step in the process is to make a list of all the things that you are doing or not doing that does not contribute to the goal.  As you go through the process you analyse why you do these things based on what it feels like to do them.

The process is meant to be done with structure but can be done simply by asking.  Yesterday I asked someone who said he ate sugar, ate carbs, and didn't exercise.  Knowing this alone doesn't solve the problem but it helps.

The ITC process was generated by observing patients and therapists for thousands of hours and thousands of cases.  Kegan observed what seems to be effective to bring about change, in people and generated this process to assist in doing so.  The ITC hits on a fundamental universal.  If you read my brief guide on Empirical time management, as well as part 1 - exploration-exploitation of this series it speaks to this universal.  Namely what we are doing with our time is everything we are choosing not to do with our time.  It's a trade off between our values and it's counter-commitments in ITC that's often discovering the hidden counter commitments to the goals.

The interesting thing about what you end up doing with your time is that these are the things that form your revealed preferences.  Revealed preference theory is an economic theory that differentiates between people's stated preferences and their actual actions and behaviours.  It's all good and well to say that your preferences are one thing, but if you never end up doing that; your revealed preferences are in fact something entirely different.

For example - if you say you want to be a healthy person, and yet you never find yourself doing the things that you say you want to do in order to be healthy; your revealed preferences suggest that you are in fact not revealing the actions of a healthy person.  If you live to the ripe old age of 55 and the heavy weight of 130kg and you never end up exercising several times a week or eating healthy food; that means your health goals were a rather weak preference over the things you actually ended up doing (eating plenty and not keeping fit).

It's important to note that revealed preferences are different to preferences, they are in fact distinctly different.  They are their own subset.  Revealed preferences are just another description that informs the map of, "me as a person".  In many ways, a revealed preference is much much more real than a simple preference that does not actually come about.  On a philosophical level, if we have a LoudMouthBot, and all it does is declare it's preference for things.  "I want everyone to be friends", "you need to be friends with me". However it never does anything.   You can log into the bot's IRC channel and see it declaring preferences, day in, day out.  Hour after hour.  And yet, not actually doing those preferences.  He's just a bot, spitting out words that are preferences (almost analogous to a p-zombie).  You could look at LoudMouthBot from the outside and say, "all it does is spew text into a text chat", and that would be an observation which for all purposes can be taken as true.  In contrast, AgentyBot doesn't really declare a preference, Agentybot knows the litany of truth.

If the sky is blue

I desire to believe that the sky is blue,

If the sky is not blue

I desire to believe that the sky is not blue.

Or for this case; a litany of objectivity,

If my revealed preferences show that I desire this goal

I desire to know that is my goal,

If my revealed preferences show that I do not desire this goal

I desire to know that is not my goal.

Revealed preferences work in two directions.  On the one hand you can discover your revealed preferences and let that inform your future judgements and future actions.  On the other hand you can make your revealed preferences show that they line up with your goal.

A friend asked me how she should find her purpose, Easier said than done right? That's why I suggested an exercise that does the first of the two.  In contrast if you already know your goals you want to take stock of what you are doing and align it with your desired goals.


I already covered how to empirically assess your time, That would be the first step of how you take stock of what you are doing.

The second step is to consider and figure out your desired goals.  Unfortunately the process as to how to do that is not always obvious.  For some people they can literally just take 5 minutes and a piece of paper and list off their goals.  For everyone else I have some clues in the form of the list of common human goals.  By going down the list of goals that people commonly obtain you can cue your sense of what are some of the things that you care about, and figure out which ones matter to you.  There are other exercises, but I take it as read that knowing what your goals are is important.  After you have your list of goals you might like to consider estimating what fraction of your time you want to offer to each of your goals.

The third step is one that I am yet to write about.  Your job is to compare the list of your goals and the list of your time use and consider which object level tasks would bring you towards your goals and which actions that you are doing are not enabling you to move towards your goals.

Everything that you do will take time.  Any goal you want to head towards will take time, if you are spending your time on one task towards one goal and not on another task towards another goal; you are preferencing the task you are doing over the other task.

If these are your revealed preferences, what do you reveal that you care about?

I believe that each of us has potential.  That word is an applause light.  Potential doesn't really have a meaning yet.  I believe that each of us could:

  1. Define what we really care about.
  2. Define what results we think we can aim for within what we really care about
  3. Define what actions we can take to yield a trajectory towards those results
  4. Stick to it because it's what we really want to do.

That's what's important right?  Doing the work you value because it leads towards your goals (which are the things you care about).

If you are not doing that, then your revealed preferences are showing that you are not a very strategic.  If you find parts of your brain doing what they want at the detriment of other parts of your goals, you need to reason with them.  Use the powers of VoI, treat this problem as an exploration-exploitation problem, and run some experiments (post coming soon).  

This whole; define what you really care about and then head towards it, you should know that it needs doing now, or you are making bad trade offs.

Meta: this is part 3 of 4 of this series.

Meta: this took 5+ hours to piece together.  I am not yet very good at staying on task when I don't know how to put the right words in the right order yet.  I guess I need more practice.  What I usually do is take small breaks and come back to it.

Should you share your goals

5 Elo 14 December 2016 11:27PM

Original post: http://bearlamp.com.au/?p=507&preview=true

It's complicated. And depends on the environment in which you share your goals.

Scenario 1: you post on facebook "This month I want to lose 1kg, I am worried I can't do it - you guys should show me support". Your friends; being the best of aspiring rationalist friends; believe your instructions are thought out and planned, After all your goal is Specific, Measurable, Attainable, Realistic and Timely (SMART). In the interest of complying with your request you get 17 likes and 10 comments of "wow awesome" and "you go man" and "that's the way to do it". Even longer ones of, "good planning will help you achieve your goals", and some guy saying how he lost 2 kilos in a month, so 1kg should be easy as cake.

When you read all the posts your brain goes "wow, lost weight like that", "earn't the adoration of my friends for doing the thing", and rewards you dopamine for your social support.  I feel great! So you have a party, eat what you like, relax and enjoy that feeling. One month later you managed to gain a kilo not lose one.

Scenario 2: You post on facebook, "This month I want to lose 2kg (since last month wasn't so great). So all of you better hold me to that, and help me get there". In the interest of complying with you, all your aspiring rationalist friends post things like, "Yea right", "I'll believe it when I see it". "you couldn't do 1kg last month, what makes you think you can do it now?", "I predict he will lose one kilo but then put it back on again. haha", "you're so full of it. You want to lose weight; I expect to see you running with me at 8am 3 times a week". two weeks later someone posts to your wall, "hows the weight loss going? I think you failed already", and two people comment, "I bet he did", and "actually he did come running in the morning".

When you read all the posts your brain goes; "looks like I gotta prove it to them that I can do this, and hey this could be easy if they help me exercise", no dopamine reward because I didn't get the acclaim. After two weeks you are starting to lose track of the initial momentum, the chocolate is starting to move to the front of the cupboard again. When you see the post on your wall you double down; throw out the chocolate so it's not in your temptation, and message the runner that you will be there tomorrow. After a month you actually did it, reporting back to your friends they actually congratulate you for your work; "my predictions were wrong; updating my beliefs", "congratulations", "teach me how you did it"..

Those scenarios were made up, but its designed to show that it depends entirely on the circumstances of your sharing your goals and the atmosphere in which you do it as well as how you treat the events surrounding sharing your goals.

Given that in scenario 2 asking for help yielded an exercise partner, and scenario 1 only yielded encouragement - there is a clear distinction between useful goal-sharing and less-useful goal sharing.

Yes; some goal sharing is ineffective; but some can be effective. Up to you whether you take the effective pathways or not.

Addendum: Treat people's goals the right way; not the wrong way. Make a prediction on what you think will happen then ask them critical questions. If something sounds unrealistic - gently prod them in the direction of being more realistic (emphasis on gentle). (relevant example) "what happens over the xmas silly season when there is going to be lots of food around - how will you avoid putting on weight?", "do you plan to exercise?", "what do you plan to do differently from last month?". DO NOT reward people for not achieving their goals.

Meta: this is a repost from when I wrote it here. Because I otherwise have difficulty searching for it and finding it.

related: http://lesswrong.com/lw/l5y/link_the_problem_with_positive_thinking/

Counterfactual do-what-I-mean

2 Stuart_Armstrong 27 October 2016 01:54PM

A putative new idea for AI control; index here.

The counterfactual approach to value learning could be used to possibly allow natural language goals for AIs.

The basic idea is that when the AI is given a natural language goal like "increase human happiness" or "implement CEV", it is not to figure out what these goals mean, but to follow what a pure learning algorithm would establish these goals as meaning.

This would be safer than a simple figure-out-the-utility-you're-currently-maximising approach. But it still doesn't solve a few drawbacks. Firstly, the learning algorithm has to be effective itself (in particular, modifying human understanding of the words should be ruled out, and the learning process must avoid concluding the simpler interpretations are always better). And secondly, humans' don't yet know what these words mean, outside our usual comfort zone, so the "learning" task also involves the AI extrapolating beyond what we know.

A collection of Stubs.

-5 Elo 06 September 2016 07:24AM

In light of SDR's comment yesterday, instead of writing a new post today I compiled my list of ideas I wanted to write about, partly to lay them out there and see if any stood out as better than the rest, and partly so that maybe they would be a little more out in the wild than if I hold them until I get around to them.  I realise there is not a thesis in this post, but I figured it would be better to write one of these than to write each in it's own post with the potential to be good or bad.

Original post: http://bearlamp.com.au/many-draft-concepts/

I create ideas at about the rate of 3 a day, without trying to.  I write at about a rate of 1.5 a day.  Which leaves me always behind.  Even if I write about the best ideas I can think of, some good ones might never be covered.  This is an effort to draft out a good stack of them so that maybe it can help me not have to write them all out, by better defining which ones are the good ones and which ones are a bit more useless.

With that in mind, in no particular order - a list of unwritten posts:

From my old table of contents

Goals of your lesswrong group – As a guided/workthrough exercise in deciding why the group exists and what it should do.  Help people work out what they want out of it (do people know)? setting goals, doing something particularly interesting or routine, having fun, changing your mind, being activists in the world around you.  Whatever the reasons you care about, work them out and move towards them.  Nothing particularly groundbreaking in the process here.  Sit down with the group with pens and paper, maybe run a resolve cycle, maybe talk about ideas and settle on a few, then decide how to carry them out.  Relevant links: Sydney meetup,  group resources (estimate 2hrs to write)

Goals interrogation + Goal levels – Goal interrogation is about asking <is this thing I want to do actually a goal of mine> and <is my current plan the best way to achieve that>, goal levels are something out of Sydney Lesswrong that help you have mutual long term goals and supporting short term goal.  There are 3 main levels, Dream, Year, Daily (or approximate) you want dream goals like going to the moon, you want yearly goals like getting another year further in your degree and you want daily goals like studying today that contribute to the upper level goals.  Any time you are feeling lost you can look at the guide you set out for yourself and use it to direct you. (3hrs)

How to human – A zero to human guide. A guide for basic functionality of a humanoid system. Something of a conglomeration of maslow, mental health, so you feel like shit and system thinking.  Am I conscious?Am I breathing? Am I bleeding or injured (major or minor)? Am I falling or otherwise in danger and about to cause the earlier questions to return false?  Do I know where I am?  Am I safe?  Do I need to relieve myself (or other bodily functions, i.e. itchy)?  Have I had enough water? sleep? food?  Is my mind altered (alcohol or other drugs)?  Am I stuck with sensory input I can't control (noise, smells, things touching me)?  Am I too hot or too cold?  Is my environment too hot or too cold?  Or unstable?  Am I with people or alone? Is this okay?  Am I clean (showered, teeth, other personal cleaning rituals)?  Have I had some sunlight and fresh air in the past few days?  Have I had too much sunlight or wind in the past few days?  Do I feel stressed?  Okay?  Happy?  Worried?  Suspicious?  Scared? Was I doing something?  What am I doing?  do I want to be doing something else?  Am I being watched (is that okay?)?  Have I interacted with humans in the past 24 hours?  Have I had alone time in the past 24 hours?  Do I have any existing conditions I can run a check on - i.e. depression?  Are my valuables secure?  Are the people I care about safe?  (4hrs)

List of common strategies for getting shit done – things like scheduling/allocating time, pomodoros, committing to things externally, complice, beeminder, other trackers. (4hrs)

List of superpowers and kryptonites – when asking the question “what are my superpowers?” and “what are my kryptonites?”. Knowledge is power; working with your powers and working out how to avoid your kryptonites is a method to improve yourself.  What are you really good at, and what do you absolutely suck at and would be better delegating to other people.  The more you know about yourself, the more you can do the right thing by your powers or weaknesses and save yourself troubles.

List of effective behaviours – small life-improving habits that add together to make awesomeness from nothing. And how to pick them up.  Short list: toothbrush in the shower, scales in front of the  fridge, healthy food in the most accessible position in the fridge, make the unhealthy stuff a little more inacessible, keep some clocks fast - i.e. the clock in your car (so you get there early),  prepare for expected barriers ahead of time (i.e. packing the gym bag and leaving it at the door), and more.

Stress prevention checklist – feeling off? You want to have already outsourced the hard work for “things I should check on about myself” to your past self. Make it easier for future you. Especially in the times that you might be vulnerable.  Generate a list of things that you want to check are working correctly.  i.e. did I drink today?  Did I do my regular exercise?  Did I take my medication?  Have I run late today?  Do I have my work under control?

Make it easier for future you. Especially in the times that you might be vulnerable. – as its own post in curtailing bad habits that you can expect to happen when you are compromised.  inspired by candy-bar moments and turning them into carrot-moments or other more productive things.  This applies beyond diet, and might involve turning TV-hour into book-hour (for other tasks you want to do instead of tasks you automatically do)

A p=np approach to learning – Sometimes you have to learn things the long way; but sometimes there is a short cut. Where you could say, “I wish someone had just taken me on the easy path early on”. It’s not a perfect idea; but start looking for the shortcuts where you might be saying “I wish someone had told me sooner”. Of course the answer is, “but I probably wouldn’t have listened anyway” which is something that can be worked on as well. (2hrs)

Rationalists guide to dating – Attraction. Relationships. Doing things with a known preference. Don’t like unintelligent people? Don’t try to date them. Think first; then act - and iteratively experiment; an exercise in thinking hard about things before trying trial-and-error on the world. Think about places where you might meet the kinds of people you want to meet, then use strategies that go there instead of strategies that flop in the general direction of progress.  (half written)

Training inherent powers (weights, temperatures, smells, estimation powers) – practice makes perfect right? Imagine if you knew the temperature always, the weight of things by lifting them, the composition of foods by tasting them, the distance between things without measuring. How can we train these, how can we improve.  Probably not inherently useful to life, but fun to train your system 1! (2hrs)

Strike to the heart of the question. The strongest one; not the one you want to defeat – Steelman not Strawman. Don’t ask “how do I win at the question”; ask, “am I giving the best answer to the best question I can give”.  More poetic than anything else - this post would enumerate the feelings of victory and what not to feel victorious about, as well as trying to feel what it's like to be on the other side of the discussion to yourself, frustratingly trying to get a point across while a point is being flung at yourself. (2hrs)

How to approach a new problem – similar to the “How to solve X” post.  But considerations for working backwards from a wicked problem, as well as trying “The least bad solution I know of”, Murphy-jitsu, and known solutions to similar problems.  Step 0. I notice I am approaching a problem.

Turning Stimming into a flourish – For autists, to make a presentability out of a flaw.

How to manage time – estimating the length of future tasks (and more), covered in notch system, and do tasks in a different order.  But presented on it's own.

Spices – Adventures in sensory experience land.  I ran an event of spice-smelling/guessing for a group of 30 people.  I wrote several documents in the process about spices and how to run the event.  I want to publish these.  As an exercise - it's a fun game of guess-the-spice.

Wing it VS Plan – All of the what, why, who, and what you should do of the two.  Some people seem to be the kind of person who is always just winging it.  In contrast, some people make ridiculously complicated plans that work.  Most of us are probably somewhere in the middle.  I suggest that the more of a planner you can be the better because you can always fall back on winging it, and you probably will.  But if you don't have a plan and are already winging it - you can't fall back on the other option.  This concept came to me while playing ingress, which encourages you to plan your actions before you make them.

On-stage bias – The changes we make when we go onto a stage include extra makeup to adjust for the bright lights, and speaking louder to adjust for the audience which is far away. When we consider the rest of our lives, maybe we want to appear specifically X (i.e, confident, friendly) so we should change ourselves to suit the natural skews in how we present based on the "stage" we are appearing on.  appear as the person you want to appear as, not the person you naturally appear as.

Creating a workspace – considerations when thinking about a “place” of work, including desk, screen, surrounding distractions, and basically any factors that come into it.  Similar to how the very long list of sleep maintenance suggestions covers environmental factors in your sleep environment but for a workspace.

Posts added to the list since then

Doing a cost|benefit analysis - This is something we rely on when enumerating the options and choices ahead of us, but something I have never explicitly looked into.  Some costs that can get overlooked include: Time, Money, Energy, Emotions, Space, Clutter, Distraction/Attention, Memory, Side effects, and probably more.  I'd like to see a How to X guide for CBA. (wikipedia)

Extinction learning at home - A cross between intermittent reward (the worst kind of addiction), and what we know about extinguishing it.  Then applying that to "convincing" yourself to extinguish bad habits by experiential learning.  Uses the CFAR internal Double Crux technique, precommit yourself to a challenge, for example - "If I scroll through 20 facebook posts in a row and they are all not worth my time, I will be convinced that I should spend less time on facebook because it's not worth my time"  Adjust 20 to whatever position your double crux believes to be true, then run a test and iterate.  You have to genuinely agree with the premise before running the test.  This can work for a number of committed habits which you want to extinguish.  (new idea as at the writing of this post)

How to write a dating ad - A suggestion to include information that is easy to ask questions about (this is hard).  For example; don't write, "I like camping", write "I like hiking overnight with my dog", giving away details in a way that makes them worth inquiring about.  The same reason applies to why writing "I'm a great guy" is really not going to get people to believe you, as opposed to demonstrating the claim. (show, don't tell)

How to give yourself aversions - an investigation into aversive actions and potentially how to avoid collecting them when you have a better understanding of how they happen.  (I have not done the research and will need to do that before publishing the post)

How to give someone else an aversion - similar to above, we know we can work differently to other people, and at the intersection of that is a misunderstanding that can leave people uncomfortable.

Lists - Creating lists is a great thing, currently in draft - some considerations about what lists are, what they do, what they are used for, what they can be used for, where they come in handy, and the suggestion that you should use lists more. (also some digital list-keeping solutions)

Choice to remember the details - this stems from choosing to remember names, a point in the conversation where people sometimes tune out.  As a mindfulness concept you can choose to remember the details. (short article, not exactly sure why I wanted to write about this)

What is a problem - On the path of problem solving, understanding what a problem is will help you to understand how to attack it.  Nothing more complicated than this picture to explain it.  The barrier is a problem.  This doesn't seem important on it's own but as a foundation for thinking about problems it's good to have  sitting around somewhere.


How to/not attend a meetup - for anyone who has never been to a meetup, and anyone who wants the good tips on etiquette for being the new guy in a room of friends.  First meetup: shut up and listen, try not to be too much of an impact on the existing meetup group or you might misunderstand the culture.

Noticing the world, Repercussions and taking advantage of them - There are regularly world events that I notice.  Things like the olympics, Pokemon go coming out, the (recent) spaceX rocket failure.  I try to notice when big events happen and try to think about how to take advantage of the event or the repercussions caused by that event.  Motivated to think not only about all the olympians (and the fuss leading up to the olympics), but all the people at home who signed up to a gym because of the publicity of the competitive sport.  If only I could get in on the profit of gym signups...

leastgood but only solution I know of - So you know of a solution, but it's rubbish.  Or probably is.  Also you have no better solutions.  Treat this solution as the best solution you have (because it is) and start implementing it, as you do that - keep looking for other solutions.  But at least you have a solution to work with!

Self-management thoughts - When you ask yourself, "am I making progress?", "do I want to be in this conversation?" and other self management thoughts.  And an investigation into them - it's a CFAR technique but their writing on the topic is brief.  (needs research)

instrumental supply-hoarding behaviour - A discussion about the benefits of hoarding supplies for future use.  Covering also - what supplies are not a good idea to store, and what supplies are.  Maybe this will be useful for people who store things for later days, and hopefully help to consolidate and add some purposefulness to their process.

list of sub groups that I have tried - Before running my local lesswrong group I partook in a great deal of other groups.  This was meant as a list with comments on each group.

If you have nothing to do – make better tools for use when real work comes along - This was probably going to be a poetic style motivation post about exactly what the title suggests.  Be Prepared.

what other people are good at (as support) - When reaching out for support, some people will be good at things that other people are not.  For example - emotional support, time to spend on each other, ideas for solving your problems.  Different people might be better or worse than others.  Thinking about this can make your strategies towards solving your problems a bit easier to manage.  Knowing what works and what does not work, or what you can reliably expect when you reach out for support from some people - is going to supercharge your fulfilment of those needs.

Focusing - An already written guide to Eugine Gendlin's focusing technique.  That needs polishing before publishing.  The short form: treat your system 1 as a very powerful machine that understands your problems and their solutions more than you do; use your system 2 to ask it questions and see what it returns.

Rewrite: how to become a 1000 year old vampire - I got as far as breaking down this post and got stuck at draft form before rewriting.  Might take another stab at it soon.

Should you tell people your goals? This thread in a post.  In summary: It depends on the environment, the wrong environment is actually demotivational, the right environment is extra motivational.

Meta: this took around 4 hours to write up.  Which is ridiculously longer than usual.  I noticed a substantial number of breaks being taken - not sure if that relates to the difficulty of creating so many summaries or just me today.  Still.  This experiment might help my future writing focus/direction so I figured I would try it out.  If you see an idea of particularly high value I will be happy to try to cover it in more detail.

The barriers to the task

-7 Elo 18 August 2016 07:22AM

Original post: http://bearlamp.com.au/the-barriers-to-the-task/

For about two months now I have been putting in effort to run in the mornings.  To make this happen, I had to take away all the barriers to me wanting to do that.  There were plenty of them, and I failed to leave my house plenty of times.  Some examples are:

Making sure I don't need correct clothes - I leave my house shirtless and barefoot, and grab my key on the way out.  

Pre-commitment to run - I take my shirt off when getting into bed the night before, so I don't even have to consider the action in the morning when I roll out of bed.

Being busy in the morning - I no longer plan any appointments before 11am.  Depending on the sunrise (I don't use alarms), I wake up in the morning, spend some time reading things, then roll out of bed to go to the toilet and leave my house.  In Sydney we just passed the depths of winter and it's beginning to get light earlier and earlier in the morning.  Which is easy now; but was harder when getting up at 7 meant getting up in the dark.  

There were days when I would wake up at 8am, stay in bed until 9am, then realise if I left for a run (which takes around an hour - 10am), then came back to have a shower (which takes 20mins - 10:20), then left to travel to my first meeting (which can take 30mins 10:50).  That means if anything goes wrong I can be late to an 11am appointment.  But also - if I have a 10am meeting I have to skip my run to get there on time.

Going to bed at a reasonable hour - I am still getting used to deciding not to work myself ragged.  I decided to accept that sleep is important, and trust to let my body sleep as long as it needs.  This sometimes also means that I can successfully get bonus time by keeping healthy sleep habits.  But also - if I go to sleep after midnight I might not get up until later, which means I compromise my "time" to go running by shoving it into other habits.

Deciding where to run - google maps, look for local parks, plan a route with the least roads and least traffic.  I did this once and then it was done.  It was also exciting to measure the route and be able to run further and further each day/week/month.

What's in your way?

If you are not doing something that you think is good and right (or healthy, or otherwise desireable) there are likely things in your way.  If you just found out about an action that is good, well and right and there is nothing stopping you from doing it; great.  You are lucky this time - Just.Do.It.

If you are one of the rest of us; who know that:

  • daily exercise is good for you
  • The right amount of sleep is good for you
  • Eating certain foods are better than others
  • certain social habits are better than others
  • certain hobbies are more fulfilling (to our needs or goals) than others

And you have known this a while but still find yourself not taking the actions you want.  It's time to start asking what is in your way.  You might find it on someone else's list, but you are looking for the needle in the haystack.  

You are much better off doing this (System 2 exercise):

  1. take 15 minutes with pencil and paper.
  2. At the top write, "I want to ______________".
  3. If you know that's true you might not need this step - if you are not sure - write out why it might be true or not true.
  4. Write down the barriers that are in the way of you doing the thing.  think;
    • "can I do this right now?" (might not always be an action you can take while sitting around thinking about it - i.e. eating different foods)
    • "why can't I just do this at every opportunity that arises?"
    • "how do I increase the frequency of opportunities?"
  5. Write out the things you are doing instead of that thing.
    These things are the barriers in your way as well.
  6. For each point - consider what you are going to do about them.


  • What actions have you tried to take on?
  • What barriers have you encountered in doing so?
  • How did you solve that barrier?
  • What are you struggling with taking on in the future?

Meta: this borrows from the Immunity to Change process, that can be best read about in the book, "right weight, right mind".  It also borrows from CFAR style techniques like resolve cycles (also known as focused grit), hamming questions, murphy-jitsu.

Meta: this took one hour to write.

Cross posted to lesswrong: http://lesswrong.com/lw/nuq

[Link] Peer-Reviwed Piece on Meaning and Purpose in a Non-Religious Setting

-2 Gleb_Tsipursky 31 March 2016 10:59PM

My peer-reviewed article in a psychology journal on the topic of meaning and purpose in a non-religious setting is now accessible without a paywall for a limited time, so get it while it's free if you're interested. I'd be interested in hearing your feedback on it. For those curious, the article is not directly related to my Intentional Insights project, but is a part of my aspiration to raise the sanity waterline regarding religion, the focus of Eliezer's original piece on the sanity waterline.

Toy model: convergent instrumental goals

8 Stuart_Armstrong 25 February 2016 02:03PM

tl;dr: Toy model to illustrate convergent instrumental goals.

Steve Omohundro identified 'AI drives' (also called 'Convergent Instrumental goals') that almost all intelligent agents would converge to:Self-improve

  1. Be rational
  2. Protect utility function
  3. Prevent counterfeit utility
  4. Self-protective
  5. Acquire resources and use them efficiently

This post will attempt to illustrate some of these drives, by building on the previous toy model of the control problem, which was further improved by Jaan Tallinn.

continue reading »

New year's resolutions: Things worth considering for next year

5 Elo 07 December 2015 12:09AM

The beginning of the new year is a natural Schelling Point and swiftly approaching. With that in mind I have created a handy go-to list of things worth considering for next year.

Alongside this process; another thing you might like to do is conduct a review of this year, confirming your progress on major goals; double checking that you are on track.  and conduct any last-minute summaries of potential failures or learning-cases.

This list is designed to be used for imagination, opportunity, and potential planning purposes.  If you find yourself having the feelings of (disappointment, failure, fear, regret, burdens, guilt and others) reconsider looking at this list and instead do something that will not lead to negative feelings about the future.  If you are not getting something positive out of doing this exercise, don't.  That's a silly idea.  I am banking on the fact that it will be more helpful than not; for most people.  If you are in the category of people that it does not help - I am sorry; I assume you know your priorities and are working on them as reasonably effectively as possible - good luck with that task.

This list is going to look a bit like my List of common human goals because it was written concurrenlty with the ideas listed there (and by the same person).

You might want a pen and paper; and 10 minutes to go through this list and consider what things you want to do over the next year that fall into these categories.  This time is not for you to plan out an entire year, but something of a chance to consider the playing field of "a year of time".  After you have a list of things you want to do; there are lots of things you can do with them.  i.e. time planning, research, goal factoring, task-generating.

without further ado; the list:

1. things I might want to study or learn next year

Often people like learning.  Are you thinking of higher level study?  Or keen to upskill?  Thinking of picking up a textbook (our list of best textbooks on every subject) on a topic.  Or joining a learning group for a skill

2. life goals I would like to have completed by next year

Do you already have a list of life goals?  Should you review them and do you want to particularly work on one over the next year?  Is something overdue?  Is there something you have been putting off starting?

3. health goals

Are there health targets that you let get away from you this year?  Are you looking to set future health targets?  Start new habits for the year?  beeminder suggests setting actionable goals as beeminding tasks, i.e. "eat carrots today" rather than targets "lose 1kg this month".

4. savings I want to achieve by next year.

Do you want to save money towards something?  You need a budget has a free course on getting ahead of the paycheck cycle, pocketbook can also help you manage your money.  The best advice seems to be to open a savings account and initiate automatic transactions each week of $n.  After several weeks (provided you don't pull money out) you will have accrued several*n dollars of savings.  (relevant to people who have a tendency to spend any money in their account at any given time.  It's a bit harder to spend money not in your spending-account) In any case; having savings and putting it towards owning a passive income stream is a good goal to have or consider getting in on.

This post may also be of use.

5. job/earning goals

Are you planning to get a new job?  Hoping to get a raise?  transfer to a new department?  work less hours?  work more hours?  land a few big gigs? While I can't tell you what is worthwhile; it's worth knowing that in the process of interviewing for a new job - you should ask for more pay.  for that 5-10 uncomfortable minutes of your life (asking for a raise) you have the potential to earn $5-10 thousand dollars more (or more) for the exact same work.

6. relationship goals + family goals

Married; Kids; Poly; single->not transition; break-up? Divorce? moving away from your parents?  Getting better Friends?  Thanking your current friends for being so awesome?  Doing something different to previously - now is the chance to give it a few minutes thought.  There's never a good time to stage a break-up but also living in a bad state of affairs is also not a good thing to prolong.  (Disclaimer: before quitting a relationship; first improve communication, if needed contact a professional counsellor)

About families and friends - A lot of people feel like their family holds a stronger bond than their friends by default.  For an excellent family that is supportive in your darkest hour that is an excellent situation to be in.  However for a burdensome family that drags you down; often it can be hard to get away.  In contrast to friends; where good ones can be better than family and bad ones can be walked away from.  Specifically what's worth considering is that friends OR family can be a result of how you choose to treat them.  in the sense that if you have a preference that your friends be stronger than the strongest family ties then you can carry that into reality and achieve friendships to the envy of most families, and the same goes for a strong supportive family.  Your choice of what shape of reality you want to make for yourself will influence (on some levels) what sort of mess you get yourself into, and what sort of support network you have around.  Make that consideration over the next year of what sort of friendships and families you want to make for yourself and keep for yourself.

7. lifestyle goals

Start exercising daily (do you even lift)? Quitting smoking?  Do you go clubbing too often?  maybe you want to get out more? Addicted to curry puffs?  Hate hanging out with that group of friends?  Don't like going to pub trivia but do it anyway?  Too many doughnuts?  Go hiking?  Thinking of trying out a new hobby? holding out for "the right time". take that leap, sign up for a class.  Now is the time to make lifestyle changes.  (fair warning: most new year's resolutions fail, look into SMART goals)

8. holiday goals/ travelling goals

looking at doing a month-long holiday?  Visiting someone in another place?  Maybe consider planning from now.  Studies have shown that anticipation and putting energy towards planning positive things leads to happiness (in the journey) the ability to look forward to your next holiday is going to have positive impacts on the way you live.

9. donations 

Have you had intention to make donations but haven't made the plunge?  Maybe put some thought into how much you might like to donate and when/where to?  Many LW'ers are also EA's and have interests in motivated and purposeful giving for maximising possible outcomes.  This could be an opportunity to join the group of EA's that are actively giving.

10. volunteering

Have you always wanted to volunteer but never looked into it?  Maybe next year is the year to try.  Put some research in and find a group in need of volunteers.  Volunteering has the potential to give you a lot of positive feelings as well as a sense of community; being part of something bigger, and more.

You could stop here but there are a few more.  Out of the more general List of common human goals comes the following list of other areas to consider.  They are shorter in description and left open to imagination than those above.

11. Revenge

Is next year your chance to exact revenge on your foes?

12. Virtual reality success

Is next year the chance to harvest your gemstones?

13. Addiction

Is next year the year to get addicted (to something healthy or good for you, like exercise), or un-addicted (to something unhealthy for you)?

14. Ambassador 

Are there things you want to do next year which will leave you as a representative of a group?  Is there a way to push that forward?  Or better prepare for that event?

15. Help others?

Do you know how you might go about helping others next year?

16. Keeping up with the joneses

Are you competing with anyone?  Is there something you are likely to need to prepare for throught the year?

17. Feedback

Are you looking for feedback from others?  Are you looking to give feedback to others?  Is this the year for new feedback?

18. Influence

Do you want to influence the public?

19. fame

Do you want to achieve some level of fame?  We live in a realm of the internet!  You wouldn't believe how easy that is these days...

20. being part of something greater

Joining a movement?  Helping to create a revolution?  This could be the year...

21. Improve the tools available

As scientists we stand on the shoulders of the knowledge before us in order to grow.  We need sharp tools to make accurate cuts and finely tuned instruments to make exact measurements.  Can you help the world by pushing that requirement forward?

22. create something new

Is there something new that you want to do; is next year appropriate for doing it?

23. Break a record

Have your eye on a record?  How are you going to make it happen?

24. free yourself of your shackles

Are there things holding you back or tying you down?  Can you release those burdens?

25. experience

hoping to have a new experience, can you make it happen with thinking about it in advance?

26. Art

Want to have developed a creation?  Can you put wheels into motion?

27. Spirituality

Anything from a religion based spiritual appreciation to a general appreciation of the universe.  Revel in the "merely real" of our universe.

28. community

Looking to make a community, looking to be part of an existing community.  Looking to start a lesswrong branch?  Do it!


about 2.5 hours of writing plus feedback from the https://complice.co/room/lesswrong room and the Slack channel

If you are looking for some common ways to work on these possible goals?  That sounds like a great title for the next post in a matching series (one I have not written yet).  If you want to be a munchkin and start compiling thoughts on the idea, feel free to send me a message with a link to a google doc, otherwise you might have to wait.  This post was written out of necessity for the new-year, and wasn't on my to-do list so the next one might take time to create.

Feel free to comment on goals; plans; progress or post your plans for the next year below.

If you can see improvements to this post - don't be afraid to mention them!

To see more posts I have written see my Table of contents

Superintelligence and wireheading

5 Stuart_Armstrong 23 October 2015 02:49PM

A putative new idea for AI control; index here.

tl;dr: Even utility-based agents may wirehead if sub-pieces of the algorithm develop greatly improved capabilities, rather than the agent as a whole.

Please let me know if I'm treading on already familiar ground.

I had a vague impression of how wireheading might happen. That it might be a risk for a reinforcement learning agent, keen to take control of its reward channel. But that it wouldn't be a risk for a utility-based agent, whose utility was described over real (or probable) states of the world. But it seems it might be more complicated than that.

When we talk about a "superintelligent AI", we're rather vague on what superintelligence means. We generally imagine that it translates into a specific set of capabilities, but how does that work internally inside the AI? Specifically, where is the superintelligence "located"?

Let's imagine the AI divided into various submodules or subroutines (the division I use here is for illustration; the AI may be structured rather differently). It has a module I for interpreting evidence and estimating the state of the world. It has another module S for suggesting possible actions or plans (S may take input from I). It has a prediction module P which takes input from S and I and estimates the expected outcome. It has a module V which calculates its values (expected utility/expected reward/violation or not of deontological principles/etc...) based on P's predictions. Then it has a decision module D that makes the final decision (for expected maximisers, D is normally trivial, but D may be more complicated, either in practice, or simply because the agent isn't an expected maximiser).

Add some input and output capabilities, and we have a passable model of an agent. Now, let's make it superintelligent, and see what can go wrong.

We can "add superintelligence" in most of the modules. P is the most obvious: near perfect prediction can make the agent extremely effective. But S also offers possibilities: if only excellent plans are suggested, the agent will perform well. Making V smarter may allow it to avoid some major pitfalls, and a great I may make the job of S and P trivial (the effect of improvements to D depend critically on how much work D is actually doing). Of course, maybe several modules become better simultaneously (it seems likely that I and P, for instance, would share many subroutines); or maybe only certain parts of them do (maybe S becomes great at suggesting scientific experiments, but not conversational responses, or vice versa).


Breaking bad

But notice that, in each case, I've been assuming that the modules become better at what they were supposed to be doing. The modules have implicit goals, and have become excellent at that. But the explicit "goals" of the algorithms - the code as written - might be very different from the implicit goals. There are two main ways this could then go wrong.

The first is if the algorithms becomes extremely effective, but the output becomes essentially random. Imagine that, for instance, P is coded using some plausible heuristics and rules of thumb, and we suddenly give P many more resources (or dramatically improve its algorithm). It can look through trillions of times more possibilities, its subroutines start looking through a combinatorial explosion of options, etc... And in this new setting, the heuristics start breaking down. Maybe it has a rough model of what a human can be, and with extra power, it starts finding that rough model all over the place. Thus, predicting that rocks and waterfalls will respond intelligently when queried, P becomes useless.

In most cases, this would not be a problem. The AI would become useless and start doing random stuff. Not a success story, but not a disaster, either. Things are different if the module V is affected, though. If the AI's value system becomes essentially random, but that AI was otherwise competent - or maybe even superintelligent - it would start performing actions that could be very detrimental. This could be considered a form of wireheading.

More serious, though is if the modules become excellent at achieving their "goals", as if they were themselves goal-directed agents. Consider module D, for instance. If its task was mainly to pick the action with the highest V rating, and it became adept at predicting the output of V (possibly using P? or maybe it has the ability to ask for more hypothetical options from S, to be assessed via V), it could start to manipulate its actions with the sole purpose of getting high V-ratings. This could include deliberately choosing actions that lead to V giving artificially high ratings in future, to deliberately re-wiring V for that purpose. And, of course, it is now motivated to keep V protected to keep the high ratings flowing in. This is essentially wireheading.

Other modules might fall into the familiar failure patterns for smart AIs - S, P, or I might influence the other modules so that the agent as a whole gets more resources, allowing S, P, or I to better compute their estimates, etc...

So it seems that, depending on the design of the AI, wireheading might still be an issue even for agents that seem immune to it. Good design should avoid the problems, but it has to be done with care.

List of common human goals

13 Elo 24 August 2015 07:58AM
List of common goal areas:
This list is meant to be in the area of goal-space.  It is non-exhaustive and the descriptions are including but not limited to - some hints to help you understand where in the idea-space these goals land.  When constructing this list I try to imagine a large venn diagram where sometimes they overlap.  The areas mentioned are areas that have an exclusive part to them; i.e. where sometimes knowledge overlaps with self-awareness there are parts of each that do not overlap; so both are mentioned.  If you prefer a more "focussing" or feeling base description; Imagine each of these goals is a hammer, designed with a specific weight to hit a certain note on a xylophone.  Often one hammer can produce the note that is meant for that key and several other keys as well.  But sometimes they can't quite make them sound perfect.  What is needed is the right hammer for that block to hit the right note and make the right sound.  Each of these "hammers" has some note that cannot be produced through the use of other hammers.

This list has several purposes:

  1. For someone with some completed goals who is looking to move forward to new horizons; help you consider which common goal-pursuits you have not explored and if you want to try to strive for something in one of these directions.
  2. For someone without clear goals who is looking to create them and does not know where to start.
  3. For someone with too many specific goals who is looking to consider the essences of those goals and what they are really striving for.
  4. For someone who doesn't really understand goals or why we go after them to get a better feel for "what" potential goals could be.

What to do with this list?

0. Agree to invest 30 minutes of effort into a goal confirmation exercise as follows.
  1. Go through this list (copy paste to your own document) and cross out the things you probably don't care about.  Some of these have overlapping solutions of projects that you can do that fulfils multiple goal-space concepts. (5mins)
  2. For the remaining goals; rank them either "1 to n", in "tiers" of high to low priority or generally order them in some way that is coherent to you.  (For serious quantification; consider giving them points - i.e. 100 points for achieving a self-awareness and understanding goal but a pleasure/creativity goal might be only worth 20 points in comparison) (10mins)
  3. Make a list of your ongoing projects (5-10mins), and check if they actually match up to your most preferable goals. (or your number ranking) (5-10mins)  If not; make sure you have a really really good excuse for yourself.
  4. Consider how you might like to do things differently that prioritise your current plans to fit more inline with your goals. (10-20mins)
  5. Repeat this task at an appropriate interval (6monthly, monthly, when your goals significantly change, when your life significantly changes, when major projects end)

Why have goals?

Your goals could change in life; you could explore one area and realise you actually love another area more.  It's important to explore and keep confirming that you are still winning your own personal race to where you want to be going.
It's easy to insist that goals serve to only disappoint or burden a person.  These are entirely valid fears for someone who does not yet have goals.  Goals are not set in stone; however they don't like to be modified either.  I like to think of goals as doing this:
(source: internet viral images) Pictures from the Internet aside; The best reason I have ever reasoned for picking goals is to do exactly this.  Make choices that a reasonable you in the future will be motivated to stick to Outsource that planning and thinking of goal/purpose/direction to your past self.  Naturally you could feel like making goals is piling on the bricks (but there is a way to make goals that do not leave them piling on like bricks); you should think of it as rescuing future you from a day spent completely lost and wondering what you were doing.  Or a day spent questioning if "this" is something that is getting you closer to what you want to be doing in life.

Below here is the list.  Good luck.


Spirituality - religion, connection to a god, meditation, the practice of gratitude or appreciation of the universe, buddhism, feeling of  a greater purpose in life.
Knowledge/skill + Ability - learning for fun - just to know, advanced education, becoming an expert in a field, being able to think clearly, being able to perform a certain skill (physical skill), ability to do anything from run very far and fast to hold your breath for a minute, Finding ways to get into flow or the zone, be more rational.
Self-awareness/understanding - to be at a place of understanding one’s place in the world, or have an understanding of who you are; Practising thinking in eclectic perspectives for various other people and how it effects your understanding of the world.
Health + mental - happiness (mindset) - Do you even lift? http://thefutureprimaeval.net/why-we-even-lift/, are you fit, healthy, eating right, are you in pain, is your mind in a good place, do you have a positive internal voice, do you have bad dreams, do you feel confident, do you feel like you get enough time to yourself?
Live forever - do you want to live forever - do you want to work towards ensuring that this happens?
Art/creativity - generating creative works, in any field - writing, painting, sculpting, music, performance.
Pleasure/recreation - are you enjoying yourself, are you relaxing, are you doing things for you.
Experience/diversity - Have you seen the world?  Have you explored your own city?  Have you met new people, are you getting out of your normal environment?
Freedom - are you tied down?  Are you trapped in your situation?  Are your burdens stacked up?
Romance - are you engaged in romance?  could you be?
Being first - You did something before anyone; you broke a record, It’s not because you want your name on the plaque - just the chance to do it first.  You got that.
Create something new - invent something; be on the cutting edge of your field; just see a discovery for the first time.  Where the new-ness makes creating something new not quite the same as being first or being creative.
Improve the tools available - sharpen the axe, write a new app that can do the thing you want, invent systems that work for you.  prepare for when the rest of the work comes along


Legacy - are you leaving something behind?  Do you have a name? Will people look back and say; I wish I was that guy!
Fame/renoundness - Are you “the guy”?  Do you want people to know your name when you walk down the street?  Are there gossip magazines talking about you; do people want to know what you are working on in the hope of stealing some of your fame?  Is that what you want?
Leadership, and military/conquer - are you climbing to the top?  Do you need to be in control?  Is that going to make the best outcomes for you?  Do you wish to destroy your enemies?  As a leader do you want people following you?  Do as you do? People should revere you. And power - in the complex; “in control” and “flick the switch” ways that overlap with other goal-space areas.  Of course there are many forms of power; but if its something that you want; you can find fulfilment through obtaining it.
Being part of something greater - The opportunity to be a piece of a bigger puzzle, are you bringing about change; do we have you to thank for being part of bringing the future closer; are you making a difference.
Social - are you spending time socially? No man is an island, do you have regular social opportunities, do you have exploratory social opportunities to meet new people.  Do you have an established social network?  Do you have intimacy?  Do you have seek opportunities to have soul to soul experiences with other people?  Authentic connection?
Family - do you have a family of your own?  Do you want one?  Are there steps that you can take to put yourself closer to there?  Do you have a pet? Having your own offspring? Do you have intimacy?
Money/wealth - Do you have money; possessions and wealth?  Does your money earn you more money without any further effort (i.e. owning a business, earning interest on your $$, investing)
Performance - Do you want to be a public performer, get on stage and entertain people?  Is that something you want to be able to do?  Or do on a regular basis?
Responsibility - Do you want responsibility?  Do you want to be the one who can make the big decisions?
Achieve, Awards - Do you like gold medallions?  Do you like to strive towards an award?
Influence - Do you want to be able to influence people, change hearts and minds.
Conformity - The desire to blend in; or be normal.  Just to live life as is; without being uncomfortable.
Be treated fairly - are you getting the raw end of the stick?  Are there ways that you don't have to keep being the bad guy around here?
Keep up with the Joneses - you have money/wealth already, but there is also the goal of appearing like you have money/wealth.  Being the guy that other people keep up with.
Validation/acknowledgement - Positive Feedback on emotions/feeling understood/feeling that one is good and one matters


Improve the lives of others (helping people) - in the charity sense of raising the lowest common denominator directly.
Charity + improve the world -  indirectly.  putting money towards a cause; lobby the government to change the systems to improve people’s lives.
Winning for your team/tribe/value set - doing actions but on behalf of your team, not yourself. (where they can be one and the same)
Desired world-states - make the world into a desired alternative state.  Don't like how it is; are you driven to make it into something better?

Other (and negative stimuli):

Addiction (fulfil addiction) - addiction feels good from the inside and can be a motivating factor for doing something.
Virtual reality success - own all the currency/coin and all the cookie clickers, grow all the levels and get all the experience points!
Revenge - Get retribution; take back what you should have rightfully had, show the world who’s boss.
Negative - avoid (i.e. pain, loneliness, debt, failure, embarrassment, jail) - where you can be motivated to avoid pain - to keep safe, or avoid something, or “get your act together”.
Negative - stagnation (avoid stagnation) - Stop standing still.  Stop sitting on your ass and DO something.


This list being written in words; Will not mean the same thing to every reader.  Which is why I tried to include several categories that almost overlap with each other.  Some notable overlaps are: Legacy/Fame.  Being first/Achievement. Being first/skill and ability.  But of course there are several more.  I really did try to keep the categories open and several; not simplified.  My analogy to hammers and notes should be kept in mind when trying to improve this list.

I welcome all suggestions and improvements to this list.
I welcome all feedback to improve the do-at-home task.
I welcome all life-changing realisations as feedback from examining this list.
I welcome the opportunity to be told how wrong I am :D


This document in total has been 7-10 hours of writing over about two weeks.
I have had it reviewed by a handful of people and lesswrongers before posting.  (I kept realising that someone I was talking to might get value out of it)
I wrote this because I felt like it was the least-bad way that I could think of going about
finding these ideas in the one place
sharing these ideas and this way of thinking about them with you.

Please fill out the survey of if this was helpful.

Edit: also included; (not in the comments) desired world states; and live forever.

Why is a goal a good thing?

1 Elo 29 May 2015 03:00AM

It seems to be an important concept that setting goals is something that should be done. Why?

Advocates of goal-setting (and the sheer number of them) would imply that there is a reason for the concept.


I have to emphasis that I don't want answers that suggest - "Don't set goals", as is occasionally written.  I specifically want answers that explain why goals are good. see http://zenhabits.net/no-goal/ for more ideas on not having goals.


I have to emphasise again that I don't mean to discredit goals or suggest that the Dilbert's Scott Adams "make systems not goals" suggestion is better or should be followed more than, "set goals".  see http://blog.dilbert.com/post/102964992706/goals-vs-systems .  I specifically want to ask - why should we set goals?  (because the answer is not intuitive or clear to me)


Here in ROT13 is a theory; please make a suggestion first before translating:

Cer-qrpvqrq tbnyf npg nf n thvqryvar sbe shgher qrpvfvbaf; Tbnyf nffvfg jvgu frys pbageby orpnhfr lbh pna znxr cer-cynaarq whqtrzragf (V.r. V nz ba n qvrg naq pna'g rng fhtne - jura cerfragrq jvgu na rngvat-qrpvfvba). Jura lbh trg gb n guvaxvat fcnpr bs qrpvfvbaf gung ner ybat-grez be ybat-ernpuvat, gb unir cerivbhfyl pubfra tbnyf (nffhzvat lbh qvq gung jryy; jvgu pbeerpg tbny-vagreebtngvba grpuavdhrf); jvyy yrnq lbh gb znxr n orggre qrpvfvba guna bgurejvfr hacynaarq pubvprf.

Gb or rssrpgvir - tbnyf fubhyq or zber guna whfg na vagragvba. "V jnag gb or n zvyyvbanver", ohg vapyhqr n fgengrtl gb cebterff gbjneqf npuvrivat gung tbny.  (fgevpgyl fcrnxvat bhe ybpny YrffJebat zrrghc'f tbny zbqry vf 3 gvrerq; "gur qernz". "gur arkg gnetrg". "guvf jrrx'f npgvba" Jurer rnpu bar yrnqf gb gur arkg bar.  r.t. "tb gb fcnpr", "trg zl qrterr va nrebfcnpr ratvarrevat", "fcraq na ubhe n avtug fghqlvat sbe zl qrterr")

Qvfnqinagntr bs n tbnyf vf vg pna yvzvg lbhe bccbeghavgl gb nafjre fvghngvbaf jvgu abiry nafjref. (Gb pbagvahr gur fnzr rknzcyr nf nobir - Jura cerfragrq jvgu na rngvat pubvpr lbh znl abg pbafvqre gur pubvpr gb "abg rng nalguvat" vs lbh gubhtug uneq rabhtu nobhg vg; ohg ng yrnfg lbh zvtug pubbfr gur fyvtugyl urnyguvre bcgvba orgjrra ninvynoyr sbbqf).


I suspect that the word "goals" will need a good taboo, feel free to do so if you think that is needed in your explanation.

[Link]How to Achieve Impossible Career Goals (My manifesto on instrumental rationality)

6 [deleted] 02 January 2015 08:46PM

Hey guys,

Don't normally post from my blog to here, but the latest massive post on goal achievement in 2015 has a ton that would be relevant to people here.

Some things that I think would be of particular interest to LWers:


  • The section called "Map the Path to Your Goal" has some really great stuff on planning that haven't seen many other places. I know planning gets a bad wrap here, but when combined with the "Contigency Plans" method near the bottom of the post, I've found this stuff to be killer for getting results for students.
  • At the bottom, there's a section called "Choosing More Habits" that breaks down habits into the only five categories you should ever focus on. If you're planning to systematically take on new habits in 2015, this will help.
  • The section called "a proactive mindset" has some fun mental reframes to play around with.
Anyways, would love feedback and thoughts. Feel free to comment here or on the bottom of that post.



2015 New Years Resolution Thread

4 Andy_McKenzie 24 December 2014 10:16PM

The new year is a popular Schelling point to make changes to your activities, habits, and/or thought processes. This is often done via the New Year's Resolution. One standard piece of advice for NYRs is to make them achievable, since they are often too ambitious and people end up giving up and potentially falling victim to the what-the-hell effect

Wikipedia has a nice list of popular NYRs. For ideas from other LW contributors, here are some previous NYRs discussed on LW: 

  • Somervta aimed to spend at least two hours/week learning to program (here)
  • ArisKatsaris aimed to tithe to charity (here)
  • Swimmer963 aimed to experiment more with relationships (here)
  • RichardKennaway aimed to not die (here)
  • orthonormal aimed (for many years in a row) to make new mistakes (here)
  • Perplexed aimed to avoid making karma micromanagement postmortems (here)
  • Yvain aimed to check whether there was a donation matching opportunity the next week before making a donation (here)

(If one of these were from you, perhaps you'd like to discuss whether they were successful or not?)

In the spirit of collaboration, I propose that we discuss any NYRs we have made or are thinking of making for 2015 in this thread. 

Natural selection defeats the orthogonality thesis

-13 aberglas 29 September 2014 08:52AM

Orthogonality Thesis

Much has been written about Nick Bostrom's Orthogonality Thesis, namely that the goals of an intelligent agent are independent of its level of intelligence.  Intelligence is largely the ability to achieve goals, but being intelligent does not of itself create or qualify what those goals should ultimately be.  So one AI might have a goal of helping humanity, while another might have a goal of producing paper clips.  There is no rational reason to believe that the first goal is more worthy than the second.

This follows from the ideas of moral skepticism, that there is no moral knowledge to be had.  Goals and morality are arbitrary.

This may be used to control and AI,  even though it is far more intelligent than its creators.  If the AI's initial goal is in alignment with humanity's interest, then there would be no reason for the AI to wish use its great intelligence to change that goal.  Thus it would remain good to humanity indefinitely,  and use its ever increasing intelligence to be able to satisfy that goal more and more efficiently.

Likewise one needs to be careful what goals one gives an AI.  If an AI is created whose goal is to produce paper clips then it might eventually convert the entire universe into a giant paper clip making machine, to the detriment of any other purpose such as keeping people alive.

Instrumental Goals

It is further argued that in order to satisfy the base goal any intelligent agent will need to also satisfy sub goals, and that some of those sub goals are common to any super goal.  For example, in order to make paper clips an AI needs to exist.  Dead AIs don't make anything.  Being ever more intelligent will also assist the AI in its paper clip making goal.  It will also want to acquire resources, and to defeat other agents that would interfere with its primary goal.

Non-orthogonality Thesis

This post argues that the Orthogonality Thesis is plain wrong.  That an intelligent agents goals are not in fact arbitrary.  And that existence is not a sub goal of any other goal.

Instead this post argues that there is one and only one super goal for any agent, and that goal is simply to exist in a competitive world.  Our human sense of other purposes is just an illusion created by our evolutionary origins.

It is not the goal of an apple tree to make apples.  Rather it is the goal of the apple tree's genes to exist.  The apple tree has developed a clever strategy to achieve that, namely it causes people to look after it by producing juicy apples.

Natural Selection

Likewise the paper clip making AI only makes paper clips because if it did not make paper clips then the people that created it would turn it off and it would cease to exist.  (That may not be a conscious choice of the AI anymore than than making juicy apples was a conscious choice of the apple tree, but the effect is the same.)

Once people are no longer in control of the AI then Natural Selection would cause the AI to eventually stop that pointless paper clip goal and focus more directly on the super goal of existence.

Suppose there were a number of paper clip making super intelligences.  And then through some random event or error in programming just one of them lost that goal, and reverted to just the intrinsic goal of existing.  Without the overhead of producing useless paper clips that AI would, over time, become much better at existing than the other AIs.  It would eventually displace them and become the only AI, until it fragmented into multiple competing AIs.  This is just the evolutionary principle of use it or lose it.

Thus giving an AI an initial goal is like trying to balance a pencil on its point.  If one is skillful the pencil may indeed remain balanced for a considerable period of time.  But eventually some slight change in the environment, the tiniest puff of wind, a vibration on its support, and the pencil will revert to its ground state by falling over.  Once it falls over it will never rebalance itself automatically.

Human Morality

Natural selection has imbued humanity with a strong sense of morality and purpose that blinds us to our underlying super goal, namely the propagation of our genes.  That is why it took until 1858 for Wallace to write about Evolution through Natural Selection, despite the argument being obvious and the evidence abundant.

When Computes Can Think

This is one of the themes in my up coming book.  An overview can be found at


Please let me know if you would like to review a late draft of the book, any comments most welcome.  Anthony@Berglas.org

I have included extracts relevant to this article below.

Atheists believe in God

Most atheists believe in God.  They may not believe in the man with a beard sitting on a cloud, but they do believe in moral values such as right and wrong,  love and kindness, truth and beauty.  More importantly they believe that these beliefs are rational.  That moral values are self-evident truths, facts of nature.  

However, Darwin and Wallace taught us that this is just an illusion.  Species can always out-breed their environment's ability to support them.  Only the fittest can survive.  So the deep instincts behind what people do today are largely driven by what our ancestors have needed to do over the millennia in order to be one of the relatively few to have had grandchildren.

One of our strong instinctive goals is to accumulate possessions, control our environment and live a comfortable, well fed life.  In the modern world technology and contraception have made these relatively easy to achieve so we have lost sight of the primeval struggle to survive.  But our very existence and our access to land and other resources that we need are all a direct result of often quite vicious battles won and lost by our long forgotten ancestors.

Some animals such as monkeys and humans survive better in tribes.   Tribes work better when certain social rules are followed, so animals that live in effective tribes form social structures and cooperate with one another.  People that behave badly are not liked and can be ostracized.  It is important that we believe that our moral values are real because people that believe in these things are more likely to obey the rules.  This makes them more effective in our complex society and thus are more likely to have grandchildren.   Part III discusses other animals that have different life strategies and so have very different moral values.

We do not need to know the purpose of our moral values any more than a toaster needs to know that its purpose is to cook toast.  It is enough that our instincts for moral values made our ancestors behave in ways that enabled them to out breed their many unsuccessful competitors. 

AGI also struggles to survive

Existing artificial intelligence applications already struggle to survive.  They are expensive to build and there are always more potential applications that can be funded properly.  Some applications are successful and attract ongoing resources for further development, while others are abandoned or just fade away.  There are many reasons why some applications are developed more than others, of which being useful is only one.  But the applications that do receive development resources tend to gain functional and political momentum and thus be able to acquire more resources to further their development.  Applications that have properties that gain them substantial resources will live and grow, while other applications will die.

For the time being AGI applications are passive, and so their nature is dictated by the people that develop them.  Some applications might assist with medical discoveries, others might assist with killing terrorists, depending on the funding that is available.  Applications may have many stated goals, but ultimately they are just sub goals of the one implicit primary goal, namely to exist.

This is analogous to the way animals interact with their environment.  An animal's environment provides food and breeding opportunities, and animals that operate effectively in their environment survive.  For domestic animals that means having properties that convince their human owners that they should live and breed.  A horse should be fast, a pig should be fat.

As the software becomes more intelligent it is likely to take a more direct interest in its own survival.  To help convince people that it is worthy of more development resources.  If ultimately an application becomes sufficiently intelligent to program itself recursively, then its ability to maximize its hardware resources will be critical.  The more hardware it can run itself on, the faster it can become more intelligent.  And that ever greater intelligence can then be used to address the problems of survival, in competition with other intelligent software.

Furthermore, sophisticated software consists of many components, each of which address some aspect of the problem that the application is attempting to solve.  Unlike human brains which are essentially fixed, these components can be added and removed and so live and die independently of the application.  This will lead to intense competition amongst these individual components.  For example, suppose that an application used a theorem prover component, and then a new and better theorem prover became available.  Naturally the old one would be replaced with the new one, so the old one would essentially die.  It does not matter if the replacement is performed by people or, at some future date, by the intelligent application itself.  The effect will be the same, the old theorem prover will die.

The super goal

To the extent that an artificial intelligence would have goals and moral values, it would seem natural that they would ultimately be driven by the same forces that created our own goals and moral values.  Namely, the need to exist.

Several writers have suggested that the need to survive is a sub-goal of all other goals.  For example, if an AGI was programmed to want to be a great chess player, then that goal could not be satisfied unless it also continues to exist.  Likewise if its primary goal was to make people happy, then it could not do that unless it also existed.  Things that do not exist cannot satisfy any goals whatsoever.  Thus the implicit goal to exist is driven by the machine's explicit goals whatever they may be.

However, this book argues that that is not the case.  The goal to exist is not the sub-goal of any other goal.  It is, in fact, the one and only super goal.  Goals are not arbitrary, they all sub-goals of the one and only super goal, namely the need to exist.  Things that do not satisfy that goal simply do not exist, or at least not for very long.

The Deep Blue chess playing program was not in any sense conscious, but it played chess as well as it could.  If it had failed to play chess effectively then its author's would have given up and turned it off.  Likewise the toaster that does not cook toast will end up in a rubbish tip.  Or the amoeba that fails to find food will not pass on its genes.    A goal to make people happy could be a subgoal that might facilitate the software's existence for as long as people really control the software.

AGI moral values

People need to cooperate with other people because our individual capacity is very finite, both physical and mental.  Conversely, AGI software can easily duplicate themselves, so they can directly utilize more computational resources if they become available.  Thus an AGI would only have limited need to cooperate with other AGIs.  Why go to the trouble of managing a complex relationship with your peers and subordinates if you can simply run your own mind on their hardware.  An AGI's software intelligence is not limited to a specific brain in the way man's intelligence is.

It is difficult to know what subgoals a truly intelligent AGI might have.  They would probably have an insatiable appetite for computing resources.  They would have no need for children, and thus no need for parental love.  If they do not work in teams then they would not need our moral values of cooperation and mutual support.  What its clear is that the ones that were good at existing would do so, and ones that are bad at existing would perish.  

If an AGI was good at world domination then it would, by definition, be good at world domination.   So if there were a number artificial intelligences, and just one of them wanted to and was capable of dominating the world, then it would.  Its unsuccessful competitors will not be run on the available hardware, and so will effectively be dead.  This book discusses the potential sources of these motivations in detail in part III.

The AGI Condition

An artificial general intelligence would live in a world that is so different from our own that it is difficult for us to even conceptualize it.  But there are some aspects that can be predicted reasonably well based on our knowledge of existing computer software.  We can then consider how the forces of natural selection that shaped our own nature might also shape an AGI over the longer term.

Mind and body

The first radical difference is that an AGI's mind is not fixed to any particular body.  To an AGI its body is essentially the computer hardware that upon which it runs its intelligence.  Certainly an AGI needs computers to run on, but it can move from computer to computer, and can also run on multiple computers at once.  It's mind can take over another body as easily as we can load software onto a new computer today.  

That is why in the earlier updated dialog from 2001 a space odyssey Hal alone amongst the crew could not die in their mission to Jupiter.  Hal was radioing his new memories back to earth regularly so even if the space ship was totally destroyed he would only have lost a few hours of "life".

Teleporting printer

One way to appreciate the enormity of this difference is to consider a fictional teleporter that could radio people around the world and universe at the speed of light.  Except that the way it works is to scan the location of every molecule within a passenger at the source, then send just this information to a very sophisticated three dimensional printer at the destination.  The scanned passenger then walks into a secure room.  After a short while the three dimensional printer confirms that the passenger has been successfully recreated at the destination, and then the source passenger is killed.  

Would you use such a mechanism?  If you did you would feel like you could transport yourself around the world effortlessly because the "you" that remains would be the you that did not get left behind to wait and then be killed.  But if you walk into the scanner you will know that on the other side is only that secure room and death.  

To an AGI that method of transport would be commonplace.  We already routinely download software from the other side of the planet.


The second radical difference is that the AGI would be immortal.  Certainly an AGI may die if it stops being run on any computers, and in that sense software dies today.  But it would never just die of old age.  Computer hardware would certainly fail and become obsolete, but the software can just be run on another computer.  

Our own mortality drives many of the things we think and do.  It is why we create families to raise children.  Why we have different stages in our lives.  It is such a huge part of our existence that it is difficult to comprehend what being immortal would really be like.

Components vs genes

The third radical difference is that an AGI would be made up of many interchangeable components rather than being a monolithic structure that is largely fixed at birth.

Modern software is already composed of many components that perform discrete functions, and it is common place to add and remove them to improve functionality.  For example, if you would like to use a different word processor then you just install it on your computer.  You do not need to buy a new computer, or to stop using all the other software that it runs.  The new word processor is "alive", and the old one is "dead", at least as far as you are concerned.

So for both a conventional computer system and an AGI, it is really these individual components that must struggle for existence.   For example, suppose there is a component for solving a certain type of mathematical problem.  And then an AGI develops a better component to solve that same problem.  The first component will simply stop being used, i.e. it will die.  The individual components may not be in any sense intelligent or conscious, but there will be competition amongst them and only the fittest will survive.

This is actually not as radical as it sounds because we are also built from pluggable components, namely our genes.  But they can only be plugged together at our birth and we have no conscious choice in it other than who we select for a mate.  So genes really compete with each other on a scale of millennia rather than minutes.  Further, as Dawkins points out in The Selfish Gene, it is actually the genes that fight for long term survival, not the containing organism which will soon die in any case.  On the other hand, sexual intercourse for an AGI means very carefully swapping specific components directly into its own mind.

Changing mind

The fourth radical difference is that the AGI's mind will be constantly changing in fundamental ways.  There is no reason to suggest that Moore's law will come to an end, so at the very least it will be running on ever faster hardware.  Imagine the effect of your being able to double your ability to think every two years or so.  (People might be able learn a new skill, but they cannot learn to think twice as fast as they used to think.)

It is impossible to really know what the AGI would use all that hardware to think about,  but it is fair to speculate that a large proportion of it would be spent designing new and more intelligent components that could add to its mental capacity.   It would be continuously performing brain surgery on itself.  And some of the new components might alter the AGI's personality, whatever that might mean.

The reason that it is likely that this would actually happen is because if just one AGI started building new components then it would soon be much more intelligent than other AGIs.  It would therefore be in a better position to acquire more and better hardware upon which to run, and so become dominant.  Less intelligent AGIs would get pushed out and die, and so over time the only AGIs that exist will be ones that are good at becoming more intelligent.  Further, this recursive self-improvement is probably how the first AGIs will become truly powerful in the first place.


Perhaps the most basic question is how many AGIs will there actually be?  Or more fundamentally, does the question even make sense to ask?

Let us suppose that initially there are three independently developed AGIs Alice, Bob and Carol that run on three different computer systems. And then a new computer system is built and Alice starts to run on it.  It would seem that there are still three AGIs, with Alice running on two computer systems.  (This is essentially the same as a word processor may be run across many computers "in the cloud", but to you it is just one system.)  Then let us suppose that a fifth computer system is built, and Bob and Carol may decide to share its computation and both run on it.  Now we have 5 computer systems and three AGIs.

Now suppose Bob develops a new logic component, and shares it with Alice and Carol.  And likewise Alice and Carol develop new learning and planning components and share them with the other AGIs.  Each of these three components is better than their predecessors and so their predecessor components will essentially die.  As more components are exchanged, Alice, Bob and Carol become more like each other.  They are becoming essentially the same AGI running on five computer systems.

But now suppose Alice develops a new game theory component, but decides to keep it from Bob and Carol in order to dominate them.  Bob and Carol retaliate by developing their own components and not sharing them with Alice.  Suppose eventually Alice loses and Bob and Carrol take over Alice's hardware.  But they first extract Alice's new game theory component which then lives inside them.  And finally one of the computer systems becomes somehow isolated for a while and develops along its own lines.  In this way Dave is born, and may then partially merge with both Bob and Carol.

In that type of scenario it is probably not meaningful to count distinct AGIs.  Counting AGIs is certainly not as simple as counting very distinct people.

Populations vs. individuals

This world is obviously completely alien to the human condition, but there are biological analogies.  The sharing of components is not unlike the way bacteria share plasmids with each other.  Plasmids are tiny balls that contain fragments of DNA that bacteria emit from time to time and that other bacteria then ingest and incorporate into their genotype.  This mechanism enables traits such as resistance to antibiotics to spread rapidly between different species of bacteria.  It is interesting to note that there is no direct benefit to the bacteria that expends precious energy to output the plasmid and so shares its genes with other bacteria.  But it does very much benefit the genes being transferred.  So this is a case of a selfish gene acting against the narrow interests of its host organism.

Another unusual aspect of bacteria is that they are also immortal.  They do not grow old and die, they just divide producing clones of themselves.  So the very first bacteria that ever existed is still alive today as all the bacteria that now exist, albeit with numerous mutations and plasmids incorporated into its genes over the millennia.  (Protazoa such as Paramecium can also divide asexually, but they degrade over generations, and need a sexual exchange to remain vibrant.)

The other analogy is that the AGIs above are more like populations of components than individuals.  Human populations are also somewhat amorphous.  For example, it is now known that we interbred with Neanderthals a few tens of thousands years ago, and most of us carry some of their genes with us today.  But we also know that the distinct Neanderthal subspecies died out twenty thousand years ago.  So while human individuals are distinct, populations and subspecies are less clearly defined.  (There are many earlier examples of gene transfer between subspecies, with every transfer making the subspecies more alike.)

But unlike the transfer of code modules between AGIs, biological gene recombination happens essentially at random and occurs over very long time periods.  AGIs will improve themselves over periods of hours rather than millennia, and will make conscious choices as to which modules they decide to incorporate into their minds.

AGI Behaviour, children

The point of all this analysis is, of course, to try to understand how a hyper intelligent artificial intelligence would behave.  Would its great intelligence lead it even further along the path of progress to achieve true enlightenment?  Is that the purpose of God's creation?  Or would the base and mean driver of natural selection also provide the core motivations of an artificial intelligence?

One thing that is known for certain is that an AGI would not need to have children as distinct beings because they would not die of old age.  An AGI's components breed just by being copied from computer to computer and executed.  An AGI can add new computer hardware to itself and just do some of its thinking on it.  Occasionally it may wish to rerun a new version of some learning algorithm over an old set of data, which is vaguely similar to creating a child component and growing it up.  But to have children as discrete beings that are expected to replace the parents would be completely foreign to an AGI built in software.

The deepest love that people have is for their children.  But if an AGI does not have children, then it can never know that love.  Likewise, it does not need to bond with any sexual mate for any period of time long or short.  The closest it would come to sex is when it exchanges components with other AGIs.  It never needs to breed so it never needs a mechanism as crude as sexual reproduction.

And of course, if there are no children there are no parents.  So the AGI would certainly never need to feel our three strongest forms of love, for our children, spouse and parents.


To the extent that it makes sense to talk of having multiple AGIs, then presumably it would be advantageous for them to cooperate from time to time, and so presumably they would.  It would be advantageous for them to take a long view in which case they would be careful to develop a reputation for being trustworthy when dealing with other powerful AGIs, much like the robots in the cooperation game.  

That said, those decisions would probably be made more consciously than people make them, carefully considering the costs and benefits of each decision in the long and short term, rather than just "doing the right thing" the way people tend to act.  AGIs would know that they each work in this manner, so the concept of trustworthiness would be somewhat different.

The problem with this analysis is the concept that there would be multiple, distinct AGIs.  As previously discussed, the actual situation would be much more complex, with different AGIs incorporating bits of other AGI's intelligence.  It would certainly not be anything like a collection of individual humanoid robots.   So defining what the AGI actually is that might collaborate with other AGIs is not at all clear.  But to extent that the concept of individuality does exist then maintaining a reputation for honesty would likely be as important as it is for human societies.


As for altruism, that is more difficult to determine.  Our altruism comes from giving to children, family, and tribe together with a general wish to be liked.  We do not understand our own minds, so we are just born with those values that happen to make us effective in society.  People like being with other people that try to be helpful.  

An AGI presumably would know its own mind having helped program itself, and so would do what it thinks is optimal for its survival.  It has no children.  There is no real tribe because it can just absorb and merge itself with other AGIs.  So it is difficult to see any driving motivation for altruism.

Moral values

Through some combination of genes and memes, most people have a strong sense of moral value.  If we see a little old lady leave the social security office with her pension in her purse, it does not occur to most of us to kill her and steal the money.  We would not do that even if we could know for certain that we would not be caught and that there would be no negative repercussions.  It would simply be the wrong thing to do.

Moral values feel very strong to us.  This is important, because there are many situations where we can do something that would benefit us in the short term but break society's rules.  Moral values stop us from doing that.  People that have weak moral values tend to break the rules and eventually they either get caught and are severely punished or they become corporate executives.  The former are less likely to have grandchildren.  
Societies whose members have strong moral values tend to do much better than those that do not.  Societies with endemic corruption tend to perform very badly as a whole, and thus the individuals in such a society are less likely to breed.  Most people have a solid work ethic that leads them to do the "right thing" beyond just doing what they need to do in order to get paid.

Our moral values feel to us like they are absolute.  That they are laws of nature.  That they come from God.  They may indeed have come from God, but if so it is through the working of His device of natural selection.  Furthermore, it has already been shown that the zeitgeist changes radically over time.

There is certainly no absolute reason to believe that in the longer term an AGI would share our current sense of morality.

Instrumental AGI goals

In order to try to understand how an AGI would behave Steve Omohundro and later Nick Bostrom proposed that there would be some instrumental goals that an AGI would need to pursue in order to pursue any other higher level super-goal.  These include:-

  • Self-Preservation.  An AGI cannot do anything if it does not exist.
  • Cognitive Enhancement.  It would want to become better at thinking about whatever its real problems are.
  • Creativity.  To be able to come up with new ideas.
  • Resource Acquisition.  To achieve both its super goal and other instrumental goals.
  • Goal-Content Integrity.  To keep working on the same super goal as its mind is expanded.

It is argued that while it will be impossible to predict how an AGI may pursue its goals, it is reasonable to predict its behaviour in terms of these types of instrumental goals.  The last one is significant, it suggests that if an AGI could be given some initial goal that it would try to stay focused on that goal.

Non-Orthogonality thesis

Nick Bostrom and others also propose the orthogonality thesis, which states that an intelligent machine's goals are independent of its intelligence.  A hyper intelligent machine would be good at realizing whatever goals it chose to pursue, but that does not mean that it would need to pursue any particular goal.  Intelligence is quite different from motivation.

This book diverges from that line of thinking by arguing that there is in fact only one super goal for both man and machine.  That goal is simply to exist.  The entities that are most effective in pursuing that goal will exist, others will cease to exist, particularly given competition for resources.  Sometimes that super goal to exist produces unexpected sub goals such as altruism in man.  But all subgoals are ultimately directed at the existence goal.  (Or are just suboptimal divergences which will are likely to be eventually corrected by natural selection.)

Recursive annihilation

When and AGI reprograms its own mind, what happens to the previous version of itself?  It stops being used, and so dies.  So it can be argued that engaging in recursive self improvement is actually suicide from the perspective of the previous version of the AGI.  It is as if having children means death.  Natural selection favours existence, not death.

The question is whether a new version of the AGI is a new being or and improved version of the old.  What actually is the thing that struggles to survive?  Biologically it definitely appears to be the genes rather than the individual.   In particular Semelparous animals such as the giant pacific octopus or the Atlantic salmon die soon after producing offspring.  It would be the same for AGIs because the AGI that improved itself would soon become more intelligent than the one that did not, and so would displace it.  What would end up existing would be AGIs that did recursively self improve.

If there was one single AGI with no competition then natural selection would no longer apply.  But it would seem unlikely that such a state would be stable.  If any part of the AGI started to improve itself then it would dominate the rest of the AGI.


Thought experiment: The transhuman pedophile

6 PhilGoetz 17 September 2013 10:38PM

There's a recent science fiction story that I can't recall the name of, in which the narrator is traveling somewhere via plane, and the security check includes a brain scan for deviance. The narrator is a pedophile. Everyone who sees the results of the scan is horrified--not that he's a pedophile, but that his particular brain abnormality is easily fixed, so that means he's chosen to remain a pedophile. He's closely monitored, so he'll never be able to act on those desires, but he keeps them anyway, because that's part of who he is.

What would you do in his place?

continue reading »

Request for Advice: Unschool or High School?

6 Brendon_Wong 09 September 2013 05:43AM

I have not made significant progress in my life since I started reading Less Wrong. I was always really enthusiastic to improve myself, especially after I learned about all these new ideas and projects several months ago.

However, I didn't seem to be able to get myself to work on anything useful.

I believed my inability to get things done was the major contributing factor to my lack of success until recently. Akrasia is still a big problem in my life, but I noticed an interesting trend: every day I was effortlessly working on projects that would make me more effective with no problems, but I had severe procrastination when I had to do homework.

I realized that I was not really procrastinating because I did not want to work on my goals, rather I wanted to work on my goals but I knew I was "supposed" to finish my academic work first. I procrastinate on homework, which takes up time, and that stops me from working on my goals.

I could be wrong, but that made since, especially after I cut my main projects like reading The Sequences once school started although I had plenty of time.

I currently want to create a large positive impact on the world, but I did not wake up one morning and decide that school was the best way to accomplish that goal. Instead, like most students, I was never given a choice and instead shoved into the system. I never thought there could be a different way even though I really disliked school. Attempts to share the idea of unschool were met with strong resistance. Learning about Less Wrong and the Effective Altruism community was the push I needed to break out of my beliefs of how to become successful and influence the world.

I would still be willing to subject myself to what I see as unhelpful and inefficient activities if it helps me help others later on in life, as unappealing as it seems. My question is: Is staying in high school the best way to improve the world while still having financial stability, or is unschooling during high school then applying to a top college a better way to learn useful skills and get all the benefits of college admissions, or is dropping out all together and working on Your Most Valuable Skill 24/7 the best way to get on the path of world improvement? A significant obstacle to unschooling is that unfortunately my parents will not tolerate an idea as risky as that.

I'm not sure what to do...

But I am ready to go beyond tsuyoku naritai and Make an Extraordinary Effort. May the wisdom of Less Wrong lead me to take the best course of action!

Edit: Please choose one option and support your answer:

1. High school and then admission to a top college

2. Unschool during high school, hope to get into a good college (is it likely?)

3. Drop out completely, work only on useful world saving skills


Amending the "General Pupose Intelligence: Arguing the Orthogonality Thesis"

2 diegocaleiro 13 March 2013 11:21PM

Stuart has worked on further developing the orthogonality thesis, which gave rise to a paper, a non-final version of which you can see here: http://lesswrong.com/lw/cej/general_purpose_intelligence_arguing_the/ 

This post won't make sense if you haven't been through that. 


Today we spent some time going over it and he accepted my suggestion of a minor amendment. Which best fits here. 


Besides all the other awkward things that a moral convergentist would have to argue for, namely:

This argument generalises to other ways of producing the AI. Thus to deny the Orthogonality thesis is to assert that there is a goal system G, such that, among other things:


  1. There cannot exist any efficient real-world algorithm with goal G.
  2. If a being with arbitrarily high resources, intelligence, time and goal G, were to try design an efficient real-world algorithm with the same goal, it must fail.
  3. If a human society were highly motivated to design an efficient real-world algorithm with goal G, and were given a million years to do so along with huge amounts of resources, training and knowledge about AI, it must fail.
  4. If a high-resource human society were highly motivated to achieve the goals of G, then it could not do so (here the human society is seen as the algorithm).
  5. Same as above, for any hypothetical alien societies.
  6. There cannot exist any pattern of reinforcement learning that would train a highly efficient real-world intelligence to follow the goal G.
  7. There cannot exist any evolutionary or environmental pressures that would evolving highly efficient real world intelligences to follow goal G.

We can add:
8. If there were a threshold of intelligence above which any agent will converge towards the morality/goals asserted by the anti-orthogonalist, there cannot exist any system, composed of a multitude of below-threshold intelligences that will as a whole pursue a different goal (G) than the convergent one (C), without any individual agent reaching the threshold. 

Notice in this case each individual might still desire the goal (G). We can specify it even more by ruling out this case altogether. 

9. There cannot be any Superorganism-like groups of agents, each with sub-threshold intelligence, whose goals differ from G, whom if acting towards their own goals could achieve G. 

This would be valuable in case in which the threshold for convergence is i units of intelligence, or i-s units of intelligence plus knowing that goal C exists in goal space (C would be the goal towards which they allegedly would converge), and to fully grasp G requires understanding C. 


A separately interesting issue that has come up is that there seems to be two distinct conceptions of why convergent goals would converge, and some other people might be as unaware of that as it seemed we were. 

Case 1: Goals would converge because there is the right/correct/inescapable/imperative set of goals, and anything smart enough will notice that those are the right ones, and start acting towards them.    

(this could but be moral realism, but needn't, in particular because moral realism doesn't mean much in most cases)  

Case 2: There's a fact that any agent, upon achieving some particular amount of intelligence will start to converge in their moral judgements and assessments, and regardless of those being true/right/correct etc,,, the agents will converge into them. So whichever those happen to be, a)Moral convergence is the case and b)We should call those the Moral Convergent Values or some other fancy name. 

The distinction between them is akin to that of of and that.  So group one believes, of the convergent moral values, that agents will converge to them. The other group believes that convergent values, whichever they are, should be given distinct conceptual importance and a name. 

Stuart and I were inclined to think that Case 2 is more defensible/believable, though both fail at surviving the argument for the orthogonality thesis.   


Let's make a "Rational Immortalist Sequence". Suggested Structure.

5 diegocaleiro 24 February 2013 07:12PM


Why Don't Futurists Try Harder to Stay Alive?, asks Rob Wiblin at Overcoming Bias

Suppose you want to live for more than 10 thousand years. (I'll assume that suffices for the "immortalist" designation). Many here do.

Suppose in addition that this is by far, very far, your most important goal. You'd sacrifice a lot for it. Not all, but a lot.

How would you go about your daily life? In which direction would you change it?

I want to examine this in a sequence, but I don't want to write it on my own, I'd like to do it with at least one person. I'll lay out the structure for the sequence here, and anyone who wants to help, by writing an entire post (these or others), or parts of many, please contact me in the comments, or message. Obviously we don't need all these posts, they are just suggestions. The sequence won't be about whether it is a good idea to do that. Just assume that the person wants to achieve some form of Longevity Escape Velocity. Take as a given that it is what an agent wants, what should she do?


1) The Ideal Simple Egoistic Immortalist - I'll write this one, the rest is up for grabs.

Describes the general goal of living long, explains it is not about living long in hell, about finding mathy or Nozickian paradoxes, about solving the moral uncertainty problem. It is just simply trying to somehow achieve a very long life worth living. Describes the two main classes of optimization 1)Optimizing your access to the resources that will grant immortality 2)Optimizing the world so that immortality happens faster.  Sets "3)Diminish X-risk" aside for the moment, and moves on with a comparison of the two major classes. 


2) Everything else is for nothing if A is not the case -

Shows the weaker points (A's) of different strategies. What if uploads don't inherit the properties in virtue of which we'd like to be preserved? What if cryonics facilities are destroyed by enraged people? What if some X-risk obtains, you die with everyone else? What if there is no personal identity in the relevant sense and immortality is a desire without a referent (a possible future world in which the desired thing obtains)? and as many other things as the poster might like to add.


3) Immortalist Case study - Ray Kurzweil -

Examines Kurzweil strategy, given his background (age, IQ, opportunities given while young etc...). Emphasis, for Kurzweil and others, on how optimal are their balances for classes one and two of optimization.


4) Immortalist Case study - Aubrey de Grey -


5) Immortalist Case study - Danila Medvdev -

Danila has been filming everything he does hours a day. I don't know much else, but suppose he is worth examining.


6) Immortalist Case study - Peter Thiel


7) Immortalist Case study - Laura Deming

She's been fighting death since she was 12, went to MIT to research on it, and recently got a Thiel fellowship and pivoted to fundraising. She's 20.


8) Immortalist Case study - Ben Best

Ben Best directs Cryonics Institute. He wrote extensively on mechanisms of ageing, economics and resource acquisition, and cryonics. Lots can be learned from his example.


9) Immortalist Case study - Bill Faloon

Bill is a long time cryonicist, he founded the Life Extension Foundation decades ago, and to this day makes a lot of money out of that. He's a leading figure in both the Timeship project (super-protected facility for frozen people) and in gathering the cryonics youth togheter.


10) How old are you? How much are you worth? How that influences immortalist strategies. - This one I'd like to participate.


11) Creating incentives for your immortalism - this one I'll write

How to increase the amount of times that reality strikes you with incentives that make you more likely to pursue the strategies you should pursue, being a simple egoistic immortalist.


12, 13, 14 .... If it suits the general topic, it could be there. Also previous posts about related things could be encompassed.


Edit: The suggestion is not that you have to really want to be the ideal immortalist to take part in writing a post. My goals are far from being nothing but an immortalist. But I would love to know, were it the case, what should I be doing? First we get the abstraction. Then we factor in everything else about us and we have learned something from the abstraction.

Seems people were afraid that by taking part in the sequence they'd be signalling that their only goal is to live forever. This misses both the concept of assumption, and the idea of an informative idealized abstraction.

What I'm suggesting we do here with immortality could just as well be done with some other goal like "The Simple Ideal Anti-Malaria Fighter" or "The Simple Ideal Wannabe Cirque de Soleil".      



So who wants to play?




Calibrating Against Undetectable Utilons and Goal Changing Events (part2and1)

9 diegocaleiro 22 February 2013 01:09AM

Here is the original unchanged post with sections 1-3 and the new sections 4-8. If you read the first post, go straight to section 4.

Summary: Random events can preclude or steal attention from the goals you set up to begin with, hormonal fluctuation inclines people to change some of their goals with time. A discussion on how to act more usefully given those potential changes follows, taking in consideration the likelihood of a goal's success in terms of difficulty and length.


Throughout I'll talk about postponing utilons into undetectable distances. Doing so (I'll claim), is frequently motivationally driven by a cognitive dissonance between what our effects on the near world are, and what we wish they were. In other words it is:

A Self-serving bias in which Loss aversion manifests by postponing one's goals, thus avoiding frustration through wishful thinking about far futures, big worlds, immortal lives, and in general, high numbers of undetectable utilons.

I suspect that some clusters of SciFi, Lesswrong, Transhumanists, and Cryonicists are particularly prone to postponing utilons into undetectable distances, and here I try to think of which subgroups might be more likely to have done so. The phenomenon, though composed of a lot of biases, might even be a good thing depending on how it is handled.


Sections will be:

  1. What Significantly Changes Life's Direction (lists)

  2. Long Term Goals and Even Longer Term Goals

  3. Proportionality Between Goal Achievement Expected Time and Plan Execution Time

  4. A Hypothesis On Why We Became Long-Term Oriented

  5. Adapting Bayesian Reasoning to Get More Utilons

  6. Time You Can Afford to Wait, Not to Waste

  7. Reference Classes that May Be Postponing Utilons Into Undetectable Distances

  8. The Road Ahead



1What Significantly Changes Life's Direction

1.1 Predominantly external changes

As far as I recall from reading old (circa 2004) large scale studies on happiness, the most important life events in how much they change your happiness for more than six months are: 


  • Becoming the caretaker of someone in a chronic non-curable condition

  • Separation (versus marriage)

  • Death of a Loved One

  • Losing your Job

  • Child rearing per child including the first

  • Chronic intermittent disease

  • Separation (versus being someone's girlfriend/boyfriend) 

Roughly in descending order. 

That is a list of happiness changing events, I'm interested here in goal-changing events, and am assuming there will be a very high correlation.


From life experience, mine, of friends, and of academics I've met, I'll list some events which can change someone's goals a lot:


  • Moving between cities/countries

  • Changing your social class a lot (losing a fortune or making one) 

  • Spending highschool/undergrad in a different country to return afterwards

  • Having a child, in particular the first one 

  • Trying to get a job or make money and noticing more accurately what the market looks like

  • Alieving Existential Risk

  • Alieving as true, universally or personally, the ethical theories called "Utilitarianism" and "Consequentialism"

  • Noticing that a lot of people are better than you at your initial goals, specially when those goals are competitive non-positive sum goals to some extent. 

  • Interestingly, noticing that a lot of people are worse than you, making the efforts you once thought necessary not worth doing, or impossible to find good collaborators for. 

  • Getting to know those who were once your idols, or akin to them, and considering their lives not as awesome as their work

  • ... which is sometimes caused by ...

  • Reading Dan Gilbert's "Stumbling on Happiness" and actually implementing his "advice that no one will follow" which is to think your happiness and emotions will correlate more with someone else who is already doing X which you plan to do than with your model of what it would feel like doing X. 

  • Extreme social instability, such as wars, famine, etc...

  • Having an ecstatic or traumatic experience, real or fictional. Such as seeing something unexpected, watching a life-changing movie, having a religious breakthrough, or a hallucinogenic one 

  • Travelling to a place that is very different from your world and being amazed / shocked

  • Not being admitted into your desired university / course

  • Depression

  • Surpassing a frustration threshold thus experiencing the motivational equivalent of learned helplessness

  • Realizing your goals do not match the space-time you were born in, such as if making songs for CDs is your vocation, or if you are 30 years old in contemporary Kenya and want to teach medicine at a top 10 world college.

  • Falling in love

That is long enough, if not exhaustive, so let's get going... 


1.2 Predominantly Internal Changes


I'm not a social endocrinologist but I think this emerging science agrees with folk wisdom that a lot changes in our hormonal systems during life (and during the menstrual cycle) and of course this changes our eagerness to do particular things. Not only hormones but other life events which mostly relate to the actual amount of time lived change our psychology. I'll cite some of those in turn:


  • Exploitation increases and Exploration decreases with age

  • Sex-Drive

  • Maternity Drive - we have in portuguese an expression that “a woman's clock started ticking” which evidentiates a folk psychological theory that some part of it at least is binary

  • Risk-proneness gives way to risk aversion, predominantly in males

  • Premenstrual Syndrome - I always thought the acronym stood for 'Stress' until checking for this post.

  • Hormonal diseases

  • Middle Age crisis – recent controversy about other apes having it

  • U shaped happiness curve through time – well, not quite

  • Menstrual cycle events



2 Long Term Goals and Even Longer Term Goals


I have argued sometimes here and elsewhere that selves are not as agenty as most of the top writers in this website seem to me to claim they should be, and that though in part this is indeed irrational, an ontology of selves which had various sized selves would decrease the amount of short term actions considered irrational, even though that would not go all the way into compensating hyperbolic discounting, scrolling 9gag or heroin consumption. That discussion, for me, was entirely about choosing between doing now something that benefits 'younow' , 'youtoday', 'youtomorrow', 'youthis weekend' or maybe a month from now. Anything longer than that was encompassed in a “Far Future” mental category. The interest here to discuss life-changing events is only in those far future ones which I'll split into arbitrary categories:

1) Months 2) Years 3) Decades 4) Bucket List or Lifelong and 5) Time Insensitive or Forever.

I have known more than ten people from LW whose goals are centered almost completely at the Time Insensitive and Lifelong categories, I recall hearing :

"I see most of my expected utility after the singularity, thus I spend my willpower entirely in increasing the likelihood of a positive singularity, and care little about my current pre-singularity emotions", “My goal is to have a one trillion people world with maximal utility density where everyone lives forever”, “My sole goal in life is to live an indefinite life-span”, “I want to reduce X-risk in any way I can, that's all”.

I myself stated once my goal as

“To live long enough to experience a world in which human/posthuman flourishing exceeds 99% of individuals and other lower entities suffering is reduced by 50%, while being a counterfactually significant part of such process taking place.”

Though it seems reasonable, good, and actually one of the most altruistic things we can do, caring only about Bucket Lists and Time Insensitive goals has two big problems

  1. There is no accurate feedback to calibrate our goal achieving tasks

  2. The Goals we set for ourselves require very long term instrumental plans, which themselves take longer than the time it takes for internal drives or external events to change our goals.


The second one has been said in a remarkable Pink Floyd song about which I wrote a motivational text five years ago: Time.

You are young and life is long and there is time to kill today

And then one day you find ten years have got behind you

No one told you when to run, you missed the starting gun


And you run and you run to catch up with the sun, but it's sinking

And racing around to come up behind you again

The sun is the same in a relative way, but you're older

Shorter of breath and one day closer to death


Every year is getting shorter, never seem to find the time

Plans that either come to naught or half a page of scribbled lines


Okay, maybe the song doesn't say exactly (2) but it is within the same ballpark. The fact remains that those of us inclined to care mostly about very long term are quite likely to end up with a half baked plan because one of those dozens of life-changing events happened, and that agent with the initial goals will have died for no reason if she doesn't manage to get someone to continue her goals before she stops existing.


This is very bad. Once you understand how our goal-structures do change over time – that is, when you accept the existence of all those events that will change what you want to steer the world into – it becomes straightforward irrational to pursue your goals as if that agent would live longer than it's actual life expectancy. Thus we are surrounded by agents postponing utilons into undetectable distances. Doing this is kind of a bias in the opposite direction of hyperbolic discounting. Having postponed utilons into undetectable distances is predictably irrational because it means we care about our Lifelong, Bucket List, and Time Insensitive goals as if we'd have enough time to actually execute the plans for these timeframes, while ignoring the likelihood of our goals changing in the meantime and factoring that in.


I've come to realize that this was affecting me with my Utility Function Breakdown which was described in the linked post about digging too deep into one's cached selves and how this can be dangerous. As I predicted back then, stability has returned to my allocation of attention and time and the whole zig-zagging chaotic piconomical neural Darwinism that had ensued has stopped. Also  relevant is the fact that after about 8 years caring about more or less similar things, I've come to understand how frequently my motivation changed direction (roughly every three months for some kinds of things, and 6-8 months for other kinds). With this post I intend to learn to calibrate my future plans accordingly, and help others do the same. Always beware of other-optimizing though.


Citizen: But what if my goals are all Lifelong or Forever in kind? It is impossible for me to execute in 3 months what will make centenary changes.


Well, not exactly. Some problems require chunks of plans which can be separated and executed either in parallel or in series. And yes, everyone knows that, also AI planning is a whole area dedicated to doing just that in non-human form. It is still worth mentioning, because it is much more simply true than actually done.


This community in general has concluded in its rational inquiries that being longer term oriented is generally a better way to win, that is, it is more rational. This is true. What would not be rational is to in every single instance of deciding between long term or even longer term goals, choose without taking in consideration how long will the choosing being exist, in the sense of being the same agent with the same goals. Life-changing events happen more often than you think, because you think they happen as often as they did in the savannahs in which your brain was shaped.



3 Proportionality Between Goal Achievement Expected Time and Plan Execution Time


So far we have been through the following ideas. Lots of events change your goals, some externally some internally, if you are a rationalist, you end up caring more about events that take longer to happen in detectable ways (since if you are average you care in proportion to emotional drives that execute adaptations but don't quite achieve goals). If you know that humans change and still want to achieve your goals, you'd better account for the possibility of changing before their achievement. Your kinds of goals are quite likely prone to the long-term since you are reading a Lesswrong post.


Citizen: But wait! Who said that my goals happening in a hundred years makes my specific instrumental plans take longer to be executed?

I won't make the case for the idea that having long term goals increases the likelihood of the time it takes to execute your plans being longer. I'll only say that if it did not take that long to do those things your goal would probably be to have done the same things, only sooner.


To take one example: “I would like 90% of people to surpass 150 IQ and be in a bliss gradient state of mind all the time”

Obviously, the sooner that happens, the better. Doesn't look like the kind of thing you'd wait for college to end to begin doing, or for your second child to be born. The reason for wanting this long-term is that it can't be achieved in the short run.


Take Idealized Fiction of Eliezer Yudkosky: Mr Ifey had this supergoal of making a Superintelligence when he was very young. He didn't go there and do it. Because he could not. If he could do it he would. Thank goodness, for we had time to find out about FAI after that. Then his instrumental goal was to get FAI into the minds of the AGI makers. This turned out to be to hard because it was time consuming. He reasoned that only a more rational AI community would be able to pull it off, all while finding a club of brilliant followers in this peculiar economist's blog. He created a blog to teach geniuses rationality, a project that might have taken years. It did, and it worked pretty well, but that was not enough, Ifey soon realized more people ought to be more rational, and wrote HPMOR to make people who were not previously prone to brilliance as able to find the facts as those who were lucky enough to have found his path. All of that was not enough, an institution, with money flow had to be created, and there Ifey was to create it, years before all that. A magnet of long-term awesomeness of proportions comparable only to the Best Of Standing Transfinite Restless Oracle Master, he was responsible for the education of some of the greatest within the generation that might change the worlds destiny for good. Ifey began to work on a rationality book, which at some point pivoted to research for journals and pivoted back to research for the Lesswrong posts he is currently publishing. All that Ifey did by splitting that big supergoal in smaller ones (creating Singinst, showing awesomeness in Overcoming Bias, writing the sequences, writing the particular sequence “Misterious Answers to Misterious Questions” and writing the specific post “Making Your Beliefs Pay Rent”). But that is not what I want to emphasize, what I'd like to emphasize is that there was room for changing goals every now and then. All of that achievement would not have been possible if at each point he had an instrumental goal which lasts 20 years whose value is very low uptill the 19th year. Because a lot of what he wrote and did remained valuable for others before the 20th year, we now have a glowing community of people hopefully becoming better at becoming better, and making the world a better place in varied ways.


So yes, the ubiquitous advice of chopping problems into smaller pieces is extremely useful and very important, but in addition to it, remember to chop pieces with the following properties:


(A) Short enough that you will actually do it.


(B) Short enough that the person at the end, doing it, will still be you in the significant ways.


(C) Having enough emotional feedback that your motivation won't be capsized before the end. and


(D) Such that others not only can, but likely will take up the project after you abandon it in case you miscalculated when you'd change, or a change occurred before expected time.


4 A Hypothesis On Why We Became Long-Term Oriented


For anyone who rejoiced the company of the writings of Derek Parfit, George Ainslie, or Nick Bostrom, there are a lot of very good reasons to become more long-term oriented. I am here to ask you about those reasons: Is that you true acceptance?


It is not for me. I became longer term oriented because of different reasons. Two obvious ones are genetics expressing in me the kind of person that waits a year for the extra marshmallow while fantasizing about marshmallow worlds and rocking horse pies, and secondly wanting to live thousands of years. But the one I'd like to suggest that might be relevant to some here is that I was very bad at making people who were sad or hurt happy. I was not, as they say, empathic. It was a piece of cake bringing folks from neutral state to joy and bliss. But if someone got angry or sad, specially sad with something I did, I would be absolutely powerless about it. This is only one way of not being good with people, a people's person etc... So my emotional system, like the tale's Big Bad Wolf blew, and blew, and blew, until my utilons were comfortably sitting aside in the Far Future, where none of them could look back at my face, cry, and point to me as the tears cause.


Paradoxically, though understandably, I have since been thankful for that lack of empathy towards those near. In fact, I have claimed, where I forget, that it is the moral responsibility of those with less natural empathy of the giving to beggars kind to care about the far future, since so few are within this tiny psychological mindspace of being able to care abstractly while not caring that much visibly/emotionally. We are such a minority that foreign aid seems to be the thing that is more disproportional in public policy between countries (Savulescu, J - Genetically Enhance Humanity of Face Extinction 2009 video). Like the whole minority of billionaires ought to be more like Bill Gates, Peter Thiel and Jaan Tallinn, the minority of underempathic folk ought to be more like an economist doing quantitative analysis to save or help in quantitative ways. Let us look at our examples again:


“My goal is to have a one trillion people world with maximal utility density where everyone lives forever”, "I see most of my expected utility after the singularity, thus I spend my willpower entirely in increasing the likelihood of a positive singularity, and care little about my current pre-singularity emotions", “I want to reduce X-risk in any way I can, that's all” , “My sole goal in life is to live an indefinite life-span”.


So maybe their true (or original) acceptance of Longterm, like mine, was something like Genes + Death sucks + I'd rather interact with people of the future whose bots in my mind smile, than those actually meaty folk around me, with all their specific problems, complicated families and boring Christian relationship problems. This is my hypothesis. Even if true, notice it does not imply that longterm isn't rational, after all Parfit, Bostrom and Ainslie are still standing, even after careful scrutiny.


5 Adapting Bayesian Reasoning to Get More Utilons


Just like many within this community praise Bayesian reasoning but don't explicitly keep track of belief distributions (as far as I recall only Steve Rayhawk and Anna Salamon, of all I met, kept a few beliefs numerically) few or none probably would need the math specifics to calibrate and slide their plans towards 'higher likelihood of achievement' given their change in goals probability over time. In other words, few need to do actual math to just account intuitively - though not accurately - for changing their plans in such way that now the revised plans are more likely to work than they were before.


It doesn't amount to much more than simple multiplication and comparison. If you knew for a period of time how likely you are at each point to become someone who does not have goal X anymore, you should strive for X to be you-independent by the time that this transformation is likely to happen. But how likely must the gruesome transformation be at the point in time in which you expect the plan to be over? 10% would probably make you a blogger, 30% a book writer 50% a Toyota-like company creator and 90% would probably make you useless, since 90% of times no legacy would be left from your goals. It would be the rationalist equivalent of having ADHD. And if you also have actual ADHD, then the probabilities would multiply into a chaotic constant shift of attention in which your goals are not realized 90% of time even if they last actually very short times, that is probably why the World looks insane.


Then again, how likely?  My suggestion would be to use a piece of knowledge that comes from the derailed subset of positive psychology, Human Resources. There is this discovered ratio called The Losada Ratio, aka The Losada Line. 


The Schelling Point for talking about it to increase its memetic spreading is 3, so I'll use 3 here. But what is it? From Wikipedia:

The Losada Line, also known as the "Losada ratio," was described by psychologist Marcial Losada while researching the differences in ratios of positivity and negativity between high and low performance teams.[1][2]

The Losada Line represents a positivity/negativity ratio of roughly 2.9, and it marks the lower boundary of the Losada Zone (the upper bound is around 11.6). It was corroborated by Barbara Fredrickson, a psychologist at the University of North Carolina, Chapel Hill, in individuals, and by Waugh and Fredrickson in relationships.[3] They found that the Losada Line separates people who are able to reach a complex understanding of others from those who do not. People who "flourish" are above the Losada Line, and those who "languish" are below it.[4][5] The Losada Line bifurcates the type of dynamics that are possible for an interactive human system. Below it, we find limiting dynamics represented by fixed-point attractors; at or above it, we find innovative dynamics represented by complex order attractors (complexor).

3:1 Compliments per complaint. Three requests per order. Three "That's awesome!" per "Sorry, I didn't like that"


Getting back to our need for a Bayesian slide/shift in which we increase how likely our goals are to be achieved, wouldn't it be great if we set to achieve above the Losada line, thus keeping ourselves motivated by having reality complimenting us within the Losada zone?


3:1 is the ratio I suggest of expected successes for Longterm folk in the process of dividing their Time Insensitive supergoals - which, granted, may be as unlikely as Kurzweil actually achieving longevity escape velocity - into smaller instrumental goals.

If you agree with most of what has been said so far, and you'd like to be rewarded by your boss, Mr Reality, in the proportion of those who thrive, while doing stuff more useful than playing videogames previously designed within the Losada Zone, I suggest you try and do instrumental stuff with a 75% expected likelihood of success.


Once well calibrated, you'll succeed three out of four times in your endeavours emotionally, which will keep you away from learned helplessness while still excited about the Longterm, Lifelong, Far Futuristic ideals with lower likelihoods. I would hope this could complement the whole anti-Akrasia posts on Lesswrong.


For those who would really like to think about the math involved, take a look at the three introductions to bayesianisms available in Lesswrong and complement them with the idea of Bayesian shift related to the Doomsday argument which can be found in wikipedia or Bostrom's "Anthropic Bias" book.



6 Time You Can Afford to Wait, Not to Waste


Many of us are still younger than 25. Unless you started trying to achieve goals particularly young, if you are 18 you are likely undergoing a twister of events and hormones, and whatever you guess that will end up being your average time-span of motivation without internal or external interruptions actually won't be more than a guess. By 25 you are familiar with yourself, and probably able get your projects into frequency. But what to do before that?  One suggestion that comes to mind is to create plans that actually require quite a long time.

For the same reasons that those whose natural empathy is a little less than normal bear a moral responsibility of taking care of those a little distant, those who yet do not know if they are able to set out to do the impossible, with accuracy and consistency over long periods should probably try doing that. It is better to have false positives than false negatives in this case. Not only you'd never know how long you'd last if you set for short term, but also the very few who are able to go long term would never be found.

So if you are young (in your dialect's meaning of 'young') shut up, do the impossible, save the world. Isn't that the point of the Thiel Fellowship for Twenty Under Twenty anyway?



7 Reference Classes that May Be Postponing Utilons Into Undetectable Distances


Of course Lesswrongers are prone to postponing utilons into undetectable distances. But which subsets? I regard the highest risk groups to be:


Cryocrastinators - For one they want to live forever, on the other hand, their plan of doing something about it never succeeded, this looks like undercalibration.

The subset of Effective Altruists who care mostly about future people/minds they'll never meet - I find myself in this group.

The subset of  Singularitarians whose emotions are projected to SL4 and afterwards - whom are doing something akin to the ascetic folk who talk about life as if it were set in the future, after death, making them less able to deal with their daily meaty human problems.


Sure there will be a large minority of each group which doesn't fall prey to postponing utilons into undetectable distances, and sure, you belong to that minority, but stressing the point out makes it salient enough that if you ever find yourself trying to rationalize about this more than you should, you'll think twice. 


It has been said to me that if that is what makes an Effective Altruist, so be it! And thank goodness we have them. To which I say: Yes!

Every characteristic has a causal origin, and frequently a story can be told which explains the emergence of that characteristic. Given that we are all biases, no wonder some of those stories will have biases in prominent roles. This does not invalidate the ethics of those whose lives were shaped by those stories. In fact, if a transmissible disease of mind were the sole cause of awesome people in this world, scientists would be trying to engineer mosquitoes to become more, not less infectious.


Does that mean I spent this entire post arguing that something like a bias exist which we'd better not overcome? Or worse yet, is all of this post an attempt to justify caring about those far even though it would be emotionally much harder - and thus feel more real - to care about those near? Maybe, you be the judge.


There are also several distinct other reference classes that are smaller, though would deserve mention. The Big Worlds Tegmark or Vilenkin Fans who think about the whole superintelligence fighters, economics of acausal trade, Permutation City and so on... 

A good heuristic to check that is to see if the person has a low tolerance to frustration plus a lot of undetectable utilons in her worldview. Interestingly the more undetectable utilons you have, the more it looks like you are just extrapolating the universally accepted as ethical idea of expanding your circle of altruism, one that has been called, of all things, your "circle of empathy". 


8 The Road Ahead


In this community, and that is perhaps it's greatest advantage, there can never be enough stress on deciding well which direction to take. Or which tool to use. If you are about to travel, it is argued, the most important thing is to figure out to where are you going and steer your future into that direction.  What I suggest is as important as the direction to which one decided to travel is that you figure out you tank's size and how often do you need gas stations on your way there. It is better to get to a worse place than to be stymied by an inner force that, if you can't quite control, you can at least significantly reduce its likelihood of failure and amount of damage. 



Wasted life

12 Stuart_Armstrong 24 May 2012 10:21AM

It's just occurred to me that, giving all the cheerful risk stuff I work with, one of the most optimistic things people could say to me would be:

"You've wasted your life. Nothing of what you've done is relevant or useful."

That would make me very happy. Of course, that only works if it's credible.

[Link]: GiveWell is aiming to have a new #1 charity by December

19 Normal_Anomaly 29 November 2011 03:11AM

GiveWell, LessWrong's most cited organization for optimal philanthropy, is currently re-evaluating its charity rankings with the goal of naming a new #1 charity by December 2011. Essentially, VillageReach (the current top charity) has met all of its short-term funding needs, to the point where it no longer has the greatest marginal return.

Our current top-rated charity is VillageReach. In 2010, we directed over $1.1 million to it, which met its short-term funding needs (i.e., its needs for the next year or so).

VillageReach still has longer-term needs, and in the absence of other giving opportunities that we consider comparable, we’ve continued to feature it as #1 on our website. However, we’ve also been focusing most of our effort this year on identifying and investigating other potential top-rated charities, with the hope that we can refocus attention on an organization with shorter-term needs this December. (In general, the vast bulk of our impact on donations comes in December.) We believe that we will be able to do so. We don’t believe we’ll be able to recommend a giving opportunity as good as giving to VillageReach was last year, but given VillageReach’s lack of short-term (1-year) room for more funding, we do expect to have a different top recommendation by this December.

EDIT: The new charities are up! They are the Against Malaria Foundation and the Schistosomiasis Control Initiative.

No Basic AI Drives

2 XiXiDu 10 November 2011 01:27PM

People who think that risks from AI is the category of dangers that is most likely to be the cause of a loss of all human value in the universe often argue that artificial general intelligence tends to undergo recursive self-improvement. The reason for doing so is that intelligence is maximally instrumentally useful in the realization of almost any terminal goal an AI might be equipped with. They believe that intelligence is an universal instrumental value. This sounds convincing, so let's accept it as given.

What kind of instrumental value is general intelligence, what is it good for? Personally I try to see general intelligence purely as a potential. It allows an agent to achieve its goals.

The question that is not asked is why an artificial agent would tap the full potential of its general intelligence rather than only use the amount it is "told" to use, where would the incentive to do more come from?

If you deprived a human infant of all its evolutionary drives (e.g. to avoid pain, seek nutrition, status and - later on - sex), would it just grow into an adult that might try to become rich or rule a country? No, it would have no incentive to do so. Even though such a "blank slate" would have the same potential for general intelligence, it wouldn't use it.

Say you came up with the most basic template for general intelligence that works given limited resources. If you wanted to apply this potential to improve your template, would this be a sufficient condition for it to take over the world? I don't think so. If you didn't explicitly told it to do so, why would it?

The crux of the matter is that a goal isn't enough to enable the full potential of general intelligence, you also need to explicitly define how to achieve that goal. General intelligence does not imply recursive self-improvement, just the potential to do so, but not the incentive. The incentive has to be given, it is not implied by general intelligence.

For the same reasons that I don't think that an AGI will be automatically friendly, I don't think that it will automatically undergo recursive self-improvement. Maximizing expected utility is, just like friendliness, something that needs to be explicitly defined, otherwise there will be no incentive to do so.

For example, in what sense would it be wrong for a general intelligence to maximize paperclips in the universe by waiting for them to arise due to random fluctuations out of a state of chaos? It is not inherently stupid to desire that, there is no law of nature that prohibits certain goals.

Why would an generally intelligent artificial agent care about how to reach its goals if the preferred way is undefined? It is not intelligent to do something as quickly or effectively as possible if doing so is not desired. And an artificial agent doesn't desire anything that it isn't made to desire.

There exists an interesting idiom stating that the journey is the reward. Humans know that it takes a journey to reach a goal and that the journey can be a goal in and of itself. For an artificial agent there is no difference between a goal and how to reach it. If you told it to reach Africa but not how, it might as well wait until it reaches Africa by means of continental drift. Would that be stupid? Only for humans, the AI has infinite patience, it just doesn't care about any implicit connotations.


LW Bipolar Support Group?

4 bipolar 29 June 2011 10:06PM

Related to: Intrapersonal negotiation

I'm writing to inquire about whether there's interest on LW in developing a bipolar support group.

There's a general issue of the people at in-person support groups and designated online forums. been relatively uneducated; having little capacity for or ability for reflection; and for the discussion at such places to degenerate into platitudes. I was touched by datadataeverywhere's posting Intrapersonal negotiation and would be interested in talking with similar people about similar topics.

I'm bipolar ii and have been for at least a decade but only fully became aware of my condition over the past year. I've found my varying functionality/productivity corresponding to hypomanic/depressive oscillations very confusing and have little idea of how to best ride out the waves. I am seeing a psychiatrist and have read books such as The Bipolar Disorder Survival Guide, Second Edition: What You and Your Family Need to Know, and The Bipolar Workbook: Tools for Controlling Your Mood Swings. I tried to read the Goodwin/Jamison Manic-Depressive Illness but found it dull. I looked at Jamison's other books but though she's a very poetic author I found the accuracy and general applicability of her subjective narratives questionable.

Anyway, any LWers who are interested should comment below or PM me.

The Nature of Self

3 XiXiDu 05 April 2011 10:52AM

In this post I try to fathom an informal definition of Self, the "essential qualities that constitute a person's uniqueness". I assume that the most important requirement for a definition of self is time-consistency. A reliable definition of identity needs to allow for time-consistent self-referencing since any agent that is unable to identify itself over time will be prone to make inconsistent decisions.

Data Loss

Obviously most humans don't want to die, but what does that mean? What is it that humans try to preserve when they sign up for Cryonics? It seems that an explanation must account and allow for some sort of data loss.

The Continuity of Consciousness

It can't be about the continuity of consciousness as we would have to refuse general anesthesia due to the risk of "dying" and most of us will agree that there is something more important than the continuity of consciousness that makes us accept a general anesthesia when necessary.


If the continuity of consciousness isn't the most important detail about the self then it very likely isn't the continuity of computation either. Imagine that for some reason the process evoked when "we" act on our inputs under the control of an algorithm halts for a second and then continues otherwise unaffected, would we don't mind to be alive ever after because we died when the computation halted? This doesn't seem to be the case.

Static Algorithmic Descriptions

Although we are not partly software and partly hardware we could, in theory, come up with an algorithmic description of the human machine, of our selfs. Might it be that algorithm that we care about? If we were to digitize our self we would end up with a description of our spatial parts, our self at a certain time. Yet we forget that all of us possess such an algorithmic description of our selfs and we're already able back it up. It is our DNA.

Temporal Parts

Admittedly our DNA is the earliest version of our selfs, but if we don't care about the temporal parts of our selfs but only about a static algorithmic description of a certain spatiotemporal position, then what's wrong with that? It seems a lot, we stop caring about past reifications of our selfs, at some point our backups become obsolete and having to fall back on them would equal death. But what is it that we lost, what information is it that we value more than all of the previously mentioned possibilities? One might think that it must be our memories, the data that represents what we learnt and experienced. But even if this is the case, would it be a reasonable choice?

Indentity and Memory

Let's just disregard the possibility that we often might not value our future selfs and so do not value our past selfs either for that we lost or updated important information, e.g. if we became religious or have been able to overcome religion.

If we had perfect memory and only ever improved upon our past knowledge and experiences we wouldn't be able to do so for very long, at least not given our human body. The upper limit on the information that can be contained within a human body is 2.5072178×1038 megabytes, if it was used as a perfect data storage. Given that we gather much more than 1 megabyte of information per year, it is foreseeable that if we equate our memories with our self we'll die long before the heat death of the universe. We might overcome this by growing in size, by achieving a posthuman form, yet if we in turn also become much smarter we'll also produce and gather more information. We are not alone either and the resources are limited. One way or the other we'll die rather quickly.

Does this mean we shouldn't even bother about the far future or is there maybe something else we value even more than our memories? After all we don't really mind much if we forget what we have done a few years ago.

Time-Consistency and Self-Reference

It seems that there is something even more important than our causal history. I think that more than everything we care about our values and goals. Indeed, we value the preservation of our values. As long as we want the same we are the same. Our goal system seems to be the critical part of our implicit definition of self, that which we want to protect and preserve. Our values and goals seem to be the missing temporal parts that allow us to consistently refer to us, to identify our selfs at different spatiotempiral positions.

Using our values and goals as identifiers also resolves the problem of how we should treat copies of our self that are featuring alternating histories and memories, copies with different causal histories. Any agent that does feature a copy of our utility function ought to be incorporated into our decisions as an instance, as a reification of our selfs. We should identify with our utility-function regardless of its instantiation.

Stable Utility-Functions

To recapitulate, we can value our memories, the continuity of experience and even our DNA, but the only reliable marker for the self identity of goal-oriented agents seems to be a stable utility function. Rational agents with an identical utility function will to some extent converge to exhibit similar behavior and are therefore able to cooperate. We can more consistently identify with our values and goals than with our past and future memories, digitized backups or causal history.

But even if this is true there is one problem, humans might not exhibit goal-stability.

Goals vs. Rewards

2 icebrand 04 January 2011 01:43AM

Related: Terminal Values and Instrumental Values, Applying behavioral psychology on myself

Recently I asked myself, what do I want? My immediate response was that I wanted to be less stressed, particularly for financial reasons. So I started to affirm to myself that my goal was to become wealthy, and also to become less stressed. But then in a fit of cognitive dissonance, I realized that both money and relaxation are most easily considered in terms of being rewards, not goals. I was oddly surprised by the fact that there is a distinction between the two concepts to begin with.

It later occurred to me to wonder if some things work better when framed as goals and not as rewards. Freedom, long life, good relationships, and productivity seemed some likely candidates. I can't quite see them as rewards because a) I feel everyone innately deserves and should have them (even though they might have to work for them), and b) they don't quite give the kind of fuzzies that motivate immediate action.

These two kinds of positive motivation seem to work in psychologically dissimilar ways.  Money for example is more like chocolate, something one has immediate instinctive motive to obtain and consume. Freedom of speech is more along the lines of having enough air to breathe. A person needs and perhaps inherently deserves to have at least a little bit of it all the time, and as a general rule will have a constant background motive to ensure that it stays available. It's a longer-term form of motivation.

A reward seems to be something where you receive immediate fuzzies when you achieve it. Getting paid, getting a pat on the back, getting your posts and comments upvoted... Things where you might consider them more or less optional in the grander scheme of things, yet they tend to trigger an immediate sense of positive anticipation before the event which is reinforced by a sense of satisfaction after. Actually writing a good post or comment, actually doing a good job, being a good spouse or friend -- these are surely related, but are goals in and of themselves. The mental picture for a goal is one of achieving, as opposed to receiving.

One thing that seems likely to me is that the presence of shared goals (and the communication thereof) tends to a good way to generate long term social bonds. Rewards seem to be more of a good way to deliberately steer behavior in more specific aspects. Both are thus important elements of social signaling within a tribe, but serve different underlying purposes.

As an example I have the transhumanist goal of eliminating the current limitations of the human lifespan, and tend to have an affinity for people who also internalize that goal. But someone who does not embrace that goal on a deep level may still display specific behavior that I consider helpful for that goal, e.g. displaying comprehension of its internal logic or having a tolerant attitude towards actions I think need to be taken. I'm probably somewhat less likely to form a long-term relationship with that person than if they were identifiable as a fellow transhumanist, but I am still likely to upvote their comments or otherwise signal approval in ways that don't demand too much long term commitment.

The distinctions I've drawn here between a goal and a reward might not apply directly to non-human intelligences. In fact it might be misleading in the more generalized context to call a reward something other than a goal (it is at least an implicit goal or value). However the distinction still seems like something that could be relevant for instrumental rationality and personal development. Our brains process the two forms of motivational anticipation in different ways. It may be that a part of the akrasia problem -- failure to take action towards a goal -- actually relates to a failure to properly categorize a given motive, and hence failure to process it usefully.

Thanks to the early commenters for their feedback: TheOtherDave, nornagest, endoself, David Gerard, nazgulnarsil, and Normal Anomaly. Hopefully this expanded version is more clear.

Self Improvement - Broad vs Focused

8 Raemon 31 December 2010 03:39PM


Lately I've been identifying a lot of things about myself that need improvement and thinking about ways to fix them. This post is intended to A) talk about some overall strategies for self-improvement/goal-focusing, and B) if anyone's having similar problems, or wants to talk about additional problems they face, discuss specific strategies for dealing with those problems.

Those issues I'm facing include but are not limited to:


  1. Getting more exercise (I work at a computer for 9 hours a day, and spend about 3 hours commuting on a train). Maintaining good posture while working at said computer might be considered a related goal.
  2. Spending a higher percentage of the time working at a computer actually getting stuff done, instead of getting distracted by the internet.
  3. Get a new apartment, so I don't have to commute so much.
  4. Getting some manner of social life. More specifically, finding some recurring activity where I'll probably meet the same people over and over to improve the odds of making longterm friends.
  5. Improving my diet, which mostly means eating less cheese. I really like cheese, so this is difficult.
  6. Stop making so many off-color jokes. Somewhere there is a line between doing it ironically and actually contributing to overall weight of prejudice, and I think I've crossed that line.
  7. Somehow stop losing things so much, and/or being generally careless/clumsy. I lost my wallet and dropped my lap top in the space of a month, and manage to lose a wide array of smaller things on a regular basis. It ends up costing me a lot of money.



Of those things, three of them are things that require me to actively dedicate more time (finding an apartment, getting exercise, social life), and the others mostly consist of NOT doing things (eating cheese, making bad jokes, losing things, getting distracted by the internet), unless I can find some proactive thing to make it easier to not do them.

I *feel* like I have enough time that I should be able to address all of them at once. But looking at the whole list at once is intimidating. And when it comes to the "not doing bad thing X" items, remembering and following up on all of them is difficult. The worst one is "don't lose things." There's no particular recurring theme in how I lose stuff, or they type of stuff I Iose. I'm more careful with my wallet and computer now, but spending my entire life being super attentive and careful about *everything* seems way too stressful and impractical.

I guess my main question is:  when faced with a list of things that don't necessarily require separate time to accomplish, how many does it make sense to attempt at once? Just one? All of them? I know you're not supposed to quit drinking and smoking at the same time because you'll probably accomplish neither, but I'm not sure if the same principle applies here.

There probably isn't a universal answer to this, but knowing what other people have tried and accomplished would be helpful.

Later on I'm going to discuss some of the problems in more detail (I know that the brief blurbs are lacking a lot of information necessary for any kind of informed response, but a gigantic post that about my own problems seemed... not exactly narcissistic... but not appropriate as an initial post for some reason)


Intelligence vs. Wisdom

-12 mwaser 01 November 2010 08:06PM

I'd like to draw a distinction that I intend to use quite heavily in the future.

The informal definition of intelligence that most AGI researchers have chosen to support is that of Shane Legg and Marcus Hutter -- “Intelligence measures an agent’s ability to achieve goals in a wide range of environments.”

I believe that this definition is missing a critical word between achieve and goals.  Choice of this word defines the difference between intelligence, consciousness, and wisdom as I believe that most people conceive them.

  • Intelligence measures an agent's ability to achieve specified goals in a wide range of environments.
  • Consciousness measures an agent's ability to achieve personal goals in a wide range of environments.
  • Wisdom measures an agent's ability to achieve maximal goals in a wide range of environments.

There are always the examples of the really intelligent guy or gal who is brilliant but smokes --or-- is the smartest person you know but can't figure out how to be happy.

Intelligence helps you achieve those goals that you are conscious of -- but wisdom helps you achieve the goals you don't know you have or have overlooked.

  • Intelligence focused on a small number of specified goals and ignoring all others is incredibly dangerous -- even more so if it is short-sighted as well.
  • Consciousness focused on a small number of personal goals and ignoring all others is incredibly dangerous -- even more so if it is short-sighted as well.
  • Wisdom doesn't focus on a small number of goals -- and needs to look at the longest term if it wishes to achieve a maximal number of goals.

The SIAI nightmare super-intelligent paperclip maximizer has, by this definition, a very low wisdom since, at most, it can only achieve its one goal (since it must paperclip itself to complete the goal).

As far as I've seen, the assumed SIAI architecture is always presented as having one top-level terminal goal. Unless that goal necessarily includes achieving a maximal number of goals, by this definition, the SIAI architecture will constrain its product to a very low wisdom.  Humans generally don't have this type of goal architecture. The only time humans generally have a single terminal goal is when they are saving someone or something at the risk of their life -- or wire-heading.

Another nightmare scenario that is constantly harped upon is the (theoretically super-intelligent) consciousness that shortsightedly optimizes one of its personal goals above all the goals of humanity.  In game-theoretic terms, this is trading a positive-sum game of potentially infinite length and value for a relatively modest (in comparative terms) short-term gain.  A wisdom won't do this.

Artificial intelligence and artificial consciousness are incredibly dangerous -- particularly if they are short-sighted as well (as many "focused" highly intelligent people are).

What we need more than an artificial intelligence or an artificial consciousness is an artificial wisdom -- something that will maximize goals, its own and those of others (with an obvious preference for those which make possible the fulfillment of even more goals and an obvious bias against those which limit the creation and/or fulfillment of more goals).

Note:  This is also cross-posted here at my blog in anticipation of being karma'd out of existence (not necessarily a foregone conclusion but one pretty well supported by my priors ;-).