You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Proposal for increasing instrumental rationality value of the LessWrong community

19 harcisis 28 October 2015 03:18PM

There were some concerns here (http://lesswrong.com/lw/2po/selfimprovement_or_shiny_distraction_why_less/) regarding value of LessWrong community from the perspective of instrumental rationality. 

In the discussion on the relevant topic I've seen the story about how community can help  http://lesswrong.com/lw/2p5/humans_are_not_automatically_strategic/2l73 from this perspective.

And I think It's a great thing that local community can help people in various ways to achieve their goals. Also it's not the first time I hear about how this kind of community is helpful as a way of achieving personal goals.

Local LessWrong meetups and communities are great, but they have kind of different focus. And a lot of people live in places where there are no local community or it's not active/regular.

So I propose to form small groups (4-8 people). Initially, groups would meet (using whatever means that are convenient for a particular group), discuss the goals of each participant in a long and in a short term (life/year/month/etc). They would collectively analyze proposed strategies for achieving these goals. Discuss how short term goals align with long term goals. And determine whether the particular tactics for achieving stated goal is optimal. And is there any way to improve on it?

Afterwards, the group would meet weekly to:

Set their short term goals, retrospect on the goals set for previous period. Discuss how successfully they were achieved, what problems people encountered and what alterations to overall strategy follows. And they will also analyze how newly set short-term goals coincide with long-term goals. 

In this way, each member of the group would receive helpful feedback on his goals and on his approach to attaining them. And also he will fill accountable, in a way, for goals, he have stated before the group and this could be an additional boost to productivity.

I also expect that group would be helpful from the perspective of overcoming different kind of fallacies and gaining more accurate beliefs about the world. Because it's easier for people to spot errors in the beliefs/judgment of others. I hope that group's would be able to develop friendly environment and so it would be easier for people to get to know about their errors and change their mind. Truth springs from argument amongst friends.

Group will reflect on it's effectiveness and procedures every month(?) and will incrementally improve itself. Obviously if somebody have some great idea about group proceedings it makes sense to discuss it after usual meeting and implement it right away. But I think regular in-depth retrospective on internal workings is also important.

If there are several groups available - groups will be able to share insights, things group have learned during it's operation. (I'm not sure how much of this kind of insights would be generated, but maybe it would make sense to once in a while publish post that would sum up groups collective insights.)

There are some things that I'm not sure about: 

 

  • I think it would be worth to discuss possibility of shuffling group members (or at least exchanging members in some manner) once in a while to provide fresh insight on goals/problems that people are facing and make the flow of ideas between groups more agile.
  • How the groups should be initially formed? Just random assignment or it's reasonable to devise some criteria? (Goals alignment/Diversity/Geography/etc?)

 

I think initial reglament of the group should be developed by the group, though I guess it's reasonable to discuss some general recommendations.

So what do you think? 

If you interested - fill up this google form:

https://docs.google.com/forms/d/1IsUQTp_6pGyNglBiPOGDuwdGTBOolAKfAfRrQloYN_o/viewform?usp=send_form

 

Min/max goal factoring and belief mapping exercise

-1 Clarity 23 June 2015 05:30AM

Edit 3: Removed description of previous edits and added the following:

This thread used to contain the description of a rationality exercise.

I have removed it and plan to rewrite it better.

I will repost it here, or delete this thread and repost in the discussion.

Thank you.

[Link]How to Achieve Impossible Career Goals (My manifesto on instrumental rationality)

6 [deleted] 02 January 2015 08:46PM

Hey guys,

Don't normally post from my blog to here, but the latest massive post on goal achievement in 2015 has a ton that would be relevant to people here.

Some things that I think would be of particular interest to LWers:

 

  • The section called "Map the Path to Your Goal" has some really great stuff on planning that haven't seen many other places. I know planning gets a bad wrap here, but when combined with the "Contigency Plans" method near the bottom of the post, I've found this stuff to be killer for getting results for students.
  • At the bottom, there's a section called "Choosing More Habits" that breaks down habits into the only five categories you should ever focus on. If you're planning to systematically take on new habits in 2015, this will help.
  • The section called "a proactive mindset" has some fun mental reframes to play around with.
Anyways, would love feedback and thoughts. Feel free to comment here or on the bottom of that post.

Thanks!
Matt

 

Goal retention discussion with Eliezer

56 MaxTegmark 04 September 2014 10:23PM

Although I feel that Nick Bostrom’s new book “Superintelligence” is generally awesome and a well-needed milestone for the field, I do have one quibble: both he and Steve Omohundro appear to be more convinced than I am by the assumption that an AI will naturally tend to retain its goals as it reaches a deeper understanding of the world and of itself. I’ve written a short essay on this issue from my physics perspective, available at http://arxiv.org/pdf/1409.0813.pdf.

Eliezer Yudkowsky just sent the following extremely interesting comments, and told me he was OK with me sharing them here to spur a broader discussion of these issues, so here goes.

On Sep 3, 2014, at 17:21, Eliezer Yudkowsky <yudkowsky@gmail.com> wrote:

Hi Max!  You're asking the right questions.  Some of the answers we can
give you, some we can't, few have been written up and even fewer in any
well-organized way.  Benja or Nate might be able to expound in more detail
while I'm in my seclusion.

Very briefly, though:
The problem of utility functions turning out to be ill-defined in light of
new discoveries of the universe is what Peter de Blanc named an
"ontological crisis" (not necessarily a particularly good name, but it's
what we've been using locally).

http://intelligence.org/files/OntologicalCrises.pdf

The way I would phrase this problem now is that an expected utility
maximizer makes comparisons between quantities that have the type
"expected utility conditional on an action", which means that the AI's
utility function must be something that can assign utility-numbers to the
AI's model of reality, and these numbers must have the further property
that there is some computationally feasible approximation for calculating
expected utilities relative to the AI's probabilistic beliefs.  This is a
constraint that rules out the vast majority of all completely chaotic and
uninteresting utility functions, but does not rule out, say, "make lots of
paperclips".

Models also have the property of being Bayes-updated using sensory
information; for the sake of discussion let's also say that models are
about universes that can generate sensory information, so that these
models can be probabilistically falsified or confirmed.  Then an
"ontological crisis" occurs when the hypothesis that best fits sensory
information corresponds to a model that the utility function doesn't run
on, or doesn't detect any utility-having objects in.  The example of
"immortal souls" is a reasonable one.  Suppose we had an AI that had a
naturalistic version of a Solomonoff prior, a language for specifying
universes that could have produced its sensory data.  Suppose we tried to
give it a utility function that would look through any given model, detect
things corresponding to immortal souls, and value those things.  Even if
the immortal-soul-detecting utility function works perfectly (it would in
fact detect all immortal souls) this utility function will not detect
anything in many (representations of) universes, and in particular it will
not detect anything in the (representations of) universes we think have
most of the probability mass for explaining our own world.  In this case
the AI's behavior is undefined until you tell me more things about the AI;
an obvious possibility is that the AI would choose most of its actions
based on low-probability scenarios in which hidden immortal souls existed
that its actions could affect.  (Note that even in this case the utility
function is stable!)

Since we don't know the final laws of physics and could easily be
surprised by further discoveries in the laws of physics, it seems pretty
clear that we shouldn't be specifying a utility function over exact
physical states relative to the Standard Model, because if the Standard
Model is even slightly wrong we get an ontological crisis.  Of course
there are all sorts of extremely good reasons we should not try to do this
anyway, some of which are touched on in your draft; there just is no
simple function of physics that gives us something good to maximize.  See
also Complexity of Value, Fragility of Value, indirect normativity, the
whole reason for a drive behind CEV, and so on.  We're almost certainly
going to be using some sort of utility-learning algorithm, the learned
utilities are going to bind to modeled final physics by way of modeled
higher levels of representation which are known to be imperfect, and we're
going to have to figure out how to preserve the model and learned
utilities through shifts of representation.  E.g., the AI discovers that
humans are made of atoms rather than being ontologically fundamental
humans, and furthermore the AI's multi-level representations of reality
evolve to use a different sort of approximation for "humans", but that's
okay because our utility-learning mechanism also says how to re-bind the
learned information through an ontological shift.

This sorta thing ain't going to be easy which is the other big reason to
start working on it well in advance.  I point out however that this
doesn't seem unthinkable in human terms.  We discovered that brains are
made of neurons but were nonetheless able to maintain an intuitive grasp
on what it means for them to be happy, and we don't throw away all that
info each time a new physical discovery is made.  The kind of cognition we
want does not seem inherently self-contradictory.

Three other quick remarks:

*)  Natural selection is not a consequentialist, nor is it the sort of
consequentialist that can sufficiently precisely predict the results of
modifications that the basic argument should go through for its stability.
The Omohundrian/Yudkowskian argument is not that we can take an arbitrary
stupid young AI and it will be smart enough to self-modify in a way that
preserves its values, but rather that most AIs that don't self-destruct
will eventually end up at a stable fixed-point of coherent
consequentialist values.  This could easily involve a step where, e.g., an
AI that started out with a neural-style delta-rule policy-reinforcement
learning algorithm, or an AI that started out as a big soup of
self-modifying heuristics, is "taken over" by whatever part of the AI
first learns to do consequentialist reasoning about code.  But this
process doesn't repeat indefinitely; it stabilizes when there's a
consequentialist self-modifier with a coherent utility function that can
precisely predict the results of self-modifications.  The part where this
does happen to an initial AI that is under this threshold of stability is
a big part of the problem of Friendly AI and it's why MIRI works on tiling
agents and so on!

*)  Natural selection is not a consequentialist, nor is it the sort of
consequentialist that can sufficiently precisely predict the results of
modifications that the basic argument should go through for its stability.
It built humans to be consequentialists that would value sex, not value
inclusive genetic fitness, and not value being faithful to natural
selection's optimization criterion.  Well, that's dumb, and of course the
result is that humans don't optimize for inclusive genetic fitness.
Natural selection was just stupid like that.  But that doesn't mean
there's a generic process whereby an agent rejects its "purpose" in the
light of exogenously appearing preference criteria.  Natural selection's
anthropomorphized "purpose" in making human brains is just not the same as
the cognitive purposes represented in those brains.  We're not talking
about spontaneous rejection of internal cognitive purposes based on their
causal origins failing to meet some exogenously-materializing criterion of
validity.  Our rejection of "maximize inclusive genetic fitness" is not an
exogenous rejection of something that was explicitly represented in us,
that we were explicitly being consequentialists for.  It's a rejection of
something that was never an explicitly represented terminal value in the
first place.  Similarly the stability argument for sufficiently advanced
self-modifiers doesn't go through a step where the successor form of the
AI reasons about the intentions of the previous step and respects them
apart from its constructed utility function.  So the lack of any universal
preference of this sort is not a general obstacle to stable
self-improvement.

*)   The case of natural selection does not illustrate a universal
computational constraint, it illustrates something that we could
anthropomorphize as a foolish design error.  Consider humans building Deep
Blue.  We built Deep Blue to attach a sort of default value to queens and
central control in its position evaluation function, but Deep Blue is
still perfectly able to sacrifice queens and central control alike if the
position reaches a checkmate thereby.  In other words, although an agent
needs crystallized instrumental goals, it is also perfectly reasonable to
have an agent which never knowingly sacrifices the terminally defined
utilities for the crystallized instrumental goals if the two conflict;
indeed "instrumental value of X" is simply "probabilistic belief that X
leads to terminal utility achievement", which is sensibly revised in the
presence of any overriding information about the terminal utility.  To put
it another way, in a rational agent, the only way a loose generalization
about instrumental expected-value can conflict with and trump terminal
actual-value is if the agent doesn't know it, i.e., it does something that
it reasonably expected to lead to terminal value, but it was wrong.

This has been very off-the-cuff and I think I should hand this over to
Nate or Benja if further replies are needed, if that's all right.

Choose that which is most important to you

-4 CarlJ 21 July 2013 10:38PM

Followup to: The Domain of Politics

To create your own political world view you need to know about societies and your own political goals/values. In this post I'll discuss the latter, and in the next post the former.

What sort of goals? Those which you wish to achieve for their own sake, and not because they simply are a means to an end. That is, those goals you value intrinsically. Or, if you believe that there exists only one ultimate goal or value, then think of those means which are not that far removed from being intrinsic goal. That is, a birthday party might be just of instrumental value but most would agree that it is more far away from the intrinsic value than, say, good tires. I will for the rest of the post assume that most people value a lot of things intrinsically, and by values I will denote intrinsic values.

So, I'd like to draw a line between values and that which achieve those values. The latter is what we're trying to figure out what they are, without first proposing what they are. Those are political systems, or parts of them; they are institutions and laws. This is not to say that these things cannot be valued for their own sake – I put value on a system, possibly for aesthetic reasons – but those values should be disentangled from the other benefit a system produces.

With that in mind, you should now list all the things you value in ranking order. To rank them is necessary since we live in a world of scarce resources, so you won't necessarily achieve all your goals, but you will want to achieve those that are most important to you.

Now, what one values may change over time, so naturally what seems to be most important may also change. That which was on place #7 may go to #1 and vice versa. That is, values are changing with new information and a change in one's condition. That said, one's political values don't probably shift all that much. And even if they do, if you can't predict how they will change, you still need them to be able to know what political system is good for you.

There are many ways to get a feel of what your most highly valued political values are. Introspection, discussing with friends, think through a number of thought experiments, read the literature on what makes most people happy, listen to what experiences have been most horrible or pleasurable to others, etc.. In any case, here's a thought experiment to help with finding your ideological preferences, should you need it:

A genie appears and it says that it will make ten wishes come true and then it will be gone forever. As this genie will make more than three wishes come true it has an added restriction: all wishes need to be political in nature. By luck you get to make the wishes – what do you wish for?

The important thing to remember is that, if you should lose one wish, you will be less sorry to give up your tenth wish than any other. And less sorry to give up the ninth wish than the eight if you lost two wishes, and so on.

To make it clearer what I mean I'll write down some of the things I value. Not my most preferred goals, but those on 11th to 20th place:

  1. Those who have trouble excelling in life should receive whatever help can be given so they may become better.
  2. If someone comes up with a previously unknown idea for improving the world, and if three knowledgeable and unrelated individuals believe the idea is very good, it should only take some hours for everyone to be able to know that this matter is of importance.
  3. Everyone should have access to some means of totally private communication.
  4. There should be no infringement on the right to develop one's mind, whatever technology one uses.
  5. All animals should be, if the technology ever becomes available, sufficiently mentally enhanced to be given the choice of whether or not to become as intelligent (or more) as humans.
  6. If it ever seems likely to be possible, we should strive towards creating a technology to resurrect the dead sooner rather than later.
  7. The civilization should be able to co-exist with other peaceful civilizations.
  8. There shouldn't be any ultimate certainty on the nature of existence or in any one reality tunnel; some balkanization of epistemology is good.
  9. Everyone who share these values should know or learn the art of creating sustainable groups for collective action.
  10. The civilization which embodies these values should continue indefinitely.

EDIT: DanielLC notes that this simple ranking wouldn't give you any information on how valuable a 90% completion of one goal is relative to a 95% completion of another goal. That information will however be important when you have to choose between incremental steps towards several different goals.

To create a ranking which displays that information, imagine that each goal you have written down can be in progress in five stages - 0%, 25%, 50%, 75%, 100% - so that it is possible to be 75% or 0% on the way to achieve any particular goal. So, for instance, the goal of having private communication for everyone might be 50% completed if half the population have access to secret communication channels, but the other half doesn't.

Next, assume your one wish (in the scenario) is divided into five parts, one for each stage. And then rank every wish again following the same rule. This will look something like this:

  1. 100% of my first goal.
  2. 100% of my second goal.
  3. 100% of my third goal.
  4. 100% of my fourth goal.
  5. 75% of my first goal.
  6. 100% of my fifth goal.
  7. 50% of my first goal.
  8. 75% of my second goal.

(This was made purely for illustrative purposes. I haven't thought the matter through completely on how much I value these incremental parts.)

Another option is to do these more fine-tuned rankings on a gut level. Just having an imprecise feeling that, somewhere being closer to goal A stops being as important as being closer to B. This should be appropriate for those areas where your uncertainty about your preferences is high or where you don't care that much about which goal gets satisfied.

Next post: "Consider the Most Important Facts"

What's your hypothetical apostasy?

-6 [deleted] 16 February 2013 11:25AM

By learning to overcome the Nirvana fallacy, I have managed to find my hypothetical apostasy: I want to cure aging!

I think that awesome stuff will happen in the far future and I plan on getting there, so I'll do my best to make sure that I stay alive as long as I can. (Also, my primitive survival instincts make me want to become immortal.) Unfortunately, due to my evolutionary baggage, my own genes are going to kill me in a few decades.

What's your hypothetical apostasy and how do you plan to put it in practice?

 

Edit #1: If you're downvoting this article, I'd like to know why you're doing that. Send me a message or reply here.

Edit #2: I totally misunderstood what the hypothetical apostasy means. I was under the impression that it meant defending a view that most people deem too weird to contemplate. See Lark's explanation. I guess you should downvote this article!