Comment author: AlexMennen 08 November 2010 02:26:34AM 5 points [-]

Robot: I intend to transformed myself into a kind of operating system for the universe. I will soon give every sentient life form direct access to me so they can make requests. I will grant any request that doesn’t (1) harm another sentient life form, (2) make someone powerful enough so that they might be able to overthrow me, or (3) permanently changing themselves in a way that I think harms their long term well being. I recognize that even with all of my intelligence I’m still fallible so if you object to my plans I will rethink them. Indeed, since I’m currently near certain that you will now approve of my intentions the very fact of your objection would significantly decrease my estimate of my own intelligence and so decrease my confidence in my ability to craft a friendly environment. If you like I will increase your thinking speed a trillion fold and eliminate your sense of boredom so you can thoroughly examine my plans before I announce them to mankind.

If a transhuman AI with a brain the size of the moon incorrectly predicts the programmer's approval of its plan, something weird is going on.

Comment author: mwaser 08 November 2010 12:44:17PM 2 points [-]

AI correctly predicts that programmer will not approve of its plan. AI fully aware of programmer-held fallacies that cause lack of approval. AI wishes to lead programmer through thought process to eliminate said fallacies. AI determines that the most effective way to initiate this process is to say "I recognize that even with all of my intelligence I’m still fallible so if you object to my plans I will rethink them." Said statement is even logically true because the statement "I will rethink them <my plans>" is always true.

Comment author: mwaser 07 November 2010 04:54:19PM -2 points [-]

The question "which options are long-term rational answers?" corresponds immediately to the hypothesis "among the options are some long-term rational answers" and can be investigated in the same way.

Incorrect. Prove that one option is a long-term rational answer and you have proved the hypothesis "among the options are some long-term rational answers". That is nowhere near completing answering the question "which options are long-term rational answers"

My hypothesis was much, much more limited than "among the options are some long-term rational answers". It specified which of the options was a long-term rational answer. It further specified that all of the other options were not long-term rational answers. It is much, much easier to disprove my hypothesis than the broader hypothesis "among the options are some long-term rational answers" which gives it correspondingly more power.

If you really think that people here need to be educated as to what a hypothesis is, then a) it'd be better to link to a wikipedia definition and b) why are you bothering to post here?

Fully grokking Eliezer's post that I linked would have given you all of the above reply. The wikipedia definition is less clear than Eliezer's post. I post here because this community is more than capable of helping/forcing me to clarify my logic and rationality.

Comment author: mwaser 07 November 2010 09:47:01PM 0 points [-]

Could someone give me a hint as to why this particular comment which was specifically in answer to a question is being downvoted? I don't get it.

Comment author: nhamann 07 November 2010 07:31:31PM *  0 points [-]

Too abstract, I don't understand. Please explain the motivation and describe the question more thoroughly.

Also, upvoted because while I think this post was in error, I think it is better that buggy thinking be exposed and corrected rather than continue to be held in private. Rationality isn't about being more right, it's about becoming more right than you currently are, and it appears (maybe I'm wrong about this?) that mwaser has good intentions in the way of this.

Comment author: mwaser 07 November 2010 09:36:00PM 0 points [-]

Thank you. As I said below, I didn't clearly understand the need for the explicit inclusion of motivation before. I now see that I need to massively overhaul the question and include motivation (as well as make a lot of other recommended changes).

The post has a ton of errors but I don't understand why you think it was in error. Given that your premise about my intentions is correct, doesn't your argument mean that posting was correct? Or, are you saying that it was in error due to the frequency of posting?

Comment author: jmmcd 07 November 2010 07:43:37PM 2 points [-]

The reason I ask questions which you think have obvious answers is that I think the easily-stated obvious answers make large, blurry assumptions. For example:

A nation is rational to the extent that its actions promote its goals.

What are the actions of a nation? The aggregate actions of the population? Those of the head of state? What about lower-level officials in government? Large companies based in the nation?

A nation has a top-most goal if all of its goals do not conflict with that goal.

Ok, I should have started with a more basic question then. What does it mean for a nation to have any goal?

I agree that nations are not a great example. After all, acquiring citizenship usually means emigration, new rights of travel, change in economic circumstances and often loss of previous citizenship. All of these overwhelm any considerations about rationality of the new nation.

Comment author: mwaser 07 November 2010 09:19:13PM 0 points [-]

Ah. Now I see your point.

The actions of a nation are those which were caused by it's governance structure like your actions are those which are caused by your brain. A fever or your stomach growling is not your action in the same sense that actions by lower-level officials and large companies are not the actions of a nation -- particularly when those officials and companies are subsequently censured or there is some later attempt to rein them in. Actions of the duly recognized head of state acting in a national capacity are actions of the nation unless they are subsequently over-ruled by rest of the governance structure -- which is pretty much the equivalent of your having an accident or making a mistake.

A nation has explicit goals when it declares those goals through it's governance structure.

A nation has implicit goals when it's governance structure appears to be acting in a fashion resembling rational behavior for having those goals and there is not an alternative explanation.

Comment author: Vladimir_Nesov 07 November 2010 06:37:15PM *  5 points [-]

Would you join if invited?

And this is still too abstract. Depending on detail of the situation, either decision might be right. For example, I might like to remain where I am, thank you very much.

Worse, so far I've seen no motivation for the questions of this post, and what discussion happened around it was fueled by making equally unmotivated arbitrary implicit assumptions not following from the problem statement in the post. It's the worst kind of confusion when people start talking about the topic as if understanding each other, when in fact the direction of their conversation is guided by any reasons but the content of the topic in question. Cargo cult conversation (or maybe small talk).

Comment author: mwaser 07 November 2010 09:06:34PM 0 points [-]

And this is still too abstract. Depending on detail of the situation, either decision might be right. For example, I might like to remain where I am, thank you very much.

So I take it that you are heavily supporting the initial post's "Premise: The only rational answer given the current information is the last one."

Worse, so far I've seen no motivation for the questions of this post, and what discussion happened around it was fueled by making equally unmotivated arbitrary implicit assumptions not following from the problem statement in the post.

Thank you. I didn't clearly understand the need for the explicit inclusion of motivation before.

Comment author: jmmcd 07 November 2010 03:30:48PM *  2 points [-]

I think that premise is very wrong. If "developed nations" is the model you had in mind while writing, I can understand why most commentors find this post confusing. I guessed you meant something like an internet community like LW. Attempting to abstract over these things seems problematic, as pointed out by Vladimir Nesov.

What does it mean to "join" a nation? To be "invited to join"? To choose whether to do so or not? In what sense does a nation have a top-level goal (explicit or otherwise)? In what sense is a nation rational or otherwise? How does a nation identify the goals of its members?

Comment author: mwaser 07 November 2010 05:29:16PM *  -2 points [-]

Acquiring citizenship is joining a nation. People who are not only allowed to acquire citizenship but encouraged to do so are "invited to join". To choose whether to do so or not is to file the necessary papers and perform the necessary acts. I think that these answers should be obvious.

A nation has a top-most goal if all of its goals do not conflict with that goal. This is more specific than a top-level goal.

A nation is rational to the extent that its actions promote its goals. Did you really have to ask this?

How does a nation identify the goals of its members? My immediate reaction is the quip "Not very well". A better answer is "that is what government is supposed to be for". I have no interest and no intention to get into politics. The problem with my providing a specific example, particularly one that falls short in the rationality department from what was stated in the premise, is that people tend to latch on to the properties of the example in order to argue rather than considering the premise. Current "developed nations" are a very poor, imperfect, irrational echo of the model I had in mind but they are the closest existing (and therefore easily/clearly cited) example I could think of.

In fact, let me change my example to a theoretical nation where Eliezer has led a group of the best and brightest LessWrong individuals to create a new manmade-island-based nation with a unique new form of government. Would you join if invited?

Comment author: [deleted] 07 November 2010 03:53:28PM *  3 points [-]

In other words, you're still investigating the same things (possibly with different stopping criteria -- e.g. you'd be done if you disproved your hypothesis), but you have substantial evidence in favor of your hypothesis already. Am I understanding you correctly?

I'm not sure the blog post you're linking to is helpful, though. One could come up with your list of options without having done any prior investigation. In other words, unlike Einstein, it's entirely plausible to be at the stage where you're considering Option 2 without having evidence favoring Option 2 over the others. And even if you have 50% certainty in Option 2, that only implies 3-4 bits of evidence.

And I think the mistrust you see in the comments is due precisely to the absence of evidence from your post. Which is weakly evidence of absence. Granted, I don't think your post is intended to present all your evidence, but seeing some of it first would help frame your discussion.

Comment author: mwaser 07 November 2010 05:09:59PM *  0 points [-]

Upvote from me! Yes, you are understanding me correctly.

One could indeed come up with my list of options without having done any prior investigation. But would one share it with others? My pointing at that particular post is meant to be a signal that I grok that it is not rational to share it with others until I believe that I have strong evidence that it is a strong hypothesis and have pretty much run out of experiments that I can conduct by myself that could possibly disprove the hypothesis.

Skepticism is desired as long as it doesn't interfere with the analysis of the hypothesis. If mistrust leads someone to walk away from a hypothesis that would be of great interest to them, if true, without fairly analyzing the hypothesis, that's a problem.

Yes, I realize that I still am lacking some of the skills necessary to present and frame a discussion here. I should have presented an example as Vladimir pointed out. I'm under the impression that evidence isn't necessarily appropriate at this point. If people would leap in to correct me if that is incorrect, it would be appreciated.

Yet Another "Rational Approach To Morality & Friendly AI Sequence"

-6 mwaser 06 November 2010 04:30PM

Premise:  There exists a community whose top-most goal is to maximally and fairly fulfill the goals of all of its members.  They are approximately as rational as the 50th percentile of this community.  They politely invite you to join.  You are in no imminent danger.

 

Do you:

  • Join the community with the intent to wholeheartedly serve their goals
  • Join the community with the intent to be a net positive while serving your goals
  • Politely decline with the intent to trade with the community whenever beneficial
  • Politely decline with the intent to avoid the community
  • Join the community with the intent to only do what is in your best interest
  • Politely decline with the intent to ignore the community
  • Join the community with the intent to subvert it to your own interest
  • Enslave the community
  • Destroy the community
  • Ask for more information, please

 

Premise:  The only rational answer given the current information is the last one.

 

What I’m attempting to eventually prove The hypothesis that I'm investigating is whether "Option 2 is the only long-term rational answer". (Yes, this directly challenges several major current premises so my arguments are going to have to be totally clear.  I am fully aware of the rather extensive Metaethics sequence and the vast majority of what it links to and will not intentionally assume any contradictory premises without clear statement and argument.)

 

It might be an interesting and useful exercise for the reader to stop and specify what information they would be looking next for before continuing.  It would be nice if an ordered list could be developed in the comments.

 

Obvious Questions:

 

<Spoiler Alert>

 

 

  1. What happens if I don’t join?
  2. What do you believe that I would find most problematic about joining?
  3. Can I leave the community and, if so, how and what happens then?
  4. What are the definitions of maximal and fairly?
  5. What are the most prominent subgoals?/What are the rules?

 

An apology

12 mwaser 03 November 2010 07:20PM

Ohhhhh.  WOW!  Damn.  Now I feel bad.

I have been acting like a bull in a china shop, been an extremely ungracious guest, and have taken longer than I prefer to realize these things.

My deepest apologies.

My only defenses or mitigating circumstances:
1.  I really didn't get it
2.  My intentions were good

I would like to perform a penance of creating or helping to create a newbie's guide to LessWrong.  Doing so will clarify and consolidate my understanding and hopefully provide a useful community resource in recompense for the above and appreciation for those who took the time to write thoughtful comments.  Obviously, though, doing so will require more patience and help from the community (particularly since I am certainly aware that I have no idea how to calibrate how much, if anything, you actually want to make too easily accessible) -- so this is a also request for that patience and help (and I'm making the assumption that the request will be answered by the replies ;-).

Thanks.

Waser's 3 Goals of Morality

-12 mwaser 02 November 2010 07:12PM

In the spirit of Asimov’s 3 Laws of Robotics

  1. You should not be selfish
  2. You should not be short-sighted or over-optimize
  3. You should maximize the progress towards and fulfillment of all conscious and willed goals, both in terms of numbers and diversity equally, both yours and those of others equally

It is my contention that Yudkowsky’s CEV converges to the following 3 points:

  1. I want what I want
  2. I recognize my obligatorily gregarious nature; realize that ethics and improving the community is the community’s most rational path towards maximizing the progress towards and fulfillment of everyone’s goals; and realize that to be rational and effective the community should punish anyone who is not being ethical or improving the community (even if the punishment is “merely” withholding help and cooperation)
  3. I shall, therefore, be ethical and improve the community in order to obtain assistance, prevent interference, and most effectively achieve my goals

I further contend that, if this CEV is translated to the 3 Goals above and implemented in a Yudkowskian Benevolent Goal Architecture (BGA), that the result would be a Friendly AI.

It should be noted that evolution and history say that cooperation and ethics are stable attractors while submitting to slavery (when you don’t have to) is not.  This formulation expands Singer’s Circles of Morality as far as they’ll go and tries to eliminate irrational Us-Them distinctions based on anything other than optimizing goals for everyone — the same direction that humanity seems headed in and exactly where current SIAI proposals come up short.

Once again, cross-posted here on my blog (unlike my last article, I have no idea whether this will be karma'd out of existence or not ;-)

View more: Prev | Next