sbenthall comments on Depth-based supercontroller objectives, take 2 - Less Wrong

2 Post author: sbenthall 24 September 2014 01:25AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (21)

You are viewing a single comment's thread. Show more comments above.

Comment author: sbenthall 24 September 2014 10:53:00PM 2 points [-]

Thanks for your thoughtful response. I'm glad that I've been more comprehensible this time. Let me see if I can address the problems you raise:

1) Point taken that human freedom is important. In the background of my argument is a theory that human freedom has to do with the endogeneity of our own computational process. So, my intuitions about the role of efficiency and freedom are different from yours. One way of describing what I'm doing is trying to come up with a function that a supercontroller would use if it were to try to maximize human freedom. The idea is that choices humans make are some of the most computationally complex things they do, and so the representations created by choices are deeper than others. I realize now I haven't said any of that explicitly let alone argued for it. Perhaps that's something I should try to bring up in another post.

2) I also disagree with the morality of this outcome. But I suppose that would be taken as beside the point. Let me see if I understand the argument correctly: if the most ethical outcome is in fact something very simple or low-depth, then this supercontroller wouldn't be able to hit that mark? I think this is a problem whenever morality (CEV, say) is a process that halts.

I wonder if there is a way to modify what I've proposed to select for moral processes as opposed to other generic computational processes.

3) A couple responses:

  • Oh, if you can just program in "keep humanity alive" then that's pretty simple and maybe this whole derivation is unnecessary. But I'm concerned about the feasibility of formally specifying what is essential about humanity. VAuroch has commented that he thinks that coming up with the specification is the hard part. I'm trying to defer the problem to a simpler one of just describing everything we can think of that might be relevant. So, it's meant to be an improvement over programming in "keep humanity alive" in terms of its feasibility, since it doesn't require solving perhaps impossible problems of understanding human essence.

  • Is it the consensus of this community that finding an objective function in E is an easy problem? I got the sense from Bostrom's book talk that existential catastrophe was on the table as a real possibility.

I encourage you to read the original Bennett paper if this interests you. I think your intuitions are on point and appreciate your feedback.

Comment author: Randaly 25 September 2014 12:55:53AM *  2 points [-]

Thanks for your response!

1) Hmmm. OK, this is pretty counter-intuitive to me.

2) I'm not totally sure what you mean here. But, to give a concrete example, suppose that the most moral thing to do would be to tile the universe with very happy kittens (or something). CEV, as I understand, would create as many of these as possible, with its finite resources; whereas g/g* would try to create much more complicated structures than kittens.

3) Sorry, I don't think I was very clear. To clarify: once you've specified h, a superset of human essence, why would you apply the particular functions g/g* to h? Why not just directly program in 'do not let h cease to exist'? g/g* do get around the problem of specifying 'cease to exist', but this seems pretty insignificant compared to the difficulty of specifying h. And unlike with programming a supercontroller to preserve an entire superset of human essence, g/g* might wind up with the supercontroller focused on some parts of h that are not part of the human essence- so it doesn't completely solve the definition of 'cease to exist'.

(You said above that h is an improvement because it is a superset of human essence. But we can equally program a supercontroller not to let a superset of human essence cease to exist, once we've specified said superset.)