wedrifid comments on A Master-Slave Model of Human Preferences - Less Wrong

58 Post author: Wei_Dai 29 December 2009 01:02AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (80)

You are viewing a single comment's thread. Show more comments above.

Comment author: Mitchell_Porter 29 December 2009 04:27:16AM 7 points [-]

a test for those who propose to "extract" or "extrapolate" our preferences into a well-defined and rational form

If we are going to have a serious discussion about these matters, at some point we must face the fact that the physical description of the world contains no such thing as a preference or a want - or a utility function. So the difficulty of such extractions or extrapolations is twofold. Not only is the act of extraction or extrapolation itself conditional upon a value system (i.e. normative metamorality is just as "relative" as is basic morality), but there is nothing in the physical description to tell us what the existing preferences of an agent are. Given the physical ontology we have, the ascription of preferences to a physical system is always a matter of interpretation or imputation, just as is the ascription of semantic or representational content to its states.

It's easy to miss this in a decision-theoretic discussion, because decision theory already assumes some concept like "goal" or "utility", always. Decision theory is the rigorous theory of decision-making, but it does not tell you what a decision is. It may even be possible to create a rigorous "reflective decision theory" which tells you how a decision architecture should choose among possible alterations to itself, or a rigorous theory of normative metamorality, the general theory of what preferences agents should have towards decision-architecture-modifying changes in other agents. But meta-decision theory will not bring you any closer to finding "decisions" in an ontology that doesn't already have them.

Comment author: Kaj_Sotala 29 December 2009 09:05:53AM -2 points [-]

I'd upvote this comment twice if I could.

Comment author: wedrifid 29 December 2009 09:14:39AM 1 point [-]

I'd upvote this comment twice if I could.

p(wedrifid would upvote a comment twice | he upvoted it once) > 0.95

Would other people have a different approach?

Comment author: Kaj_Sotala 29 December 2009 11:15:07AM *  0 points [-]

I'd use some loose scale where the quality of the comment correlated with the amount of upvotes it got. Assuming that a user could give up to two upvotes per comment, then a funny one-liner or a moderately interesting comment would get one vote, truly insightful ones two.

p(Kaj would upvote a comment twice | he upvoted it once) would probably be somewhere around [.3, .6]

Comment author: wedrifid 29 December 2009 11:54:02AM 0 points [-]

I'd use some loose scale where the quality of the comment correlated with the amount of upvotes it got.

That's the scale I use. Unfortunately, my ability to (directly) influence how many upvotes it gets is limited to a plus or minus one shift.