Stuart_Armstrong comments on Friendly AI ideas needed: how would you ban porn? - Less Wrong

6 Post author: Stuart_Armstrong 17 March 2014 06:00PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (80)

You are viewing a single comment's thread. Show more comments above.

Comment author: Squark 31 March 2014 01:18:33PM -1 points [-]

What do you mean by "follow a utility function"? Why do you thinks humans don't do it?

Humans are neither independent not transitive...

You still haven't defined "follow a utility function". Humans are not ideal rational optimizers of their respective utility functions. It doesn't mean they don't have them. Deep Blue often plays moves which are not ideal, nevertheless I think it's fair to say it optimizes winning. If you make intransitive choices, it doesn't mean your terminal values are intransitive. It means your choices are not optimal.

Human preferences change over time...

This is probably the case. However, the changes are slow, otherwise humans wouldn't behave coherently at all. The human utility function is only defined approximately, but the FAI problem only makes sense in the same approximation. In any case, if you're programming an AI you should equip it with the utility function you have at that moment.

...humans have preference over their state of knowledge...

Why do you think it is inconsistent with having a utility function?

...what does it mean to have a correct solution to the FAI problem?

A utility function which, if implemented by the AI, would result in a positive, fulfilling, worthwhile existence for humans.

How can you know that a given utility function has this property? How do you know the utility function I'm proposing doesn't have this property?

Even if humans had a utility, it's not clear that a ruling FAI should have the same one, incidentally.

Isn't it? Assume your utility function is U. Suppose you have the choice to create a superintelligence optimizing U or a superintelligence optimizing something other than U, let say V. Why would you choose V? Choosing U will obviously result in an enormous expected increase of U, which is what you want to happen, since you're a U-maximizing agent. Choosing V will almost certainly result in a lower expectation value of U: if the V-AI chooses strategy X that leads to higher expected U than the strategy that would be chosen by a U-AI then it's not clear why the U-AI wouldn't choose X.

Comment author: Stuart_Armstrong 31 March 2014 02:32:15PM 3 points [-]

Humans are not ideal rational optimizers of their respective utility functions.

Then why claim that they have one? If humans have intransitive preferences (A>B>C>A), as I often do, then why claim that actually their preferences are secretly transitive but they fail to act on them properly? Nothing we know about the brain points to there being a hidden box with a pristine and pure utility function, that we then implement poorly.