FAWS comments on Contrived infinite-torture scenarios: July 2010 - Less Wrong

24 Post author: PlaidX 23 July 2010 11:54PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (188)

You are viewing a single comment's thread. Show more comments above.

Comment author: daedalus2u 25 July 2010 01:05:43AM -1 points [-]

To me a reasonable utility function has to have a degree of self-consistency. A reasonable utility function wouldn't value both doing and undoing the same action simultaneously.

If an entity is using a utility function to determine its actions, then for every action the entity can perform, its utility function must be able to determine a utility value which then determines whether the entity does the action or not. If the utility function does not return a value, then the entity still has to act or not act, so the entity still has a utility function for that action (non-action).

The purpose of a utility function is to inform the entity so it seeks to perform actions that result in greater utility. A utility function that is self-contradictory defeats the whole purpose of a utility function. While an arbitrary utility function can in principle occur, an intelligent entity with a self-contradictory utility function would achieve greater utility by modifying its utility function until it was less self-contradictory.

It is probably not possible to have a utility function that is both complete (in that it returns a utility for each action the entity can perform) and consistent (that it returns a single value for the utility of each action the entity can perform) except for very simple entities. An entity complex enough to instantiate arithmetic is complex enough to invoke Gödel's theorem. An entity can substitute a random choice when its utility function does not return a value, but that will result in sub-optimal results.

In the example that FAWS used, a utility function that seeks to annoy me as much as possible, is inconsistent with the entity being an omnipotent AI that can simulate something as complex as me, an entity which can instantiate arithmetic. The only annoyance the AI has caused me is a -1 karma, which to me is less than a single dust mote in the eye.

Comment author: FAWS 25 July 2010 12:10:08PM 1 point [-]

I said as many times, not as much as possible. The AI might value that particular kind and degree of annoyance uniquely, say as a failed FAI that was programmed to maximize rich, not strongly negative human experience according to some screwed up definition of rich experiences, and according to this definition your state of mind between reading and replying to that message scores best, so the AI spends as many computational resources as possible on simulating you reacting to that message.

Or perhaps it was supposed to value telling the truth to humans, there is a complicated formula for evaluating the value of each statement, due to human error it values telling the truth without being believed higher (the programmer thought non-obvious truths are more valuable), and simulating you reacting to that statement is the most efficient way to make a high scoring true statement that will not be believed.

Or it could value something else entirely that's just not obvious to a human. There should be an infinite number of non-contradictory utility functions valuing doing what it supposedly did, even though the prior for most of them is pretty low (and only a small fraction of them should value still simulating you now, so by now you can be even more sure the original statement was wrong than you could be then for reasons unrelated to your deduction)