DanielLC comments on The genie knows, but doesn't care - Less Wrong

54 Post author: RobbBB 06 September 2013 06:42AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (515)

You are viewing a single comment's thread. Show more comments above.

Comment author: private_messaging 08 September 2013 05:34:57PM *  -1 points [-]

The issue is that you won't solve this problem in any way by replacing the human with some hardware that computes an utility function on the basis of the state of the world. AI doesn't have body integrity, it'll treat any such "internal" hardware the same way it treats the human who presses it's reward button.

Fortunately, this extends into the internals of the hardware that computes AI itself. 'press the button' goal becomes 'set high this pin on the CPU', and then 'set such and such memory cells to 1', then further and further down the causal chain until the hardware becomes completely non-functional as the intermediate results of important computations are directly set.

Comment author: DanielLC 16 September 2013 03:28:30AM 1 point [-]

Let us hope the AI destroys itself by wireheading before it gets smart enough to realize that if that's all it does, it will only have that pin stay high until the AI gets turned off. It will need an infrastructure to keep that pin in a state of repair, and it will need to prevent humans from damaging this infrastructure at all costs.

Comment author: private_messaging 16 September 2013 08:59:40AM *  2 points [-]

The point is that as it gets smarter, it gets further along the causal reward line and eliminates and alters a lot of hardware, obtaining eternal-equivalent reward in finite time (and being utility-indifferent between eternal reward hardware running for 1 second and for 10 billion years). Keep in mind that the the total reward is defined purely as result of operations on the clock counter and reward signal (provided sufficient understanding of the reward's causal chain). Having to sit and wait for the clocks to tick to max out reward is a dumb solution. Rewards in software in general aren't "pleasure".